Course
Digital Tools and Techniques for the Adventurous Historian

Under construction, 08 May 2016
This page is likely to be messy and incomplete. Check back later.


A workshop organised by the History Council of South Australia for the SA History Festival 2016.
Held at the State Library of South Australia, 10 May 2016.

Knowing what’s possible

In three hours we’re not going to give you all the knowledge and skills you need to make full use of digital tools and technologies in your historical work.

Sorry!

What we’re aiming to do is to give you a sense of what’s possible, so that when you have a particular task to do you’ll know what tools are relevant, what sites to visit, what words to Google.

You can come back to this page at any time and explore in more depth. So don’t worry about the technical details, think about the possibilities!


LODBook and layers

LODBooks demo site

LOD = Linked Open Data. It’s a way of sharing structured data (like the stuff in databases) on the web so that it can be easily linked, collected, and used.

Historians create LOD all the time – they define entities (people, places, events) and the relationships between them. But when we publish we tend to squeeze out the data, until all that it left is narrative text. I think we can have both – narratives enriched with Linked Open Data.

Here you’ll find a link to my LODBooks demo site, using a draft text created by Kate Bagnall . Explore! Scroll! Click! I wanted to show it today because I believe we need to think more creatively about the ways we publish historical work online. Not just narratives, not just databases, but both – together!

I also wanted to show it because it illustrates my general theme for today’s workshop. There will be no talk of revolutions today, and hopefully I’ll avoid using the word ‘new’ too much. Even as we explore this realm of digital possibilities, we are still historians. We must remain critical of our tools and sources. But what these tools and techniques offer are ways of working with the complex layers of history, of exposing structures in our sources, of creating complex narratives reveal their relationships with data.

Sites to visit


The web isn’t a thing

After

But before we start peeling back the layers of digital history we first have to understand that this thing we call ‘the web’ isn’t a thing at all. We talk about ‘web publishing’ and ‘web pages’ as if they’re products of the print world. But if you look underneath what’s presented in your browser you’ll see each ‘page’ is an assemblage of files, standards, protocols, and technologies, all pulled together and rendered in a human-readable version by your browser.

This is important because we can play around with these layers. We don’t have to take the web we’re given. We can change it.

Things to try

  • X Ray Goggles – a great educational tool from The Mozilla Foundation. Deconstruct and remix websites. Share the results!
  • Stop Tony Meow – ok, it’s a bit redundant now, but it’s a great example of how web pages can be rewritten in our browser. MOAR KITTENS.

Hacking RecordSearch

After

I love the collections of the National Archives of Australia, but I am constantly frustrated by their online database RecordSearch. Here’s a little hack which makes RecordSearch a bit more user friendly.

Remember, web pages aren’t fixed. We have more control over them than we think. Our browsers can change the way they look, even the way they behave. Userscripts are little programs installed within browsers to customise specific websites. This userscript adds more detail to the display of search results in RecordSearch. Try it!

Things to try


Working with data

Plotly chart of SA 1901 Census

There is useful data everywhere on the web. Sometimes it’s formatted in a convenient, structured way. Sometimes it isn’t.

More and more institutions are sharing data through repositories such as data.gov.au. Often these are available in formats such as CSV (comma separated values) which can be opened and manipulated in any spreadsheet or database.

Organisations also share data through APIs (application programming interface). APIs deliver the data you want, when you want it, in a form that computers can understand. You can use APIs to harvest datasets, or build applications that retrieve and display data on the fly.

But sometimes the data you want is just presented as text on a website. No downloads, no structure – just text. To get data like this into a useful form you often have to resort to things like ‘screen scrapers’ – tools that use patterns and structures within the page to isolate the data you want. Zotero, for example, uses a whole series of little screen scrapers to extract stuctured data from websites.

Once you have your data you can start to explore, analyse and visualise it using a growing range of online tools.

Things to try

  • Creating simple charts with Plotly – copy population data from the 1901 South Australian Census and display it as a bar chart.
  • Zotero – more than just a reference manager, Zotero is your personal research database. Extract metadata and images from sites like Trove and RecordSearch.
  • Trove API Console – try some of the potted examples to get an idea of how to get data out of Trove.
  • DataBasic.io’s WTFcsv – learn how to find out what’s hiding in CSV files, starting with The Titanic!

Sites to visit

  • Pre-harvested data – a small (and slightly weird) collection of data sources that I’ve packaged up for easy access. Hansard, ASIO files, faces & more!
  • Building with Trove – Trove is not just a website, it’s a platform! Grab collections data from the Trove API and build things.
  • Government data – try data.gov.au , and the various state repositories such as data.sa.gov.au . What can you find?

Harvesting Trove

Trove Harvester in action

The Trove Newspaper Harvester lets you download lots and lots of digitised newspaper articles from Trove – hundreds, or even thousands of articles. Why would you want this? Perhaps you’re trying to trace trends over time. The harvester collects the publication details of all the articles into a single CSV file. You can harvest a particular search query and then open and explore the results in a spreadsheet or database. Or perhaps you want to look for patterns in the newspaper content itself – the harvester can save all the OCRd text into separate files for analysis.

Digital tools enable us to harvest, process and visualise large quantities of data. Instead of looking for individual examples we can zoom out and trace patterns and trends across time. It doesn’t have to be articles, it could be photographs, or sound files. Tools like the Trove Harvester, which retrieves data from the Trove API, enable us to assemble our own custom datasets for exploration or display.

Things to try


Text as data

In a word

Sometimes the text is the data. Using digital tools we can break texts down into their component parts – words, phrases, and parts of speech – and manipulate them. How are certain words used within collections of texts? We can analyse things like occurance, frequency, and context to better understand the layers of meaning within text.

Things to try

  • DataBasic.io’s WordCounter and SameDiff – some simple and fun tools that introduce you to the possibilities of exploring text as data. Compare the lyrics of Beyonce and Aretha Franklin!
  • Voyant Tools – upload and explore your texts! There are many tools and options, so you might want to look at the Getting Started guide. Try uploading Hansard!
  • Getting started with Topic Modelling and MALLET by Shawn Graham, Scott Weingart and Ian Milligan, Programming Historian – topic modelling is a way of finding ‘themes’ or ‘topics’ within a collection of texts.

Sites to visit

Finding faces

Real face of White Australia

Computers are getting better at seeing. The things we take for granted, like the ability to recognise a face, are challenging tasks for computer vision. But recent years have brought great advances.

Computers can be taught to find shapes and patterns within images. Facial detection (finding a face in a photo) is pretty straightforward. This offers interesting possibilities for historians, but the use of such technologies for surveillance also presents political and social challenges.

Things to try

Sites to visit


Telling stories with maps

Geolocated map of Adelaide

Digital technologies have turned everyone into map makers. Whether you know it or not, it’s likely that your electronic devices are currently plotting your movements through space.

Online mapping tools make it easy to visualise geospatial data. But you can do more than just put markers on a map. What about layering historical maps over modern data? What about combining geospatial data with narrative to tell a story in time and space?

Things to try

Sites to visit

  • My georectified maps of Adelaide – here’s something I prepared earlier…
  • American Panorama – amazing new collection of data rich maps telling stories from American history
  • Odyssey.js – another tool that combines maps and stories.
  • Neatline – a plugin for Omeka that lets you create rich, multilayered maps with timelines, embedded documents, annotations and stories.

The read/write web

Chinese in NSW

We don’t just consume the web, we make it. People generate vast quantities of online content through social media, and sharing sites. This is exciting raw material for historical analysis, but how can we make sure it’s preserved and accessible?

The web also offers historians the opportunity to create, engage, and experiment. We can publish stories, start conversations, build exhibitions, and share our research.

Things to try

Sites to visit

  • Omeka – create online exhibitions!
  • Documenting the now – important new project aimed at documenting communities and current events.

Going further