Under construction, 08 May 2016
This page is likely to be messy and incomplete. Check back later.
A workshop organised by the History Council of South Australia for the SA History Festival 2016.
Held at the State Library of South Australia, 10 May 2016.
In three hours we’re not going to give you all the knowledge and skills you need to make full use of digital tools and technologies in your historical work.
What we’re aiming to do is to give you a sense of what’s possible, so that when you have a particular task to do you’ll know what tools are relevant, what sites to visit, what words to Google.
You can come back to this page at any time and explore in more depth. So don’t worry about the technical details, think about the possibilities!
LOD = Linked Open Data. It’s a way of sharing structured data (like the stuff in databases) on the web so that it can be easily linked, collected, and used.
Historians create LOD all the time – they define entities (people, places, events) and the relationships between them. But when we publish we tend to squeeze out the data, until all that it left is narrative text. I think we can have both – narratives enriched with Linked Open Data.
Here you’ll find a link to my LODBooks demo site, using a draft text created by Kate Bagnall . Explore! Scroll! Click! I wanted to show it today because I believe we need to think more creatively about the ways we publish historical work online. Not just narratives, not just databases, but both – together!
I also wanted to show it because it illustrates my general theme for today’s workshop. There will be no talk of revolutions today, and hopefully I’ll avoid using the word ‘new’ too much. Even as we explore this realm of digital possibilities, we are still historians. We must remain critical of our tools and sources. But what these tools and techniques offer are ways of working with the complex layers of history, of exposing structures in our sources, of creating complex narratives reveal their relationships with data.
But before we start peeling back the layers of digital history we first have to understand that this thing we call ‘the web’ isn’t a thing at all. We talk about ‘web publishing’ and ‘web pages’ as if they’re products of the print world. But if you look underneath what’s presented in your browser you’ll see each ‘page’ is an assemblage of files, standards, protocols, and technologies, all pulled together and rendered in a human-readable version by your browser.
This is important because we can play around with these layers. We don’t have to take the web we’re given. We can change it.
I love the collections of the National Archives of Australia, but I am constantly frustrated by their online database RecordSearch. Here’s a little hack which makes RecordSearch a bit more user friendly.
Remember, web pages aren’t fixed. We have more control over them than we think. Our browsers can change the way they look, even the way they behave. Userscripts are little programs installed within browsers to customise specific websites. This userscript adds more detail to the display of search results in RecordSearch. Try it!
There is useful data everywhere on the web. Sometimes it’s formatted in a convenient, structured way. Sometimes it isn’t.
More and more institutions are sharing data through repositories such as data.gov.au. Often these are available in formats such as CSV (comma separated values) which can be opened and manipulated in any spreadsheet or database.
Organisations also share data through APIs (application programming interface). APIs deliver the data you want, when you want it, in a form that computers can understand. You can use APIs to harvest datasets, or build applications that retrieve and display data on the fly.
But sometimes the data you want is just presented as text on a website. No downloads, no structure – just text. To get data like this into a useful form you often have to resort to things like ‘screen scrapers’ – tools that use patterns and structures within the page to isolate the data you want. Zotero, for example, uses a whole series of little screen scrapers to extract stuctured data from websites.
Once you have your data you can start to explore, analyse and visualise it using a growing range of online tools.
The Trove Newspaper Harvester lets you download lots and lots of digitised newspaper articles from Trove – hundreds, or even thousands of articles. Why would you want this? Perhaps you’re trying to trace trends over time. The harvester collects the publication details of all the articles into a single CSV file. You can harvest a particular search query and then open and explore the results in a spreadsheet or database. Or perhaps you want to look for patterns in the newspaper content itself – the harvester can save all the OCRd text into separate files for analysis.
Digital tools enable us to harvest, process and visualise large quantities of data. Instead of looking for individual examples we can zoom out and trace patterns and trends across time. It doesn’t have to be articles, it could be photographs, or sound files. Tools like the Trove Harvester, which retrieves data from the Trove API, enable us to assemble our own custom datasets for exploration or display.
Sometimes the text is the data. Using digital tools we can break texts down into their component parts – words, phrases, and parts of speech – and manipulate them. How are certain words used within collections of texts? We can analyse things like occurance, frequency, and context to better understand the layers of meaning within text.
Computers are getting better at seeing. The things we take for granted, like the ability to recognise a face, are challenging tasks for computer vision. But recent years have brought great advances.
Computers can be taught to find shapes and patterns within images. Facial detection (finding a face in a photo) is pretty straightforward. This offers interesting possibilities for historians, but the use of such technologies for surveillance also presents political and social challenges.
Digital technologies have turned everyone into map makers. Whether you know it or not, it’s likely that your electronic devices are currently plotting your movements through space.
Online mapping tools make it easy to visualise geospatial data. But you can do more than just put markers on a map. What about layering historical maps over modern data? What about combining geospatial data with narrative to tell a story in time and space?
We don’t just consume the web, we make it. People generate vast quantities of online content through social media, and sharing sites. This is exciting raw material for historical analysis, but how can we make sure it’s preserved and accessible?
The web also offers historians the opportunity to create, engage, and experiment. We can publish stories, start conversations, build exhibitions, and share our research.