This was the year I finally understood the possibilities of Jupyter notebooks for learning, experimentation, and collaboration. As a result, I spent a lot of time building up a collection of Jupyter-powered tools and examples in my GLAM Workbench.
I also made some much needed progress on LODBook 3 — a Jekyll plugin and theme that helps you create rich, linked data enriched, historical narratives.
In other ways 2018 was a bit of a challenge and, as I head into 2019, I’ll be further cutting back my hours at the University of Canberra to just one day per week. Hopefully this will help me get a few things into balance, and give me more time to spend on the sorts of things listed below — things that help people explore and use digital collections.
If you think this work is useful or important, you might like to support me on Patreon.
For regular updates throughout the year, follow @wragge on Twitter or visit 101 DH Hacks on Facebook.
Posts and articles
- Trove bots for all!, 101 DH Hacks, 21 January 2018 — make your very own Trove Twitter bot using Glitch
- ‘Withheld, pending advice’ , Inside Story, 2 February 2018 — a summary of my research on ‘closed’ files in the National Archives of Australia
- Closed Access 2017 update, Open Research Notebook, 18 February 2018 — details of my annual harvest of records in the National Archives of Australia with the access status of ‘closed’
- An experiment in two-way direct linking using Hypothes.is, Open Research Notebook, 20 February 2018 — creating two-way direct links to quotes in Historic Hansard
- Records of Resistance, 1 March 2018 — working on the State Library of NSW’s Tribune negatives collection
- Tribune negatives metadata and licensing, Open Research Notebook, 4 March 2018 — some notes relating to my harvest of metadata and images from the State Library of NSW’s Tribune negatives collection
- Mapping Migration in the arts: The Real Face of White Australia, 8 March 2018, interview with Adrian Murphy for the Europeana Migration campaign
- Filing cabinets, secrets, and the introduction that never was, 30 March 2018
- Hacking heritage: understanding the limits of online access, preprint of a chapter submitted for publication as part of The Routledge International Handbook of New Digital Practices in Galleries, Libraries, Archives, Museums and Heritage Sites
Presentations and workshops
- Hacking Cultural Heritage Collections to Understand the Limits of Access (video, slides), Deakin University’s Contemporary Histories Research Group History Seminar Series, 11 April 2018
- Digital History Workshop (slides), workshop for ANU History HDR students, April 2018
- Trove Tips & Tricks (slides), research methods workshop at University of Canberra, April 2018
- Trove as a platform for digital research and creativity (slides), research methods workshop at University of Canberra, May 2018
- Digital History Workshop (slides), workshop for ANU History Honours students, May 2018
- Playing with Linked Open Data, Karen Miller’s report on my workshop at the Curtin University Makerspace, 26 July 2018
- LOD for historians: enriching narratives with structured data (slides), presentation at Landscape Data Art & Models as Linked Open Data, Curtin University, 27 July 2018
- Trove Tips & Tricks, presentation for Sailing Into History Family History Congress, Batemans Bay, September 2018
- Exploring GLAM data (with Jupyter notebooks) (video, notebook), ARDC Tech Talk, 18 September 2018
- Small data and hand-crafted infrastructure (slides), presentation for the School of Humanities and Social Inquiry’s Provocations seminar series, University of Wollongong, 27 September 2018
- A GLAM data workbench for reluctant researchers (video, notebook), presentation at National Digital Forum, Wellington, 20 November 2018
- LODBook III: The Voyage Home (video, slides), presentation at National Digital Forum, Wellington, 21 November 2018
Assorted tools, code, and data
These are mostly updates, but I also created a few remixable apps using Glitch that spawned an army of Trove Twitter bots.
- Trove List Bot (Glitch) — create a Twitter bot that tweets new or randomly-selected items from a Trove list
- Trove Title Bot (Glitch) — create a Twitter bot that tweets articles from selected Trove newspaper titles
- Trove Collection Bot (Glitch) — create a Twitter bot that tweets Trove items from a particular collection
- Trove Tag Bot (Glitch) — create a Twitter bot that tweets Trove items with a particular tag
- Trove Headline Roulette (Glitch version) — a Glitch version of this simple game using Trove’s digitised newspapers
- Trove Titles: Exploring Trove’s digitised journals — updated to feature newly added content and to use version 2 of the Trove API; new titles harvested monthly
- RecordSearch Tools — scraper library updated to work with Python 3
- SRNSW Indexes — harvesting code updated to work with the new index layout; updated all indexes and added ‘Criminal Indictments, 1863-1919’
- Trove Newspaper Harvester — various updates and fixes for better Python 3 compatibility and to handle newspaper articles without titles
- DIY #redactionart — added another SVG outline to the collection, this one bearing a remarkable resemblance to Donald Trump
- Trove texts — OCRd text and metadata harvested from Trove’s digitised journals; currently includes text from 4465 issues of The Bulletin
- DXLab Tribune — updated harvest of item metadata from the State Library of NSW’s Tribune collection
- LODBook 3 — latest code and demo content for the LODBook project
- Real Face of White Australia data — monthly updates of data from the Real Face of White Australia transcription project
- Faces of the Tribune — all the faces extracted from the State Library of NSW’s Tribune negatives in one very big, deep zoomable, image
GLAM Workbench
This is a collection of tools, apps, and examples to help you work with data from galleries, libraries, archives, and museums (the GLAM sector). It’s created using Jupyter notebooks which combine code, text, and data in a way that encourages experimentation and learning. The notebooks are organised into repositories that focus on particular collections or technologies.
This is very much a work-in-progress. As you can see from the list below, I’m still reorganising some of the repositories. I’ve started work on a documentation site that will provide brief descriptions of each notebook, as well as some suggested pathways through the repositories.
Repositories
The links below go to the repositories on GitHub. For easier browsing of the notebooks you might want to use NBViewer. To run the notebooks live, open the GitHub links in Binder.
- OzGLAM Workbench — this is no longer being updated; sections and notebooks in this repository are gradually being moved into standalone repositories
- OzGLAM Data — harvest details of datasets shared by GLAM institutions through to various state and national data portals; results available as a CSV or on Google Sheets
- OzGLAM Data NAA: White Australia Policy — harvest and explore records relating to the White Australia Policy in the National Archives of Australia; includes item-level metadata harvested from 23 series
- OzGLAM Data NAA: ASIO — harvest and explore records relating to the White Australia Policy in the National Archives of Australia; includes item-level metadata harvested from 18 series
- OzGLAM Data: Records of Resistance — explore metadata from the State Library of NSW’s Tribune collection
- Archway harvesting (Archives New Zealand) — harvest record metadata from Archives New Zealand’s Archway database
- Te Papa collections API — explore Te Papa’s collection API; includes some simple mapping examples
- Trove lists — use the Trove API to download and analyse the contents of Trove lists
- Trove unpublished — an attempt to use the Trove API to find and explore unpublished works that might be entering the public domain on 1 January 2019
- OzGLAM Data Records of Resistance — analyse metadata from the State Library of NSW’s Tribune negatives collection
- Image recognition — try some basic machine learning examples using Tensorflow to tag and categorise images in the State Library of NSW’s Tribune negatives collection
- Facial detection — use computer vision technologies to detect faces in the State Library of NSW’s Tribune negatives collection
- DigitalNZ — examples using the DigitalNZ API; includes notebooks to harvest searches in Papers Past, and to search by country in DigitalNZ
- Trove newspaper harvester — making my existing command line tool for harvesting large quantities of Trove newspaper articles available through an easy-to-use web interface; includes some examples of how you can analyse and visualise the results
- Trove newspapers — various examples of harvesting, analysing, and visualising data from Trove’s digitised newspapers
- RecordSearch — use the RecordSearch Tools library to harvest and analyse collection metadata from the National Archives of Australia
Apps
Using the Appmode extension and the Binder service you can hide the code cells in Jupyter notebooks and run them as handy little apps. Of course if you want to dig around in the code you can — just click on the ‘Edit App’ button.
- Download a high-resolution page image from Trove’s digitised newspapers (part of the Trove Newspapers repository)
- QueryPic deconstructed — visualise searches in Trove’s digitised newspapers with a deconstructed, extended, and hackable version of my classic QueryPic tool (part of the Trove Newspapers repository)
- Download the contents of a digitised file from the National Archives of Australia (part of the RecordSearch repository)
- Focus on faces — an experimental app using facial detection in the Tribune collection (part of the Facial Detection repository)