Course
Random acts of meaning

Under construction, 22 June 2017
This page is likely to be messy and incomplete. Check back later.


Introduction

Who can you trust?

Privacy Badger is a browser extension created by the Electronic Frontier Foundation. Install it and be scared. Privacy Badger reveals the many hidden trackers embedded in modern websites. You are being watched.

Whether it’s metadata retention, or facial recognition, surveillance has become the norm. It seems even pizza restaurants are now tracking and logging our appearance.

At the same time knowledge and expertise are being devalued. Cultural institutions are having their funding cut. Misinformation is rife.

It all seems overwhelming. What can we do?

#redactionart

In your envelope you’ll find a piece of #redactionart. These creatures were discovered in digitised ASIO surveillance files held by the National Archives of Australia. An archivist tasked with removing particular words or phrases from the files left a wonderful artistic flourish.

You can read more about the discovery of #redactionart, and its life beyond the archives. It also features in a talk I gave at the Australian Society of Archivists Annual Conference last year.

By studying the processes of redaction in ASIO files I hope to reverse the gaze of state surveillance, and find out more about the way ASIO collected and controlled information on ordinary people. But #redactionart reminds us that these processes are not mechanical, they are historical and human. Little things can make a difference.

Random acts of meaning

Today we are going to try and fight back against the forces of control, surveillance, and misinformation through small acts of meaning-making.

You have four missions to complete:

  1. Unsettle institutions – remix websites using X-Ray Goggles
  2. Take responsibility – create a Twitter archive using Twarc
  3. Mobilise meanings – unleash your own Twitter bot on the world
  4. Challenge assumptions – use Hypothes.is to enrich understandings

XRay Goggles

Mission 1: Unsettle Institutions

This thing we call ‘the web’ isn’t a thing at all. We talk about ‘web publishing’ and ‘web pages’ as if they’re products of the print world. But if you look underneath what’s presented in your browser you’ll see each ‘page’ is an assemblage of files, standards, protocols, and technologies, all pulled together and rendered in a human-readable version by your browser.

This is important because we can play around with these layers. We don’t have to take the web we’re given. We can change it.

Let’s have a peek beneath the hood and see if we can break something…

(These instructions assume you’re using Chrome. All browsers provide a console, but the way you open it can vary.)

  • Since we’ve been talking about ASIO #redactionart, let’s start by going to the ASIO website «https://www.asio.gov.au/».

  • Now open up your browser’s Javascript console. In Chrome press Ctrl+Shift+J (Windows / Linux) or Cmd+Opt+J (Mac).

  • Cut and paste the code in the box below into the console, then hit Enter.

document.getElementsByTagName('body')[0].innerHTML = '<h1>All your secret are belong to us!</h1>';

Congratulations you just hacked ASIO! The unmarked black vans are already pulling up outside…

Don’t worry, it’s ok. Nothing was actually changed on the ASIO site. All the various bits that make up the ASIO home page are stored in your browser in a thing called the Document Object Model or DOM. The Javascript console let’s you play around with the contents of the DOM.

  • Reload the ASIO home page. See, it’s still there!

  • Cut and paste the code again. But this time change the text between the <h1></h1> tags before you hit Enter.

In this example we’ve replaced the contents of the page’s <body> tag. But we can be more selective.

  • Reload the page again.

  • Cut and paste the following code and then hit Enter.

document.getElementsByTagName('img')[1].src = 'https://dl.dropbox.com/s/ezxugjt877wodj3/11981680-p42-1-324-194.jpg'

In this example we’re selecting an <img> tag and replacing the source of the image with the url for some #redactionart. Much better!

  • Reload the page.

  • Cut and paste the code again, but this time try changing the 1 in the square brackets to 0 or another number. What happens?

The Mozilla Foundation has created a tool called [XRay Goggles] that makes it easy to explore and change the content of any web page.

  • Go to the X-Ray Goggles page

  • You might as well create an account (this will let you save your projects) – so click on ‘Create an account’ at the top of the page and follow the instructions.

  • Now follow the instructions to install X-Ray Goggles in your browser. It’s basically just a matter of dragging the pink button to your browser’s bookmarks bar.

  • Once you’ve installed X-Ray Goggles, click on the ‘Sample activity page’ button and follow the instructions.

  • Note that once you’ve edited an element, you need to click update to save your change and move on to the next task.

Ok, so creating and naming cute animals is not terribly exciting. But now you know how to use XRay Goggles, you can remix any web page!

Have you ever wanted to hack your own institution’s website? Or perhaps ‘improve’ the web presence of your favourite political party or government agency? Do it now.

  • Go the site you want to hack.

  • Click on the X-Ray Goggles button in your toolbar to activate the editor. You’ll notice some buttons appear at the bottom right of the page.

  • Just click any element (text or image) on the page to open it for editing.

  • Try changing the text. Replace images by substituting the src attribute with a url to an image somewhere on the web.

  • Once you’re happy with your remixed page, click on the Publish button. (You’ll need to be logged in to your account.)

  • Share the ‘published’ version on Twitter using the #nls8 tag. Add it to the Google Doc for this workshop.

There are other ways of modifying the DOM to hack the web. Userscripts are little Javascript programs that run in your browser, modifying the contents of selected pages. For example, I’ve created a userscript that inserts the faces of people subject to the restrictions of the White Australia Policy into relevant records in the National Archives of Australia’s RecordSearch database – instead of just metadata, you see the people inside. You can install it yourself.

Browser extensions are similar – they’re just packaged and distributed in a more formal way. Stop Tony Meow is a browser extension that replaces images of Tony Abbott with kittens. How do think it might work?

Rehumanize is a browser extension that replaces words like ‘boat people’ and ‘illegals’ with ‘humans’. A simple hack that is nonetheless both pointed and powerful.


Mission 2: Taking responsibility

It seems that almost every week there’s yet another article warning us about the ‘digital black hole’, ‘digital deluge’, or even a ‘digital dark age’. We’re not paying enough attention to the archiving and preservation of digital sources, such articles exclaim, we’re in danger of losing our memory.

Of course there are plenty of archivists and librarians already working on this problem. It’s not easy, but we’re making good progress.

However, the ‘black hole’ hype does at least encourage us to think about whose responsibility it is to preserve digital culture. The election of Donald Trump has reminded us that we can’t take it for granted that critical government-funded scientific data will be preserved. The threat to climate data, in particular, prompted a group of librarians, researchers, and activists to create Data Refuge – ‘an initiative committed to identifying, assessing, prioritizing, securing, and distributing reliable copies of federal climate and environmental data so that it remains available to researchers’.

While working with a number of libraries and universities to preserve an ‘end of term’ web harvest of US government sites, the Internet Archive also called on individual web users to play a part, asking ‘If you see something, save something’.

The Wayback Machine and it’s handy browser extension make it easy to save any page with a click. Instead of simply assuming preservation will just happen, we can play an active role, using tools like the Wayback Machine and Perma.cc to help ensure that sources remain accessible.

It’s everyone’s responsibility.

We also need to keep asking critical questions about which digital voices will be preserved and why. How do we broaden the scope of collecting activities to ensure that the experiences of traditionally underrepresented communities are preserved? Projects like Documenting the now are addressing this question by working with communities. Tools like Twarc, a Twitter archiving tool, make it easy to capture a diversity of voices around political events and social movements. Here, for example, is an article about using Twarc to capture tweets from the #WomensMarch.

Your mission today is to create a Twitter archive using Twarc and undertake some basic analysis.

To avoid problems and speed-up configuration, you’ll be doing your Twitter archiving in the workshop using my magic book of fairy tales. It’s basically a Raspberry Pi in a big box – but hey, it looks cool! If you haven’t played around with Raspberry Pis before, check them out – there’s lots of fun DIY projects you can undertake.

If you want to try Twarc at home, you’ll need to set up Python and a few related things before installing it. You might also want to look at my notes on ‘Trying out Twarc’.

Installing Twarc

Let’s get things set up…

  • First we’ll just clear the slate. Don’t worry if you get a ‘command not found error’. In the terminal type the following commands, hitting enter after each line:
cd
deactivate
  • Now we’ll set up a ‘virtual environment’ which will isolate our changes from the rest of the system. Replace [YOUR GROUP NAME] with a name for your group:
virtualenv [YOUR GROUP NAME]
cd [YOUR GROUP NAME]
source bin/activate
  • Now we’re ready to actually install Twarc. It’s just:
pip install twarc

Twarc configuration

IGNORE THIS SECTION IF YOU’RE DOING THIS IN THE WORKSHOP.

My magic fairy tales box is pre-configured with a set of Twitter access keys I’ve created for the workshop, so you don’t need to configure Twarc yourself.

However, I’ve included the details below for anyone setting Twarc up on their own computer…

Perhaps the trickiest bit of all of this is just getting all the access keys and tokens you need to authenticate yourself with Twitter. Every time I try this it seems a bit different, but the basic principles should be the same.

  • Make sure you have a Twitter account!

  • Go to the Twitter developers site. Login if necessary with your normal Twitter credentials.

  • Click on the ‘My apps’ link at the top of the page.

  • Click on the Create New App button.

  • Fill in the form. The ‘name’ and ‘description’ can be anything you want. For ‘website’ you can use the url of this workshop.

  • Tick the terms and conditions and click the Create button.

  • On your new application’s page click on the ‘Keys and Access Tokens’ tab.

  • Ok, you should now see the first two of the four keys and tokens we need to collect:
    • Consumer Key (API Key)
    • Consumer Secret (API Secret)
  • To get the other two, click on the Create my Access Token button at the bottom of the page. You should now see:
    • Access Token
    • Access Token Secret
  • Keep this page open so you can copy and paste the values into Twarc.

Now go back to the virtual environment in which you installed twarc and type:

twarc configure

Twarc will ask you for your four keys in turn. Just copy and paste them at the command line. The keys will be stored in your user directory, so you should only need to provide them once.

Starting your harvest

There are two main ways of collecting tweets using Twarc: search and filter. Search naturally searches current tweets, while filter sits and listens to the Twitter stream and tries to catch new tweets matching your query.

Given the nature of the Twitter API, it’s virtually impossible to get everything matching your query. You may have noticed that Twitter’s search results only remain current for a few days. This means that you might want to repeat your search every few days, or supplement search with filter. You can always remove duplicates later on.

Let’s run a search to collect tweets using the #nls8 hashtag:

twarc search "#nls8" > nls8.json

This saves the #nls8 tweets in a file called nls8.json. Results from the Twitter API are saved in JSON format. This is a very common format for saving and sharing structured data – but it’s not very human friendly.

Let’s find out how many tweets we collected. This command counts the number of lines (each tweet is on a new line) in the JSON file.

wc -l < nls8.json

Ok, now it’s time to run your own harvest. Think about what you’d like to search for – note that its doesn’t just have to be a hashtag. The Twarc GitHub site has a few more advanced examples.

Once you’ve decided on a search, repeat the steps above with your new query, saving the results to a suitably-named file. Then you can move on the process the results.

Processing the tweets

Twitter discourages you from sharing large amounts of tweets harvested from their API. Twarc offers a way around this – you can dehydrate your collection to create a file that contains only tweet ids. Other users can then rehydrate using Twarc or the web service on the Documenting the Now site. This has the added advantage of respecting users’ decisions to delete their own tweets.

In the code below, replace nls8 with the name of your own results file.

To dehydrate:

twarc dehydrate nls8.json > nls8-ids.txt

To inspect your ids:

cat nls8-ids.txt

You’ll see a fascinating list of very large numbers. Use the arrow keys to scroll down, and type q to quit.

Twarc users are sharing files like this for use by other researchers. Here, for example, is my collection of ‘#australianvalues’ tweets.

Twarc includes a number of handy little utilities, but they’re not installed automatically. Probably the easiest way to get them is to just make a copy of the Twarc repository. Just type:

git clone https://github.com/DocNow/twarc.git

Now you can try removing retweets from your harvest:

twarc/utils/noretweets.py nls8.json > nls8-noretweets.json

Count the number of unique tweets:

wc -l < nls8-noretweets.json

So how many retweets were there?

Let’s finish with some simple visualisations. First let’s make a wall:

twarc/utils/wall.py nls8.json > wall.html

To view the results, use the Pi’s file browser to navigate to /home/pi/workshop and then open your group folder. Double click wall.html to open it in a browser.

And a wordcloud:

twarc/utils/wordcloud.py nls8.json > cloud.html

View the results as above.

The Documenting the Now project has a strong focus on the ethics of collecting. A person’s social media activity might reveal their involvement in political protest – what are the risks to preserving such material? How should access be controlled? What would you do if the government asked you to have over your archive?


Mission 3: Mobilise Meanings

How do we change the conversation? How do put new ideas into circulation? You might not think of Twitter bots as tools for transformation, but perhaps they can at least expose us to new perspectives.

Twitter bots are just computer programs that interact with Twitter. Some are evil, such as spambots trying to capture your clicks, or fake news bots trying to manipulate public opinion. But others generate art, open collections, and expose political activities to scrutiny.

Some Twitter bots just post tweets, while others interact with Twitter users. Their behaviours can be quite complex and surprising.

TroveNewsBot tweets random articles from Trove’s digitised newspapers. But if you tweet keywords to it the bot will search for them in Trove and tweet back the most relevant result. You can search Trove without ever leaving Twitter! TroveNewsBot does a lot more as well.

  • For a random article, just tweet the hashtag ‘#luckydip’ to @TroveNewsBot.

  • Try tweeting a url together with the hashtag ‘#keywords’ to @TroveNewsBot.

  • What about all the resources in Trove that aren’t newspapers? Try @TroveBot!

But Twitter bots can also make a point. @StL_Manifest shares images of Jewish refugees who were turned away from the US in 1939 and became victims of the Holocaust. Every3Minutes helps us understand the scale of the slave trade in pre Civil War America. I created Operation Random Words in response to the Australian government’s implementation of Operation Sovereign Borders. For more on the political potential of Twitter bots read A protest bot is a bot so specific you can’t mistake it for bullshit by Mark Sample.

Your mission is to create a Twitter bot. You’ll be using a site called Cheap Bots Done Quick that makes it easy to build bots with a minimum of fuss.

Your bot will combine words and phrases within a pre-designed template to generate random messages. Operation Random Words is an example of this approach. Its tweets generally follow the pattern:

Operation [ADJECTIVE] [NOUN]: [EITHER 'PROTECTING' OR 'DEFENDING'] Australia from [PLURAL NOUN]

To build your bot, you’ll need to design a template something like this, and provide sets of words to fill the slots.

  • First decide on a concept for your bot – what would you like it to do? This should help you come up with a name.

  • Once you have a name for your bot, create an account for it on Twitter. Note that you’ll need either a mobile phone number, or an email that’s not already associated with Twitter to verify the new account.

  • Once you’ve created your new account, go to Cheap Bots, Done Quick, and click on the Sign in with Twitter button.

  • Click to give the site permission to use your bot’s Twitter account.

Cheap Bots, Done Quick uses a Javascript library called Tracery to generate the random messages. Most of the work is done behind the scenes, but you have to provide Tracery with the data it needs in a format called JSON (Javascript Object Notation). JSON is used all over the place to move data around.

Fortunately, there’s a visual editor that makes it simple to generate data in the format Tracery expects.

  • Open the Tracery visual editor.

  • If it’s not already selected, choose ‘tinygrammar’ from the dropdown list.

  • Click on show colors and then reroll. You’ll see how the values on the left side of the editor are being combined to create the phrases on the right side.

  • The template for your tweets is always called ‘origin’. The placeholders in your template (where the randomly selected words will be slotted in) are indicated by a hash (#) at the beginning and end. Note that the default example has two placeholders ‘#name#’ and ‘#occupation#’.

  • Try reversing the positions of ‘#name#’ and ‘#occupation#’ in the template and click reroll. What happens?

  • You’ll see that the values to fill the placeholders are drawn from lists called ‘name’ and ‘occupation’. Try adding some extra values to the list of names and click reroll.

  • Click on new symbol to create a new list.

  • Click on the label of the list and change it to ‘mood’.

  • Click on ‘rule’ in the new list and change it to ‘happy’.

  • Click on the ‘+’ sign to add ‘sad’ to the list.

  • Now edit the origin template and add ‘is #mood#’ to the end. Click reroll. What happens?

At this point you should have a good idea of how the templates work. Now you need to design your own! Think about the types of things you want to combine. Then start creating lists of those things. Keep in mind that the complete messages need to be under 140 characters.

Once you created your templates and word lists in the visual editor, it’s time to copy the data across to your bot.

  • In the visual editor, click on the JSON button. Your template and lists will be displayed in JSON format.

  • Select and copy all of the JSON data.

  • Go back to Cheap Bots, Done Quick and paste your data in place of the default values.

  • A sample tweet should appear in the text box below the data. Click on the reload button a few times to check that it’s working the way you expect.

  • If all seems ok – fire off your first tweet! Just click on the Tweet button next to the sample tweet.

  • Check your bot’s twitter profile to see if the tweet was sent. (You might have to reload the page.)

  • All good? Now go back to Cheap Bots, Done Quick to set up a schedule.

  • In the dropdown box next to ‘post a tweet’ select ‘Every hour’.

  • Now click on the Save button at the bottom of the page.

Congratulations, you’ve created your first bot. Make sure your share the results through Twitter and in the workshop’s Google Doc.

How else do you think you might be able to use Twitter bots?


Screenshot of Hypothes.is

Mission 4: Challenge Assumptions

Who can we trust? In this era of ‘fake news’ we are constantly confronted with the task of critically analysing the media, and seeking out reliable sources of information.

Tools like DiffEngine can help us understand the way the ‘news’ itself is shaped and transformed over time. DiffEngine tracks changes in news articles via RSS feeds. When an article changes it saves a fresh copy to the Internet Archive and tweets the changes.

The Internet Archives’s Trump Archive is both a collection of Trump’s media interviews and a resource for fact checking. While Climatefeedback.org is a project that uses the web annotation tool Hypothes.is site to add a bit of scientific rigour to media reporting around climate change. We don’t just have to consume the media, we can use digital tools to challenge and critique.

You mission is to use Hypothes.is to enrich a web page by adding notes, links and images.

Hypothes.is can be used with any webpage (or PDF). Some sites, like this one, have embedded the necessary code so that any visitor can add annotations. You can also install a browser extension that lets you annotate any page.

For this mission we’ll just be feeding the url of the page we want to annotate directly to Hypothes.is. But what page (or pages) are you going to choose?

What about:

  • Articles from a major news organisation?
  • Debates in Parliament via Open Australia (or Historic Hansard?
  • Policy statements of political parties?
  • Media releases from cultural heritage organisations?

For more ideas have a look at ‘In Action’ section of the Hypothes.is site.

Once your group has made a decision and found some suitable pages, you’re ready to start annotating.

  • First sign up for a free account with Hypothes.is.

  • Once you’re signed in, copy the url of the page you want to annotate. Go back to the Hypothes.is site and click on ‘Paste a link’. Paste the url in the box and click the Annotate button.

  • The page will load with the Hypothes.is annotation tools embedded. You’ll see some new tabs on the right hand side of the screen. Click on the arrow to open and close the Hypothes.is editor.

  • The easiest way to add an annotation is to simply select some text. A pop up will give you the option to either ‘Annotate’ or ‘Highlight’ – choose annotate and the editor will automatically open.

  • The editor includes a number of simple formatting tools – you can use these to add links and images.

  • You can also add tags to your annotations. Add the tag ‘nls8’ so that we can easily follow the editing activity.

  • Once you’ve finished your annotation, click on the Post to public button.

One cool thing about Hypothes.is is that each annotation has its own unique url that you can share. This means instead of just sharing a link to a webpage, you can share a link to an annotated word or phrase on a webpage – and it’ll take you right back to that annotated word or phrase.

  • Once you’ve saved an annotation, click on the share icon and copy the url.

  • Open a new browser tab and paste in the url. What happens?