Blog

DigiVol: DigiVols as Code Breakers

By: Dana Anderson, Category: Museullaneous, Date: 14 Nov 2014

The immense task of digitising our collections raises many challenges for our digitisation and transcription team, the DigiVols.

Page of Edgar Waite's Diary

Page of Edgar Waite's Diary
Photographer:  © Waite Family

With the increased use of computers, it’s true that we don’t put pen to paper as often as we once used to. In this digital age our eyes have become accustomed to the neat and uniform text that appear on computer screens, and often when we encounter cursive handwriting we can easily become confused as to what that spidery handwriting is trying to say.


Handwriting is idiosyncratic and no two individuals write in exactly the same way. Sometimes it can be illegible and we can be unsure of the word, letters or even numbers that have been written. For someone trying to transcribe a handwritten document, the quirks of personal writing styles and issues like fading ink and discoloured paper can raise problems and confusion during this process. On top of that, transcribers can encounter a range of issues, including words and names that are spelt in a variety of ways; confusing writing styles where the individual has crossed out words, written in margins or added miniscule corrections and; even some documents have been written in shorthand or Pidgin. Yep, it all sounds like chaos on paper!


The immense task of digitising our collections raises many of these challenges for our volunteer digitisation and transcription team, the DigiVols. The DigiVols work hard at digitising, transcribing and deciphering handwritten labels from scientific specimens, research notes, manuscripts and diaries. Without these diligent code breakers, institutions such as the Australian Museum wouldn’t be able to store or manage important scientific data, nor make our collections more accessible for scientists, historians and the general public.


Transcription is about interpreting a handwritten document so that it can be typed up, entered into a database and made more easily readable and searchable by people and software. For those who are working on transcribing documents, the task can feel like you’re piecing together an elaborate and confusing puzzle or breaking a code that was intended for only the original writer to read.


The DigiVols are currently working their way through the museums entomology and malacology collections. Many of the museum specimens that are currently being digitised have been collected by a variety of people at different times throughout the Museum’s history. Each collector has a unique style of labelling and the labels that are handwritten are the specimens that cause the most confusion. Luckily, the more you transcribe and read handwritten documents, the easier it becomes. Many of our DigiVols have developed a system of deciphering that helps to decode the puzzle of someone’s scribbly scrawl.


With the help of the Museum’s knowledgeable entomology staff, I have been working on a resource of key entomology collectors many of which the DigiVols encounter on a daily basis. The resource contains data detailing information about the collector, the collection time period, location, taxa or species collected and a sample of their handwriting or printed specimen label. This will be a point of interest and reference for the DigiVols, as well as helping them in correctly transcribing specimen labels and assisting to identify collectors. 


Some tips for transcribing handwritten documents
Transcription can be a time-consuming and laborious process but our DigiVols can pass on the following tips to make it easier:

  • Start by taking an overview of several pages or documents and start to pick out patterns. The author may have a specific way of writing certain letters such as ascenders and descenders.
  • Compare similar types of documents that have been written by the same person.
  • Check the position, start, middle or end of a word and look for combinations which may be one or more letters.
  • Lower case letters such as A, O, and U can sometimes appear to look the same, especially when the A or O is left slightly open at the top or the U is almost closed at the top.
  • Keep in mind the possibility of abbreviations and Pidgin being used. The names of places, taxa and people as well as commonly used words can often be abbreviated. Abbreviations can be marked by symbols or some may have no symbol at all. Name abbreviations are very common and usually consist of the first three or four letters plus the last letter.
  • Double letters were often written as single letters with a line or flourish above them.
  • Often the more time you spend reading a document the clearer it will become, and it’s a good idea to come back to it after a break.
  • In the document you are trying to read, physically write down what you can of the sentence, leaving spaces for the words that you can’t figure out. This will help when trying to decipher a sentence.
  • Its always a good idea to have a thesaurus, dictionary or reference book handy.