We’ve officially wrapped up our Newspaper Digitization Project. In this new blog, we are taking a look at the work, which resulted in uploading some 2,000 newspaper issues from 1938 to 1975 onto our website.
It is thanks to the generous funding from the New Horizons for Senior’s Program from Employment and Social Development Canada that we were able to complete the newspaper digitization and uploading.
Newspapers are some of the most heavily used records here in the Archives, both in-person and online. It is not only local residents who find them interesting; researchers have often travelled from far to study the newspapers, as AMBA has the only copies of many editions and they are instrumental in assisting historians, family researchers and genealogists. Accessibility can be a big issue for those who live far away or cannot visit the Archives, and with Covid-19, access to our Reading Room has often been limited. For seniors, a trip to the Archives has been difficult and sometimes impossible. As they are among our most active researchers, when the opportunity arose for a grant project, AMBA was eager to make available an even greater number of historic resources online.
The earliest years of newspapers were already digitized through past projects up to 1937. In 2021, AMBA hoped to digitize about 40 more years of local history, and a $25,000 grant was secured through the New Horizons for Seniors’ Program of the Government of Canada.
While there are numerous paper editions of the newspapers in our storage vault, there are also many more newspapers on microfilm. In fact, some of the only surviving copies of some editions of the papers are held on microfilm!
Microfilms are a length of film containing microphotographs, like scans that can only be read with the assistance of a special reader.
The best method for digitization is to produce high-resolution scans of physical papers, but this is a costly and lengthy endeavour. Digitizing microfilm reels is not only faster and safer but also a much more cost-effective solution.
It was AMBA’s aim to upload as many newspapers as possible for researchers, and so, it was determined that the reels were to be the primary focus for digitization. Dozens of reels of microfilm and 8 boxes of physical newspapers were prepared, in the hopes that if we could get through all of the reels, there may be some funding leftover for use on the physical papers. Preparation work was made easier by the indexed list of newspapers previously created and maintained by our dedicated AMBA volunteers.
Of the $25,000 we received from the government for the project, the bulk of the funding was used for the digitization by an outsourced company, Image Advantage. While we have the ability to digitize many items here in the Archives, we do not have the equipment required to digitize the reels of microfilm.
Image Advantage specializes in capturing high-resolution images, and it required several months of work to process all of the newspapers.
In the end, funding covered only the digitization of the microfilm reels between 1938 and 1975 – 14 reels of the Arnprior Chronicle and 9 reels of the Arnprior Guide. During their digitization, Image Advantage produced three different digital formats for saving: collated PDFs for uploading and use by our researchers, as well as Tiff and Jpeg files for each newspaper page, which are our high-resolution renderings for long-term preservation.
Around mid-summer, once Image Advantage’s digitization was complete, we got to work on tasks to improve the accessibility and usability of the newspapers online, with the assistance of our web-host service, Andornot.
One of the benefits of digitized newspapers is the ability to search through the text using a word or phrase, made possible by Optical Character Recognition, or OCR for short. OCR is where typed or printed text images are turned into machine-coded text that the computer can read. However, OCR can only be as good as the quality of the image and text on microfilm.
There are always some risks with digitizing from microfilm: the physical newspapers may have been in poor physical conditions when captured on microfilm (folds, ink bleeds, missing pages, etc.), and they are sometimes copies of previous microfilms – what we consider second-generation duplicates. Thus, when captured by Image Advantage, the quality of the digitized newspapers was varied.
The resulting OCR was reduced for many editions until we were able to run the prepared PDF files through our ABBYY Fine Reader software.
Example of a poor OCR rendering of a page before processing on ABBYY. Most of the text is not detected by the computer (highlighted in green) and instead is being considered an image (highlighted in red).
OCR of text rendered after processing with ABBYY – note how much of the text is now readable.
Even so, it is always worth not only searching the newspapers for keywords but also looking through individual papers for information, as OCR improvements were not always good.
Once OCR was improved as much as possible within the project’s parameters, the PDFs of the newspapers were uploaded to the website for testing. New improvements were made to enhance the user experience by our website and archival search developer and host, Andornot Consulting Inc.
In addition to being able to search across all the holdings described online, there is now a search box within the updated newspaper finding aid. Some newspaper titles are also now organized by decade for ease of use, such as the Arnprior Chronicle.
Around 180GB of digitized material and 120 hours of work later, the newest additions to our website cover nearly 40 more years of local events and news, from near to the end of the Great Depression through to World War II, and the subsequent three decades of growth and change in the Arnprior and surrounding area.
Part of our Archives' mandate is "to make important historical documents available" and thanks to this project, we have the opportunity to better serve some of our most active supporters -- our senior researchers. We hope that this project offers a new gateway to local history for everyone with internet access, regardless of where they live or their mobility. Issues beyond these digitized dates are still available to view at the Archives on microfilm.
To access the search function on the Archives’ website and the finding aids, click here.
Special thanks also to Emma Carey, Laurie Dougherty, Irene Robillard, Diane Bresson, our AMBA volunteers, the team at Andornot and the digitization team at Image Advantage.
By: Kristen Mercier, ArchivistInside the Vault: The Handford Studio Co… A brief history of Robert Simpson Park