ProQuest Digital Archiving and Access Project (DAAP)
14 Oct 2024
Colleagues from our Metadata and Discovery division of Collection Strategies have recently completed a six-year project to digitise doctoral theses working collaboratively with ProQuest.
In this Q&A, Helen Scott (Metadata Manager for Digital and Special Collections) shares key details and discusses the outcome of their work.
What was the aim of the Digital Archiving and Access Programme (DAAP) project?
The Library worked in partnership with ProQuest on the DAAP project to improve discovery and access and ensure preservation of The University of Manchester Library’s legacy doctoral theses. As a print reference collection, it wasn’t being utilised to its full potential as a valuable research resource, but now through full-text digitisation and improved metadata and indexing, we bring together the University’s doctoral theses collection in its entirety to be discoverable and accessible via Library Search and external databases including PQDTGlobal and DARTEurope.
Tell us about the theses collection?
A collection of over 22,000 doctorial theses published between 1896 and 2010 by The University of Manchester and former UMIST were previously housed as a print reference collection at the Joule Library, North Campus. The vast collection forms a valuable body of doctoral research, covering a broad range of subjects, with an emphasis on medicine, materials science, chemistry and engineering.
Who did the project involve?
The project was initiated and led by the Metadata and Discovery division in Collection Strategies. Kathryn Sullivan, Metadata and Discovery Manager, was the project lead and Helen Scott, Metadata Manager for Digital and Special Collections, liaised with ProQuest in managing the operational running of the project with support from the Digital Metadata Team. Helen worked in collaboration with the Library’s Digital Services Innovation team who managed the preservation of digitised content in Preservica and enabled discovery and open access via Library Search.
How long did the project take and what did the process involve?
Six years, starting in 2018, with a break in 2020. For each year of the project there were three collections of 2000 theses by an external document scanning company. Each collection took several weeks of planning, checking and producing comprehensive metadata and collection manifests which would be used to create the new MARC 21 standard records, and to ensure each thesis could be checked and tracked through each stage of the digitisation process. Once scanned, PDFs were sent to the USA for quality checking, metadata enrichment and uploading to PQDTGlobal. Finally, the PDFs and metadata were returned to the Library for digital storage in Preservica. The whole process for each individual thesis took around 6 months.
How has the digitisation project affected the way researchers interact with the University's thesis collection?
Improved RDA MARC standard metadata and indexing means the collection is more discoverable and accessible, and OCR’d text enables faster and easter interrogation of content. Crucially, researchers are able to access our content instantly from anywhere in the world - outside the UK the highest number of retrievals for our digitised doctoral content to date are from the USA, China, Canada, Australia, and Turkey - whereas previously the collection was for reference use only and limited to those who had the time and means to visit the Library, or to pay for a one off digitisation chargeable fee via other digitisation or inter-library loan services.
What feedback have you received from faculty and alumni whose work has been digitised and made more accessible?
On a number of occasions, we’ve been contacted by alumni or faculties regarding specific titles in our collection. It has given our team service satisfaction to provide immediate access to the full text PDF thesis. Recently we had an enquiry from the University’s School of Natural Sciences about theses supervised by former Professor Peter Aczel, a mathematician and leading figure in his research area of Mathematical Logic and Computing Science over the past 50 years. We quickly discovered the theses he supervised had been digested via the DAAP project and were able to swiftly offer a list of links to full text PDF downloads. On another occasion an alumnus from the 1970s had sadly lost his personal copy of this thesis due to fleeing war in his home country and was thankful when we were able to supply him with immediate access to the digital copy of his PhD, which was digitised as part of the DAAP project. He was grateful to the Library and told us our “prompt” and “excellent” response was “typical of Manchester”
Can you share any interesting statistics or trends you've noticed in terms of access and usage of the digitised theses?
Analysing usage across the last six years of the project, the PQDTGlobal ETD Dashboard reveals a steady increase from around 200 retrievals per month January 2018 to 2500 retrievals in January 2023. Usage peaks also suggest, perhaps not surprisingly, an increase around the time of the integration of PQDTGlobal and Web of Science and when the British Library’s EThOS service was affected by a cyber-attack.
Our most popular title “Online learning and learning styles: The evaluation of two learning methods in an online learning environment. Musa, Abuagila M. 2005.” has been downloaded over 1000 times since it was digitised in 2018 and was observed to be the top of the download charts early in the pandemic, perhaps reflecting the global shift towards blended and online learning.
The collection contains theses written over a hundred years ago, which continue to be downloaded and used to the present day and viewed more widely than would have been possible when available in print only format. Amongst our popular older titles is a thesis from 1908, “The Calcutta Plague 1896-1907, with some observations on the epidemiology of plague.” Crake, H. M. – downloaded 26 times since it was digitised in 2020.
How do you see this digitisation effort contributing to The University of Manchester's global academic reputation?
The full-text digitisation of our print doctoral theses collection with enriched metadata and indexing, facilitates the search, discovery, and access to the wealth of research contained within the collection, contributing to the Digital Library Manchester priority area for Imagine2030 and the University’s global reputation as a world leading research university.
How can I access the doctoral theses collection now?
Full-text access to high quality OCR’d downloadable PDFs can be found via Library Search or through the PQDTGlobal database.