Dr Samuel Goebert PhD

CSCAN Network Research Student

Decentralised Hosting and Preservation of Digital Collections

The Internet is changing from a web of documents to a web of data. Open-data collections like Wikipedia, Internet Archive, Stack Exchange and OpenStreetMap have become important sources of global knowledge. The data are freely available and everybody is invited to contribute.

Preserving digital collections is also performed by entities not a?liated with the original initiatives and involves copying content and meta-data about a collection to new storage locations. A copied collection retrieved from an untrusted location requires the task of revalidating the authenticity of the data. Since budgets are limited, novel ways of ?nding storage space have to be acquired. Safely storing and validating data without the need to own and control the storage location enables this.

This thesis develops a protocol for decentralised hosting of digital collections. The result is a formalised, decentral mode of discovery, curation and hosting for datasets, retaining authenticity even at untrusted storage locations. Donating storage space and bandwidth becomes possible for entities not a?liated with the original initiative and ensures long-term access to the authentic collection for the public at the same time. The protocol is leveraging the bittorrent protocol, a variation of the block chain protocol and is backwards compatible with existing web application architecture.

This novel approach is validated through a proof-of-concept prototype. A series of test scenarios is used to illustrate how a decentralised collection would behave given multiple participants. The results support the use of a decentralised hosting approach for digital collections leveraging storage locations that are under the supervision of entities not a?liated with the original initiative.

The thesis concludes with a detailed summary of the contributions to the ?eld and suggests further areas of study in the context of distributed-preservation.

Director of studies: Prof. Dr Bettina Harriehausen-Mühlbauer
Other supervisors: Prof. Dr Christoph Wentzel, Prof. Steven M Furnell

Towards A Unified OAI-PMH Registry
Goebert S, Harriehausen-Mühlbauer B, Furnell SM
Proceedings of the 11th IS&T Archiving Conference, pp97-100, ISBN: 978-0-89208-309-1, 2014
Decentralized Hosting and Preservation of Open Data
Goebert S, Harriehausen-Mühlbauer B, Wentzel C
Proceedings of the 10th IS&T Archiving Conference, pp264-269, ISBN: 978-1-63266-642-0, 2013
A non-proprietary RAID replacement for long term preservation systems
Goebert S, Sarti A
8th International Conference on Preservation of Digital Objects, November 1-4, 2011, Singapore, pp254-256, ISBN: 978-981-07-0441-4, 2011
