Preserving Digital Information
|
© 2006-7, H. M.
Gladney 3 May
2007 |
Almost all new information
is created with digital tools. Much of
this is distributed primarily in digital form, partly because its analog representations
lose information contained in digital sources.
This recent shift—part of the Information Revolution—creates demand for
new functionality that includes means for preserving reliably authentic
versions of authors’ works—versions that will be intelligible and/or otherwise
usable many years from now.
It
will surprise no one that we can solve a technical challenge with further
technology. Preserving Digital Information describes
a complete technical solution for preserving any digital information
whatsoever.[1]
What
are the challenges for preserving a digital work and how can we handle
them? We package each work of interest
into a bundle containing several representations and critical provenance
information, addressing key requirements as represented by the following table.
|
We must ensure that: |
How we can accomplish this: |
|
Some copy survives as long as wanted |
Replicating each document package in independent
sites, and including the document’s original form in the package |
|
And any user can find a copy |
Using widely available search tools |
|
With sufficient authenticity evidence |
Binding each document with its provenance
information and sealing the package with cryptographic signatures |
|
And can use it as its producer
intended |
Using ISO standards to represent
simple objects and |
|
With sufficient contextual
information. |
With reliable references using
universally unique digital object identifiers. |
The book describes how to
realize these features in the structure of a Trustworthy Digital Object (TDO)
(see Figure below) and in procedures for generating and inspecting TDOs. That the
solution is complete and economical is justified with precise language based on
early 20th-century epistemology.[3]
The answer to “How can we
preserve digital information?” is, of course, different from that for “How
should we manage digital archives?” The
archival services needed are available in any of many current digital library
offerings.[4]
|
Fig 32 from Preserving Digital Information: |
[1] A more complete synopsis and table of contents is accessible at http://home.pacbell.net/hgladney/PDIf.pdf. A slide show summary is available at http://home.pacbell.net/hgladney/PDIslides.pdf.
[2] Using a simple Universal Virtual Computer based on the Church-Turing thesis.
[3] For instance, we must avoid confusions about “dynamic objects” and about “the original”, and need objectively testable definitions for “authentic” and for relationships among versions of a work.
[4] Modest and obvious software extensions might be needed for some content management packages.