Digital Document Quarterly

Perspectives on Trustworthy Information

Volume 8, Number 1, 1Q2009

 

 

 

Past DDQ numbers

HMG Consulting

Saratoga, CA 95070

©  2009, H.M. Gladney    ISSN: 1547-8610

 

Acknowledgements

 


Long Term Digital Preservation (LDP)

DPC Technology Watch Report provides a careful description of JPEG 2000—a proposed digital image preservation standard.[1]

Trusting Digital Documents

Michael Day has summed up cultural heritage community views about trusting stored information.[2]  This and related articles do not address some key sources of risk—sources of interest to CPAs and attorneys:

(1)   End users—the people who, 100 years from now, might depend on sensitive archived documents—are being asked to trust collection custodians.  How can they judge whether such trust is prudent?

(2)   In the Trusted Digital Repositories approach,[3] the correct working of a repository is required.[4]  How can an end user decide whether or not this has been achieved, given his limited pertinent skills and limited time and energy available for the task?

(3)   What specifically is to be trusted?  The only occurrences of “trusted to” in the TRAC documents[5] are in: “They are trusted to store these valuable materials.  They are trusted to provide access to them in order to document and reveal history as well as to foster the growth of knowledge.  They are trusted to preserve these items to the best of their ability for future generations.”  There are no occurrences of “trusted by" in either report!

Trust for repositories is summed up in Moore’s phrase “a preservation environment”.[6]  In my opinion, the root of the difficulties discussed in Critique ... abstracted below is that authors have confounded the managing digital repositories with a different topic—long-term digital preservation (LDP).

The tensions are illustrated by how some authors attempt to transfer the “chain of custody” concept from information recorded on paper to digital information.  Careful control of paper along lines developed over several centuries can provide credible authenticity evidence.  But similarly reliable custodial protocols have not yet been described for digital records.

Just as this DDQ number was being completed, IEEE Computer published several papers about Trust Management—too late for careful comment in the current DDQ number.

Emulation as a Digital Preservation Tool

A 2007 Netherlands meeting considered different approaches to emulation, following a 2-year project to develop a modular hardware emulator.  In a related announcement, the Koninklijke Bibliotheek (KB) described a digital repository offering more than 160,000 scientific articles.

There are two distinct approaches to emulation.  KB has focused on emulation of entire computing environments in order to provide the perpetual possibility of executing today's application programs,[7] making its Dioscuri pilot available.  However, we find:

Rothenberg proposes an emulator specification … [for which] the effort involved would be unreasonable.  Two examples … illustrate this objection: First, suppose that future generations are inter­ested in just viewing a picture.  Then [Rothenberg] emulation still requires [one] to preserve the whole software environment for creating and modifying the picture.  Second, consider an email sent using Lotus Notes. Here, for fu­ture access the complete software system, which supports a load of other groupware tasks, would have been preserved, just for reading a simple plain text email.  Worse, application software provides just one view of the data … without direct access to the text included.  Therefore, it is impossible to transfer the raw data from the old system into a new system.  In addition, future development of emulation soft­ware just on the basis of a specification is considered extremely risky, … since the result cannot be tested by comparing it to the orig­inal hardware.                 [Borghoff page 214][8]

IBM is offering the alternative in a pilot for its Universal Virtual Computer solution.[9] 

A Critical View

The abstract of my Critique of Architectures for Long-Term Digital Preservation follows:

Evolving technology and fading human memory threaten the long-term intelligibility of many kinds of documents.  Furthermore, some records are susceptible to improper alterations that make them untrustworthy.  Trusted Digital Repositories (TDRs) and Trustworthy Digital Objects (TDOs) seem to be the only broadly applicable digital preservation methodologies proposed.  We argue that the TDR approach has shortfalls as a method for long-term digital preservation of sensitive information.  Comparison of TDR and TDO methodologies suggests differentiating near-term preservation measures from what is needed for the long term.

TDO methodology addresses these needs, providing for making digital documents durably intelligible.  It uses EDP standards for a few file formats and XML structures for text documents.  For other information formats, intelligibility is assured by using a virtual computer.  To protect sensitive information—content whose inappropriate alteration might mislead its readers, the integrity and authenticity of each TDO is made testable by embedded public-key cryptographic message digests and signatures.  Key authenticity is protected recursively in a social hierarchy.  The proper focus for long-term preservation technology is signed packages that each combine a record collection with its metadata and that also bind context—Trustworthy Digital Objects.[10]

Notice other LDP authors' near-silence in the last year.  Absent truly new ideas or challenges from others, I intend to quit writing on LDP except for dealing with referees’ issues with Critique and two other papers.[11] 

Perhaps everything that can be said about methods for LDP has been said, so that only implementation is missing.  A possible exception that came to my attention after Critique was submitted is a proposal to use multivalent architecture.[12]

EU Emulation Project

In February, a pan-European “fresh start for lost file formats” was announced.  This 4.02M Keeping emulation environments portable (KEEP) project, aims to create a universal emulator set—software that can recognise, play and open all types of computer file from the 1970s onwards.  As well as basic text documents it will also let people play computer games that technology has left behind.”

The project leaders assert that, “the number of unreadable documents in archives is [growing].  Britain's National Archive estimates that it holds enough information to fill about 580,000 encyclopaedias in [obsolete] formats …  The British Library estimates that the delay caused by accessing and preserving old digital files costs European businesses about £2.7bn a year.”

A brief KEEP technical description is available, together with a abstract of technology it might use.

Open Access

German TextGrid Project

TextGrid intends to create a grid for collaborative editing, annotation, analysis and publication of specialist texts for emerging e-Humanities.  It intends to integrate technologies for analyzing texts with dictionaries, lexica, secondary literature and other tools.  Its intended CommunityGrid will provide for integrating initiatives worldwide.  The project asserts that:

[P]ast and current initiatives for digitising and accessioning texts already accrued a considerable data volume, which exceeds multiple terabytes.  Grids are capable of handling these data volumes.  Also the dispersal of the community as well as the scattering of resources and tools call for establishing a CommunityGrid … for connecting the experts and integrating the initiatives worldwide.

Mining Millions of Metaphors

A San Jose Mercury News article starts by quoting Aristotle that a metaphor “is the one thing that cannot be learned from others".  Obviously Aristotle was mistaken, since we frequently repeat metaphors that we did not ourselves invent.

A Stanford University project is using computers to analyze ancient and modern texts, mining to create a database for investigating historic patterns of word usage.[13]  Digitized libraries have put an ocean of books—including obscure ones—at readers' fingertips.  Using new data-mining techniques and "machine learning", researchers can study shifts in word usage.

Federated Search for Online Books

In early March I posted the following blog inquiry:

In recent years several giant [book] digitizations… have been mounted,[14] some by commercial enterprises and some by academic institutions.  My relatively cursory Google search has not led me to a Web service from which one can mount a search that inspects a subset of these to determine whether or not a book (or a scholarly article) is available in digital form.

I wonder, has the kind of service been programmed and made available?  … If not, perhaps some list member will take it on.)   If you know of an example, I would be grateful for a pointer to it.  I suspect that other members of this distribution list would be as interested as I am.

Roy Tennant reacted with, “OCLC and others are working hard to get records for books being digitized by Google, the Open Content Alliance, [OAIster records,] and others into WorldCat”, but that “past cataloging practices can make it rather difficult to determine whether a URL … points to the full text of an item or to some lesser portion”.  Peter Noerr suggested that “nobody has built connectors (to handle the standards based and non-standard interfaces) … for "anybody anywhere in the world" coverage.  Andrew Hankinson suggested, “[Y]ou are describing [a] "Holy Grail" for digital librarians”.

Subsequent inspection of 22 articles[15] reveals focus on service to undergraduate students uncomfortable with literature search.  As important as that might be for the college librarians who wrote these articles, it calls for aspects not needed by mature scholars—aspects that had not been on my mind when I inquired. 

Clearly I need to clarify what I’m seeking and what might help other scholars.  It follows naturally from the existence of the massive digitization projects tabulated in DDQ 7(1).[16]  What I want is a “quick and dirty” PC tool whose limitations I understand and that combines searches over major sources such as those provided by Carnegie Mellon, Amazon, Google, Microsoft, … and research libraries they are partnering with.  I would use this as a starting point for more careful searching if and when I wanted that.  The tool would allow a user to select the resources searched, might include in its output a guess about the completeness of any object it identifies—a guess made from the sources’ own descriptions of what they provide.  It would have a front end similar to that of the SJSU and San Jose Public Library.  Finally, its help text would identify its own weaknesses, together with hints how each of these can be overcome.

In short, I’m not looking for any Holy Grail, but instead a tool to save human time and effort.  It does not even have to use machine resources particularly efficiently or be very fast.  Tom Keays suggested part of a mechanism: using the xOCLCNUM service as input for FRBR search in WorldCat, illustrated by:

For Don Quixote, there is a copyrighted edition at http://www.worldcat.org/oclc/51848364, but you can use xOCLCNUM for a WorldCat FRBR type search to return freely available ebook versions, as follows:  http://xisbn.worldcat.org/webservices/xid/oclcnum/51848364?method=getEditions&format=txt&library=ebook&fl=oclcnum,url

Ex Libris MetaLib enables an institution for providing the metasearch I have in mind, but does so from a Web server rather than as the kind of Web browser tool that I could customize for myself.

I subsequently found promising offerings, and am starting to test Google Custom Search Engine.  Its user guide includes an HTML program fragment for embedding its interface into a personal Web page.  A later DDQ number will report personal experience.

NSF Digital Library Ignored

An article by Jeffrey Mervis suggests that the NSF Digital Library is a practical failure.[17]

The National Science Foundation (NSF) has spent roughly $175 million over the last nine years "to provide organized access to high quality resources and tools that support innovations in teaching and learning at all levels" … [in] its Digital Library (NSDL) program … to make potentially useful Web content easy to find and classroom-customizable.  NSF funded "core integration" groups at Columbia University, Cornell University, and the University Corporation for Atmospheric Research to run the main portal.  The agency then invited the community to compile collections and [funded] 13 Pathway portals …  [However] … academic researchers contribute to digital libraries infrequently.  [A] … survey found that scientists are tend to perform their own searches … [and] are heavy Google users.

A new service, ResearchGate, attempts to expedite scientific communication, asserting that:

Instead of disseminating scientific results in regularly scheduled and printed journal issues, now a continuous release of articles in online format will change and expedite the way new results are spread.  Without anonymous review processes, open access journals or wiki-like concepts will assure the quality of science. Hidden conglomerates of various interests will give way to transparent and traceable new concepts of scientific impact measurements. Science is collaboration, so scientific social networks, wikis and other means of collaboration will facilitate and improve the way scientists collaborate.  

Epistemology

Objectivity

Readings in the philosophy of language have gradually made me hypersensitive to objective/subjective distinctions in assertions.  It seems almost impossible to overemphasize these.  Of course, scientific method has long explicitly emphasized empirical observation.  What’s new for me is asking whether I understand what a speaker has in mind even when I am not doing science.  Daston and Galison write:[18]

skeptics … are right to assert a wide gap between epistemological precept and scientific practice, even if the two are correlated.  Epistemology (of whatever kind) advanced in the abstract cannot be easily equated with its practices in the concrete. 

If training a telescope on large, remote causes fails to satisfy, what about the opposite approach, scrutinizing small, local causes under an explanatory microscope?  The problem is mis­match between the heft of explanandum and explanans, rather than distance between them: in their rich specificity, local causes can obscure … wide-ranging effect that is our subject here.  Local circumstances that may seem to lie behind, for example, a change in surgical procedures in a late Victorian London hospital, are missing in an industrial-scale, post-Second World War physics lab in Berkeley.  [A]nd yet in both cases a similar phenomenon is at issue: … how to handle automatically pro­duced scientific images.

Quine comments:[19]

… ontology, or the values available to variables.  [W]e can go far with physical objects.  They are not, however, known to suffice.  … we do not need to add mental objects.  But we do need to add abstract objects, if we are to accommodate science as currently constituted.  Certain things we want to say in science may compel us to admit … not only physical objects but also classes and relations of them; also numbers, functions, and other objects of pure mathematics.  For, mathematics—not uninterpreted mathematics, but genuine set theory, logic, number theory, algebra of real and complex numbers, differential and integral calculus, and so on is best looked upon as an integral part of science, on a par with the physics, economics, etc., in which mathematics is said to receive its applications.

Philosophy that requires mental objects is sometimes called “psychologism”.

Bafflement about Zeno’s Paradox

I have long been puzzled, not by the paradox itself, but rather that the ancients did not solve it even with their limited mathematics.  Recall that, in condensed form, the Achilles and the Tortoise paradox reads:[20]

Before Achilles can catch the tortoise he must reach the point where the tortoise started.  But in the time he takes to do this the tortoise crawls a little further forward.  So Achilles must next reach this new point. But while Achilles achieves this, the tortoise crawls a tiny bit further.  And so on … Achilles has in infinite number of finite catch-ups to do before he can catch the tortoise, and so, Zeno concludes, he never catches the tortoise.

The incorrect reasoning is exposed by the word ‘never’.  Zeno and his critics might simply have asked how long the race could last.  Suppose that, at the start of the race the Tortoise had a 100 meter lead, that Achilles’ speed was 6 meters/second (slow by modern standards), and the Tortoise’s speed was 6 millimeters/second.  In 17 seconds Achilles will have achieved 102 meters and left the Tortoise 199.8 centimeters behind. 

Thus, Zeno might have discovered[21] that ‘never’ would be shorter than 17 seconds.  This absurdity could have alerted the ancients to the fact that the sum of an infinite series can be finite!

Patterns in the Information Revolution

DDQ focus is planned to shift from preservation to evaluating three interrelated propositions: 

(1)   That the fundamental principles underlying the "Information Revolution" were mostly worked out between 1850 and 1960;

(2)   That laymen (vis-a-vis science) can achieve comfort with modern and evolving information management by absorbing surprisingly few ideas from the work just alluded to; and

(3)   That the single notion of a "pattern" is an effective tool for comprehending most of the critical ideas underlying the Information Revolution.

A start in these directions is implicit in Part II of Preserving Digital Information.

News

A January editorial article asserts the urgent need to fix federal archiving policies.  An interagency working group has just recommended that the United States develop a strategic policy for preserving and making scientific information accessible in a world in which data increasingly is born, stored and used in digital formats.

2008 was a down year for the economy.  However according to PC World, it was a banner year for unfounded technology rumors.

At the end of the first week of April, negotiations for IBM’s purchase of Sun Microsystems broke down, perhaps only temporarily.  The stock price of Sun promptly dropped 23%, which probably did not please Sun owners.

Upgrading Tools to Verify Digital Records

As computers become faster and malefactors become more vigorous, old cryptographic tools might no longer be sufficiently secure.  Some months ago, the National Institute of Standards and Technology mounted a competition for new hash algorithms.

Guide for Vetting Charities

After Madoff, Donors Grow Wary of Giving,” writes the Wall Street Journal, “but you can spot red flags before you write out a check.”

Recommendations for Reading, Listening, or Viewing

Mechanized computing is at least 2500 years old.  Read about the Antikythera Mechanism.

Learn about medical sciences from the videos in the Charlie Rose Science Series.

Learn about computer software architecture from Grady Booch.

In mid-2008, the Scientific American Book Club offered a promotional collection whose individual books were about most interesting numbers: zero, π, e, —one book for each number.  An excerpt follows:

Attempting to convince someone whose mind is already made up is difficult.  In Mathematical Cranks, Underwood Dudley tells of a cyclometer who wrote that “π’s only position in mathematics is its relation to infinite series [and] π has no relation to the circle.  ... Lindemann proclaimed the squaring of the circle impossible; but Lindemann’s proof is misleading for he uses numbers (which are approximate in themselves) in his proof.”  How can you argue with that logic? [22]

Like most scientists, I know that five of the most important numbers are related by a single equation,

eiπ – 1 = 0,

but am nevertheless fascinated by this fact.

J.L. Synge: Science: Sense and Nonsense

This little book[23] is an articulate and amusing reminder of differences between common language usage and scientific language, as illustrated by:

A dignified scholar emerges from the [Alexandria] Library with a roll of manuscript under his arm.  The boy runs up to him.  “Beg pardon, Sir.  What is geometry?” asks the boy.

“That is a long story,” replies the scholar. “Perhaps you will come to the University some day, and you will learn what geometry is.”

The boy makes a face. “Please, Sir,” he pleads, “I want to know now.  I know it would take a long time to learn all about it, but can't you tell me just a little bit now?”

By this time the scholar is very much impressed by the eagerness of the boy.  “Well,” he says, unrolling his manuscript, “I'll read you the first sentence. …  ‘A point is that which has no part.’"

There follows a dead silence. 

“Please, Sir,” [the boy] asks, “has a point got a smell?”  

        [The boy continues, asking whether a point has colour, shape, ability to speak, and so on.  He finally sums up angrily.]

“I think you are a dishonest man.  You tell people that a point has no part, but you don't tell them that a point has no smell, no colour, can't talk, can't hear.  Why do you bother to say that a point has no part, when there are so many other things it hasn't got?  Tell me that.”

Ernst Cassirer: Kant’s Life and Thought

All subsequent epistemological work is influenced by Immanuel Kant’s writings—particularly by his Critique of Pure Reason.  Ernst Cassirer, a famous philosopher in his own right, provided Kant’s Life and Thought early in his career.[24]  An excerpt suggests its tone:

Kant's initial years at the university, to judge by the slight infor­mation about them that has been preserved, are also significant more for this education of the will than for the knowledge furnished him in the regular course of lectures.  In Prussia at this time, school and university supervision were still barely distinct from each other.  As late as 1778, under the reign of Frederick the Great, a ministerial edict was promulgated to the professors of the University of Konigsberg expressly forbidding the free organization of academic instruction and demanding the closest adherence to prescribed textbooks, on the grounds that the worst compendium was better than none at all.  The professors might, if they possessed sufficient wisdom, emend the author, but the reading of their own dictata was [prohibited].  Moreover, the syllabus for each subject was laid down in detail, and …

World's Greatest Keyboard

Engineers enjoy outstanding designs.  An example is “the World’s Greatest Keyboard”.  Edwards writes, “From the satisfying click of its keys to its no-nonsense layout and solid steel underpinnings, IBM's 24-year-old Model M is the standard by which all other keyboards must be judged.”  See a slide show.

I’m a fan, using versions extended with the Trackpoint, and piling up in my garage newer keyboards received as part of PC purchases.

Practical Matters

Pegoraro provides a critique of income tax return programs.  The two most prominent offerings create different tax estimates.

10 Minute Mail is a free service to create a temporary e-mail address for avoiding unwanted solicitations as a side effect of signing up for some service.  Ten minutes should be long enough to sign up, receive a confirming e-mail, and send a "yes, it's really me" message.  Then the address evaporates.  A slow typer can add an additional ten minutes to the email account life.

I have not tried it, as I suspect that it requires one to use an e-mail client, which I don’t need with Gmail—my preferred service.  However, I achieve similar filtering with a Gmail address that I use only for service sign-ups, ignoring all other incoming traffic.

PC World teaches creating an ad hoc network for information transfer  among PCs.

A network hard disk offers easy backup, albeit only supporting recent operating system versions.

Telephone Enhancements and Free Services

It’s hardly a secret that the landline telephone business is threatened by wireless telephony.  Companies are scrambling to provide novel wireless services and handset features.  Some of these are for free service funded by sometimes intrusive advertising.  Recent annoucements include:

·      Free phone calls from your browser using GizmoCall, CallingAmerica, and perhaps other offerings.

·      Renting a Blackberry for the road.

·      A Google service to redirect an incoming call to several telephone numbers and provide options for handling an accepted call.

·      A handset design combining features from today’s smartphones.

Product Origins from Bar Codes

Product origins can be determined from Universal Product Codes (aka “bar codes”), albeit not always reliably.  The first three digits usually identify the country of manufacture, as follows:

00-13

 USA & CANADA

30-37

 37 FRANCE

40-44

 GERMANY

49

 JAPAN

50

 UK

57

 Denmark

64

 Finland

76

 Switzerland & Liechtenstein

471

 Taiwan

480

 Philippines

628

 Saudi Arabia

629

 United Arab Emirates

690-695 

 People’s Republic of China

740-745

 Central America

Purchasing an HDTV Monitor

San Jose Mercury News offers a good guide for what to look for in a high-definition television.

Studio-Quality Audio

The AVS Audio Editor reads audio CDs almost perfectly.  It is free for non-commercial use.

Browser Enhancers

2008 brought us many free tools to speed browsers to helpful information.  Some that I adopted are:

·      Enhancements to Mozilla Firefox, keeping it ahead of Windows Internet Explorer.  See a review.

·      Xmarks (FoxMarks renamed) for synchronizing browser bookmarks across multiple PCs.

·      FreeDownloadADay weekly e-mail service alerting recipients to new services.

·      ToRead 2-click service to send the content of any Web page to your e-mail.

·      Interclue summary information about Web links.  Hover your mouse pointer over the link, and an icon will appear.  Rest your mouse on the icon to see a linked-page summary.

The Wall Street Journal discussed enhancing Web search efficiency with Surf Canyon and Google’s SearchWiki.

Virtual Machines and Virtual Desktops

You have perhaps noticed an upsurge of virtual machine software offerings.  I am evaluating some to choose among, both for Windows XP (I have tried and rejected Windows Vista) and for Linux Ubuntu.    See a comparative review of Sun xVM VirtualBox and VMWare Server.

Virtual Desktops seem to provide a subset of virtual machine services—specifically the ability to see/manage different applications in separate spaces with rapid switching among spaces—a service for MS Windows similar to what is a native part of Linux Ubuntu with KDE.  (A virtual desktop is simply a desktop view that displays only selected windows.)  WindowsPager enjoys positive reviews.  The Fences screen organizer complements it nicely.

Souped-Up Scanner Reads Books Aloud

Plustek has created a scanner for vision-impaired that reads to you.  Just plunk a novel on the platen, punch a button, and relax to the dulcet sounds of a computerized voice reading aloud.  The buttons and power switch are marked in Braille.  Watch a video (in Japanese, but you can turn off the sound and still get the gist) of the Plustek BookReader in operation.  The scanner also produces MP3s or WAV files that you can listen to at a later time, saves images, and produces PDF files of scanned text. 

Plustek scanners have a specially designed edge and lamp that allows "zero edge" scanning.  (The machine can scan right up to its edge where the book spine is placed.)  The book pages are completely flat on the glass, thereby avoiding book spine shadow and distorted text.

The obvious use is for vision-impaired users although it will also work well as a normal scanner.  It is not inexpensive (approx. $600), but within range for private purchases.

Price Watch

HD TV

Samsung 50” 720P Plasm

$1310.

each

HD TV

Samsung 32” 720P LCD

$555.

each

HD TV

Samsung 52” 1080P LCD

$2280.

each

DVD dual layer

Sony 20x internal dual-layer DVD -/+RW drive

$39.

each

LCD Monitor

Acer 23” 1920x1080, 40000:1 contrast, 5 ms. response

$240.

each

LCD Monitor

Hyundai 19” 1280x1024 1000:1 contrast, 5 ms. response

$130.

each

LCD Monitor

Hyundai 21.6” 1680x1050 2500:1 contrast, 5 ms. response

$150.

each

LCD Monitor

Acer P241W 24” 1920x1200, 3000:1 contrast, 5 ms. response

$330.

each

 



[1]     Robert Buckley, JPEG 2000—a Practical Digital Preservation Standard, DPC Report, Feb. 2008.

[3]     Research Libraries Group, Trusted Digital Repositories: Attributes and Responsibilities, May 2002.

[4]     The practical interest of most users will not include most repository contents, but only the authenticity of very few records.

[5]     Loc. cit. endnote 3.  Also RLG-NARA Digital Repository Certification Task Force, Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist, 2007.

[6]     R. Moore, Towards a Theory of Digital Preservation, Intl. J. Digital Curation 3(1), 63-75, 2008.

[7]     Jeffrey van der Hoeven, Bram Lohman, and Remco Verdegem, Emulation for Digital Preservation in Practice: The Results, Intl. J. Digital Curation 2(2), 2007.

[8]     Uwe M. Borghoff, Peter Rödig, Jan Scheffczyk, and Lotar Schmitz, Long-Term Preservation of Digital Documents: Principles and Practices, Springer Verlag, 2006, ISBN 978-3-540-33639-6.

[9]     J. R. Van Der Hoeven, R. J. Van Diessen, K. Van Der Meer, Development of a Universal Virtual Computer (UVC) for long-term preservation of digital objects, J. Info. Sci. 31(3), 196-208, June 2005.

[10]    H.M. Gladney, Preserving Digital Information, Springer Verlag, 2007.

[11]    The slow publication schedules of the target periodicals suggest that this will not be complete until late 2009. 

[12]    Thomas A. Phelps and P.B. Watry, A No-Compromises Architecture for Digital Document Preservation, Proc. 9th European Conf. on Research and Advanced Technology for Digital Libraries (ECDL 2005), September, 2005.

      T.A. Phelps, Multivalent Documents: Anytime, Anywhere, Any Type, Every Way User-Improvable Digital Documents and Systems, Ph.D. Dissertation, University of California, Berkeley, 1998.

[13]    Brad Pasanek and D. Scully, Mining Millions of Metaphors, 2007.

[14]    Brewster Kahle provides an up to date summary of the Economics of Book Digitization.

[15]    Christopher N. Cox, Federated Search: Solution or Setback for Online Library Services, 2008.

[16]    In May 2008, Microsoft closed their Live Search Books and Live Search Academic services.  The project had scanned 750,000 books and indexed 80 million journal articles.

[17]    Jeffre Mervis, NSF Rethinks its Digital Library, Science 323(5910), 54, January 2009.

[18]    Lorrain Daston and Peter Galison, Objectivity, 2007, ISBN 978-1890951788.

[19]    W.V.O. Quine, The Ways of Paradox, 1975, ISBN 978-0674948378, Ch. 22: The Scope and Language of Science.

[20]    Adapted from the Stanford Encyclopedia of Philosophy.

[21]    Notice that the reasoning that follows uses only concepts already known 2400 years ago.

[22]    David Blatner, The Joy of π, 1997, ISBN 0-8027-7562-7.

[23]    J.L Synge, Science: Sense and Nonsense, 1951.

[24]    E. Cassirer, Kant’s Life and Thought, Yale U.P., 1981, ISBN 0-300-02982-9