Digital Document Quarterly

Perspectives on Trustworthy Information

Volume 5, Number 4, 4Q2006

 

 

 

DDQ Home

Citations

Glossary

HMG Consulting

Saratoga, CA 95070

©  2006, H.M. Gladney

 

ISSN: 1547-8610

Editorial Comment

DDQ 5(4) offers criticisms of apparently common scholarly practice—criticisms broader and more tenuous than have appeared in prior DDQ numbers.  These views suggest aspects that readers might reflect on.  To the extent that they seem inappropriate, I would be most interested in refutations, either as private correspondence or in the form of short arguments that DDQ could publish in later numbers.

Some of the content, particularly that about inattention across professional boundaries, is stimulated by missing evidence that I would expect to be present, as in the case of the dog’s barking in Arthur Conan Doyle’s The Adventure of Silver Blaze.  (What was significant to Sherlock Holmes was that the dog had not barked.) 

It seems to me that C.P. Snow’s Two Cultures difficulties are still with us,[1] and have impeded digital preservation progress.  From one side of the divide, I offer a perspective of differences of approach that have hampered productive collaboration.  My participation in a 1996 panel discussion stimulate this line of thinking.  Readers will see the adverse influence of The Two Cultures gap throughout the current DDQ number.

Digital Preservation

On Designated Communities

In December, TechWorld reported that the European Union has funded a multi-country digital preservation project called PLANETS (Preservation and Long-term Access through NETworked Services), and that a participants’ team has assembled itself.[2]  The DPE (Digital Preservation Europe) website reports a November PLANET partners’ meeting.

Among the topics discussed in this meeting, the importance and role of a collection’s “Designated Community” received attention that puzzles me.  This is not because anything said is surprising or controversial, but rather that it has long been normal practice for each library to identify such a community as part of its mission statement.  I would have been interested in explicit distinctions between traditional library practices and aspects that are new and challenging for digital collections, but none were emphasized in the meeting report.

NDIIPP at Mid-Point

“One of the chief tasks of NDIIPP is to identify and provide for all the barriers to progress in digital preservation. The most salient are those caused by the rapid changes in technology.  Frustrations are shared by industry and collecting institutions alike over the multiplicity of formats, rapid technological changes, and hardware and software obsolescence that plague the new information technologies.[3]

Recent reports remind us that the [U.S.] National Digital Information Infrastructure and Preservation Program (NDIIPP)[4] has reached its midpoint, suggesting that critical evaluations are appropriate, with a view to mid-point course changes.[5]  The next D-Lib Magazine number will contain my critique of technical aspects of NDIIPP reports.[6]  The abstract of Digital Preservation in a National Context: Questions and Views of an Outsider reads:

“This article draws attention to technical opportunities which, if pursued, would significantly accelerate National Digital Information Infrastructure Preservation Program (NDIIPP) progress towards objectives called for by the U.S. Congress.  It also identifies concerns about apparent content scope limitations of the NDIIPP plan.

“A solution is known in principle for every difficult technical problem of digital preservation, including all those identified in NDIIPP publications.  They and other works correctly assert that non-technical preservation challenges are greater than technical ones, but do not discuss using technology to reduce non-technical obstacles.  Available technical choices show that some apparent preservation challenges are not obstacles after all.

“If document representations and network protocols are standardized, then each archive can autonomously adapt itself to its own institutional environment.  Thinking about what end users will want led my colleagues and me to approach the challenge differently than most other authors.   We focus on information contributors and readers instead of on the work of repository employees.  We design document representations instead of new repository methodology.  We treat each repository as a “black box” whose internals can be adapted to local needs instead of discussing sharable repository.”

Compared to the pace of R&D progress expected in the private sector, at least in the Silicon Valley environment that I am familiar with, progress in the technical components of the NDIIPP project is disappointingly slow.[7]

Three management shortfalls seem to contribute to what disappoints me.  (1) NDIIPP has not effectively exploited  the skills of software engineers.  (2) NDIIP has not established productive collaboration with IT enterprises.  There is almost no private sector work on digital preservation.  (3) The NDIIPP digital content scope has not included documents of practical interest such as public infrastructure engineering records, health care records, legal records, and many other record classes that citizens value for their daily lives and futures.

East of England Digital Preservation Project

A recent report describes a pilot project investigating the issues and costs of potential regional digital repositories.  Taking as its starting point the anticipated needs of local authorities, the report looks in detail at the processes and costs involved in preserving and managing digital records of the types routinely dealt with by local authority records managers and archivists, including privately deposited material.

Many of the challenges have to do with ingestion of proffered record collections whose preservation has not been anticipated, whose current formats are problematical, and whose metadata are seriously incomplete.  Communication between archive personnel and collection owners unfamiliar with the technology and jargon of digital collections is another difficulty needing attention.

Not mentioned, but worth thinking about, is the extent to which these challenges are transitory effects of the novelty of digital records—effects that will vanish when our children take over in 20 years or so.[8]

A Misleading Analogy: Paper and Digital Preservation

“[A]s we approach the end of the twentieth century, we find ourselves confronting … a vast void of knowledge filled by myth and speculation.  Information in digital form—the evidence of the world we live in—is more fragile than the fragments of papyrus found buried with the Pharaohs.    [T]o achieve the kind of information density that is common today, we must depend on machines that rapidly reach obsolescence to create information and then make it readable and intelligible.” [9]

“[D]igital objects such as electronic journals are not only mutable but can also be modified or transformed without generating any evidence of change.  It is the mutable nature of digital information objects that represents one of the principal obstacles to the creation of archives for their long-term storage and preservation.” [10]                                                                                           

Pessimism about digital preservation is sometimes accompanied by comparison of the durability of paper to that of digital information.  That printed works are inherently immutable is a professional myth.

The myth is repeated by James Billington, the Librarian of Congress, in a September 2006 Atlantic Monthly article.[11]  This is surprising, since earlier in the year Deanna Marcum, an Assoc. Librarian of Congress, emphasized that “Only a fraction of what the ancient world committed to papyrus has come down to us.”[12]   Even though nobody seriously proposes saving heritage materials forever on today’s digital media, the comparison has been made often enough to warrant cautioning readers that the analogy is misleading.

Paper is mutable—easily burned, easily torn, easily cut, and easily overwritten.  However, four facts about information on paper are reliable guides to a digital preservation solution.  (1) We are usually more interested in inscribed content patterns than in paper artifacts for themselves.  (2) We protect printed information with immense infrastructure that includes widely dispersed libraries with redundant holdings.  (3) It took us many years to learn how to preserve reliably on paper.  And, (4) changes to information on paper can be detected, often easily.

Digital data has an advantage over most other artifacts: bit-string patterns do not decay.  We know how to make any bit-string as useful perpetually as it is today.[13]  Even if better methods were to be invented, if we save original bit-strings together with convenient transformed versions, we could create replacement versions of today’s OAIS AIPs (Archival Information Packages).

What we expect today for saved information is much more demanding than ever before, including at least ease of reading, ease of finding and very rapid access to portions of a vast information corpus, extremely high quality and fidelity that sometimes should include evidence of authenticity, and quality of references/linking.  Why these and other factors make the papyrus-to-digital information misleading is analyzed in a forthcoming publication.[14]

Digital Preservation of a Different Sort

Ray Kurzweil, author of The Singularity is Near, suggests that computers will enable people to live forever.  He predicts that non-biological intelligence will allow humans to overcome illness and aging in just 25 years, and that scientists will develop machines surpassing human intelligence.  He says, "We won't experience 100 years of technological advance in the 21st century; we will witness … about 1,000 times greater than what was accomplished in the 20th century."

Inattention across the Boundaries of Professional Disciplines

The bed-rock of research in this area is to understand in more detail the sociology of preserving and sharing information.  This will include understanding better disciplinary differences, and in particular those requirements that are fundamental versus those that are primarily historical.  For a cultural change to take place, it is important to involve key stakeholders and resource providers and for them to drive this process.”[15]

The digital preservation literature contains repeated calls for cross-disciplinary cooperation.  However, inattention across the professional boundaries is a sad tradition, sadly evident once again.  Each academic community behaves as if what is not represented in its own literature does not exist.

The most amazing twentieth-century development is the unprecedented success of science and technology.  This has been fostered by scientific methodology that includes lively constructive criticism and problem partitioning, with each contributor being confident that aspects he does not address will be handled by others.  Such partitioning is rooted in philosophical analysis starting with Leibniz and Descartes and represented by today’s analytical philosophy.  Getting partitioning right is not easy; false starts are resolved by self-evident utility of successful partitioning.  A great merit is that work, once done, need not be repeated (except sometimes to validate experimental results and applied logic).  I believe that this scientific methodology should be used more extensively by information scientists than seems to be the case.

For my writings on digital libraries and digital preservation, I have inspected over 600 articles written by librarians, archivists, and university information scientists[16]—an informal group sometimes called the cultural heritage community.[17]  This literature has surprisingly few citations to ACM and IEEE articles by software engineers.  This is unfortunate, because the ignored literature contains solutions to technical issues grappled with in the articles alluded to.  Such inattention has permitted, and continues to permit, wastage of public funds.[18]

The literature from the digital heritage community rarely considers the business climate that influences the tools available to it.[19]  The technology that creates today’s excellent access to information for more people than ever before is mostly created by private enterprise, whose rules of engagement emphasize responsiveness to markets.[20]  Unfortunately, industry is unlikely to see cultural heritage repositories as promising customers.  They are simply too few and small, with digital collections smaller than business collections for the foreseeable future.

There is a mismatch—a semantic dissonance—between the language and expectations of cultural heritage community spokespersons and technology vendors.   The current emphasis for technology products seems to be on system components, whereas cultural repositories want customizable “solutions”.

Technology vendors’ work on “solutions” is mostly in the custom contract business, which they call “services” and which is an immense business sector.  Insights and design successes in this area are not published, but rather treated as marketplace advantages that companies nurture, hone, and propagate internally.  This phenomenon contributes to another cultural mismatch: academic librarians seem emotionally and practically unprepared to use outside services.  Their institutions are not financially prepared for outsourcing work, even though they do not seem to have sufficient internal skills to build the middleware components of repository services.[21]

A Two Cultures Model: Stylistic Differences in R&D

My analysis of NDIIPP technical progress brought the Two Cultures rift to my attention more strongly than ever before.  The misunderstandings and intolerance which C.P. Snow described continue to be widespread, and to hamper progress for efficient and effective digital preservation.

The tension is evidenced by differences in writing style between what I have read addressing digital repositories, most of which comes from authors with liberal arts backgrounds, and the physical science and engineering literature with which I have worked from my undergraduate days.  I find the information science literature difficult in that its articles rarely differentiate their novel elements from ideas already published.  In the most influential technical periodicals this difficulty is precluded by expert referees’ demands for clear identification of what is new and for thorough citation of prior literature.

A likely contributing factor is the last 50 years’ increase in the number of university faculty members and “publish or perish” expectations.  The number of new ideas does not seem to have matched the rush of publications.  Critics of scientific literature have pointed to “slice and dice” behavior in which each piece of research is parceled into as many small articles as possible.[22]  I have the impression that information scientists meet the economic imperative by inattention to prior work, repeating what can be found elsewhere.  A consequence has been a large increase in the number of periodicals (and financial pressure on academic libraries).  At least in the sciences and engineering, I believe that most of the new periodicals can be ignored with little risk.[23]

In Wm. Lefurgy’s 2005 NDIIPP presentation, he reminded an ARL audience that there was “still no ‘silver bullet’ solution to digital preservation.”  This repeats earlier authors’ assertions that there would be no single digital preservation solution—a “straw man” assertion.[24]  No good engineer would ever talk about a potential “single solution,” because the phrase has no objective meaning.  (The distinction between simple and compound is entirely subjective, having to do with a speaker’s choice of the level of detail for discussion.)

These impressions from the literature and from personal interactions with members of the cultural heritage community are summarized in the following table of stylistic differences.

Aspect

Cultural Heritage Community

Content Management: Scientific, Engineering, and Medical Communities

Collegial

Values consensus more highly than criticism and debate

Values criticism and debate as methodology for progress

Working relationships

Emphasizes collegial and institutional collaboration and synergy

Emphasizes independent thought and competition

Breadth and depth

Emphasizes global discussions of topic at hand

Emphasizes “in depth” investigation of key topical aspects

Didactic

Combines research reporting with advice for newcomers to the topic

Separates research articles from textbooks and teaching materials

Subjective / Objective divide

Happy to confront subjective matters of opinion squarely

Focuses on objective topics that can be empirically tested[25]

Philosophical basis

Continental philosophy

Analytical philosophy

 

Cassirer’s “expressive perception”[26]

Carnap’s “purely structural descriptions”[27]

Problem attack

Emphasizes relationships among distinct components

Emphasizes partitioning and approximation, with later corrections

Typical reaction to a practical challenge

Recommends organizational or personal behavior; often normative

Builds tools and makes them available for user criticism; iteratively refines these.

Mathematical models

Rarely employs mathematics except for elementary statistics

Uses mathematical models to articulate physical laws and engineering designs

Key standards and conventions

OAIS, METS, MARC, …

ODF, MPEG-21, various XML, Unicode, various Java, JSR 170, …

Readers might reasonably believe that this table is biased towards opinions of a scientist, that it is unrefined and incomplete, and that the information samples from which it is drawn are too small and narrow.  I wonder myself whether or not refinement and base broadening would suggest it to be an appropriate description of widely seen differences between information scientists and physical scientists.  If so, a next question would be, “What do these differences suggest about the likely evolution of information science?”

Some readers might protest that the differences merely reflect that information science is a new discipline—one that will achieve similar rigor to the physical sciences in a decade or two.  I’m skeptical about this because the style of technical literature seems to have been rigorous from its earliest days, or at least since science and engineering emerged as distinct disciplines over a century ago.

My opinion is influenced by the history of philosophy, which was the history of most scholarship until the nineteenth century.  A relatively modern influence is the rift between Continental philosophy and analytic philosophy, which happened of its own accord, but was sharpened by events in the run-up to World War II.[28]  An older influence is Immanuel Kant’s attempt to make his philosophy “scientific”—an effort emulated as later thinkers created what today is called analytic philosophy.[29]

I need to emphasize that nothing written above is intended to be an unqualified recommendation of scientific methodology or scientific style.  In fact, the twentieth-century success of science and engineering depends on careful limitation of scientific methods to focused technical challenges.  It would much improve the quality of digital preservation literature if its authors used scientific methodology for its technical components.

A New U.K. Research Emphasis: Memories for Life

British research funding agencies have for some time selected “grand challenge” research topics that are (presumably) favored for funding.  In 2006, “Memories for Life” was identified as such a topic.  What is intended by this, and some citations of key articles, are described by a Royal Society paper: Kieron O’Hara et al., Memories for life: a review of the science and technology.  Its abstract includes:

“Recent developments in our understanding of memory processes and mechanisms, and their digital implementation, have placed the encoding, storage, management and retrieval of information at the forefront of several fields of research.  At the same time, the divisions between the biological, physical and the digital worlds seem to be dissolving.  Hence, opportunities for interdisciplinary research into memory are being created, between the life sciences, social sciences and physical sciences.  Such research may benefit from immediate application into information management technology as a testbed.”

The value of transcending disciplinary boundaries is exemplified by the “Memories for Life” initiative.

Coping with an Uncertain Future

“The digital realm is one of change and uncertainty, and it is likely to remain so for the foreseeable future. Even the most astute businesspeople cannot forecast anything comfortably because change is so rapid that it is too difficult to develop viable business models.[30]

“There is no way to predict how the future will unfold.  The Library recognizes the need to track the evolving circumstances … that can have decisive yet unanticipated effects on the preservation mission.[31]

Curators engaged in preservation need to learn to live with not knowing for sure that they have succeeded.  They will surely see much information disappear, perhaps including some that they believe they have made permanently durable.  Preservation failures can make themselves known, but successes cannot!  Presuming that curators have diligently and correctly applied the best known methodology, how can they achieve peace of mind?[32]

As posed, the question is unlikely to bother scientists or engineers.  “Not knowing for sure” is unavoidable, but worrying about it is neither warranted nor healthy.  In science, we do not claim apodictic truth for any law or deduction, but instead hold that any assertion is conditional on no counter-examples being found.  In engineering we never claim perfect reliability, but instead design for good enough to meet requirements in view of the cost of improvements.

Scientific practice includes experiments and reasoning that test extreme cases of any proposition and vigorous peer criticism.  We gradually become more confident of a scientific truth as time passes, provided that it has been subjected to responsible testing and critical examination.  Engineering practice includes estimating failure probabilities.  Over time, we become confident that artifacts are sufficiently reliable if their component failure probabilities are known to be small, provided that all likely hazards have been evaluated.  Neither in science nor in engineering do we ever assert that the last critical question has been asked and completely answered.

Every citizen of the technologically advanced nations is familiar with this, even taking it for granted with familiar technologies.  For instance, you surely use your automobile with confidence that it will perform satisfactorily for approximately 15 years.  Similarly, we are confident that the Golden Gate Bridge will not only be safe for today's crossing, but also will not fall down within the next century.  We habitually traverse such bridges without giving their possible collapse a moment's thought, and without examining the engineering evidence that such confidence is warranted.

For preservation of any document collection, as for automobiles and for bridges, engineers can design for whatever small probabilities of loss those who will pay the bills specify.  Engineers also know how to describe such designs so that independent auditors can vet reliability estimates.  Such considerations lead to simple prescriptions for preservation repositories.  Their managers might consider:

(1) Managing preservation as an increment over providing today's digital library services to clients, doing budgeting accordingly.  The only essential digital preservation activity is to save bit-strings reliably.

(2) Assessing today's most plausible methodologies against the claims made for them, choosing pieces that constitute an end-to-end solution, and asking solution advocates the most searching questions possible.

(3) Wasting neither time nor nervous energy worrying about perfect or permanent preservation, instead thinking in terms of acceptable losses over a normal enterprise planning horizon.

(4) Seeking every opportunity for replacing human clerical procedures with automated machine procedures, because the largest preservation costs and greatest quality risks will be associated with human failings.

(5) Commissioning software to implement the methodology chosen, paying the software engineers whatever is needed to make the tools convenient and fail-safe.

News

Abstracts for the 2006 International Conference on Formal Ontology in Information Systems are available.

Charles Darwin Online

A new website provides access to 50,000 text pages and 40,000 images of Charles Darwin's publications and manuscripts.  The site currently has 50% of his works online.  Its authors hope to finish by 2009.

More Elbow Room on the Internet

BusinessWeek reports the status of Internet Protocol Version 6 (IPv6), the initiative to relieve the impending shortage of internet addresses.  For an introduction and background, see the IPv6 Wiki.

Identifying Forgeries among Digital Images

A Dartmouth College doctoral thesis, Lighting and Optical Tools for Digital Image Forensics, describes tools for detecting tampering in digital images—tools that do not depend on watermarks or specialized hardware.[33]  M.K. Johnson uses illumination direction to analyze light sources in a photograph, detecting inconsistencies among shadows.  He uses a specularity tool to seek inconsistent reflective highlights, such as differences in reflections from human eyes.  Finally, he is preparing a chromatic aberration tool to examine the natural distortion of a picture caused by a camera lens. If this distortion is not consistent throughout, then the image is probably forged.  While none of these tools will be 100% effective, in concert they will contribute much to image forgery investigation.

DPubS Software for University Publishing

The Cornell and Penn State university libraries have released DPubS (Digital Publishing System) to enable organizing and disseminating electronic scholarly communications.  It is a flexible platform for the delivery of journals, conference proceedings, and books, intended to help libraries build publishing programs and manage the dissemination of an expanding number of scholarly publications.  It comes ready to publish journals, conference proceedings, and books, can be configured to handle content in almost any format, includes administrative tools for non-technical staff, and supports both open-access and subscription-based publications.

DPubS is written in Perl, runs on Solaris and Linux systems, and conforms to common open-source conventions.  The system has a flexible XML user interface and is OAI-PMH 2.0 compatible.  It can use Fedora as a digital repository, and DSpace support is under consideration.  DPubS is available without fees.

News in Depth from the Lyon (France) City Library

DDQ readers who are dissatisfied with “sound byte” news reports and who read French might be interested in a service from the Bibliothèque Municipale (BM) de Lyon.  BM Lyon librarians have begun to provide news breadth and depth on the library website.  "Points d'actu!" / "The News, In Depth" is an illustrated magazine which augments daily news with additional resources, context, and links for inquiry beyond the headlines.

New Data-Archiving Rules for Businesses

New U.S. rules that compel companies to produce electronically stored information for civil litigation could drive improved tracking and archiving of electronic documents, digital images and spreadsheets.

Reading Recommendations

Big Brother Is Preparing to Watch You!

Ian Angell and Jan Kietzmann present a chilling projection of government, commercial, and criminal surveillance that radio-frequency identifier (RFID) technology will make easy.[34]  They focus on addition of RFID circuits to large-denomination currency notes, and extend this to other marking opportunities.  It is particularly worrisome because RFID surveillance does not depend on line-of-sight and will not be noticed by its victims.

I already am concerned about surveillance by global position sensing circuits in cell phones.  The article amplifies these concerns.  Of course, these issues will affect our children more than those of us more than 50 years old.

Jostein Gaarder’s Sophie’s World

Sophie's World: A Novel about the History of Philosophy, a New York Times recommendation about 10 years ago, is a whimsical, easy-to-read introduction to the questions and history of philosophy.  Its style is intended to appeal to young girls, but it is also informative to anyone interested in deep questions about what it means to be human.

Since the book reports only one topic for each selected philosopher, it might strike some readers as presenting philosophic caricatures rather than a balanced view.  For instance, its chapter on Immanuel Kant pays more attention to ideas from the Critique of Practical Reason than the seminal Critique of Pure Reason.  This does fit with the book’s second-half emphasis on ethics, rather than on epistemology.  I most enjoyed the first half, because it teaches about early Greek philosophic insights that re-emerged only about 150 years ago.

David Hilbert’s On the Infinite

This 1925 lecture in honor of Karl Weierstrass is a masterful synopsis of the logical problems associated with infinity.  It teaches the essential connections between mathematical arcana, such as the epsilon-delta method that grounded mathematical analysis (calculus) soundly for the first time, and better known problem resolutions, such as those that address Russell’s Paradox.

The English translation of the lecture appears in P. Benacerraf and H. Putnam’s Philosophy of Mathematics: Selected Readings,1983, which I recommend also for its other chapters.  The latter might appeal more to sophisticated readers than to beginners in reading epistemology.   In contrast, I believe that Hilbert’s lecture will be enjoyed by almost all DDQ readers.

Electronic Health Records: "IEEE Spectrum"

A long-standing dream within the medical profession is “the longitudinal patient record”―a birth-to-death collection of all medical reports on an individual.  The October 2006 number of IEEE Spectrum (v. 43, no. 10) proposes a comprehensive system of electronic health records, linking hospitals, general practitioners, specialists, and insurance offices and replacing paper-based files with accessible digital records.  It promotes benefits such as the ability to monitor for pandemics and enabling doctors to focus on preventative care.

This particular silver lining comes with its own cloud.  A January 2007 BusinessWeek article, Diagnosis: Identity Theft suggests that “for $60, a thief can buy your health records—and use them to get costly care.”  The impact on such identity theft victims apparently does not end with their clearing up incorrect bills.

Practical Matters

Unusually Interesting Web Sites

FILExt, the file extension source, is a good starting point for learning about unfamiliar file types.

FlightStats provides a wealth of information about flights, airlines, airports, and much more related to commercial aviation. 

Sun’s Java Web Start provides a platform-independent, secure, and robust deployment technology.  It helps developers deploy applications to end users by making the applications available on a standard Web server.

WebNote is a note-taking Web application. You type something, save your workspace, and then can revisit it from any computer.  Backpack is a similar application that includes calendaring, reminders sent to your telephone, and the ability to share pages online. 

Rollyo is a personalized search engine service that allows you to customize which groups of sites you search at one time.  Based on Yahoo! Search, Rollyo also lets you share your searches with friends.

PC Magazine lists 99 interesting “undiscovered” Web sites and its choice of 101 top classic Web sites.

Price Watch

Laptop computer

No brand named, AMD Sempron 3000+, 256Mb, 15” screen, 40Gb HDD, CD-RW/DVD ROM, Win/XP Home

$490.

each

Compact PC

HP Slimline S7500N, AMD Sempron 3300+, 512Mb, 200Gb HDD, Double Layer DVD RW, Win/XP Home

$425.

each

Desktop PC

Compaq Presario SR1900NX, Intel Celeron 3.2 GHz, 512Mb, 533MHz FSB, 120Gb HDD, CD-RW/DVD-ROM, 17” CRT

$325.

each

PC main memory

OCZ PC3200 DDR 1 Gb

$86.

$86/Gbyte

HDD portable

Wolverine 2.5” w/enclosure, 120 Gb

$98.

$0.82/Gbyte

HDD portable

Soyo 1.5” w/enclosure, 20Gb

$58.

$2.90/Gbyte

HDD external

ACOM 250 Gb, 7200 rpm w/backup SW

$75.

$0.30/Gbyte

HDD NAS

Anthology 1 Tb, Raid, USB 2.0 and Firewire, w/backup SW

$600.

$0.60/Gbyte

HDD NAS

Buffalo 2 Tb, Raid, Gigabit Ethernet and USB 2.0, w/printer attachment support

$970.

$0.48/Gbyte

HDD for laptop

Fujitsu 120 Gb, 2.5”, 5400rpm, 8Mb buffer

$130.

$1.08/Gbyte

DVD Writer

Hi-Val 16x +R/-R, 8x Double Layer +R

$33.

each

Flat panel display

Emprex 17”

$119.

each

Color laser printer

Samsung CLP-510, 1200 DPI, 64Mb, 25ppm B/W, 6ppm color

$380.

each

Color laser printer

Minolta 2400W, 400 DPI, 20ppm B/W, 5ppm color

$192.

each

Digi-cam memory

SD or CF, 2Gb

$28.

$14./Gbyte

Telephone

Uniden 2.4GHz, with answering machine, 3 handsets, caller ID, call waiting, 3-way intercom

$43.

each

Antivirus program

ZoneAlarm

$3.

each

These prices include California sales taxes (8.25%).



[1]     C.P. Snow. The Two Cultures, Cambridge U.P., 1959. 

      Lawrence Kraus discusses the issue anew in Questions that Plague Physics, Scientific American 291(2), 82-85, August 2004.  It also figures in the Stanford Encyclopedia of Philosophy article about Ernst Cassirer’s work.

[2]     A descriptive PLANETS presentation is available.

[3]     Library of Congress, Preserving Our Digital Heritage, NDIIPP Plan, 2002, page 22.

[5]     William Lefurgy, “What if NDIIPP knew what NDIIPP knows?”  NDIIPP Website, 2006.  Also ARL briefing, October 2005.

      Abby Smith, Distributed Preservation in a National Context, D-Lib Magazine 12(6), June 2006.

[6]     Ideas elaborated in the forthcoming D-Lib Magazine article were communicated privately to the NDIIPP management at the Library of Congress in May 2004, but seem to have been ignored.  This 2004 letter is now made publicly available.

[7]    My expectations are conditioned by what is typical for Silicon Valley R&D denizens—that executive managers expect substantial practical progress within a year or so from funding a project that has clear objectives.  In contrast, NDIIPP was funded six years ago.

[8]     “[T]echnology is rather easy.  Or more exactly, technology is the branch of human experience that people can learn with predictable results.  … a good many Englishmen have been skilled in mechanical crafts for half-a-dozen generations.  Some­how we've made ourselves believe that the whole of technology was a more or less incommunicable art.  It's true enough, we start with a certain advantage.  Not so much because of tradition, I think, as because all our children play with mechanical toys.  They are picking up pieces of applied science before they can read.”                      C.P. Snow, The Two Cultures, 1959.

[9]     Paul Conway, Preservation in the Digital World, CLIR pub62, March 1996.

[10]    Anne Okerson 2002, YEA: The Yale Electronic Archive, page 53

[11]    James Fallows, File Not Found: Why a stone tablet is still better than a hard drive, The Atlantic Monthly, Sept. 2006.

[12]    Deanna Marcum, The Future of Preservation, keynote address at the Symposium on The 3-D’s of Preservation: Disasters, Displays, Digitization, Paris, March 2006.

[13]    H.M. Gladney and R.A. Lorie, Trustworthy 100-Year Digital Objects: Durable Encoding for When It's Too Late to Ask, ACM Trans. Office Info. Sys. 23(3), 299-324, July 2005.

[14]    H.M. Gladney, Preserving Digital Information, Springer Verlag, 2007, Chapter 10.

[16]    What seems to me the best of this literature is cited in Preserving Digital Information, Springer Verlag, 2007.

[17]    I do not know where the designation, “the digital heritage community”, originates.  I believe that I first saw it about 5 years ago in a Digicult Thematic Issue from U. Glasgow.  It occurs on other Web pages, such as those from Cultivate Interactive and a recent conference.

[18]    An example is the proposals funded by the first NSF Digital Library Initiative, which included aspects that had been realized in commercial products already in use by many customers in 1995.  This pattern of inattention continues in the U.S. National Information Infrastructure Preservation Program (NDIIPP).

[19]    What follows might be seen as a polemic that inadequately considers the value of cultural collections, but this is not intended.  Instead it is simply a reminder of economic facts that affect what heritage institution managers can accomplish.

[20]    Some readers will protest with examples of inventions made by university employees.  Such readers should consider the difference between invention and innovation, and the work necessary to make inventions widely useful.

[21]    In contrast, the British project described above considers outsourcing carefully, and some universities have built widely useful tools.

[22]    Long ago, some wag suggested that faculty tenure committees know how to count better than they know how to read.

[23]    This might extend to some leading periodicals.  I subscribed to the Journal of Chemical Physics until its growth overwhelmed both my ability to read and available shelf space.  I found its articles on quantum chemistry (my field in 1970) repetitive in that they applied the same calculations to one molecule after another, a durable game because there are many molecules.  However, these articles did not contribute to my qualitative understanding of chemistry.  (Realizing this led to my changing fields, since I had no idea how to do better.)

[24]    A “straw man” assertion is a ridiculous statement made primarily so that its author can knock it down.

[25]    This is an explicit part of scientific methodology worked out over many years of analysis.  Karl Popper discusses it under the label “falsification” in The Logic of Scientific Discovery, Anchor Press, 1959.

[26]    Ernst Cassirer, The Logic of the Humanities. New Haven: Yale University Press, 1961.

[27]    Rudolf Carnap, The logical structure of the world, U. Chicago Press, 1967.

[28]    See Michael Friedman, A parting of the ways: Carnap, Cassirer, and Heidegger. Open Court, 2000, particularly its last two chapters.

[29]    Readers will be quick to notice that “scientific” has several meanings, and perhaps argue that what Kant had in mind was not what we usually mean by the term today.  The point is problematical, as Walter Kaufmann suggests in Discovering the Mind: Goethe, Kant, and Hegel (McGraw Hill, 1980).  Apparently Kant was explicit in wanting to emulate the certainty he saw (erroneously) in Newtonian mechanics and Euclidean geometry as descriptions of the universe.

[30]    Library of Congress, Preserving Our Digital Heritage, NDIIPP Plan, 2002, page 31.

[31]    Ibid, page 40.

[32]    This question, slightly differently phrased, was recently sent to me by an NDIIPP participant, suggesting that it represents a common source of unease within the cultural heritage community.

[33]    Details are available in papers by M.K. Johnson and H. Farid.

[34]    Ian Angell and Jan Kietzmann, RFID and the End of Cash? Comm. ACM 49(12), 91-97, December 2006.