Digital Document Quarterly

Perspectives on Trustworthy Information

Volume 7, Number 3, 3Q2008

 

 

 

Past DDQ numbers and Glossary

HMG Consulting

Saratoga, CA 95070

©  2008, H.M. Gladney    ISSN: 1547-8610

 

Acknowledgements

 

Information Science and Scholarly Writing

A Short Life for Information Science (IS)

University Information Science departments are likely to shrivel up and disappear.  What motivates this DDQ 6(3) prediction is partly that IS participants seem not to have identified any unique theoretical base.  The fundamentals of IS are epistemology and philosophy of language, which continue to be the purview of departments of Philosophy.  Furthermore, most of what might be the IS practical side is handled by software engineers and departments of Computer Science.  What's left is little more than 60-year-old library management topics—what used to be taught in a “Library School”–with relatively obvious extensions for digital holdings.

A current IS enthusiasm, ontologies, suffers from a different kind of problem.  No other university faculty—not historians, not chemists, and not professors of literature and the fine arts—wants outsiders' help at organizing their topics.  University faculty are happy to have librarians' help for routine tasks, such as finding books, but do not want librarians' opinions about their own work.

For current research, digital periodicals have pretty much replaced paper versions.  Growing numbers of university faculty prefer on-line means for literature search and access, using services that mostly come from elsewhere than their own campus libraries.  Consider:

Know your library user—and worry about who's not using the library.  That's the main advice to librarians in a new white paper that notes "a growing ambivalence about the campus library" among faculty members ...  [A report from] Ithaka, a nonprofit organization that promotes the use of technology in higher education, ... probes the relationship between libraries and the faculty at institutions of all sizes, and how the digital shift is altering that relationship.[1]

The prestige of research librarians and professional archivists is further threatened by massive book digitization[2] by commercial enterprises.[3]  At the moment, the digitized content has quality constraints that limit how it can be used, and is encumbered by ill-understood intellectual property constraints.  Over time, such problems are likely to be repaired.

None of this implies a reduced role for “world class” libraries—just the contrary, in fact.  I am thrilled by ready access to the catalogs of the University of California, the Library of Congress, the British Library, ..., and by a San Jose State U. (SJSU) service that delivers requested volumes to a nearby public library.  For things not held by SJSU, I have access to Santa Clara U. stacks 15 minutes from home and to U.C. Berkeley stacks an hour's drive away.

What stimulates this DDQ 7(3) return to the topic—imminent demise of university IS—is renewed encounter with McLellan's InterPARES report.[4]  This article illustrates, better than anything else that I recall, why the field cannot continue to be represented by academic departments that are organizationally parallel to Computer Science, History, Physics, and so on.  The limitations and a sense of frustrated despair that I read between this article's lines recur less vividly in a second recent article.[5]

These articles are similar to other surveys—descriptions of how research libraries are trying to handle a challenge that seems to be beyond the skills of most of them (if what the articles describe are indeed the efforts of their best and brightest).  McLellan shows that repository institutions have no agreed file format choices.  Her conclusion seems to be that “certain dilemmas” arise; her recommendations are to:

1.       Clarify terminology;

2.       Distinguish between file formats, wrapper formats, and XML formats, and ensure that specifications are complete;

3.       Require that XML files are well-formed and valid;

4.       Specify any restraints on file compression, and prefer uncompressed files;

5.       When in doubt, choose formats accepted by other repository institutions; and

6.       Where possible, work with records creators to persuade best format choices.

Such weak recommendations could have been made after a few telephone calls that included at least one to an experienced software engineer, saving the work of McLellan's study, and its expense to the funding public.  The obvious conclusion is that institutional experience shows preservation cannot rely on file format standards, except for very few cases.  Nobody counts on near-term improvements to this situation.  These conclusions either escaped McLellan or required too much courage to announce.

When I published the cited DQ 6(3) column, I thought it might stimulate outraged protest, or at least a whimper.  Nobody said anything.  As usual, such silence implies very little.  Any of several different speculations might be offered.  As I have done for other topics, I challenge and invite any DDQ reader's critical commentary on the opinion just expressed, and offer anybody who asks to cite or to publish concise reactions in future DDQ numbers.

Critique of Scholarly Writing: Only the Best is Good Enough

Many digital preservation community members with liberal arts backgrounds seem uncomfortable with scientific method, style, and jargon.[6]  For instance, my recent submission to an archivists' periodical[7] was rejected even though the referees thought it highly pertinent.  They might have agreed that it was correct, had they understood it, but they were unfamiliar with the literature it cited.  The editor explained that the article was too far ahead of her readers, and we agreed that I would withdraw it, replacing it with a tutorial  review of the state of the art.[8]

This episode has a sad implication (assuming that the referees were typical professional archivists).  The pace of software innovation apparently exceeds that of archivists' self-education.  This might have the consequence that archivists have little influence on the future of their own institutions!

A reader suggested that I describe methodological and stylistic elements of DDQ—elements that have nowhere been concisely expressed.  Guidelines taught to me as a University of Toronto physical sciences undergraduate include that a scholar must be familiar with other scholars' work and that:

l       Material appearing in earlier articles, whether by other authors or oneself, should be cited instead of being repeated except as needed to identify the ideas intended.[9]

l       Science and engineering proceed by partitioning topics into parts that interact relatively weakly (“divide and conquer”), and by handling interactions between parts as corrections only later.[10]   Models or abstractions that ignore evident complications of observed behavior are widely used.[11]

l       A report of original thinking is permitted to be surprisingly narrow, partly to keep the exposition at hand concise, trying for depth rather than breadth.  A research article is likely to discuss context primarily to identify its topic, leaving breadth to review articles and textbooks.  Educating readers is seldom a priority.  They are assumed to obtain any education needed elsewhere.

l       Heeding Popper,[12] technical authors work hard to avoid errors and single points of failure.

l       That a bridge should be beautiful is less important than that it should not collapse.  However, elegant design is incredibly effective towards technical correctness.

l       Opportunities for progress can be found by analyzing earlier authors’ difficulties.

l       An original author is likely to focus on the most basic and most difficult instance of his topic, because if that is correctly solved, less difficult instances can be handled without much additional effort.

Scientists and engineers tend to avoid normative prescriptions because people detest being told how to behave.  Instead they identify results that somebody might want and methods that might work.  Instead of saying, “You should use rule set Y”, an engineer is likely to say, “Should you want result X with properties Z under circumstances W, consider the method defined by rule set Y”.  Here, Z might be something like “with least human effort” or “with small likelihood of errors or failures” and W might estimate end user skills, as in “for undergraduates”.

Of course, such careful language can be tedious, obscuring key messages in a welter of contingencies.  Scholars therefore often take short cuts, but stay ready to agree that more care might be needed.

Anybody inclined to dismiss notions like these might reconsider the immense success of science and engineering over the last 150 years.

How science should be applied is full of moral and ethical choices that do not belong to science itself.[13]

Not Necessarily Novel

During the years in which I have written DDQ, I have steadily become more unhappy with the literature that it cites.  This DDQ number begins an exploration of such dissatisfaction.

In a nutshell, I am coming to the opinion that most authors from IS and related fields do not write for  readers' benefit, but instead primarily for their own satisfaction and to advance career objectives summarized by “publish or perish”.  According to a well-known caricature, “tenure committees know how to count, but not how to read”.  If one believes this, paying attention to other peoples' articles can seem a waste of time.  Concern for the novelty of one's own thoughts can seem distracting.

Honest scholars are likely to resent the consequences.  Reading becomes a distasteful and wasteful burden of searching for a few nuggets in a wasteland.  The nuggets seem distressingly rare.  It is difficult for readers to learn what is truly new and significant.

Digital Archiving Research: Not Necessarily Novel

As will be discussed below, many articles labeled as addressing digital preservation research are instead about digital content management (CM), an older topic that seems to be relatively mature and to be handled well by many commercial and open source offerings.

I find it very difficult to determine which LDP requirements[14] the most hyped trustworthy digital repository (TDR) offerings, such as SDSC iRODS, Cornell Fedora, and MIT Dspace, actually cover and how they do it.  I also find it difficult to discover what novel ideas their authors believe they are exploiting, or what significant features their software offers beyond what is available in commercial offerings.  Of course the emphasis and style of academic high-level digital repository components differ from those of commercial examples because they are tailored to different end-user expectations.  Nevertheless, I challenge the authors and DDQ readers to identify specific differences related to novel thinking.

To compare academic and commercial CM offerings would be a time-consuming and tedious task, partly because their descriptions are couched in different styles and different jargons.  The academic work is described mostly in formal periodicals with on-line versions.  The commercial work is described mostly in trade press periodicals.  What galls me is that each literature is written as if the other did not exist.  The trade press does not have a habit of citing prior literature.  Academic authors never mention commercial offerings, and seem to cite themselves more frequently than they cite others.

Commercial repository suppliers might not advertise high-level software components because they handle specialized institutional requirements through custom service contracts.  Commercial offerings might be inferior in front-end features to academic offerings, but probably have higher reliability and need  fewer local skills to deploy.  Over time, the commercial offerings are likely to prevail because of business-world imperatives associated with customer support and customer demands.  In contrast, each academic offering is likely to wither when its authors move on to other activities or retire from employment.

It seems prudent to ask, can a digital repository offering be trusted for long-term preservation if its using institutions have no formal commitments from its providers?

Patented Technology: Not Necessarily Novel

Two-and-half years after the U.S. Patent and Trademark Office (USPTO) initiated a reexamination of Amazon's 1-Click Patent,[15] the company delivered 20 kg. of documents with about 185 citations to the USPTO Examiner assigned to the case, asking for “favorable action”.  So much for Bezos' pledge of less work for the overworked Patent and Trademark Office!  Nobody should be surprised that USPTO is overloaded and that this is part of the reason that it issues many patents of doubtful quality.

The scope of patent eligibility has been a political and legal issue[16] at least since the Prater-Wei patent was litigated 40 years ago.[17]  I first became aware of the issue when, as a staff assistant at IBM HQ in 1970, I was asked to analyze Prater-Wei for Emanuel Priore, IBM V.P. and Chief Scientist (deceased).

Widely considered the first software patent, Prater-Wei was about calculating temperatures for petroleum fractionation.  At the time, software was not patentable, so the authors described a non-computer method of choosing the temperatures.  What amused me was that the central claim was matrix inversion.  The authors disguised this by using 19th-century algebraic notation instead of modern matrix notation.

Prater-Wei could not have been enforced in litigation because it failed the novelty requirement.  A Telkowski patent seems a modern example of the same difficulty.

The invention includes one or more integrity manager applications, each of which monitor the integrity of an aspect of a data archive.  Some integrity manager applications monitor the integrity of processes executed by the archive system, and other integrity manager applications monitor the integrity of communication paths in the archive system.  ...  Further, an event integrity manager application executes predetermined events triggered by characteristics of documents stored in the archive system and ensures that all events have been properly executed.[18]

The assignee, J.P. Morgan Corp., probably understood this weakness, but sought the patent to prevent some other rascal from obtaining a similar patent and forcing J.P. Morgan into expensive litigation.

Digital Archiving and Preservation

Digital Preservation Notes

According to a May Inside Higher Education article, libraries are increasingly seeing digitization as a preservation strategy.

DPE has released the European Quarterly Preservation Digest providing an Overview of the current activities of European digital preservation projects.  It provides information from Planets, CASPAR, LIWA, SHAMAN, DRAMBORA and DPE.

LoC NDIIPP, JISC, The Open Access to Knowledge (OAK) Law Project, and The SURFfoundation have released an international study on the Impact of Copyright Law on Digital Preservation.

The Blue Ribbon Task Force on Sustainable Digital Preservation has published a bibliography.

Modern technology makes photograph manipulation easy to execute and hard to see, but also enables falsified image detection.[19]  In the entertainment world, photographic fakery is called “special effects” and, being expected and not an attempt to defraud, is socially acceptable.

Federal Files Blip Into Oblivion

A September New York Times article publicizes the U.S. Government information preservation challenge.  Quoting from government officials and archivists, Robert Pears summarizes:

Federal agencies have rushed to embrace the Internet and new information technology, but their record-keeping efforts lag far behind.  Moreover, federal investigators have found widespread violations of federal record-keeping requirements.  ...

Experts worry that items preserved in digital form may not be readily accessible in the future because the equipment and software needed to read them will become obsolete.

“All of us have stored personal memories or favorite music on eight-track tapes, floppy disks or 8-millimeter film,” said Allen Weinstein, the archivist of the United States.  “In many cases, these technologies are now relics, and we have no way to access the stored information.  Imagine this problem multiplied millions and millions of times.  That’s what the federal government is facing.”

The National Archives is in the early stages of creating a permanent electronic record-keeping system, ... The electronic archive is behind schedule and over budget.  But officials say they hope that the project, being developed with Lockheed Martin, will be able to take in huge quantities of White House records when President Bush leaves office in January.

The on-line New York Times version links to information provided by NARA.

The comparable British situation seems to be that The National Archives has no plan, instead asking repeatedly for outside comment without reacting to what is suggested.[20]

What Do People Mean by “Digital Preservation”?

DDQ has assumed that “digital preservation” had to do with measures for keeping records useful for readers—measures beyond what any writer does as part of preparing to publish.  It has consistently applied different labels, such as 'content management' and 'document editing', to subjects that were current before preservation was addressed by the Invest to Save task force report.[21] 

Some authors use definitions close to those assumed in DDQ; for instance, consider:[22]

The phrase "digital preservation" creates confusion since readers, familiar with traditional approaches, assume that "preservation" involves the use of well-defined techniques to prevent the original artifact from deteriorating further and to perhaps even improve it to the point where it can be used again.  Digital preservation involves quite different methods, skills, and outcomes and can complement traditional preservation services, ...   definition ... proposed by the Research Libraries Group as follows:[23] 

Digital preservation is defined as the managed activities necessary: 1) For the long term maintenance of a byte stream (including metadata) sufficient to reproduce a suitable facsimile of the original document and 2) For the continued accessibility of the document contents through time and changing technology.

In contrast, part of an American Library Association proposal suggests that:

Digital preservation strategies and actions address content creation, integrity and maintenance.

Content creation includes:

·       Clear and complete technical specifications

·       Production of reliable master files

·       Sufficient descriptive, administrative and structural metadata to ensure future access

·       Detailed quality control of processes

Content integrity includes:

·       Documentation of all policies, strategies and procedures

·       Use of persistent identifiers

·       Recorded provenance and change history for all objects

·       Verification mechanisms

·       Attention to security requirements

·       Routine audits

Content maintenance includes:

·       A robust computing and networking infrastructure

·       Storage and synchronization of files at multiple sites

·       Continuous monitoring and management of files

·       Programs for refreshing, migration and emulation

·       Written disaster prevention and recovery plans

·       Periodic review and updating of policies and procedures

This fails to distinguish ordinary digital library procedures from what is required for LDP, by which I mean  "the complex of measures required for and/or undertaken to mitigate digital object unreliability caused by ravages of time, including human misfeasance, fading human memory, and technological obsolescence".

Readers might object that these paragraphs are merely pedantic quibbles.  However, a good choice of jargon is an essential prelude to partitioning the preservation challenge into nibbles for small R&D groups.   We need to start by identifying portions that can be solved and later recombined.  To talk about such portions, it is invaluable, or even essential, for us to share labels that distinguish the components being considered.

Time to Build Bridges

A recent experience led me to try to describe the technical part of LDP for readers educated in the liberal arts.  The abstract of the first part of this review[24] reads:

We focus on ... identifying precisely what can be preserved and what cannot.  Our answer comes from [epistemology], especially its discussion about the limits of what can be communicated. 

Philosophers have taught that answers to critical questions have been obscured by "failure to understand the logic of our language".  We can clarify difficulties by paying extremely close attention to the meaning of words such as 'knowledge', 'information', 'the original', and 'dynamic'.

What is valuable in transmitted and stored messages, and what should be preserved, is an abstraction, the pattern inherent in each transmitted and stored digital record.  This answer has, in fact, been lurking just below the surface of archival literature. 

To make progress, archivists must collaborate with software engineers.  Understanding perspectives across disciplinary boundaries will be needed.

The abstract of the second part[25] includes:

Most digital archival repository technology ... has been thoroughly understood and widely deployed for more than a decade.  This technology is inadequate for LDP because it includes no mechanisms for reliably assuring authenticity and intelligibility of digital documents for 50 years or longer.  CM provides for near-term preservation without handling long-term preservation, which must overcome risks associated with technological obsolescence and fading human memory.  We show how to mitigate these risks.  Implementing software would be a small addition to widely deployed CM offerings.

Nothing in this review is new.  Instead it is intended to convey an essentially technical topic to non-technical readers.  DDQ readers are encouraged to critique these articles, especially if they have suggestions how to simplify their explanations.

Partitioning Digital Preservation for Individual SW Contributions

Many authors have discussed digital preservation without contributing to progress.  There are at least two problems apparent in the literature.  (1) Much of what is discussed under the 'digital preservation' label in fact competes with highly refined CM software that is widely deployed.  Unfortunately, few authors advocating “Trusted Digital Repositories” specifically identify novel features of what they propose, or how those novel features could be implemented as enhancements to deployed CM offerings.[26]  (2) Few contributors have partitioned software they assert is needed into portions that small teams could realistically expect to provide in 2 to 3 years.

I was reminded of such difficulties by a draft article from the U.S. National Institute of Standards and Technology (NIST).[27]  Engineering documents are both very important and also interesting as an example that presents unusually difficult preservation challenges.

CAD/CAM files present some particularly difficult challenges, partly because of their complexity, but more because one must be sensitive to very small imprecision.  For instance, the machine part created by a computer-driven milling machine might have very small ridges that are complex functions of the automation instructions, the way the milling machine works, and the order that instructions are followed.  A first question is whether or not these are significant, and if so, whether instructions for a current-year machine would create a trouble-free replacement part when they control a milling machine built 50 years from now.  A second complexity has to do with format conversions that might be needed from the CAD program some engineer uses and input to a CAM program that will be used ..., when today's CAM program is incompatible with the CAM machine available 50 years from now.

This article, combined with thinking about an editor to create a OAIS Submission Information Package,[28] inspires the following speculation.

(1)     Digital preservation is too complicated for any small research group to make a significant contribution without identifying an essential piece whose separate solution can contribute to an integrated whole. 

(2)     As part of such "divide and conquer", it is productive to separate near-term digital preservation (NDP) from LDP.  In this context, what is meant by near-term is everything that CM services need to provide in order to satisfy clientele right now and for 5-10 years.[29]

(3)     One reason for considering NDP separately from LDP is that their assumptions and requirements are in sharp contrast.  For instance, in the near-term repositories need to be open to new ideas, new software components, new implementations, and new user wishes whenever they arise.  CM tools need to be flexible and open-ended.

In contrast, LDP software needs to emphasize stability.  LDP tools and content should not change even when their contextual infrastructure changes.

(4)     The only repository software that needs LDP adaptation is its ingest component.  However, little specific can be said about this without specifications for content and metadata formats to be ingested.

(5)     Whenever people start generating LDP metadata and content, the search technology community will rapidly move to incorporate it.  Information-finding experts are poised to exploit data whenever it becomes available.  The preservation community doesn't need to help them.  In fact, indexing and search experts will almost surely ignore advice from preservationists.

(6)     Another opportunity for "divide and conquer" is partitioning  between (a) what is general to all file formats and purposes, and (b) what is specific to a class of file formats.  Furthermore, each of (a) and (b) can be divided into data schema specifications and editors (and other tools) for each data schema.  The TDO scheme and a TDO editor provide for (a).[30]  For (b), a scheme is needed for each file type (e.g., OpenOffice text documents), together with editor(s) for that file type (e.g., the OpenOffice Text editor).  The editor for each file format must be interfaced to the TDO editor.

Structuring along these lines is essential for preserving engineering documents.  CAD/CAM files will need to be packaged (tightly bound) to corresponding image, text, video, ... bit-strings, as well as to metadata bit-strings that other players (e.g., the Library of Congress) devise, standardize, and build tools for.

File Formats for Long-Term Digital Preservation (LDP)

Few file format standards are today reasonable candidates for LDP.  Two exceptions are JPEG[31] and PDF variants.

A Digital Preservation Coalition report holds that Portable Document Format/Archives (PDF/A) is suitable for preservation, but only in the context of comprehensive records management.[32]  PDF is, in fact, a complicated topic treated by several standards.  See ISO 19005 (PDF/A, 2005) and ISO 32000-1 (PDF 1.7, 2008).  Inexpensive products convert PDF documents to MS Word representations.[33]

Recommendations for Reading, Listening, or Viewing

NASA has made historic photographs, film and video available to the public for the first time.  With the Internet Archive, it will merge 21 collections into a searchable online resource.  The content covers many NASA programs, including Apollo missions, Hubble Space Telescope views, and experimental aircraft.

Computerworld has surveyed R&D efforts at HP, IBM and Microsoft ($17 billion annually) and asked: Are these companies supporting long-term basic research, or just short-term, product-oriented work?

·          HP is consolidating its focus on a few 'big bet' projects—information explosion, dynamic Internet services, content transformation, intelligent infrastructure, and sustainability.

·          IBM has four high-risk research areas—nanotechnology, Internet computing, integrated systems and chip architecture, and managing business integrity through mathematics.

·          Many Microsoft projects target major product lines, but others have little apparent application to anything the company sells today.

Collapse of Sub-prime Mortgage Market

The collapse of the US sub-prime mortgage market, written for the Australian Institute of Chartered Accounts, is extraordinarily clear and complete.[34]  Its preface includes:

The report provides an insight into the many factors involved in the United States sub-prime mortgage market collapse, including the effects to financial statements of banks and other financial institutions and investors in mortgage-backed securities around the world.

...  this paper will assist in understanding the complex issues, and contribute to an appropriate understanding of how the ... collapse occurred.  Through the fictional story of Mr. and Mrs. Jones the paper examines the role of those responsible: borrowers; investors; and brokers and those who protect them: preparers; auditors; standard-setters; and regulators and makes recommendations of their roles in the future.

...  [N]one of these parties can guarantee that investors will receive reliable and relevant information to assist in their decision making process, without the will and support of others.  Regulators and standard setters ... need to consider all parties, [not just] traditional scapegoats, [accountants and auditors].

...  the United States Financial Accounting Standards Board has announced [intention to change]   accounting standards .. and propose working with the International Accounting Standards Board to achieve convergence in de-recognition of assets and liabilities.

Jeffrey D. Sachs' Common Wealth

The current liquidity crisis is a serious problem.  However its gravity pales in comparison to systematic economic problems that have been known for at least a decade, but have not yet been addressed.

To understand what faces our children and grandchildren, read Common Wealth.[35]  I recommend it whole-heartedly, not because it is an easy read (it is not) or fun (it most definitely is not that), but because it is a thorough and thoughtful treatment of critical economic, social, and scientific issues that will affect our children, grand-children, and later generations, more profoundly than the kind of disruptions represented by recent securities market fluctuations.  This book collects, with authoritative citations and quotations, critical thinking about population and ecological pressures.  These are related to human beings deliberately slaughtering other human beings (while they thoughtlessly slaughter other species).

Historians will remember Thomas Malthus 1798 prediction of a population catastrophe.  Recent Scientific American articles testify that parts of the problem, water shortages[36] and stifling pollution,[37] are already with us.  Sachs has painted a realistic picture and makes excellent recommendations.

Fundamentals for Archiving

A few algorithms ensure the correctness of hundreds of millions of daily business transaction records.  Expanding on 30-year-old fundamental ideas,[38] a Queue article[39] neatly summarizes how this works:  In partitioned databases, trading some consistency for availability can lead to dramatic scalability improvements.”

News

Soaring Demand for Storage Space

A Wall Street Journal article reports that, as companies are forced to store more data, they are squeezed by space and cost constraints.  Shipments of magnetic storage on disk and tape have been growing at a compound rate of 50% per annum!

Meaningless "Cloud Computing"

A Wall Street Journal article suggests that “'Cloud' has been a go-to metaphor for almost as long as the Internet has existed, conveying the sense that the Internet was intangible and bigger than the sum of its parts”.  Apparently my inability to identify a useful meaning for the term is shared.

Video Cameras in Police Cruisers

A Washington D.C. board has urged fitting police cars with video cameras, copying Maryland.   Maryland police cruisers are being fitted with roof-top cameras for imaging license plates of cars whizzing by.  These are  immediately processed by the state Dept. of Motor vehicles.  If a license number is correlated with an infraction, the system signals the officer to stop the vehicle. 

Critics point out that this “service” is much faster than that for a citizen who telephones the DMV.  For that, a Marylander might spend a day hanging on the phone.

Archiving an Archenemy

A UC-Davis scholar is analyzing Al-Quaida audiocassettes:  On the seventh anniversary of the 9/11 attacks, Flag Miller of U.C. Davis has analyzed some of the 1,500 audio cassettes that were retrieved after bin Laden fled American soldiers advancing on his Afghan residence.

Advanced Quadruped Robot

BigDog is the alpha male of the Boston Dynamics family of robots.  It is a quadruped robot that walks, runs, climbs rough terrain, carries heavy loads, and resists attempts to upset it.  See the video.

Practical Matters

Browser Enhancement for Fast Link Preview

InterClue, available at no cost, provides fast preview of the target of any displayed link.  When you hover your mouse pointer over a link, one or more LinkClue icons will appear.  When you rest the mouse pointer over one of these icons, a summary of the linked information will appear, showing some of: a text content summary; a small snapshot of the page; sizes and dates of linked files; and next actions possible.

It took me 5 minutes to understand the features and to tailor InterClue.  It is a major time-saver, reducing the number of web pages that I open only to find that they are not quite what I want.

Software for Musicians

Musicians like to study tunes and techniques while listening to passages over and over again.  Usually, they cannot control the playback speed.  An avid amateur cellist enthusiastically showed me Amazing Slow Downer, software that provides for slowing down music from CDs or files in any of many audio formats, such those held by iTunes.  Speed and pitch are separately variable over a wide range.

An MIT Technology review article describes another powerful tool, Direct Note Access, a digital sound separator to extract the individual notes of a recorded chord.

Price Watch

For those who want the amusement of assembling a PC, Kingsley-Hughes’ Best Kit List for Aug/Sept 08 might be helpful.

Some recent best prices, with state tax included, follow.

PC

Shuttle KPC K4500 extremely compact PC

$220.

each

HDD

Maxtor 1Tb SATA/300

$162.

$0.16/Gbyte

HDD external

Hammer 500Gb USB 2.0 

$86.

$0.17/Gbyte

Digital camera

Olympus SP-570, 10 MP, 20x zoom, 2.7” LCD, stabilization

$520.

each

Digital camera

Panasonic FZ18K, 8.1 MP, 18x zoom, 2.5” LCD, stabilization

$300.

each

Wireless-N router

Airlink 802.11n 300Mbps

$24.

each

Wireless-N adapter

Airlink PC, PCI, or USB

$17.

each

Flat panel display

Manufacturer not identified 19”

$109.

each

Flat panel display

Manufacturer not identified 21.6”, 1680x1050, 1000:1 contrast

$165.

each

Flat panel display

Hyundai X224W 22” widescreen LCD

$184.

each

GPS device

Garmin NUVI 200 3.5” TFT touch screen w/voice directions

$140.

each

GPS device

Garmin NUVI 200W 4.3” TFT touch screen w/voice directions

$185.

each

GPS device

Garmin NUVI 260W  4.3” TFT wide screen w/text-to-speech

$258.

each

LCD HDTV

Sony 1089P 40”

$1300.

each

LCD HDTV

Sony 1089P 46”

$1620.

each

Plasma HDTV

Panasonic TH46PZ85U 46” 1080P, 10,000:1 contrast ratio

$1350.

each

Plasma HDTV

Panasonic 50” 1080P, 10,000:1 contrast ratio

$1520.

each

LCD HDTV

Mitsubishi 1080P 60”

$1620.

each

LCD HDTV

Mitsubishi 1080P 65”

$1940.

each

LCD HDTV

Mitsubishi 1080P 73”

$2820.

each


 



[1]     Jennifer Howard, Scholars' View of Libraries as Portals Shows Marked Decline, Chron. Higher Education, Aug. 2008.

[2]     Karen Coyle, Mass Digitization of Books, J. Academic Librarianship 32(6), 641-5, 2006.

[4]     E.P. McLellan, Selecting Digital File Formats for Preservation, InterPARES 2 report, 2007.

[7]     H.M. Gladney, Economics and Engineering for Preserving Digital Content, preprint, December 2007.

[8]     See DPC/PADI What's New in Digital Preservation newsletter and similar periodicals.

[9]     In DDQ I sometimes make exceptions to help readers.  For instance, I often summarize ideas already expressed, even though citation of an earlier article or book would satisfy rigorous scholars.

[11]    Choosing models that “work” is a high art.  A famous example is the set of tricks used by Albert Einstein in his annus mirabilis papers.  More recently, Richard Feynman's Lectures on Physics are immensely admired by professional physicsts for their explanatory models.

[12]    Karl Raimund Popper, The Logic of Scientific Discovery, Hutchinson, 1959.  Translation of Logik der Forschung, Springer Verlag, 1935.

[13]    In philosophy, how to apply science and engineering belongs to ethics rather than to epistemology or scientific philosophy.

[14]    My current best summary of LDP requirements can be seen in the What Would an LDP Solution Accomplish section of Long-Term Preservation of Digital Records, Part I: A Theoretical Basis.

[15]    Peri Hartman, Jeffery P. Bezos, et al., Method and system for placing a purchase order via a communications network, U.S. Patent 5,960,411, September 28, 1999.

[16]    Pamela Samuelson, Legally Speaking: Revisiting Patentable Subject Matter, Comm. ACM 51(7), 20-3, July 2008.

[17]    This patent, originally filed by Mobil Oil Corporation in 1960, addressed computerized spectographic analysis.  It had many method and apparatus claims that could be performed either on an analog or digital computer, or with pencil and paper.  A Court of Customs and Patent Appeals (CCPA) decision is famous because the question "whether computer programs could contain patentable subject matter" was also before the CCPA.  See Application of Charles D. Prater and James Wei, U.S. CCPA, 415 F.2d 1378, November 20, 1968

[18]    William A. Telkowski et al.., System for archive integrity management and related methods, U.S. Patent 7,069,278, 2006.

[19]    Hany Farid, Digital Image Forensics, Scientific American 298(6), 66-71, June 2008.

[20]    This comment is based on Adrian Brown's reaction to my careful response to his and Karen Wilson's calls for advice.

[22]    Ronald Jantz and Michael J. Giarlo, Digital Preservation: Architecture and Technology for Trusted Digital Repositories, D-Lib Magazine 11(6), June 2005.

[26]    This topic is under consideration for a future DDQ number.

[27]    Joshua Lubell, Sudaresan Rachuri, Eswaran Subrahmanian, and Mahesh Mani, Sustaining Engineering Informatics: Toward Methods and Metrics for Digital Curation, submitted for publication, 2008.

[28]    In Gladney, loc. cit. endnote 25, see the section about Object Versions and Audit Trails.

[29]    This is the time period for which any service provider or software engineer implicitly commits itself/himself to be available to repair shortfalls in his offers and to respond to newly identified requirements.

[30]    H.M. Gladney, Digital Preservation in a National Context, D-Lib Magazine 13(1/2), Jan. 2007.

[31]    Paolo Buonora and Franco Liberati, A Format for Digital Preservation of Images: A Study on JPEG 2000 File Robustness, D-Lib Magazine 14(7/8), July 2008.

[32]    Betsy A. Fanning, Preserving the Data Explosion: Using PDF, DPC Technology Report, April 2008.

[33]    For instance, see PDF2Word, VeryDOC PDF to Word Converter, and LEADTOOLS ePrint Professional.  I have not tried any of these, since I use ScanSoft Omnipage for such conversion.

[34]    Patricia Doran Walters, The collapse of the US sub-prime mortgage market, Australian Institute of Chartered Accountants, 2008.

[35]    Jeffrey D. Sachs, Common Wealth: Economics for a Crowded Planet, 2008, ISBN 978-1-59420-127-1.

[36]    Peter Rogers, Facing the Freshwater Crisis, Scientific American 299(2), 46-55, August 2008.

[37]    Dan Fagin, China’s Children of Smoke, Scientific American 299(2), 72-79, August 2008.

[38]    K.P. Eswaran, J. Gray, R.A. Lorie, and I.L. Traiger, The Notions of Consistency and Predicate Locks, Comm. ACM 19(11), 624-632, Nov. 1976.

[39]    Dan Pritchett, BASE: An ACID Alternative, ACM Queue 6(3), May/June 2008.