Digital
Document Quarterly Glossary
A work in progress
|
|
HMG Consulting |
To communicate precisely is surprisingly difficult. Writing is more difficult than conversation
because no listener can signal confusion that a speaker might promptly
correct. Even though Digital Document Quarterly numbers are
edited carefully, aided by critical input by a small advisory group, I am not
as confident as I would like to be that readers will infer what I intend. The
difficulty is even greater for documents in long-term storage (Figure 1).
One can reduce the communication difficulty by providing careful definitions and contextual information. However this remedy creates its own hazards—lengthy explanations that try readers’ patience, blizzards of detail that obscure central points, and seeming pedantry.
Such difficulties hamper community attempts to design information sharing tools as emphasized in digital library literature. Different authors use even well-known terms, such as “archiving”, differently.
The definitions that follow are what is sometimes called “technical definitions”—definitions intended to help readers understand a specific article or body of writings. Typically, a technical definition is used for a word or phrase that has many different meanings in popular usage and/or in the writings of other authors. Often a technical definition does not conform precisely to any previous usage. Readers who are unfamiliar with the idea of a technical definition, or do not notice that this device is in effect, are likely to protest that the author misunderstands the meaning of the term (as they construe it).
Documents and digital objects are used to communicate
information. Information is a subset of
knowledge. Felix Mauthner said, “Philosophy
is theory of knowledge”.[2] Ludwig Wittgenstein commented, “All
philosophy is a 'critique of language' (though not in Mauthner's sense). … the
apparent logical form of a proposition need not be its real one.”[3] How words and phrases are used in DDQ is
strongly influenced by readings of 20th-century epistemology.
What
disturbed Mauthner, above all, was the tendency ordinary people have to attribute
reality to abstract and general terms. This
natural tendency to reify abstractions he regarded as the origin not just of
speculative confusion, but also of practical injustice and evil in the world.
Reification—to use a Machian phrase—begets all sorts of "conceptual
monsters." In science, these
include such misleading notions as force, laws of nature, matter, atoms and
energy; in philosophy, substance, objects and the absolute; among religious
ideas, God, the devil and natural law; in political and social affairs,
obsession with notions like the Race, the Culture, and the Language, and with
their purity or profanation. In all such
cases, reification involves assuming the existence of entities which are
"metaphysical." So Mauthner
considered metaphysics and dogmatism to be two faces of a single coin, which
was also the fountainhead of intolerance and injustice. Janik,
p.123[4]
It seems to me that verbs are less subject to this problem than are nouns. Cf. “knowledge” and “to know”; “information” and “to inform”.
|
abstract |
(noun)
summary of a statement, document, or speech; (verb) reduce by eliminating all
properties not essential to the concept or the class in question; (adj.)
expressing a characteristic apart from any specific object or instance. For example, abstract data types are
defined without any commitment to particular encodings for instances. |
|
access
control |
(noun)
security component which defines who may do what and administers these rules,
as defined by an ISO standard;[5] the
process of determining which uses of resources within an open system
environment are permitted and, where appropriate, preventing unauthorized
access, which is frequently subdivided into classes known as unauthorized
use, disclosure, modification, destruction, and denial of service. The other
parts of a complete security system ensure that the registered rules are
complied with, and that an audit trail is maintained. |
|
access path |
(noun) means of referring to an
entity by identifying positions in (a nest of) containing entities, e.g. John
Doe in the |
|
architecture |
(noun) abstraction of design, hiding
features not of interest to a user of the thing described; rules for
interfaces provided for some collection of entities and services; the choice
and structuring of what can be viewed and what manipulations can be performed
through these interfaces. |
|
archiving |
(noun) digital content management
needed to ensure ready access to reliable records immediately, in the near
future, and in the distant future. |
|
asymmetric key cryptography |
(noun) see
public key cryptography. |
|
archive |
(noun) (1) persistent storage used
for long term information retention, typically very inexpensive per unit
stored and with a long response time, and often in a different geographic
location to protect against equipment failures and natural disasters; (2)
collection of historical documents or records.12 Notice that these two definitions
identify quite different object classes. Archivists emphasize that the rules
and conventional procedures for their collections are different than those
for libraries. Briefly, what is
important to an archivist is the authenticity of each archived object, and
evidence for that authenticity. These
can be established without understanding the content itself. I.e., the associated metadata are, in a
certain sense, more important than the content. |
|
atomic |
(adj.) not decomposable into parts,
at least for the discussion of the moment.
For example, the integer 2 is
atomic and the list 2 4 6 is
not. |
|
attribute
(esp. of a digital object) |
(noun)
synonym for property; mathematical value that is a mathematical
function of the object. The
words most important to us, such as “value” and “function”, often have
multiple meanings and no synonyms with which we can eliminate ambiguity. Where minimal ambiguity and misunderstanding
are critical in DDQ, we include modifiers or other concise methods to reduce
the risks. Wiitgenstein’s lectures[6]
illustrate how fragile natural language is. |
|
auditor |
(noun) human role in which the
individual is responsible for checking that resources are not being misused
or misappropriated and/or that mechanisms to prevent misuse and
misappropriation are in place and being used as prescribed. |
|
audit trail |
(noun) sequence of records of
events deemed important to determine whether or not a set of resources has
been used in accordance with guidelines or limitations defined by appropriate
authorities; the results of monitoring each operation of subjects on
objects; for example, an audit trail might be a record of all actions taken
on a particularly sensitive file or a record of all users who viewed that
file. |
|
authenticate |
(verb) verify the identity of a
person (or other agent external to the protection system) making a request;
for a standard definition.[7] |
|
authentication |
(noun) mechanism for establishing
with known confidence that a token passing between processes belongs to a set
of allowed tokens; typically each such token identifies a subject and also
contains some secret that could only come from the single user authorized to
use this subject; if the token is acceptable, the subject is bound to the
issuing process–a step called login if
the user is a human being, i.e., the process can act on behalf of this
subject. |
|
authenticity |
(noun)
property of being associated correctly with sufficient provenance information
to convince any recipient
that the signer deliberately signed the document. The provenance information should not be
easily reusable, in the sense that it should be difficult to detach the
signature from one document and reattach it to a different document so that a
recipient is convinced that the signer actually signed the latter document.[8] |
|
authority |
(noun) (1) privilege and
responsibility to utilize and/or control some resource; (2) quality of
special value of information stored or conveyed, because of either knowledge
or official right to comment, as in “spoken with authority”; (3) especially
valuable commentator by virtue of superior knowledge, diligence, or
scholarship on the topic at hand, as in “Holmes was an authority on the forensics
of tobacco ashes”. The pertinence is
that the user of library information is at least implicitly interested in how
trustworthy each extracted datum is. |
|
bit |
(acronym)
contraction of the term "binary digit." |
|
bit-stream |
(noun)
potentially unbounded sequence of binary characters, typically an information
representation transmitted over a serial channel. |
|
bit-string |
(noun)
finite sequence of binary characters; a synonym for file or dataset
used to emphasize that it denotes an information representation readily
transmitted via a serial channel or stored on a disk or tape. |
|
(noun) acronym for binary large object, used to denote a
unit of data whose representation, meaning and interpretation are not
pertinent to the discussion at hand, such as the objects stored and
catalogued in a library. The acronym
can be construed also as binary little
object. |
|
|
bottom-up |
|
|
breadth-first |
See the
endnote diagram and table. |
|
cache |
(noun) specialized store used to
hold objects temporarily, often with the objective of more rapid access than
would otherwise be possible. In
computer systems, caches often are intended only for replicas of information
held more securely elsewhere; however cache
should be construed in the former, broader sense. |
|
catalog |
(noun) in computer science and
related fields, table relating names to names, objects or locations of
objects, and possibly also object descriptions; synonym for directory (q.v.); among librarians,
a specific kind of finding aid with one or several entries for each
collection element and conforming to a carefully documented standard. |
|
certificate |
(noun) in the context of
information security, an unforgeable object that attests to the accuracy,
correctness, completeness, and provenance of some information. |
|
consistent |
(adj.) of a data collection,
conforming to all externally specified rules pertaining to this collection
and required to define correctness. |
|
constraint |
(noun) rule relating (values of)
two entities or limiting the membership of a set. |
|
consumer |
(noun)
person who obtains and makes use of a document, including merely reading it,
whether or not this use is as originally intended by the document’s producer. |
|
context |
(noun) a set of pairs mapping from
names to entities or to other names; definition of the meanings of names; set
of bindings between names and entities.
For example, the meaning of “bald” depends on the context. If the context is English, “bald” means
“without hair”; if it is German, “bald” means “in kurzer Zeit” (“in a short
time”). |
|
countermeasure |
(noun)
mechanism that reduces vulnerability to a threat. |
|
credentials |
(noun) unforgeable data that
guarantee claimed identity. |
|
cryptanalysis |
(noun) study
and practice of various methods to penetrate ciphertext and deduce the
contents of the original cleartext message. |
|
cryptographic algorithm |
(noun)
mathematical procedure, used in conjunction with a closely guarded secret
key, that transforms original input into a form that is unintelligible
without special knowledge of the secret information and algorithm. Such algorithms are also the basis for
digital signatures and key exchange. |
|
cryptography |
(noun)
originally, the science and technology of keeping information secret from
unauthorized parties by using a cipher.
Cryptography is used for many applications that do not involve
confidentiality. |
|
data |
(noun)
information that is not intended to convey as much meaning as many similar
communications might convey. (The
boundary between data and information is fuzzy.) |
|
decryption |
(noun)
cryptographic procedure of transforming ciphertext into the original message
cleartext. |
|
delegate |
(verb) grant a subject permission
to grant further subjects privileges. |
|
depth-first |
(adj.) See
the endnote diagram and table. |
|
digest |
(noun)
much condensed version of a message produced by processing the message by a hash algorithm. Commonly,
the digest length is independent of the length of the original message. |
|
digital signature |
(noun) data appended to a message
to assure the recipient of the origin and integrity of the message; a
digitized analog of a written signature, produced by a cryptographic
procedure acting (commonly) on a digest of the message to be signed. |
|
digital signature standard (DSS) |
(noun) |
|
DRI |
(acronym) Digital Resource Identifier, a specific kind of uniform universal
identifier, as described in Preserving Digital
Information, §7.3.4. |
|
(noun) directed graph (q.v.)
in which no path starts and ends at the same vertex. |
|
|
directed graph |
(noun) graph whose edges are
ordered pairs of vertices. Each edge
can be followed from one vertex to the next. |
|
document |
(noun)
representation of any kind of information, such as a command, text, a
photograph, video or audio information, a scientific table, a spreadsheet, a
computer program, or any other kind of information, or any ordered or
unordered combination of such specialized kinds of information, whether
conveyed on a material substrate or represented and conveyed digitally. |
|
domain |
(noun) see function for mathematical context. In relation to security, a set of subjects
and information objects whose use is governed by a set of rules. |
|
durability |
(noun) in transaction processing,
the property that state changes of successfully completed transactions
survive failures. |
|
encapsulate |
(verb) hide selected information
from an external environment, as in certain programming language definitions
of data types. The unit of text involved
is called a capsule. |
|
encrypt |
(verb) scramble data according to a
secret transformation key, so as to make it safe for transmission or storage
in otherwise inadequately protected environments. |
|
environment |
(noun) relative to an activation,
the set of objects (and their values) reachable for a function evaluation. |
|
essence |
(noun) in communication, the
information that a speaker or writer intended to convey, in contrast to
inevitable accidental information. For
instance, in a lecture, the speaker’s voice pitch is usually accidental, not
essential, as are the page break locations in a printed document. |
|
(noun) set of facts demonstrating
the truth of some assertion, with each fact being either obvious or
objectively testable; the evidence for a document’s authenticity can be
either external (in the form of attached metadata) or internal (in the
meaning or representation of the document itself). |
|
|
fact |
(noun) a
thing done; an action performed or an incident happening; an event or
circumstance; an actual occurrence; an actual happening in a time or space or
an event mental or physical; that which has taken place. A fact is either a state of things (an
existence) or a motion (an event). |
|
faithful |
(adj.) of a data copy, conforming
accurately to some other data instance, usually identically bit by bit. |
|
finding aid |
(noun) librarian's term for a tool
that is not a catalog, but serves a similar purpose as a catalog to the
extent that something simpler (and much less expensive) can do; compare catalog. |
|
firmware |
(noun)
program information used to control the low-level operations of
hardware. Firmware is commonly stored
in read only memory (ROM), which is initially installed in the factory and
may be replaced in the field to fix mistakes or to improve system
capabilities. |
|
fix |
(verb) (1) of content such as text
or an image, make a relatively immutable on a physical medium, e.g., by
printing on paper or developing a photographic film; (2) repair; (3)
(colloq.) damage. |
|
folder |
(noun) digital object that contains other objects by
reference. In the digital analog of a
paper folder system, every document except one occurs in exactly one folder;
the folder relationship is acyclic. |
|
formal |
(adj.) pertaining to, or
emphasizing, the organization or composition of the constituent
elements. For example, in a formal
mathematical system the elements of discourse are not associated with
meanings; interest is limited to relationships between elements, which are
deduced from simpler relationships (axioms) on the basis of (a small number
of) combining forms. |
|
function |
(noun) in DDQ, always a
mathematical function. A function is
defined on two sets, the domain and
the range and consists of a
set of pairs in which the first component is from the domain and the second
component is from the range and in which there are no two pairs with the same
first component. A function is total if every member of the domain
is in some pair, and partial otherwise. In common usage, “function” has
many different meanings. Several of
these are used when discussing computing, sometimes within a single
discourse. This is a source of
confusion. |
|
generic |
(adj.) referring to all members of
a class, group, or kind. For example,
a generic operator denotes a
set of operations of (presumably) similar function; each member of the set
has different operand types. See also operation and operator. |
|
global |
(adj.) describing an entity which
is accessible without being explicitly mentioned in an operand of the program
defining an operation, or being derivable from explicit operands, or being
created by the operation. |
|
glyph |
(noun)
picture for a character of printed or written language. |
|
granularity |
(noun) measure of the level of
detail with which some data object set is accessible or is controlled by some
process or program being discussed.
For instance, for objects managed by the library system, the
granularity for access control might be items whereas the granularity for
copying to/from library stores might be item parts. |
|
graph |
(noun) picture, or its abstract counterpart, describing the connections among a set of entities. Used in tracts on “object orientation” to represent the relationships between entities for the purpose of making distinctions clear. The entities are denoted by points, called nodes, and the connections by lines, called arcs. If the direction of |