Notes about XML and XML Tools and Related Stuff: SGML, XSL, XSLT, UML, …

H.M. Gladney

HMG Consulting

20044 Glen Brae Drive

Saratoga, CA 95070

                        ©  2002-6 H.M. Gladney

20-Aug-06

Bibliographies and Links. 1

IBM’s “Working XML” (Summary) 1

The XML Cover Pages. 2

Standards Tracks and XML Standards Reference. 2

SOAP (Simple Object Access Protocol) and Java SAX (Simple API for XML) 2

OASIS  2

DELOS Standardization Forum   2

Information Society Technologies (IST) Electronic Publishing  2

XML/EDI 2

Standard Naming (XML at Builder.com, 30th Oct 02) 2

XML Dictionary. 3

Digital Object Models (DOM, CSS, HTML, Web, XML, SVG, …) 3

NameSpaces and URIs. 3

Entity and URI Resolvers  3

W3C Specification: Namespaces in XML  3

Resource Directory Description Language (RDDL) 3

ITSE Namespace Navigator 4

XNS Naming and Addressing  4

XML Name Access Control (NAC) Repository  4

XHTML. 4

W3C Releases XHTML 1.0 Spec (OpenEnterprise Trends 16th Aug 2002) 4

XML, Query, DB and DL. 4

SRU protocol (query in the URL, XML as response): 4

SRU/SRW protocol 4

XQuery  4

Full Text Search Functionality in XQuery--a Status Report 4

XQEngine (formerly XML Query Engine) 4

James (Java SAX for MARC) Beta (9th May) 4

James → MARC4J  5

MARC4J Release 2  5

XOBIS:  the XML Organic Bibliographic Information Schema  5

Introducing XOBIS to the FRBR Working Group (2003) 5

MarcXchange standard (April 2005) 5

Using Greenstone (Gordon Paynter to XML4Lib, 17th April 2002) 5

Wrapping Websites into XML for Archival Storage  6

TEI and XML: A Marriage made in Heaven  6

XML Topic Maps  6

Create XML Document from Undefined Legacy Data  6

Database and Persistence. 6

XML/Database Links  7

Castor {data binding} 7

XML Databases  7

Sleepycat Berkeley DB XML (29-Jul-03) 7

IBM DB2 9 “Viper” Swallows XML Whole  7

TeraText and DB  7

News. 7

Faster XML Ahead?  7

Tools (some also shown by W3C main page) 7

Free XML tools and software (XML Tools by Category) 8

Microsoft XML Data Files  8

Sun XML Developer Connection  8

Forte™ Tools  8

Web Developer’s Virtual Library  8

The Web’s Best Freebies  8

Tips for Unlocking XML’s Secret Powers  8

XML Console  9

MIME  9

Substantives in XML Slots  9

XMLTree (Free) 9

Stylus (eXcelon)  ($199) 9

Logictran RTF Converter ($70 and $25 annual maintenance) 9

Ixiasoft TextML Server 10

Irfanview Free Software  10

Metadata Resources from Nelinet 10

Extensible Document Level Metadata  10

Component level metadata  10

Metadata  for Science, Research, Education and Technology (Lopatenko) 10

VTD-XML 1.0 (6-Oct-05) 10

SGML2X: SGML-to-XML Converter 10

VorteXML Turns Data Into XML  11

Hermes LateX to XML  11

XL2Web Package  11

Generation of Java from XML Schema  11

JAVA XML Processing APIs  11

Dynamic XML for Java: DXMLJ  11

Tools for XML/XSL/XLink  11

Crosswalks  11

E4X  11

Other Tools  11

XML Editors. 12

jEdit 12

OxygenXML  12

Xerlin  12

Swish and Tcl 12

OOL {Java SAX filters for working with out-of-line markup} 12

XML Schema Validation  12

Making a Document Searchable with Swish  12

xTagger: Authoring Document-centric XML  13

Eclipse with XMLbuddy  13

Summary Comment on XML Editors  13

XML Browsers. 13

Doczilla  13

The Versioning Machine. 13

HTMLtidy  13

Via IE Plug-Ins  13

HTML-Kit 14

XML to PDF  14

Embedding XML into PDFs (Charles Myers, Adobe, 30th May) 14

Document tagging  14

Search. 15

Sgrep  15

XML-parsing and XML-APIs. 15

Pretty Printing Parser: Tony  15

Comparing XML Files. 15

Performance. 15

READY YOUR ENTERPRISE NETWORK FOR XML  15

Binary XML  15

XML Compression  15

AlphaWorks XML and Web Services Development Environment (WSDE) 15

XML search engine from IBM   16

IBM XML Generator with VA Java 3.5  16

WebSphere Development Environment 16

XML Apache consists of seven sub-projects, each focused on a different aspect of XML: 16

XML Parser for Java  16

Cocoon - XML-based web publishing, in Java (downloaded to g:\xml\downloads) 16

XML Editor Maker 16

Preservation Metadata. 16

The eXtensible Rule Markup Language (XRML) 16

Security and IP Management 16

XML Security Suite  16

XML Security Library Releases Tools, Resources  17

White Paper: XML Web Services Security  17

Encryption and Key Management 17

XML Key Management Specification (XKMS)  (March 2001) 17

Signatures and Certificates. 17

XML-Signature Syntax and Processing  17

Draft ETSI TS on XML Advanced Electronic Signatures  17

XML Signatures  17

Access Control 17

eXtensible Access Control Language (XACL) 17

SAML for Security Assertions (OASIS XML-based Security Services TC) 17

XACML -- A No-Nonsense Developer's Guide  18

Risks. 18

XML Security Risks (PC Magazine, April 2, 2002) 18

Securing XML Data  18

DRM Tools. 18

XrML  18

SMIL XML Multimedia Standard  18

Synthesize Speech with XML. 19

TALKING HEADS   19

SOMETHING OLD, SOMETHING NEW    19

TEXT TO SPEECH   19

FORMATTING SSML  19

HOW TO SPEAK   19

ADVANCED SSML  19

MAPFORCE 2004 Release 3! 19

Whitepaper on Data Integration with MAPFORCE 2004  19

XML Schema, Structures, Datatypes, and Schema Registry. 19

XML Schema  Resources  19

Comparing DTDs and XML Schema: Are DTD’s Dead?  19

Miscellany  20

Web Service Schema  20

XML Schema: Formal Description. 20

From “Getting Started with XML Schemas”. 21

Keys for XML and Databases. 21

DocBook XML DTD. 21

Sample DocBook Documents  21

DocBook Stylesheets  21

MatML (Materials XML) 22

Related Samples. 22

Data Exchange Descriptions with XML and Java  22

XML Schema for Schemas:Structures and Data Types. 22

XML Schema for Particular Topics or Disciplines. 22

More Specialized Discipline Metadata  22

From LandXML, for Land management 22

From NetBryx, XML Schema Files (also StyleSheets) 22

MARCXML, etc. 22

XOBIS:  the XML Organic Bibliographic Information Schema  22

XBRL (Business Reporting) 22

Medicine and Bio-Sciences  22

DTD for Content Model for Electronic Archiving and Publishing of Journal Articles  22

SyncML for Wireless. 23

Tutorial 23

DATA VALIDATION WITH XML SCHEMAS  23

XML SCHEMAS   23

THE XML DOCUMENT  23

CREATING THE DTD   23

CREATING THE SCHEMA  23

Maintaining Schemas for Pipelined Stages  23

Specify Dataset Needs in XML, not Code  23

End User Problem with XML  23

Test Beds and Applications. 24

Publishing Dissertations in XML at UMich  24

Stanford Lane Medical Library  24

The XML Family of Technologies, Technology Watch Briefing 7, DigiCULT  24

XML for Government Documents. 24

U.S. Code  24

Schema for Specific Domains and Disciplines. 24

Citations. 25

Common Lisp support for the Extensible Markup Language (CL-XML) 25

XML Encoding of Simple Dublin Core Metadata  25

Dublin Core XML Schema  25

E-Mail 25

U.S. Navy  25

U.S. Congress: XML and Legislative Documents  25

Sources of XML DTDs  25

METS Opening Day Presentations  25

Data Dictionary for Technical Metadata for Digital Still Images  25

Digital Video: Video Development Initiative (ViDe) 25

Related Languages. 25

eXtensible Data Format (XDF) 25

XFDL (Extensible Forms Description Language) 25

SVG (Scalable Vector Graphics) 25

Ted Nelson’s ZigZag  26

XML and Java. 26

VTD-XML  26

Conversion to XML. 26

Xanadu  26

Books, Articles, and Tutorials. 26

Web Services. 26

Keys to Clean Up XML. 27

Remedial XML. 27

XML for Records Managers. 27

XSLT and its Web Applications. 28

Command Line Invocation  28

XSL Stylesheets and Docbook  28

Using XSL tools to publish DocBook documents  28

Saxon  29

XSLT Resources  29

http://jaxen.org/ 29

TclXML  29

XSLT-Mediated File Format Migration as a Digital Preservation Strategy  29

Quick Study: Improve XML Transformations  29

XSLTC   29

Xlink. 30

XInclude to Promote Modularity and Reusability. 30

MODULAR XML  30

INCLUSION EXAMPLE   30

PurchaseOrder.xml: 30

CustomerInformation.xml: 30

PurchaseOrder.xml: 30

XInclude Standardization  30

XML in a Nutshell Index. 31

Tim Bray, XML and More, May 2000. 31

XML: The Digital Library Hammer, Tennant, March 2001. 31

CONTENT AUTHORING  31

LEARN HOW TO CREATE WEB PAGES IN OO STYLE (from Java Builder e-mail 31

XML Unleashed ToC   32

Application Profiles. 32

Application profiles: mixing and matching metadata schemas  32

XML: Draft Specification for DC-based Application Profile  32

Unclassified. 32

Practical Examples. 33

Bibliography. 33

Appendix A: XML and XML Schema Samples. 33

NetBryx Book. 33

XML Schema  33

Recipe. 34

In XML with link to XSL Stylesheet 34

DTD   35

XML Schema N/A  35

XSL Stylesheet 35

Poem and Recipe: all pieces for Web Display  37

Wordsworth Poem.. 37

In XML with link to XSL Stylesheet 37

DTD   37

XML Schema  37

XSL Stylesheet 37

Metadata Server in the CARMEN Project 38

Sample of a Dublin Core Description. 38

DiML (Dissertation Markup Language - 1999) 38

Samples from TopXML tutorial 38

XML for Molecular Biology: eXpressML. 39

Markup Languages (LegalXML, NewsML, Commerce XML, …) 39

Appendix B: BNF for XML, XSL, XML Schema, DTD, ….. 39

BNF and EBNF: What are they and how do they work?  40

The extended Backus-Naur format (EBNF) 40

W3C XML Specification DTD (“XMLspec”) 40

Bibliographies and Links

IBM’s “Working XML” (Summary)

UML, XMI, and code generation, Part 1
In this first article in a new series on UML and XML schema development, Benoit discusses the motivations for modeling XML schema through the use of UML.

..., Part 2
In the second part of this series on UML and XML, Benoit introduces the UML metamodel. He proceeds to XMI, the XML-based specification for the exchange of ...

..., Part 3
In his third article on UML modeling and XML, Benoit further refines the conversion stylesheet with the introduction of stereotypes and tags.

..., Part 4
In this final article in his series on UML and XML, Benoit wraps up the technique.  He discusses the need to simplify the model by burying some of the logic ...

Using XSLT for content management
This is the first installment of Working XML, a column with companion project code that demonstrates the evolution of full-fledged XML applications.

Link management and preparing the future
This article shows how to use XML filters to add new functionality to XM, an open-source Web publishing application.

Fundamentals of Web publishing with XML
As more developers learn and experiment with XML, many have become interested in using stylesheets to publish and manage Web sites.

Define and load extension points
In this article, Benoit takes integration between XM, the simple content-management solution, and Eclipse one step further. Publishing a Web site requires ...

Creating a project
Work continues to integrate Eclipse -- IBM's open-source project to build an extensible IDE for Java developers -- and Benoit Marchal's simple ...

A lightweight XML client
While excellent solutions are available for large corporations that want to implement XML, few solutions exist for smaller organizations.

Processing instructions and parameters
This month our hardworking columnist adds support for multiple style sheets to the XM content-management project. In so doing, he taps into TrAX URIResolver ...

Compiling the proxy
In this column, Benoit provides the front end for the Handler Compiler, HC, and encounters unexpected problems with the DFA. A stable but less than optimal ...

A first version of the lightweight client
Benoit continues to develop a lightweight XML client. In this article, he shows you how to create SOAP transactions through XSLT.

Importing text as XML with XI
This column marks the launch of the third 'Working XML' project. This new project deals with importing text documents in an XML publishing solution (or any ...

Wrestling with Java NIO
This column takes the XI project to the next step. Here, Benoît reports his findings with the new Java technology APIs -- in particular, ...

Wrapping up XM version 1
In this month's column, developer and author Benoît Marchal adds final features to the first release of XM, a low-cost open-source content management ...

Compiling the paths and automating tests
This month, our columnist discusses the compilation algorithm. He also invests a bit of time automating tests with JUnit.

Building a compiler for the SAX ContentHandler
This installment of the column describes the requirements for the Java project
and analyzes its overall design. The new project, called HC (short for ...

Compiling XPaths
This month our columnist describes how he implements the DFA construction algorithm, giving the first concrete example of using the compiler to recognize ...

Map files into SOAP requests, Part 1
Many applications are being upgraded to accommodate e-commerce transactions. In the first of two articles on the subject, Benoit Marchal shows one simple ...

Mapping files into SOAP requests, Part 2
Many applications are being upgraded to accommodate e-commerce transactions. In his previous column, Benoit Marchal analyzed legacy data and showed how to ...

Wrapping up XI
Columnist Benoît Marchal continues to shape XI, an open-source project that converts legacy text to XML. For increased efficiency, XI now implements the SAX ...

Putting XI to good use
When it comes to user interfaces, simplification is the key. Fewer options and fewer controls mean less confusion and less chance for error.

XML : Technical library view
... Take advantage of lessons learned by refactoring XM ... Working XML:A first version of the lightweight client ...

Link management and preparing the future
In this installment of Working XML, Benoit Marchal uses XML filters to add new functionality to XM, his open-source Web publishing application.

Importing text as XML with XI
Importing text as XML with XI, e-mail it! Related content:.Subscribe to the developerWorks newsletter. Also in the XML zone:. Tutorials ...

Wrapping up XM version 1
Wrapping up XM version 1, e-mail it! Managing a list of links and the table of contents.

Compiling the proxy
... br>"); writer.println(" * A project developed for the 'Working XML' column at developerWorks."); writer.println(" * @see <a ...

Compiling the proxy
Compiling the proxy, e-mail it! Related content:. Subscribe to the developerWorks newsletter. Also in the XML zone:. Tutorials ...

Compiling XPaths
The Java-based Handler Compiler (HC) project for SAX parsing nears its alpha release. This month our columnist describes how he implements the DFA ...

The XML Cover Pages

Also Standards and Interoperability

Standards Tracks and XML Standards Reference

SOAP (Simple Object Access Protocol) and Java SAX (Simple API for XML)

NT Explorer: Simple Object Access Protocol (SOAP): Brett Burridge investigates the use of the Simple Object Access Protocol (SOAP), the XML-based protocol that is taking a leading role in the emerging area of Web Services.

OASIS

OASIS, the Organization for the Advancement of Structured Information Standards, is a non-profit, international consortium that creates interoperable industry specifications based on public standards such as XML and SGML, as well as others that are related to structured information processing.

DELOS Standardization Forum 

A number of emerging Web standards will provide much of the basic architecture for digital libraries (RDF, Dublin Core, INDECS, DIENST protocol, UNICODE, XML, Z39.50, etc.). Many of these standards have just begun to move from research to deployment.  Implementers will need to follow the progress of research, while researchers will need to monitor the experience of early adopters. The refinement of a stable architecture for digital libraries will require several iterations as these standards are adapted for various applications. The Standardisation Forum focuses on issues related to the deployment of standard metadata schemas and of related infrastructures for publishing, registering, and cross-linking application schemas based on standards.

Several DELOS Standardization working groups will examine the relation between standards for high-level resource description, such as Dublin Core, and standards for fine-grained description within specialised applications, and how multiple standards can be used in combination to address the unique needs of applications. This issue was the focus of a metadata workshop held in Vienna on 30 June 2000 organised by the Standardisation Forum at an EC concertation event presenting projects sponsored by the Digital Heritage and Cultural Content programme, as well as of a workshop session on "Building Digital Library Portals with Harvested Metadata" held on 8 February 2001 at the First EU-DL All Projects Concertation meeting in Luxembourg.

·          Agent’s Requirements in Digital Libraries

·          Formal Metadata

Information Society Technologies (IST) Electronic Publishing

XML/EDI

Notes about secure data exchange

Standard Naming (XML at Builder.com, 30th Oct 02)

How many times have you received an XML document from an external source and the data was all there, but you knew you would need to tweak it to make it work with your system? It's a common problem that is often the result of a lack of precision. Being more precise can alleviate problems with many common data types. Let's look at some examples of how you can create precision standards for your XML elements.

THE NAME GAME

One of the most common problematic elements is one that includes name information. Here are a few of the many name formats:

* SCHAFFNER, BRIAN

* Brian T. Schaffner

* Mr. Brian Schaffner

* Rev. Dr. Brian Schaffner, III

There are several different formats for someone's name, and communicating it in XML can be confusing if there's not enough precision in your elements to hold each component. Let's try to create an all-encompassing name element that you can use for nearly any name.

We'll start with the first part of the name--the prefix. Of course, there may be more than one prefix (as in the Reverend Doctor), so you'll need space for multiple prefixes, and probably something that indicates their proper order.  Next is the first or proper name followed by one or more middle names (or perhaps an initial, or perhaps no middle name). Nearly last is the last name or surname followed by the suffix (as in Jr.). When put together, it might end up like this:

<Name>

  <Prefix order="1">

    <Abbreviated>Rev.</Abbreviated>

    <LongForm>Reverend</LongForm>

  </Prefix>

  <Prefix order="2">

    <Abbreviated>Dr.</Abbreviated>

    <LongForm>Doctor</LongForm>

  </Prefix>

  <Proper>Brian</Proper>

  <Middle>T.</Middle>

  <Surname>Schaffner</Surname>

  <Suffix order="1">

    <Abbreviated>III</Abbreviated>

    <LongForm>the third</LongForm>

  </Suffix>

</Name>

XML PHONE HOME

Another element in XML documents that frequently suffers from format abuse is telephone numbers. Here are a few of the many ways to format telephone numbers:

* (502) 555-1212

* 502.555.1212

* 5025551212

* 502-555-1212

* +1(502)-555-1212 x334

There certainly are others that aren't listed here.  You can run into the same problem with phone numbers as you do with names. The solution is to determine what are all the possible components that a telephone number may include and create a model that supports them. That way you won't lose any information in your XML documents.

We've identified five important components of a telephone number above.  They start with the country code and end with the extension number.  Creating a comprehensive telephone number element shouldn't be too difficult with this information:

<TelephoneNumber>

  <CountryCode>1</CountryCode>

  <AreaCode>502</AreaCode>

  <Exchange>555</Exchange>

  <Number>1212</Number>

  <Extension>334</Extension>

</TelephoneNumber>

XML Dictionary

My new book "Dictionary of XML Technologies and the Semantic Web" (Springer-Verlag, 2004) is now available for sale world-wide.

This 250-page hardcover includes over 1,800 terms/acronyms and 264 illustrations. An accompanying CD-ROM contains a searchable version of the dictionary.

For more information, visit Amazon http://www.amazon.com/exec/obidos/tg/detail/-/1852337680/ or see sample pages at Springer http://www.springeronline.com/sgw/cda/frontpage/0,10735,3-40109-22-17629028-0,00.html

Digital Object Models (DOM, CSS, HTML, Web, XML, SVG, …)

tutorial at w3schools <http://www.w3schools.com/dom/default.asp> will get you into the XML DOM quickly.  Perl shouldn't be any different from any other language; that's the point of the DOM.

NameSpaces and URIs 

The list had some 1840 postings through July 03, 2000. See in particular the post of C. M. Sperberg-McQueen on 2000-07-03: "Moving toward a decision", [Proposed by the Plenary:] "Proposed: to deprecate the use of relative URI references in namespace declarations; that is: to say that while t