The magazine of the Melbourne PC User Group

XML - for the bookshelf
Major Keary
 

The importance of XML 
̶  eXtensible Markup Language — cannot be overestimated. In development are a large number of XML-based initiatives designed for doing business over the Internet. Originally conceived as a replacement for HTML, XML has been developed to the point where it "offers the tantalising possibility of truly cross-platform, long-term data formats" [XML in a Nutshell]. There is increasing enterprise-level reliance on XML and its technologies to ensure permanence of data and access to it. Computer end-users are often unaware that applications they use from day to day make use of XML: OpenOffice stores its files in XML format, and Microsoft Office 2003 can write to XML (but still saves in a binary format by default).

Not many present-day users will have ever seen, let alone used, an 8-inch floppy disk. Suppose you are given such a disk that contains document files created by the original CP/M version of WordStar; unless you can find a copy of WordStar and a machine that can run the program and which has an 8-inch drive, it's a case of being up the creek without a paddle. That fate awaits files saved in MS proprietary binary format. Given that the means to read the media are available, OpenOffice files (like the source files created for TeX) will remain readable because, even if the program is extinct, writing a parser is trivial (at least, for those who know what they are doing). TeX files created in the late 1970s are still valid, even though they are processed on and with equipment that was not even a gleam in the engineer's eye thirty-five years ago. The same goes for files created using the Standardized General Markup Language (SGML), of which XML is a subset.

Users of digital photography with an eye to posterity should consider the problem images stored in binary format, especially in CD-based archives. What happens when CD drives have become redundant? Will there be software that can read today's recording formats and media, let alone graphics formats? Photographs taken during the American Civil War are still available to us; will yours still be around — and usable — in 150 years time? However, that's another subject; the topic here is XML and its applications, which are moving towards development of long-term, cross-platform data formats.

Learning XML

Now in its second edition, Learning XML has been expanded to include schemas ('A schema is any type of model document that defines the structure of something, such as databases or documents" [Beginning XML]). This text provides an excellent
coverage of XML:s foundations and the technologies that have been developed for particular needs. It is the definitive introduction to XML.
 
Erik Tay: Learning XML 2/e
ISBN 0-596-00420-6
Published by O'Reilly,
400 pp.,
RRP $85.00 incl. GST

XML in a Nutshell

Now in its third edition, this is the definitive guide and reference for developers. The first half of the book is divided into sections that discuss: XML Concepts (XML fundamentals, DTDs, namespaces, and internationalisation); Narrative-Like Documents (XML as a document format, XML on the web, XSLT, XPath, XLinks, XPointers, Xlnclude, cascading style sheets, XSL formatting objects, and the Resource Directory Description Language); Record-Like Documents (XML as a data format, schemas, programming models, DOM, and SAX).

The second half the book is a reference section that contains succinct entries — that include examples of syntax and use — covering the W3C recommendations for XML, schemas, XPath, XSLT, DOM, SAX, and character sets with Unicode coding.

Information is easy to find with a detailed table of contents, 34-page index, and a convenient thumb index. The writing is excellent, although this, "... the markup has less semantics to lever off of.", took me by surprise. It must have slipped past the usually eagle-eyed O'Reilly editorial staff.

An essential reference for web designers, programmers using applications such as REST and SOAP, and developers involved in internationalisation of applications, web pages, and other situations where language encodings are required. A title that should be in any library.
 

Elliotte Harold and W. Scott Means: XML in a Nutshell 3/e
ISBN 0-596-00764-7
Published by O'Reilly,
689 pp.,
RRP $79.95 in GST

XML Hacks

Just released in the hacks series, this title presents "100 Industrial-Strength Tips & Tools" for XML users. Apart from the working hacks, the author gives the reader glimpses of XML technologies that are in development.

The hacks are grouped under categories: Looking at XML Documents (reading, displaying, character entity references, XML vocabulary tools, Java
programs that process XML); Creating XML Documents (includes editing tools, using OpenOffice as a conversion tool, create an XML document from a CSV file; Transforming XML Documents (includes XSLT, convert XML to CSV, generate" XML from SQL, generate PDF documents from XML
and CSS, perform math with XSLT, generate SVG with XSLT); XML Vocabularies (namespaces, create and validate XHTML documents, identify yourself with FOAF, the OpenOffice file format, using XForms, create technical manuals and papers in XML with DocBook); Defining XML Vocabularies with Schema Languages (validation issues); RSS and Atom (one of the hacks shows how to syndicate a list of books from Amazon with RSS and ASP); and Advanced XML Hacks.

Illustrations (mainly screen shots) are used sparingly to support the text, and there are plenty of code examples (which are also available for download from a Web site).

Even though the book is a valuable resource intended for users who know their way around XML technologies, it also provides a fascinating insight for those who would like to know more about XML and its capabilities.
 

Michael Fitzgerald: XML Hacks
ISBN 0-596-00711-6
Published by O'Reilly,
460 pp.,
RRP $49.95 incl. GST

Beginning XML

Wrox Press puts all its efforts into content; their titles typically have multiple authors (and not just two or three), and are information dense — both in the sense of what they pack in, and in the sense of space (pages are wall-to-wall text). Given the number of contributors, I am always surprised by the uniformity of style achieved.

Beginning XML is a book for programmers written by programmers (nine of them) and is in much the same mould as an academic 'introduction': readers are expected to know things or be capable of finding information for themselves. In short, this is not a primer. The intended audience includes programmers familiar with some Web programming or data interchange techniques, and those who work in a programming environment but have no experience of Web development or data exchange. Readers are not expected to know anything about XML, SGML, or markup languages at large.

So, if you can cope with terse entries such as,
Input Functions
At the time of writing, the XQuery input functions are the doc () function and the collection ( ) function and both are implemented in Saxon 7.8.
this book is for you. However, that is not typical of the content; it occurs frequently, but there is also extensive discussion of concepts, especially in the introductory chapters. At the end of each chapter there are self-test questions, with answers in an appendix.

Of particular interest to database users and developers is a chapter on XQuery, which "provides a powerful mechanism to pull XML content from multiple sources and dynamically generate new content using a user- friendly declarative language" ]XML Hacks]. XQuery can be compared to SQL; it is described — albeit tersely — in more detail here than will be found in other current literature.

This is a hefty, and most comprehensive, overview of XML technologies. It is especially suited to people wanting a self-study introduction, even though the specific audience is programmers who want to learn XML — for them it provides the kind of compact, detailed information they need.
 

David Hunter et al.: Beginning XML 3/e
ISBN 0-7645-7077-3
Published by Wrox Press,
995 pp.,
RRP $66.95 incl. GST

Reprinted from the August 2005 issue of PC Update, the magazine of Melbourne PC User Group, Australia

[ About Melbourne PC User Group ]