DAISY—Structure Guidelines: Intro to Structured Markup, Markup
Mark Up is the identification and tagging of the components of a text. The more detailed the mark up, the greater the access provided to the end user. A DTB without mark up is as inaccessible as an analogue book on cassette without tone indexing which consequently would not allow the reader to navigate to points within the book. .
Furthermore, in the digital world, distinguishing one structural element from another is of great importance; when an element is identified and marked up, properties special to that element can be assigned to it, resulting in increased flexibility and enhanced navigation for the reader. For example, in an analogue recording the narrator pronounces or spells out an acronym, as appropriate. In a DTB containing a text file that may be accessed by a browser with synthetic speech it is important for the markup to indicate if the acronym should be spelled out or pronounced. Whether the acronym is to be spelled or pronounced is a property assigned to the acronym tag.
When elements are identified they can be displayed according to user needs. A user may not want to hear the sidebars in a book. If the sidebars are identified and marked up with the sidebar tag, the end user can choose to skip them, listen to them as they occur, or even listen only to them.
XML markup components are variously referred to as elements and tags, although we attempt to maintain a distinction:
A tag is XML code, surrounded by angle brackets (< and >). All tags are either opening (as in <p>), closing (as in </p>) or self-closing (as in <br/>). In the following example, the q tags are used to mark a short quotation within a paragraph:
<p>As Yogi Berra said, <q>"It ain't over 'til it's over.”</q></p>
<q> indicates the beginning of the quote and </q> indicates the end of the quote; <p> and </p> wrap the entire paragraph. Note: to be well-formed XML, tags must be closed in the reverse order in which they are opened.
An element is a matched pair of tags (opening and closing), attributes in the opening tag, and all text and tags contained between the matched tags. An element can also be a self-closing tag and its attributes.
Tags and elements are not normally displayed in a DTB.
An attribute functions somewhat like an adjective, providing more information about the structure a tag is identifying. One of the most commonly used attributes is “class”. In the following example, class=”chapter” indicates that the “level” tag begins a chapter section:
The attribute “id” is heavily used to uniquely identify each structural element of the book, and is usually inserted automatically by DTB production software. Other uses of attributes include indicating whether or not an item may be “turned off” as part of a group of items the user wishes to skip, and indicating if an acronym should be pronounced as a word or spelled out letter by letter, as mentioned earlier.
An attribute, if used, must appear in the start tag and the value of the attribute (in the above example, “chapter”) must be in quotes. In most cases the use of attributes is optional. Tags for which they are required will be clearly identified in Part II of these guidelines.
One attribute that warrants special mention is “smilref.” It is used to synchronize the textual content file and the SMIL file when a user moves between navigation controlled by the SMIL file and navigation controlled by the textual content file. The DAISY Standard requires it to be present and have a value for each element in the textual content file that is referenced by a SMIL file. Normally, both the SMIL files and the smilref attributes would be created by the DTB production software.
The following tags are required for a book to be valid within the current DTBook DTD.
The complete DAISY DTB is surrounded by the <dtbook> and </dtbook> tags. Within these, the <head> and </head> and <book> and </book> tags must also be present in this order as shown, and as required by the DTD. The <head> tags identify information about the book that is separate from the content. The <book> tags enclose the content of the book. The following example illustrates how these tags are used.
<head> (Information About the Book)
<book> (The entire content of the book, including cover information, etc.)
This element, <link>, appears in the <head> section of a document. It establishes the relationships between the current document and other documents, useful in cases where the content has been divided into separate DTBook documents. The <link> element conveys relationship information (for example, next and previous) that may be rendered in a variety of ways. <link> is implemented similarly as in XHTML; for information on its use, consult sources on link within XHTML, such as this W3C tip sheet on link, or the link element section in the XHTML 2.0 spec
Meta provides the metadata elements for the book and appears in <head>. It is the container for the Dublin Core attributes and additional DTBook attributes. As a minumum the dc:Title and dtb:uid are required. Complete, accurate metadata should be included in all DAISY DTBs.
Within <book> the content should generally be divided into three sections which should be presented in the following order and tagged with the elements <frontmatter>, <bodymatter>, and <rearmatter>. See Information Object: Front Matter: Major Structural Elements.
- Front Matter—consists of information found in the preliminary pages of a book (e.g., title, author, book jacket material, foreword, acknowledgements, dedication, and table of contents) as well as information added by the talking book producer (e.g., date of recording, narrator, studio, special copyright message).
- Body Matter—consists of the basic content of the document as distinguished from prefatory and supplementary materials. The body matter may be divided into parts, chapters, sections, etc.
- Rear Matter—consists of material following the main body of the book such as: appendices, bibliographies, alphabetical indexes, etc. These items should be presented in the sequence found in the printed book.