DAISY—Structure Guidelines: Intro to Structured Markup, Structure and Hierarchy
The main elements of a document, such as parts, chapters, sections, stanzas, etc., and their interrelationships, constitute its primary structure. These are ordinarily arranged hierarchically. For example, a novel consisting of an introduction and ten chapters has a very simple structure of eleven elements all at the same hierarchical level. On the other hand, a textbook containing parts, chapters, and sections has a more complex structure with text elements at three hierarchical levels: parts at the highest level, chapters at the middle level, and sections at the lowest level. Appropriate markup is used to identify the proper hierarchical structure of a document.
Levels describe the relative position of the major structural elements of a book. The hierarchy they define provides the end user with the ability to navigate within the DTB. Therefore it is critical that the markup of levels be correct.
Two methods of marking up levels are allowed.:
Six tags are used: <level1>, <level2>, <level3>, etc., through <level6>, with the highest level of a book tagged as <level1>.
A single <level> tag is used to mark all levels, with differences between the levels defined by a nesting hierarchy, and optionally with the “depth” attribute. (See Alternative Markup in Part II(a): Major Structures).
In the following examples and discussion, only the level1-level 6 method is described.
A level is marked up in the following way. Determine at which level the structural component (part, chapter, section, etc.) occurs in the original document. The class attribute may be used to name (identify) it. The use of class attributes is not required, however, in some players they may provide additional information to the user.
If the highest level of the book is “Part” the tag might read <level1 class=”part”> and if the next level consists of “Chapter”, the second level might read <level2 class=”chapter”>. If a book is made up of chapters which contain sections, the chapters might be tagged <level1 class=”chapter”> and the sections tagged as level <level2 class=”section”>. In a book with one level (chapters) only the <level1 class=”chapter”> tag would be used.
It is not necessary to use the class attribute names shown in the examples in these guidelines (part, chapter, section, etc.). A level can be called anything that doesn’t violate basic naming conventions (spaces, colons, commas, and periods cannot be used in attributes). <level1 class=”kazong”> is a valid name, even if it is not very descriptive. DTB producers should assign names to levels using their local language. For example: <level1 class=”kapitel”> (Nordic for chapter).
If the structural component has a heading in the print book, mark it using the tags <h1> through <h6>. The numbers of the level tag and of the heading tag must be identical (h1 for level1, h2 for level2, etc.). The class attribute value used in the level tag may also be used within the heading tag. For example:
<h1 class="chapter">Darwin's Formative Years</h1>
In the remaining examples in this document, the class attribute value is not used in the heading tags.
The level tags are the container for the part, chapter, etc., while the h1 to h6 tags mark the heading for that part, chapter, etc.
At the end of the structural component being contained by the level it is necessary to insert the appropriate end tag: </level2> (end of level 2), </level1> (end of level1). For example:
<h1>Darwin's Formative Years</h1>
<!-- content of chapter -->
For further discussion of levels, see Information Object: Major Structural Elements, Levels.
In a DTB that is valid to the DTD and the DAISY Standard, (and thus produced according to the requirements of XML), components at different levels in the hierarchy must be nested, that is, contained one within the other. See the W3C Extensible Mark Up Language 2004 Recommendation. This means that a component at a lower level must fit completely inside the higher level. When a second tag is opened before the previous tag is closed, proper nesting must be observed—the second tag must be closed before the first is closed.
Valid mark up:
<level1> <level2> </level2> </level1>
Invalid mark up:
<level1> <level2> </level1> </level2>
Note also that the invalid mark up shown above is also not well-formed XML.
In addition, when marking up levels using the level1 to level6 tags, the tags must be used in sequence. A level 3 element (e.g., section) that is not inside a level 2 element (e.g., chapter) will be invalid to the DTD. If the document is run through a validation process (via a parser) the invalid mark up will be flagged.
The hierarchy in the DTB should reflect the hierarchy in the print book. The markup used in the DTB to represent the hierarchy determines the extent of the “global” navigation (from heading to heading) available to the end user.
In most cases, only structural components with headings should be identified using the level 1to level 6 tags. Components such as acknowledgements or dedication sometimes appear in the print book without a heading, in which case they should be marked up with the <div> tag. See Major Structural Elements.
The producer should impose a structural scheme in the DAISY DTB when it is absent from the print book. If the structural scheme is unclear, it may be necessary to promote a level, add a level, or flatten the hierarchy. As long as the final result is a well-structured DAISY DTB the producer has the flexibility to do this. For example, sometimes there is a discrepancy between the appearance of a heading, as indicated by typography, and the apparent hierarchy in the printed book. A subheading printed in the same typeface as level 3 headings in that book may appear as the first heading following a chapter heading at level 1. This could be due to various reasons. First, there may be no true hierarchy in the book and the typography used could reflect an aspect of content rather than hierarchy. In such cases it would be possible to flatten the hierarchy in the DTB, placing such headings at an appropriate superior level. Second, there may be a hierarchy in the book that is not correctly represented by the typography. In this case the actual hierarchy should be reflected in the DTB regardless of the typography.
The contents of a DTB should generally be presented to the end user in the order in which they appear in the printed book. That sequence does not necessarily relate to the physical location of the digital information in a DTB (that is, items that follow each other in the book may be located in different files in the DTB), or to the order in which the contents were recorded (that is, a note that is read at the end of a sentence in the DTB may in fact have been recorded on a different day than the sentence was). Proper sequence is especially important for the end user who does not navigate randomly through the DTB, but instead listens to it from beginning to end.
A presentation sequence should be established where none exists in the original document. For example, some material such as pictures, sidebars, boxes etc. “float” within the surrounding text. That is, they do not fall at a clearly identified point within the text. They are positioned on the print page for visual effect and may not be meant to be read at a single specific point within the surrounding text. The talking book producer must establish the sequence in which such elements are presented within the surrounding text, and presentation should be consistent throughout the DTB.
Such floating information may be vital for the understanding of the continuous text, but text and floating information may function more or less independently of each other. When this type of material is inserted into the text it should be done as closely as possible to existing reference points or the relevant text without disrupting the flow of the content. Wherever possible it should be included on the same page as it occurs in the printed book (some users may use the audio of a DTB as support or reinforcement with visual reading).
Some books rely strongly on visual presentation and have no continuous text. When there is no apparent order in the printed book an order must be established for the DTB. This is done according to the conventions of the producing country. For example, in the western world a left to right, top to bottom sequence would be appropriate.
In a DTB it may sometimes be beneficial to move selected material, (e.g., picture captions) from its location in the print book and gather it into a section created for the DTB. This section should be placed at an appropriate point within the overall sequential structure, often as part of the rear matter. Any divergence from the print book should always be described in the producer’s introduction to the DAISY DTB or in a specific Producer’s Note.DAISY