DAISY Format 2.0 Specification

DAISY Format 2.0 Specification

STATUS: This is a recommendation of the DAISY Consortium. September 22, 1998

Document maintained by: George Kerscher
If you have any questions, comments, or suggestions please contact George Kerscher at kerscher@montana.com


Thanks to everyone who has helped to author the working drafts that went into the DAISY 2.0 specification, and to all those who have sent suggestions and corrections.

The authors of this specification, the members of the MarkUp Specification Team deserve much applause for their diligent review of this document, their constructive comments, and their hard work. A special thanks goes to:

George Kerscher, DAISY Consortium / Recording For the Blind and Dyslexic; Mark Hakkinen, Productivity Works; Harvey Bingham, Yuri Rubinski Insight Foundation; David Pawson and Stephen King and Keith Gladstone, Royal National Institute for the Blind; Tatsu Nishizawa, Plextor Ltd; Dominic Labbé and Gilles Pepin, VisuAide Inc.; Lynn Leith and Barbara Freeze, Canadian National Institute for the Blind; Matthias Ragaz, Swiss Library for the Blind and Visually Impaired; Michael Moodie and Lloyd Rasmussen, National Library Service for the Blind and Physically Handicapped Library of Congress; Tom McCartney, Gray Wolf Computing; Rej Tanikella, Recording For the Blind & Dyslexic; Jason White, Web Accessibility Initiative; Edmar Schut, Dutch Library for the Visually and Print-handicapped Students and Professionals; Susanne Seidelin and Christian Wallin, Danish National Library for the Blind; Thomas Kalish, Association of Talking Book Libraries of Germany; Jan Lindholm, Labyrinten Inc; Daniel Dardailler and Philipp Hoschka, World Wide Web Consortium.

1.0 Introduction

The DAISY 2.0 HTML Specification is intended to be a simple set of conventions for the structure of Digital Talking Books (DTB) that are marked up primarily to the level of headings and page numbers. This is useful for organizations who produce books using the Sigtuna authoring / Recording system. We may find that this specification is also useful for books that are converted from analog recordings. It is also possible to support full text. This specification uses the HTML 4.0 STRICT document type definition (DTD) defined by the World Wide Web Consortium (W3C – http://www.w3.org). The DAISY Specification uses only a small part of the HTML 4.0 tag set.

1.1 Description

A book that complies with the DAISY 2.0 Specification must contain headings, page number references, and metadata for book identification and cataloging. It is expected that the current playback hardware and software will easily support this specification. The specification identifies the essential HTML tags and their usage. Associated with the HTML will be Synchronized Multimedia Integration Language (SMIL) files that synchronize the structure with continuous digital human voice. This specification only describes the metadata contents of the SMIL file. The full SMIL specification is already defined by the W3C. There is also a required file for navigation and control. The NCC file is required under this specification.

It is expected that several mechanisms will be available to produce materials that meet this specification. Using the Sigtuna Recording software materials can be produced that comply with this specification. Sigtuna recorder, when released, will produce the required Navigation Control Center (NCC) file, the SMIL files and the sound files. It should also be possible to produce compliant materials from HTML files which include the structural and optional full text components. The SMIL files and sound files will also need to be present. The required NCC file may then be produced from the examination of the HTML data, the SMIL, and the sound files.

2.0 Page Numbers

Page references are provided in DAISY 2.0 books to support navigation to pages in a book. Previous and next pages should always be supported by players. Go to a certain page should also be supported and this specification provides flexibility in providing direct access to pages.

Page numbers are represented by the HTML “span” tag. The span element identifies the beginning of the page. The content of the span element is the page number as it appears in the book. Regardless of where a page number appears on the printed page, the span element with the content must be placed before the first word on that page. It is important to note that the exact representation of the page number from the printed book is the content. The producer should not modify this content. Player manufacturers should use the content for navigation and searching purposes.

NOTE: blank pages should be marked with a span element whose content represents the sequential number of the page.

2.1 Class Attributes

Class attributes of: page-front, page-normal, and page-special are supported.

The class attribute “page-front” indicates pages at the front of the book before the normal page numbering begins.

The class attribute “page-normal” indicates pages that have a normal scheme that starts at 1 and continues to the back matter or to the end of the book. IT is very important to note that the content of normal page numbers MUST BE THE ASCII VALUE OF A POSITIVE WHOLE NUMBER. Players will use this positive integer for navigation purposes.

The class attribute “page-special” indicates pages that are not front matter and do not have a traditional numbering system. Many times these special page numbers are compound page numbers. A compound page number has substructure. A page number may have a prefix part, a separator, a sequential part at lower level (the traditional page number), an optional separator and a suffix. For example, page ‘1-15b’ could indicate section 1 page 15 second inserted page thereafter (right after ‘1-15a’ and somewhere before page ‘1-16’ or ‘2-1’.)”

That style is admittedly ugly, but is still used in the revision process for some loose-leaf books. It may exist in some legacy books.

2.2 ID

The ID of each element must be unique. The unique ID is used for synchronization between the SMIL file, the NCC and the optional full text. The unique ID must be present on each item used for synchronization. If full text is used, the ID must be present on each element that is synchronized with the recording. Note that full text is not required, but it is optional. The ID in most cases will be generated by software according to an internal algorithm. The requirement is that the ID must be unique and conform to the HTML specification for an ID.

NOTE: the W3C describes valid id attributes as follows.

ID and NAME tokens must begin with a letter ([A-Z a-z]) and may be followed by any number of letters, digits ([0-9]), hyphens (“-“), underscores (“_”), colons (“:”), and periods (“.”).

It is advised to avoid the use of the colon in IDs. The colon will be used in the future and it is recommended to avoid confusion to refrain from using the colon.

2.3 Examples Of Page Numbers

If the page number in a book is “57” the span element would be:

<span id="x1228" class="page-normal">57</span>

There is no problem with this example and it is what one would expect.

For page 3-85 where the book is numbered specially with the chapter followed by the page number that restarts at 1 for each new chapter, we would have:

<span id="z22273" class="page-special">3-85</span>

For Front matter pages represented by Roman Numerals such as XV the span would be:

<span id="xx34s7" class="page-front">XV</span>

The final example is where the end matter does not use contiguous numbering to identify pages. Let us say that appendix A is identified as page number “A 1” and that this starts immediately after page 363.

<span id="a123" class="page-special">A 1</span>

3.0 Structural Tags

HTML headings tags one through six are supported. Class attributes can be used to provide additional semantic information.

3.1 Valid Class Attribute Values on Headings

The values for the class attribute should be chosen from:

title, jacket, front, title-page, copyright-page, acknowledgments, prolog, about-author, other-books, introduction, dedication, forward, preface, print-toc, part, chapter, section, sub-section, minor-head, bibliography, glossary, appendix, index, index-category.

NOTE: In some cases blocks (group) of text can be used for navigation. For example, next paragraph, or next list item or table row may be identified in the NCC to provide this type of navigation. In the NCC the “div” element is used as a placeholder to point to this type of information. The class attribute of “group” will be used. The content of the div in the NCC contains an anchor (link). If full text is used, the div in the NCC can point to any HTML element with a unique ID.

The class attributes should be in lower case and enclosed with the double quote character.

HTML 4.0 does not require proper nesting of headings, but the DAISY 2.0 specification suggests proper nesting. This means that you cannot have a h3 without it being preceded by a h2 and the h2 must be preceded by a h1. It is permissible to skip a level, but it is highly discouraged. Authoring software should warn that this convention is being violated.

3.2 Examples

<h1 class="front">Front Matter</h1>     <h2 class="title-page">Title Page</h2>     <h2> class="copyright-page">Copyright page</h2>  <h1 class="part">House pets</h1>  <h1 class="chapter">Chapter 1, All about Dogs </h1>     <h2> class="section">Care and feeding</h2>        <h3 class="sub-section">Dog Food</h3>           <h4> class="minor-head">Nutritional Value</h4>              <h5 class="minor-head">meat</h5>                 <h6 class="minor-head">lamb</h6>        <h3 class="sub-section">Exercise</h3>     <h2 class="section">Training</h2>  <h1 class="chapter">All about Cats</h1>   .   .   .    <h1 class="bibliography">For further Reading</h1>  <h1 class="glossary">Some terms you should use</h1>  <h1 class="appendix">Appendix A: First Aid Supplies</h1>  <h1 class="index">Index</h1>

NOTE: The indented information is simply provided to aid reading. It is not expected that the HTML use this convention. It may or may not be used. The required ID attribute is not provided in the above example to improve readability. Each of these structural tags should have a unique ID.

3.3 Multiple HTML Files

Producers may use multiple HTML files to create the content. This will especially be the case if full text is used. In that case logical, simple use of structural block elements such as paragraphs, lists, tables, preformatted text, etc. is expected. Use of tables to format text is not allowed. Use of frames and other graphical formatting features of HTML are discouraged .

To correctly identify the order of the HTML files, the standard HTML link meta elements should be used. This link information is contained in the header of each HTML file.

3.3.1 Example of the Link Data

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Strict//EN">  <html lang="en">  <head>  other heading information    <link rel="start" href="first-file.html">  <link rel="previous" href="previous-file.html">   <link rel="next" href="next-file.html">   </head>

Note: the starting file would have no previous and the last file would have no next.

NOTE: In addition to the link values specified above the HTML 4.0 specification identifies: contents, index, glossary, chapter, section, subsection, appendix, copyright, help, and bookmark. Please refer to the HTML 4.0 specification for more information.

4.0 Dublin Core Bibliographic and Cataloging Metadata

DAISY 2.0 provides bibliographic and cataloging information according to the Dublin Core specification. The evolving reference description, including any defined qualifiers, resides at http://purl.org/metadata/dublin_core

The metadata must be included in the HTML file(s) used to store the heading and page number information or the optional full text. The metadata must also be reproduced in the NCC with the required “DC” prefix. The metadata in the full text would have the DC prefix as well.

Note: For compatibility with XML applications and W3C Namespaces the tool developers should use “DC:” and “NCC:”, but be prepared to accept the alternative separator “.”, and the lower-case forms “dc” and “ncc”.

The metadata labels are: title, creator, subject, description, publisher, Contributor, date, type, format, identifier, source, language, relation, coverage, rights

The “scheme” attribute must be used with the metadata to clarify the information. The scheme attribute allows authors to provide user agents more context for the correct interpretation of metadata. At times, such additional information may be critical, as when metadata may be specified in different formats. Please refer to the HTML 4.0 specification for more information. http://www.w3.org/TR/REC-html40/struct/global.html#profiles

Note: An evolving list of qualifiers to the Dublin Core scheme attribute can be found at: http://www.loc.gov/marc/dcqualif.html

4.1 Example

<head>  <title>All About Dogs</title>  <meta name="DC:title" content="All about Dogs">  <meta name="DC:creator" content="George Kerscher">  <meta scheme="keyword" name="DC:subject" content="animals">  <meta scheme="keyword" name="DC:subject" content="pets">  <meta scheme="original" name="DC:publisher" content="Barking Press">  <meta scheme="yyyy-mm-dd" name="DC:date" content="1998-11-05">
<meta scheme="ISBN" name="DC:identifier" content="333-333-333-333">    <meta scheme="edition" name="DC:identifier" content="3rd">    <meta name="DC:format"  content="DAISY 2.0">  <meta name="DC:type" content="textbook">  <meta name="language" content="en">  </head>

4.2 Required Bibliographic Metadata

For DAISY 2.0 books, the following is required bibliographic metadata: title, creator, publisher, identifier including edition, type, format, date, language.

Other Dublin Core labels may be used according to the specification.

5.0 Navigation Control Center (NCC)

The NCC is normally automatically generated by software through an examination of the HTML, the SMIL, and the digital recordings. Some software, such as the Sigtuna recorder, may generate the NCC directly. The NCC is an HTML 4.0 file used by players to provide navigation to headings, pages, and in some cases to blocks (groups) of text. This provides basic identification information for the players. Contained in the NCC are required and optional bibliographic metadata as specified in section four, and additional metadata for the player and links for navigation into the SMIL.

5.1 NCC File Names and Folders

Each DAISY book must have a file called NCC.html. The players look for this file and use it as a way to gain access to all the associated files. Normally a single book would be contained on the CD. The NCC.html file would be in the root of the CD. If more than one DAISY book is provided on the same medium, each book should be stored in a separate folder (directory) of its own. In the root of the CD, DVD, or other media a file called discinfo.html must be created that contains links to all books on the media.

5.1.1 Example Of Discinfo.html File

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Strict//EN">  <html lang="en">  <head><title>CD Information</title></head>  <body>  <a href="./book1/NCC.html">All About Dogs</a>  <a href="./book2/NCC.html">All About Cats</a>  </body>  </html>

5.1.2 CD-ROM Formats

NOTE: there is no restriction for file naming or mode types on CD-ROM delivery. This means that MODE I & II are supported. Also long file names of “Romeo extensions” and “Joliet extensions” are supported. The original ISO 9660 standard levels are supported. The distributor should use the file system that is most appropriate for them.

The “Romeo ” file system for CD-ROM delivery is recommended for Japanese characters. This supports the long file names and this is the system used in Japanese and many other double byte character set languages. The “Joliet” file system supports the UNICODE character set.

CAUTION: the extension”.smi” is not valid. When writing CD-ROM the SMIL file extensions may be truncated to an invalid “.smi” extension. This and other path and file names can be affected by the file system of the CD-ROM selected. A distributor of DAISY 2.0 books on CD-ROM must be certain that the file system selected is compatible with the characters used in the production process. In other words, If you create your data in Joliet, you should write Joliet CD-ROMS; and if you create your data using the Romeo system, you should write the CD-ROM with the Romeo system.

NOTE: For more information on these and other CD-ROM Formats please see http://www.ping.be/~pin11466/formtxt.html.

5.1.3 Multiple Volume Book

In most distribution cases, a single book will be contained on a single distribution media. For example purposes we will describe a CD-Rom distribution system, but this can apply to any removable media. It is suggested that a commonly used codec be used to reduce the size of the sound files. For example, MPEG 2 level 2 should allow 40 hours of recorded human speech. MPEG 2 layer 3 should allow more than 50 hours. However, it may still be necessary to have more than one volume to deliver large books.

The end user of a multiple volume book should be told to insert a CD-ROM of a set when necessary. For example let us consider a book that requires three CD-ROM disks for distribution. If the end user requested to go to page 802 which is on the third CD-ROM, a message should say, “Please insert disk three.” The user would eject the CD and insert the third disk and the player would then go directly to page 802. To accomplish this the following conventions must be followed:

  1. Each CD-ROM must contain a full NCC with slightly different information.
  2. There must be a meta tag that identifies the current volume of the set.
    <meta name”NCC:setinfo” content=”1 of 3″ >
    Where the content 1 of 3, 2 of 3, or 3 of 3 denotes the order of the CD-ROM.
  3. Any information referenced that does not contain the audio data will refer to a SMIL file and a corresponding sound file that contains the message, “Please insert disk x.” Where x represents the number of the disk in the set.
  4. There must be a “rel” attribute on items pointing to information not on the current CD. The valid values of the rel attribute should identify the number of the CD. Valid rel attributes will be: “1 of 3”, “2 of 3”, “3 of 3”, etc.

In our example of a three CD-ROM volume book. the first CD-ROM would have all the normal NCC information plus the setinfo metadata. The href information for the data not contained on the first would point to the SMIL message file with the respective message. All href items that point to data on the second CD-ROM would instead point to the same message file. Most players store information about the place where you stopped reading. When the next CD-ROM is inserted, it will go directly to that same ID, but this time it would point to a href that represents the correct data. If the user inserts the wrong CD-ROM, the end user will get the same message to insert the proper disk.

NOTE: examples of this will be available as reference material supporting this specification.

5.2 NCC Metadata

In addition to the bibliographic metadata, the required meta labels for the NCC are “NCC:format”, “NCC:TOCItems”, “NCC:page-front”, “NCC:charset”, “NCC:page-normal”, “NCC:page-special”, “NCC:generator”, “NCC:publisher”, and “NCC:identifier”

* NCC:format valid content “DAISY 2.0” any additional supported DTB format information should repeat the NCC:format meta element.

* the NCC:TOCItems valid content is an integer which represents the number of navigation points in the DAISY 2.0 document instance.

* NCC:charset

NOTE: One can download a list of registered charset values from ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets The specification for character sets can be found at: http://www.w3.org/TR/REC-html40/charset.html#doc-char-set.

* NCC:page-front, NCC:page-special, NCC:page-normal has valid content of 0 or a positive integer. This represents the number of pages of the type.

* NCC:generator. is the software that generated the NCC. More than one program may have been applied to produce the NCC. The NCC may have also been generated by hand and this should be indicated.

* NCC:publisher is the organization that produced the accessible version.

* NCC:identifier is the unique identifier (book number, shelf number, etc.) used by the dtb-publisher.

5.3 Example

<meta name="NCC:format" content="DAISY 2.0">  <meta name="NCC:Generator" content="Sigtuna Digital Audio Recorder 1.6">  <meta name="NCC:ToCItems" content="57">  <meta name="NCC:Charset" content="3,Windows ANSI CP 1252,Windows 3.1 Latin 1  code page (US\, Western Europe),3">    <meta name="NCC:page-front" content="3">  <meta name="NCC:page-normal" content="150">  <meta name="NCC:page-special" content="0">  <meta name="NCC:publisher" content="RFB&D">  <meta name="NCC:identifier" content="shelf zz999">

5.4 Optional Metadata

NOTE: the “NCC:totaltime” is optional, but is very useful for informing the end user how much time is remaining. If this is provided along with the time meta information in the SMIL files, players will be able to determine how much time has elapsed and how much time remains.

* NCC:total time is represented as hh:mm:ss.

<meta name="NCC:TotalTime" content="3:04:11">  <meta name="NCC:Country" content="SE,Sweden,99">  <meta name="NCC:Producer" content="production site name">  <meta name="NCC:ProducedDate" content="1998-03-11">  <meta name="NCC:RecordedBy" content="technician name">  <meta name="NCC:RecordedDate" content="1998-03-11">  <meta name="NCC:Revision" content="1998-03-11">  <meta name="NCC:LastRevision" content="1998-03-11">  <meta name="NCC:Narrator" content="John Doe">
<meta name="NCC:ProjectNumber" content="122">

5.5 Body Example

Headings, page numbers, and optionally blocks (group) of text for navigation are identified and a URL to the SMIL file and the id to the SMIL fragment are provided in the anchor. This provides standard mechanisms to link to the components of the DAISY document for navigation and playing of the digitized human voice. The first entry in the body of the NCC must be the title of the book. The class attribute is “title”.

<h1 class="title" id="t1"><a href="title.sml#t1">All about pets</a></h1>    <h1 id="h1_004" class="chapter"> <a href="DAIS004A.smil#h1_004">All About Dogs</a></h1>    <div id="p004_001a" class="group">  <a href="DAIS004A.smil#p004_001a"></a></div>    <h2 id="h2_000" class="section"> <a href="DAIS004B.smil#h2_000">Care and Feeding</a></h2>    <h3 id="h3_001" class="sub-section"> <a href="DAIS0018.smil#h3_001">Dog Food</a></h3>    <span id="p21" class="page-normal"> <a href="DAIS0018.smil#i21">21</a></span>

6.0 SMIL Files

The SMIL files must conform to the W3C specification for SMIL 1.0. In addition to this specification some metadata is required and several conventions will be used.

6.1 Metadata

The following is required.

<meta name="format" content="Daisy 2.0"/>

The following is optional:

<meta name="DC:title" content="Title of book"/>

The following is optional:

<meta name="title" content="Title of SMIL Section of book"/>

The following is optional, but if provided allows players to inform end users of the elapsed time and time remaining in the audio portions of a document.

<meta name="total-elapsed-time" content="03:32:02"/>  <meta name="time-in-this-smil" content="01:10:22"/>

NOTE: the time information in the meta data above will be replaced with information contained within a “master SMIL file” in future versions of this specification.

6.2 Sound and SMIL File Extensions

All sound files must be stored using the de facto standard convention for the file extension. For example, PCM and ADPCM wave files must use the “.wav” extension. SMIL files must use the “.sml” or “.smil” extension. These file extensions are registered under Microsoft Windows. Some other examples are:

  • “.mpg” for MPEG files.
  • “.mp2” for MPEG 2 layer 2 files
  • “.mp3” for MPEG 2 layer 3 files.
  • “.ra” for Real Audio files.
  • “.vqe .vqf, or vql” for Twin VQ files.

7.0 Interoperability and DAISY Certification

The HTML, and SMIL files can all be validated using standard SGML and XML validation tools. The NCC, and the content of the book in HTML can be validated against the HTML 4.0 strict specification. The SMIL files can be validated with a XML parser against the SMIL 1.0 specification.

In addition to these standard tests by validation tools, several test books will be made available. Player manufacturers that wish to meet the DAISY 2.0 specification should use these specifications and the test books to perform internal testing. If all of the books provided in this test suite can be played by the player, then the player manufacturer may request to have their player certified as DAISY 2.0 compliant.

Manufacturers of authoring tools must be able to meet these specifications in their authoring tool. If the specifications are met and if the materials produced by their authoring tool create content that is like the test suite materials, the developer may request to have their authoring tool certified as DAISY 2.0 compliant.

To receive DAISY Consortium certification of their claims, the player or authoring tool must be submitted to the DAISY Consortium for performance testing.