Recommendation, February 28 2001
Copyright © The DAISY Consortium 1998, 1999, 2000, 2001.
This document defines version 2.02 of the DAISY Digital Talking Book (DTB) format. The DAISY format is based on the W3C defined SGML (ISO 8879) applications XHTML 1.0 and SMIL 1.0. Using this framework, a talking book format is presented that enables navigation within a sequential and hierarchical structure consisting of (marked-up) text synchronized with audio.
This is a formal recommendation of the DAISY Consortium.
Document maintained by: Markus Gylling
This document can be found at: www.daisy.org/z3986/specifications/daisy_202.html
If you have any questions, comments, or suggestions please contact Markus Gylling at markus.gylling@tpb.se
George Kerscher, DAISY Consortium / Recording For the Blind and Dyslexic; Markus Gylling, DAISY Consortium / Swedish Library of Talking Books and Braille; James Pritchett, Recording For the Blind and Dyslexic; Heinz Zyset and Dorota Pograniczna and Matthias Ragaz, Swiss Library for the Blind and Visually Impaired; Lynn Leith and Barbara Freeze, Canadian National Institute for the Blind; Lars Sonnebo and Thomas Johansson, Swedish Library of Talking Books and Braille; Mark Hakkinen, isSound Corporation; Jan Lindholm and Diana Hiorth, Labyrinten Inc; Harvey Bingham, W3C Web Accessibility Initiative Invited Expert; David Pawson and Stephen King and Keith Gladstone, Royal National Institute for the Blind; Tatsu Nishizawa, Plextor Ltd; Dominic Labbé and Gilles Pepin, VisuAide Inc.; Michael Moodie and Lloyd Rasmussen, National Library Service for the Blind and Physically Handicapped Library of Congress; Tom McCartney and Rej Tanikella, Recording For the Blind and Dyslexic; Jason White, Web Accessibility Initiative; Edmar Schut, Dutch Library for the Visually and Print-handicapped Students and Professionals; Susanne Seidelin and Christian Wallin, Danish National Library for the Blind; Thomas Kalish, Association of Talking Book Libraries of Germany; Brink Kuchenbrod and Chad Berkley, University of Montana; Daniel Dardailler and Philipp Hoschka, World Wide Web Consortium; Peter Toneby, Umeå University.
This specification extends and revises the DAISY 2.01 specification. These revisions are are intended to bring this specification more in line with the DAISY 3/NISO DTB specification, and to clarify ambiguities in the DAISY 2.01 specification.
The major revisions made in this version of the specification are:
The DAISY 2.02 specification is technically backwards compatible with the DAISY 2.0 specification.
Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in IETF RFC 2119.
This specification uses the XHTML 1.0 and SMIL 1.0 specifications defined by the World Wide Web Consortium. Bibliographic and document metadata is based on the Dublin Core Metadata Initiative element set.
As specified in the DAISY structure guidelines, the DAISY 2.02 standard supports the following types of DTB (Digital Talking Book).
To comply with the DAISY 2.02 standard, a DTB must contain exactly one NCC.HTML document and one or more SMIL documents. Depending on the type of DTB made, the DTB may also contain one or more audio files, and one or more text content documents (XHTML). Finally, the DTB may also contain an optional Master SMIL document.
The structure of these document and file types, and the DTB functionality they provide, is defined below in sections 2.1 to 2.5.
The NCC document contains an index of navigable entry points into the DTB. In the default case these entry points consist of the elements <h1> through <h6> for headings, and the <span> element for pages. Optionally, blocks (group) of text are also used for navigation by means of using the <div> element.
The NCC also implicitly represents the continuous playback order of all the media objects that make up the DTB. This is sometimes referred to as "the flow" of narration and/or text.
The NCC is not necessarily identical to the table of contents (TOC) of the print source. It will often contain more elements than a print source TOC, that is, the NCC may be an expanded version of the TOC based on the content and structure of the body of the book.
The NCC should be a XHTML 1.0 transitional DTD compliant document. Use of HTML 4.01 is deprecated in this version of the Daisy DTB specification.
The NCC document must be named "NCC.HTML" or "ncc.html".
The <head> element must contain the following children:
The NCC <head> element must contain a set of <meta> children. These have bibliographic as well as technical-descriptive content.
For bibliographic metadata, the DAISY 2.02 specification uses the Dublin Core (DC) Metadata Initiative element set, which is an internationally approved and broadly accepted tool comprised of 15 data categories, and the rules necessary for the description of document resources. Although the DC element set covers a wide range of bibliographic description for digital talking books, there is some vital information that is not adequately covered in those 15 data categories. Additional elements specific to DAISY DTB´s have therefore been developed. These additional elements are designated as "ncc:"-prefixed elements.
The general syntax of the <meta> element is:
<meta name="metaname" content="value of metaname" scheme="scheme for value of metaname" />
The name attribute contains the name for the content of a certain meta statement. In DAISY 2.02 DTB´s there are two categories of metanames identified by the prefixes "dc:" or "ncc:".
The content attribute contains a value for the name attribute.
The "dc:" and "ncc:" prefixes should be lower case. However, playback systems must not be case sensitive when reading these attributes.
To accommodate the provision of additional information about the print source, the Dublin Core element labels are used in meta statements with an "ncc:" prefixed metaname that consists of the DC element label preceded by the word "source".
Please note that there is one exception. Information about the edition of the print source is included by using the metaname "ncc:sourceEdition". DC does not yet have a solution for the inclusion of data regarding editions.
The scheme attribute contains references as to how the value of the content attribute has to be interpreted. Such a reference may consist of a simple syntax model, but normally it is the name of a standard or an authority list. It is not meaningful to use a scheme name that does not refer to a file or a standard. In many cases no scheme is needed.
The <meta> element set in the NCC.HTML must be compliant with the following definition list.
In addition to the above definition list, it is allowed to use arbitrary metadata elements to support producer-specific metadata issues. Such meta elements shall carry a "prod:" prefix and shall be ignored by playback systems.
(All mandatory and some optional elements are included in this example)
<?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Economics</title> <meta http-equiv="Content-type" content='text/html; charset="iso-8859-1"' /> <meta name="dc:title" content="Economics" /> <meta name="dc:creator" content="Richard G. Lipsey" /> <meta name="dc:creator" content="Paul N. Courant" /> <meta name="dc:creator" content="Douglas D. Purvis" /> <meta name="dc:creator" content="Peter O. Steiner" /> <meta name="dc:date" content="2000-09-05" scheme="yyyy-mm-dd" /> <meta name="dc:format" content="Daisy 2.02" /> <meta name="dc:identifier" content="DTB00345" /> <meta name="dc:language" content="EN" scheme="ISO 639" /> <meta name="dc:publisher" content="TPB" /> <meta name="dc:source" content="0-065-01022-1" scheme="ISBN" /> <meta name="dc:subject" content="Qb" /> <meta name="ncc:sourceDate" content="1993" scheme="yyyy" /> <meta name="ncc:sourceEdition" content="1" /> <meta name="ncc:sourcePublisher" content="Harper Collins" /> <meta name="ncc:charset" content="iso-8859-1" /> <meta name="ncc:generator" content="LpStudioGen 1.6" /> <meta name="ncc:narrator" content="Timothy Ocklind" /> <meta name="ncc:tocItems" content="1024" /> <meta name="ncc:totalTime" content="91:27:21" scheme="hh:mm:ss" /> <meta name="ncc:pageNormal" content="881" /> <meta name="ncc:maxPageNormal" content="881" /> <meta name="ncc:pageFront" content="27" /> <meta name="ncc:pageSpecial" content="45" /> <meta name="ncc:prodNotes" content="0" /> <meta name="ncc:footnotes" content="0" /> <meta name="ncc:sidebars" content="0" /> <meta name="ncc:setInfo" content="1 of 3" /> <meta name="ncc:depth" content="4" /> <meta name="ncc:kByteSize" content="1530000" /> <meta name="ncc:multimediaType" content="audioNCC" /> <meta name="ncc:files" content="97" /> <meta name="prod:recLocation" content="Studio 2" /> <meta name="prod:recEngineer" content="J Klein" /> </head>
The <body> element of a DAISY 2.02 NCC must contain the following children:
The <body> element of a DAISY 2.02 NCC may contain the following children:
No other elements than <h1> to <h6>, <span> and <div> shall occur as children of the NCC.HTML <body>.
Heading references are provided in DAISY 2.02 DTB´s to support navigation to chapters and sections. Headings occurring in the DTB are represented by XHTML heading elements one through six (<h1> to <h6>). The content of the heading element should be the chapter or section name as it appears in the print source.
The general syntax of the heading element is:
<hx class="value" id="value"><a href="smil#fragment">Heading content</a></hx>
The heading element must contain the following attributes:
The heading element may contain the following attributes:
The heading element must contain the following child elements:
Class attributes may be used to provide additional semantic information. Typical values for class attributes occurring on the heading elements are:
title, jacket, front, title-page, copyright-page, acknowledgments, prolog, introduction, dedication, foreword, preface, print-toc, part, chapter, section, sub-section, minor-head, bibliography, glossary, appendix, index, index-category.
The first entry in the <body> of the NCC must be the title section of the DTB. The element used must be a <h1>. The class attribute used for this headings must be "title".
As opposed to both HTML 4.01 and XHTML 1.0, the DAISY 2.02 specification requires proper nesting of headings. A level cannot be skipped. For example an <h3> must always be preceded by an <h2> (or another <h3>) and an <h2> must be preceded by an <h1> (or another <h2>).
... <h1>The nature of economics</h1> <h2>The economic problem</h2> <h2>Economics as a social science</h2> <h3>The scientific approach</h3> <h1>An overview of the market economy</h1> ...
... <h1>Demand, supply and price</h1> <h2>Demand</h2> <h2>Supply</h2> <h4>Quantity supplied</h4> <h3>Determination</h3> <h1>Elasticity</h1> ...
Page references are provided in DAISY 2.02 DTB´s to support navigation to pages by the end users. Pages occurring in the DTB are represented by the <span> element.
The <span> element identifies the beginning of the page. The content of the <span> element is the page number as it appears in the print book.
The general syntax of the <span> element is:
<span class="value" id="value"><a href="smil#fragment">span content</a></span>
The <span> element must contain the following elements:
The <span> element must contain the following attributes:
Values for the class attributes occurring on the <span> element when used for pages are:
Blocks (groups) of text may be used for navigation. For example, paragraphs, list items, or table rows may be identified in the NCC to provide navigation by such structural elements. In the NCC the grouping element <div> is used as a placeholder to point to these blocks. The content of the <div> element is determined by the producer in relation to the navigation that will be required by the end users. If full text is used, the <div> in the NCC can point to any HTML element with a unique id.
The general syntax of the <div> element is:
<div class="value" id="value"> <a href="smil#fragment">content</a></div>
The <div> element must contain the following attributes:
The <div> element must contain the following elements:
Values for class attributes occurring on the <div> elements are:
Each child of the <body> element in the NCC.HTML used for synchronization must contain an id attribute.
The value of the id attribute must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
Each child of the <body> element in the NCC.HTML document must contain an anchor element. The href attribute makes this anchor the source anchor of exactly one link.
The general syntax of the <a> element is:
<a href="smil#fragment">document content</a>
The <a> element must contain the following attributes:
Document content must always occur between the anchor start and close tags.
The href attribute defines the link between the source anchor (current position in the NCC) and the destination anchor (a location within a SMIL document). This is expressed as a URI where a fragment identifier points to the id of a location within the resource, i.e. the SMIL document.
In DAISY 2.02 books, it is required that the destination anchor resides within the SMIL <par> or within the SMIL <text> element. Pointers to other types of elements must not occur.
Use of the <par> element as destination anchor is recommended. If the target is a <text> element, it is highly recommended that that <text> element be the first element within its parent <par>.
All child elements of the NCC.HTML <body> must be synchronization points, i.e. they must contain an <a> element with an href attribute pointing to a destination anchor within a <par> or <text> element of a SMIL file.
<body> <h1 class="title" id="econ_0001"><a href="econ0001.smil#ec1a_0001">Economics by Richard G. Lipsey et al..</a></h1> <h1 id="econ_0002"><a href="econ0002.smil#ec2a0001">Information about the talking book</a></h1> <h1 id="econ_0003"><a href="econ0003.smil#ec3a0001">Contents in Brief</a></h1> <span class="page-front" id="econ_0004"><a href="econ0003.smil#ec3a0002">iv</a></span> <span class="page-front" id="econ_0005"><a href="econ0003.smil#ec3a0003">v</a></span> <span class="page-front" id="econ_0006"><a href="econ0003.smil#ec3a0004">vi</a></span> ... <span class="page-normal" id="econ_0021"><a href="econ0008.smil#ec8a0003">1</a></span> <h1 class="part" id="econ_0022"><a href="econ0009.smil#ec9a0001">Part 1. The nature of economics</a></h1> <h2 class="chapter" id="econ_0023"><a href="econ0010.smil#ec100001">1. The economic problem</a></h2> <span class="page-normal" id="econ_0024"><a href="econ0010.smil#ec100002">2</a></span> <span class="page-normal" id="econ_0025"><a href="econ0010.smil#ec100003">3</a></span> <h3 id="econ_0026"><a href="econ0010.smil#ec100004">What is economics?</a></h3> <h4 id="econ_0027"><a href="econ0010.smil#ec100005">Resources and commodities</a></h4> <h4 id="econ_0028"><a href="econ0010.smil#ec100006">Scarcity</a></h4> <h4 id="econ_0029"><a href="econ0010.smil#ec100007">Choice</a></h4> <span class="page-normal" id="econ_0031"><a href="econ0010.smil#ec100008">4</a></span> ... <span class="page-special" id="econ_0934"><a href="econ0085.smil#ec850027">G-1</a></span> <h1 class="glossary" id="econ_0935"><a href="econ0086.smil#ec860001">Glossary</a></h1> <div id="econ_0936" class="group"><a href="econ0086.smil#ec860002">Glossary item 1</a></div> <div id="econ_0937" class="group"><a href="econ0086.smil#ec860003">Glossary item 2</a></div> <div id="econ_0938" class="group"><a href="econ0086.smil#ec860004">Glossary item 3</a></div> ... <h1 id="econ_0946"><a href="econ0088.smil#ec880001">Ending announcement</a></h1> </body>
The NCC may include references to elements that can be turned on or off by the end user. The default style for these items should be "on". Note references, non-essential sidebars, optional producer notes, and page numbers fall into this category. The XHTML <span> element must be used to indicate these items in the NCC <body>.
The <span> element must contain the following attributes:
The <span> element must contain the following elements:
Values for class attributes occurring on the <span> element are:
<span class="sidebar" id="econ0047"> <a href="econ0056.smil#ec560004">Sidebar</a> </span>
<span class="optional-prodnote" id="econ0057"> <a href="econ0060.smil#ec600004">Producer's Note</a> </span>
<span class="noteref" id="ncc003"> <a href="smil001.smil#par002"><sup>1</sup></a> </span>
The SMIL 1.0 system-required test attribute shall be used to control selective playback of content such as sidebars, footnotes, producer notes, and page numbers. The system-required attribute must occur on <par> elements.
Values for system-required attributes occurring on the <par> element are:
For the NCC sidebar example above, the following usage of the switch mechanism in the SMIL document shall be used. (The same syntax applies to producer notes and pagenumbers.)
<seq dur="123.45s"> <!--Par pre sidebar --> <par endsync="last" id="ec560003"> <text src="ncc.html#econ0046" /> <seq> <audio src="audio001.mp3" clip-begin="npt=0.000s" clip-end="npt=2.507s" id="phrs_0001" /> <audio src="audio001.mp3" clip-begin="npt=3.345" clip-end="npt=6.123s" id="phrs_0002" /> <seq> </par> <!--Par for sidebar --> <par endsync="last" id="ec560004" system-required="sidebar-on"> <text src="ncc.html#econ0047" /> <seq> <audio src="audio001.mp3" clip-begin="npt=6.123s" clip-end="npt=8.345s" id="phrs_0003" /> <audio src="audio001.mp3" clip-begin="npt=8.345ss" clip-end="npt=10.567s" id="phrs_0004" /> <seq> </par> <!--Par post sidebar --> <par endsync="last" id="ec560005"> <text src="ncc.html#econ0046" /> <seq> <audio src="audio001.mp3" clip-begin="npt=10.567s" clip-end="npt=22.347s" id="phrs_0005" /> <audio src="audio001.mp3" clip-begin="npt=22.347s" clip-end="npt=26.435s" id="phrs_0006" /> <seq> </par> </seq>
The footnote SMIL syntax differs from that of sidebars and prodnotes. Here, a nested <seq> is used to contain two <par> elements; the first containing the note reference, and the second containing the note body. The following SMIL syntax corresponds to the NCC footnote example above.
<body> <seq dur="123.45s"> <!--Par pre footnote --> <par endsync="last" id="par001"> <text src="ncc.html#ncc001" /> <seq> <audio src="audio001.mp3" clip-begin="npt=0.000s" clip-end="npt=2.507s" id="phrs_0001" /> <audio src="audio001.mp3" clip-begin="npt=3.507" clip-end="npt=6.507s" id="phrs_0002" /> <seq> </par> <!--nested seq begins here--> <seq> <!--Par for note reference --> <par endsync="last" id="par002"> <text src="ncc.html#ncc003" /> <seq> <audio src="audio001.mp3" clip-begin="npt=6.507s" clip-end="npt=7.507s" id="phrs_0003" /> </seq> </par> <!--Par for note body --> <par endsync="last" id="par003" system-required="footnote-on"> <text src="ncc.html#ncc003" /> <seq> <audio src="audio009.mp3" clip-begin="npt=7.507s" clip-end="npt=10.507s" id="phrs_0004" /> <audio src="audio009.mp3" clip-begin="npt=10.507s" clip-end="12.507s" id="phrs_0005" /> <seq> </par> <!--nested seq ends here--> </seq> <!--Par post footnote> <par endsync="last" id="par004"> <text src="ncc.html#ncc001" /> <seq> <audio src="audio001.mp3" clip-begin="npt=12.507s" clip-end="npt=14.507s" id="phrs_0006 /> <audio src="audio001.mp3" clip-begin="npt=14.507s" clip-end="npt=18.507s"" id="phrs_0007" /> <seq> </par> </seq> </body>
Playback systems, both hardware devices and software, require a mechanism to provide audio descriptions of the structural elements of a book. Though some playback systems may support synthetic speech presentation of textual information, pre-recorded audio announcements are preferred as they are most natural and can be supported by all devices. Pre-recorded messages allow a playback device to correctly announce elements such as "chapter", "section" or any other class name.
A resource file can be used to achieve this. Resources in this case are individual audio files which are associated with structural elements. In traditional software development models, a single resource file may point to or contain multiple resource definitions. Presently, there are no existing standards for defining resource files for use with this particular application.
Audio Style Sheets are used for defining this resource information. Audio Style Sheets are part of the W3C´s Cascading Style Sheet recommendation. In addition to providing resource definitions to be used by playback systems, Audio Cascading Style Sheets (ACSS) may be used for general audio styling during playback of DAISY 2.02 DTB´s.
Audio Style Sheets provide the necessary definition to associate an audio cue with a structural element. The structural element may be an element name (such as h1) or a class name (such as chapter). The playback system will interpret the Audio Style Sheet and use this information to play out the elements to the end user. Examples include position ("where am I" announcements) and structural information.
Generalized audio styling may be supported by playback systems to implement "earcons" and to format textual content for presentation with speech synthesis. This particular utilization of audio styles is not formally defined or required in DAISY 2.02.
This is an example of a file which associates an audio file with a document element (h1) and with two class names ("chapter" and "subhead").
h1 {cue-before: uri("resources/en-h1.wav")} .chapter {cue-before: uri("resources/en-chapter.mp3")} .subhead {cue-before: uri("resources/en-subhead.mp3")}
The uri value points to the file which contains the audio to be associated with the element or class name. The location of the audio resource files is left to the producer. In this example, the audio resource files are stored in a "resources" sub-directory.
The standard mechanism for associating a Style Sheet with an XHTML document is through the <link> element. The format of the <link> element is:
<link rel="stylesheet" href="en-resource.css" media="aural" title="daisy Resource File">
The style sheet link reference should be contained in the NCC.HTML file. Individual XHTML source files may also include additional style sheet references for visual and audio presentation. These must be specified in the documents themselves, using the XHTML style sheet <link> element, and reference the media attribute value appropriate to the stylesheet.
The NCC.HTML document may be used as the carrier of text content if the text content is limited to headings, pages, and simple references to other content,i.e. content conforming to DTB types 1 and 2. The use of additional text content documents is required if there is extensive text content in the DTB, i.e. content conforming to DTB types 3 to 6.
Additional text content documents must be XHTML 1.0 compliant documents. Use of HTML 4.01 is deprecated in this version of the Daisy DTB specification.
The <head> element of a text content document must contain the following children:
The <head> element of a text content document may contain the following children:
In the <body> of text content documents, the use of logical, simple use of structural block elements such as paragraphs, lists, tables, preformatted text, etc. is recommended. Use of tables to format text is discouraged. Use of frames and other graphical formatting features are discouraged.
Multiple XHTML files may be used to carry the text content. This may more often be the case with DTB's which are full text productions.
To correctly identify the order of the XHTML files, the XHTML link element may be used.
<head> ... <link rel="start" href="first-file.html"> <link rel="previous" href="previous-file.html"> <link rel="next" href="next-file.html"> ... </head>
The SMIL document of a DAISY DTB is a SMIL 1.0 compliant document that provides the text-audio synchronization functionality for all or defined segments of the DTB content. In text-audio synchronized DAISY DTB´s, one SMIL document is always in itself a continuous sequence, that contains one or several parallel time groupings or synchronization units referring to text and/or audio media objects.
The following describes the document structure convention for implementation of SMIL 1.0 in DAISY 2.02 DTB´s.
The <head> element must contain the following children:
The <head> element may contain the following children:
The syntax of <meta> elements used in the SMIL <head> corresponds to those used in NCC.HTML - see section 2.1.2.
Additionally, the SMIL <head> may contain a <layout> element that determines how elements in the document´s <body> are positioned on an abstract rendering surface (visual or acoustic).
The <layout> element defines how the <text> elements in the document's <body> are positioned when rendered. If a document contains no layout element, the positioning of the body elements is implementation-dependent.
The general syntax of the <layout> element is:
<layout> <region id="value" /> </layout>
The <layout> element must contain the following children:
The <region> element controls the position, size and scaling of <text> elements. A region element is applied to a <text> element by setting the region attribute of the positionable element to the id value of the region.
Attributes occurring on the <region> element are:
The <body> element of a DAISY 2.02 SMIL must contain the following children:
The <seq> element is a time container whose children form a temporal sequence. In the following definition lists, the <seq> element that occurs as a child of <body> is referred to as the "main" <seq>, and <seq> elements that are nested are referred to as "nested".
The main <seq> element of a DAISY 2.02 SMIL must contain the following children:
The main <seq> element of a DAISY 2.02 SMIL may contain the following children:
Nested <seq> elements of a DAISY 2.02 SMIL must contain one of the following children:
A nested <seq> element must not have children of different types.
Attributes occurring on the <seq> element are
The <par> element is a time container whose children do not form a temporal sequence; as opposed to the <seq> element they instead occur simultaneously. In other words, media objects within a <par> element are synchronized with each other.
The <par> element of a DAISY 2.02 SMIL must contain the following children:
The <par> element of a DAISY 2.02 SMIL may contain the following children:
Attributes occurring on the <par> element are
The <text> element is a media object without an intrinsic duration. In the case of DAISY DTB´s types 1 to 5, its duration will be determined by the audio with which it is synchronized.
Attributes occurring on the <text> element are
The value of the src attribute is the URI of the media object. In this case it is a pointer to the currently synchronized XHTML document content - a heading, a page number, block text etc.
If any <text> element src attribute refers to the file NCC.HTML, then no other text element within the same DTB should contain a src attribute which refers to a file other than the NCC.HTML file.
The <audio> element is a media object with an intrinsic duration.
Attributes occurring on the <audio> element are
The value of the src attribute is the URI of the media object. In this case it is a pointer to the file containing the currently synchronized audio. If the referenced audio object is not to be played as a whole, subpositions are defined by the attributes clip-begin and clip-end, specifying the beginning and the end of the sub-clip. If several sub-positions occur sequentially they are placed within a seq element.
The values of the clip-begin and clip-end attributes must be a metric specifier followed by a clock value. Of the metric specifiers allowed by the SMIL 1.0 specification, DAISY DTB´s must use "Normal Play Time". The metric specifier is "npt", and the clock value must be expressed in legal SMIL timecount values, using the default SMIL 1 timecount (seconds). It is highly recommended to specify fractions of seconds in millisecond resolution.
A <par> must never contain several media objects of the same type - for example, not more than one <text> and one <audio> reference (unless the media objects are contained within a nested <seq>).
The first <par> shows an example of where an entire audio file is to be rendered. The second <par> shows audio divided into segments (sub-positions) and contained within a <seq>.
<!DOCTYPE SMIL PUBLIC "-//W3C//DTD SMIL 1.0//EN" "http://www.w3.org/TR/REC-SMIL/SMIL10.dtd"> <smil> <head> <meta name="ncc:generator" content="LpStudioGen v1.6" /> <meta name="dc:identifier" content="DTB00345" /> <meta name="dc:format" content="Daisy 2.02" /> <meta name="dc:title" content="Economics" /> <meta name="title" content="13. Monopoly" /> <meta name="ncc:totalElapsedTime" content="14:04:48" /> <meta name="ncc:timeInThisSmil" content="0:02:38" /> <layout> <region id="txtView" /> </layout> </head> <body> <seq dur="158.485s"> <par endsync="last" id="ec79_0002"> <text src="ncc.html#econ_0346" id="ec79_0003" /> <audio src="econ25000c.mp3 id="ec79_0004" /> </par> <par endsync="last" id="ec79_0005"> <text src="ncc.html#econ_0348" id="ec79_0006" /> <seq id="ec79_0007"> <audio src="econ25000d.mp3" clip-begin="npt=72.200s" clip-end="npt=74.659s" id="phrs_0011" /> <audio src="econ25000d.mp3" clip-begin="npt=74.659s" clip-end="npt=81.269s" id="phrs_0012" /> <audio src="econ25000d.mp3" clip-begin="npt=81.269s" clip-end="npt=91.691s" id="phrs_0013" /> <audio src="econ25000d.mp3" clip-begin="npt=91.691s" clip-end="npt=95.477s" id="phrs_0014" /> <audio src="econ25000d.mp3" clip-begin="npt=95.477s" clip-end="npt=110.575s" id="phrs_0015" /> <audio src="econ25000d.mp3" clip-begin="npt=110.575s" clip-end="npt=115.777s" id="phrs_0016" /> <audio src="econ25000d.mp3" clip-begin="npt=115.777s" clip-end="npt=119.980s" id="phrs_0017" /> <audio src="econ25000d.mp3" clip-begin="npt=119.980s" clip-end="npt=127.816s" id="phrs_0018" /> <audio src="econ25000d.mp3" clip-begin="npt=127.816s" clip-end="npt=135.288s" id="phrs_0019" /> <audio src="econ25000d.mp3" clip-begin="npt=135.288s" clip-end="npt=141.507s" id="phrs_0020" /> <audio src="econ25000d.mp3" clip-begin="npt=141.507s" clip-end="npt=158.485s" id="phrs_0021" /> </seq> </par> </seq> </body> </smil>
The first <text> element in the first <par> of each SMIL document must refer to a heading element (h1 through h6). In other words, each SMIL document must begin with providing synchronization for a HTML heading. It is recommended to begin with h1 headings. Other than this, there is no required relationship between text structure and SMIL.
There is no required relationship between audio files and SMIL documents. One SMIL document may point to several audio files (at seq and at clip level), or to non-continuous segments of an audio file.
This is an allowed SMIL <seq> segment series:
... <audio src="econ0001.mp3" clip-begin="npt=0.000s" clip-end="npt=4.659s" id="phrs_0001" /> <audio src="econ0002.mp3" clip-begin="npt=74.659s" clip-end="npt=81.269s" id="phrs_0002" /> <audio src="econ0001.mp3" clip-begin="npt=4.659s" clip-end="npt=6.691s" id="phrs_0003" /> <audio src="econ0001.mp3" clip-begin="npt=6.691s" clip-end="npt=8.477s" id="phrs_0004" /> ...
This is also an allowed SMIL <seq> segment series:
... <audio src="econ0001.mp3" clip-begin="npt=0.000s" clip-end="npt=4.659s" id="phrs_0001" /> <audio src="econ0001.mp3" clip-begin="npt=74.659s" clip-end="npt=81.269s" id="phrs_0002" /> <audio src="econ0001.mp3" clip-begin="npt=4.659s" clip-end="npt=6.691s" id="phrs_0003" /> <audio src="econ0001.mp3" clip-begin="npt=6.691s" clip-end="npt=8.477s" id="phrs_0004" /> ...
All SMIL files must have the extension ".smil" or the extension ".SMIL".
In SMIL filenames, it is highly recommended to use only the letters ([A-Za-z]), digits ([0-9]), hyphens ("-"), and underscores ("_").
The Master SMIL document is an optional SMIL 1.0 compliant document consisting of a list of the DTB component SMIL documents in playback order.
To provide easy distinguishability from the other SMIL documents in the DTB, the Master SMIL document must be named "master.smil" or "MASTER.SMIL".
The <head> element must contain the following children:
The <head> element may contain the following children:
The <body> element of a DAISY 2.02 Master SMIL must contain the following children:
Attributes occuring on the <ref> element are:
<!DOCTYPE SMIL PUBLIC "-//W3C//DTD SMIL 1.0//EN" "http://www.w3.org/TR/REC-SMIL/SMIL10.dtd"> <smil> <head> <meta name="dc:title" content="Economics" /> <meta name="dc:identifier" content="DTB00345" /> <meta name="dc:format" content="Daisy 2.02" /> <meta name="ncc:generator" content="LpStudioGen 1.6" /> <meta name="ncc:timeInThisSmil" content="91:27:21" /> <layout> <region id="txtView" /> </layout> </head> <body> <ref title="Economics by Richard G. Lipsey et.al." src="econ0001.smil" id="lpID_0001" /> <ref title="Information about the talking book" src="econ0002.smil" id="lpID_0002" /> <ref title="Contents in brief" src="econ0003.smil" id="lpID_0003" /> ... <ref title="Part 1. The nature of economics" src="econ0009.smil" id="lpID_0007" /> <ref title="1. The economic problem" src="econ0010.smil" id="lpID_0008" /> <ref title="2. Economics as a social science" src="econ0011.smil" id="lpID_0009" /> <ref title="3. An overview of the market economy" src="econ0012.smil" id="lpID_0010" /> ... <ref title="Ending announcement" src="econ0088.smil" id="lpID_0095" /> </body> </smil>
The PCM file must use the Microsoft RIFF WAVE file format with a header indicating PCM as its format tag. The PCM file must use the file extension ".wav".
The MPEG file must use header and file structure as defined by either ISO/MPEG or Microsoft RIFF WAVE. The following subformats of MPEG Audio are supported: MPEG-1 Layer 2; MPEG-1 Layer 3; MPEG-2 Layer 2; MPEG-2 Layer 3. If the ISO/MPEG file format is used, the file extension should be ".mp2" or ".mp3" respectively.
If the Microsoft RIFF WAVE file format is used, the file extension should be ".wav".
Use of the ISO/MPEG file format and its extensions is recommended.
The ADPCM2 file must use the file extension ".wav".
In audio filenames, it is highly recommended to use only the letters ([A-Za-z]), digits ([0-9]), hyphens ("-"), and underscores ("_").
The NCC.HTML should be placed in the root of the carrier media, together with SMIL and media (audio) files. This is the default case.
There are however optional exceptions to this case, and although they are allowed by the specification, producers should use them with caution since not all playback devices may be able to handle the non-default cases correctly.
For example, a NCC.HTML placed in the root of a media may have a heading element that looks like this:
<h1 id="econ0128"><a href="./subfolder1/econ0065.smil#ec650001">Chapter 1</a></h1>
and a corresponding SMIL section could look like this:
... <par endsync="last"> <text src="../ncc.html#econ00128" id="ec650001" /> <seq> <audio src="./subfolder2/econ0034.wav" clip-begin=... /> <audio src="./subfolder3/econ0035.wav" clip-begin=... /> ...
To achieve DAISY 2.02 compliance, playback devices are only required to support playback of DTB´s that conform to the default case.
If a DAISY DTB, due to its size, has to be placed on several (two or more) carrier media, the following method must be used.
Each carrier media must contain a full NCC document, with the following additions:
<meta name="ncc:setInfo" content="1 of 3" />Where the content "1 of 3", "2 of 3", or "3 of 3" denotes the order of the media.
The following example shows the second disc in a set of three.
<?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> ... other head elements ... <meta name="setInfo" content="2 of 3" /> </head> <body> <h1 id="ec001" class="title"> <a href=title.smil#info0001>Title section</a></h1> ... <h1 id="ec006"><a href=ch01.smil#ec01 rel="1 of 3">Chapter 1</a></h1> <h1 id="ec007"><a href=ch02.smil#ec01 rel="1 of 3">Chapter 2</a></h1> <h1 id="ec008"><a href=ch03.smil#ec01 rel="1 of 3">Chapter 3</a></h1> <h1 id="ec009"><a href=ch04.smil#ec01 rel="1 of 3">Chapter 4</a></h1> <h1 id="ec010"><a href=ch05.smil#ec01 rel="1 of 3">Chapter 5</a></h1> <h1 id="ec011"><a href=ch06.smil#ec01>Chapter 6</a></h1> <h1 id="ec012"><a href=ch07.smil#ec01>Chapter 7</a></h1> <h1 id="ec013"><a href=ch08.smil#ec01>Chapter 8</a></h1> <h1 id="ec014"><a href=ch09.smil#ec01>Chapter 9</a></h1> <h1 id="ec015"><a href=ch10.smil#ec01>Chapter 10</a></h1> <h1 id="ec016"><a href=ch11.smil#ec01 rel="3 of 3">Chapter 11</a></h1> <h1 id="ec017"><a href=ch12.smil#ec01 rel="3 of 3">Chapter 12</a></h1> <h1 id="ec018"><a href=ch13.smil#ec01 rel="3 of 3">Chapter 13</a></h1> <h1 id="ec019"><a href=ch14.smil#ec01 rel="3 of 3">Chapter 14</a></h1> <h1 id="ec020"><a href=ch15.smil#ec01 rel="3 of 3">Chapter 15</a></h1> </body> </html>
The following example corresponds to "ch01.smil" in the above NCC example
<!DOCTYPE SMIL PUBLIC "-//W3C//DTD SMIL 1.0//EN" "http://www.w3.org/TR/REC-SMIL/SMIL10.dtd"> <smil> <head> ...other head elements... <layout> <region id="txtView" /> </layout> </head> <body> <seq dur="4.024s"> <par endsync="last"> <text src="ncc.html#ec001" id="info0001" /> <audio src="please_insert_cd_1.wav" clip-begin="npt=0.000s" clip-end="npt=1.456s" id="info0004" /> </par> </seq> </body> </smil>
If more than one DAISY DTB is provided on the same carrier media, the following method must be used.
Each DTB must be stored in a separate folder (directory). In the root of the carrier media a file called discinfo.html containing links to all DTB´s on the media is created. This document must be structured as shown in section 3.3.1.
<?xml version="1.0" encoding="iso-8859-1"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>CD Information</title> <meta http-equiv="Content-type" content='text/html; charset="iso-8859-1"' /> </head> <body> <a href="./book1/ncc.html">Economics</a> <a href="./book2/ncc.html">Ecology</a> </body> </html>
Use of a file system that allows long file names (more than 8+3 characters) is required.
As for CD-ROM, use of the "Joliet" extension to ISO 9660 is recommended. The "Romeo" file system shall not be used.
Prefix | Label | Scheme | Occurrence |
ncc | charset | - | mandatory |
dc | contributor | - | optional* |
dc | coverage | - | optional* |
dc | creator | - | mandatory (if a creator is known)* |
dc | date | ISO 8601 | mandatory |
ncc | depth | - | optional - recommended |
dc | description | - | optional* |
ncc | files | - | optional - recommended |
ncc | footnotes | - | mandatory if footnotes are used |
dc | format | - | mandatory |
ncc | generator | - | optional |
- | http-equiv | - | optional |
dc | identifier | - | mandatory |
ncc | kByteSize | - | optional |
dc | language | ISO 639-1/ISO 3166 | mandatory* |
ncc | maxPageNormal | - | optional - recommended |
ncc | multimediaType | - | optional - recommended |
ncc | narrator | - | optional - recommended* |
ncc | pageFront | - | mandatory |
ncc | pageNormal | - | mandatory |
ncc | pageSpecial | - | mandatory |
ncc | prodNotes | - | mandatory if producers´ notes are used |
ncc | producedDate | ISO 8601 | optional |
ncc | producer | - | optional |
dc | publisher | - | mandatory |
dc | relation | - | optional* |
ncc | revision | - | optional |
ncc | revisionDate | ISO 8601 | optional |
dc | rights | - | optional* |
ncc | setInfo | k of n | mandatory in multiple volume DTB´s |
ncc | sidebars | - | mandatory if sidebars are used |
dc | source | ISBN | optional - recommended |
ncc | sourceDate | ISO 8601 | optional - recommended |
ncc | sourceEdition | - | optional - recommended |
ncc | sourcePublisher | - | optional - recommended |
ncc | sourceRights | - | optional |
ncc | sourceTitle | - | optional (mandatory if the titles differ) |
dc | subject | - | optional - recommended* |
dc | title | - | mandatory |
ncc | tocItems | - | mandatory |
ncc | totalTime | hh:mm:ss | mandatory |
dc | type | - | optional* |
* = this meta element may occur multiple times within the NCC meta element set.
Prefix | Label | Scheme | Occurrence |
dc | format | - | mandatory |
ncc | generator | - | optional |
dc | identifier | - | optional - recommended |
ncc | timeInThisSmil | SMIL clock | optional - recommended |
- | title | - | optional |
dc | title | - | optional |
ncc | totalElapsedTime | SMIL clock | optional - recommended |
dc | format | - | mandatory |
ncc | generator | - | optional |
dc | identifier | - | mandatory |
ncc | timeInThisSmil | SMIL clock | optional - recommended |
dc | title | - | mandatory |
The XHTML and SMIL files can all be validated using standard XML validation tools. The NCC, and the content of the book in XHTML can be validated against the XHTML 1.0 transitional specification. The SMIL files can be validated with a XML parser against the SMIL 1.0 specification.
In addition to these standard tests by validation tools, several test books will be made available. Player manufacturers that wish to meet the DAISY 2.02 specification should use these specifications and the test books to perform internal testing. If all of the books provided in this test suite can be played by the player, then the player manufacturer may request to have their player certified as DAISY 2.02 compliant.
Manufacturers of authoring tools must be able to meet these specifications in their authoring tool. If their authoring tool creates content which meets these specifications and which performs like the test suite materials, the developer may request to have their authoring tool certified as DAISY 2.02 compliant.
To receive DAISY Consortium certification of their claims, the player or authoring tool must be submitted to the DAISY Consortium for performance testing.