DAISY Pipeline: Word document to DTBook XML
The DAISY Pipeline App now has the Word to DTBook script. Word documents can now be directly converted to DTBook XML by DAISY Pipeline. Earlier Save as DAISY add-in was the only tool available to create DTBook XML.
The steps for using the Word to DTBook script are listed below.
Prepare the Word document
Format the Document in Word According to Accessibility Guidelines. A properly structured Word document is essential for creating accessible formats like DAISY & EPUB. Use styles (like Heading 1, 2, etc.) and apply alt text to images.
The tutorial Creating accessible Word documents explains this step.
Make sure to:
- Use built-in heading styles.
- Create alt text for images.
- Mark language and reading order.
- Avoid blank paragraphs for spacing.
Use Word to DTBook script in DAISY Pipeline
Install the DAISY Pipeline app. See DAISY Pipeline App Quick Start Guide to get the download link and overview of the app.
1. Run DAISY Pipeline: Click on the DAISY Pipeline icon on Desktop to run it. The Pipeline engine may take a few minutes to start.
2. Select script for conversion: In the DAISY Pipeline main screen, in the Select Script list box select Word to DTBook. Alternatively, you can also drag and drop the Word file in DAISY Pipeline window. The Word to DTBook script will get selected automatically. You can also click the Browse button and select a folder containing the Word document. This is specially helpful when you want to convert several files together. When the folder is selected, the Word files in the folder will be listed in DAISY Pipeline window. You can select the checkbox with the files you want to convert and then click the Create Job button.
3. Configure the script
- Input docx file: Click the Browse button and select the formatted Word document. If you had dropped files in Pipeline window or selected files from a folder, you will see them here and Browse button will be unavailable.
- Custom job name: This is optional, you can provide a job name if desired.
- Document title: The title of the document should be provided. It will be included in the metadata.
- Document author: Write the name of the author of the content. If there is more than one author, you can write their names separated by commas.
- Document publisher: Provide the name of the publisher of the content.
- Document identifier: Provide the unique number for the content such as ISBN.
- Subject(s): Provide the subject of the content which will be added as dc:Subject metadata in the XML.
- Accept revisions: Checking this Checkbox will accept the revisions if present and not yet accepted in the document.
- Pagination mode: You have to choose how page breaks will be inserted in the XML. The two options are; Numbers having PageNumberDAISY style and Word page breaks.
- Image resizing: Choose between Keep image size, Resize images or Resample images.
- Image resampling value: If you choose to resample images then provide the image resampling targeted resolution in dpi (dot-per-inch).
- Translate character styles: Check this checkbox if you want to retain the character styles of the document such as bold, italics etc.
- Footnotes position: You can choose where the footnote should be placed. The three options are near the paragraph containing the footnote reference, near the page break or at the end of the level (to be defined in next field).
- Footnotes insertion level: If you choose the third option in previous field, you have to also mention the content level. 0 means the footnotes will be inserted as close as possible of its first call.
- Footnotes numbering: You can choose between Use original Word numbering, Disable note numbering or Use custom numbering.
- Footnotes starting value: If you choose custom numbering in previous field, then mention the starting number for the footnote here.
- Footnotes number prefix: If you want, mention a prefix before the note’s number here.
- Footnotes number suffix: If you want, mention a text between the note’s number and the note’s content here.
- Extract vector shapes (Experimental): If this checkbox is checked, DAISY Pipeline can try to export inline shapes like diagrams or charts during conversion using Microsoft Word otherwise those shapes will be replaced by their name and description in the XML. Caution is recommended for using this experimental feature.
- Run: Finally click the Run button to start the conversion process.
The status of the conversion is shown at the top of the window. When the status changes to Completed click the Open folder link in the Results section to view the DTBook XML created by DAISY Pipeline. The Messages section is more relevant when the conversion fails and the status shows errors.
Tags: Pipeline App / Word