Advanced Guide to WordToEPUB

Advanced Guide to WordToEPUB

(WordToEPUB version : 1.0.0)

Introduction

You can begin using WordToEPUB really easily, as described in a companion document Getting started with WordToEPUB. It describes installation, updating and uninstalling the tool, and the basic steps to create an EPUB from a Word document.

This guide provides information on more advanced features, for people who want more control of the EPUBs they are creating, and who want to make use of the more sophisticated features.

Create an accessible document in Microsoft Word

The EPUBs you create with this tool are from regular Word documents. However, the more you pay attention to your document, the better experience your readers will ultimately have in the EPUBs you create. For helpful tips in making better Word documents see our guide Creating accessible Word documents.

Additional Word authoring considerations for WordToEPUB

Including Math as MathML

Math expressions included in Word’s math format (OOML) are converted and placed in the EPUB as MathML. A growing number of reading systems support MathML, but also see the later section on Math as an image.

There are several ways to add math expressions into your document:

Use Insert / Equation and use Word’s Equation editor to build an expression, which will be stored in the document as OOML. This works well for many expressions, but some people prefer other methods.

If you are familiar with LaTex then you can use this to enter math expressions. Use Insert / Equation / Insert new equation. In the Conversions group on the ribbon ensure that LaTex is selected. Enter your LaTex expression and it will be converted and stored in the document in Word’s OOML format.

There are many tools you can use to generate MathML. This can then be pasted into Word as plain text. Word will then convert this immediately to OOML (use ctrl v to paste, then press ctrl and then afterwards press t).

Lastly, if you use Wiris MathType to create expressions, these can be converted to OOML also. First, configure MathType via MathType / Preferences / Cut and copy preferences / MathML and / MathMl 3.0 (with namespace attr).

Screenshot of the MathType Cut and Copy Preferences dialog

Then using the MathType expression editor, open the expression you wish to convert:

Screenshot of a MathType math expression

Select and cut the expression (ctrl a and then ctrl x) and return to the Word document and paste the expression into the document with ctrl v. MathType will add the OOML expression. Then when you use WordToEPUB it will be converted and be included in the EPUB as MathML.

Math as an image

Alternatively, you can include the math expression as an image and provide the alt text. You may want to do this if you are targeting reading apps that do not support MathML.

If you are using this technique, you may find the online MathML Cloud tool useful, which is available at https://mathmlcloud.org/

The MathML cloud will generate the image and alt text for you, and then you can add it to word with Insert / Picture. Including the math as an image means it should display OK in all reading systems, but the accessibility experience is sub-optimal.

It is hoped that support for math in the WordToEPUB tool will be further improved, as best practice is defined and reading systems support continues to improve.

Metadata

The WordToEPUB program provides the opportunity to add metadata to the EPUB at conversion time. If you use the tool more than once on the same document you need to add the metadata each time. It is more efficient to add this information to the Word document itself. This also means that the metadata can be included in batch conversions.

To add metadata to the Word document use File / Info / Properties / Advanced properties

In the General tab, Title and Author are mapped to dc:title and dc:creator.

Then in Custom the following fields can be added:

  • Subtitle
  • Publisher
  • Rights
  • ISBN
  • Description

Image descriptions

Non decorative images should be described in Word using Picture format / Alt text. If an image is decorative or described adequately in the surrounding text, then the check box Mark as decorative can be selected. If you are using an older version of Word, then enter the alt text as ‘Decorative’. If an image is not described or marked as decorative then the EPUB will still be generated, but will not meet the needs of all users and would fail an accessibility check.

Single and batch conversions

The Getting started with WordToEPUB guide explains how to convert a single EPUB, using the ribbon button, launching WordToEPUB and browsing to the document, or right clicking on the document in File Explorer and selecting “Convert with WordToEPUB”.

If you have several documents then you may prefer the batch conversion feature. To convert multiple documents in one go, start the WordToEPUB program and then select more than one document in the dialog Select the Word file or files to convert. You can add individual files in combination with the control key, select a range with the shift key, or mark all the files with ctrl a.

When you are ready, select OK, and then choose the destination folder.

In batch mode WordToEPUB will use your choices that you set in Preferences. This is where you set the deto change the The EPUBs will be created using your currently selected default options.

Advanced features

After selecting a file for conversion, you can select Advanced to reveal additional options. A multipage dialog opens. The available options are described below

Advanced / Metadata

The metadata in the EPUB provides information about your title to reading apps and in distribution channels.

The minimum fields to complete are: title and author.

Using the metadata dialog you can add or edit the properties that have been read from the Word document. The date will default to today’s date. As described above in the section on metadata, if properties are entered in the Word file, values for title, subtitle, author (creator), date, publisher, rights, ISBN and description will be populated. The source field can be used to identify source of the page numbering.

An accessibility summary is really helpful so that users can be aware of features and hazards. A summary is suggested by the tool, which can be adjusted if there is additional useful information about the title that can be conveyed to someone with print disabilities.

Advanced / Cover image

Common practice is to include the cover image inline in the EPUB as the first item, but it can be included in the EPUB package but not inline if desired. Or you can omit the cover image.

The tool can create a cover image specific to your Word file. To do this, select ‘Use first page of the Word document’.

A default cover image is provided with the tool. Alternatively, you can select your own image. For your own image JPG or PNG file types can be used. A canvas size of around 1600px by 2400px works well. When providing your own image, make sure that you update the Cover image alt text on this screen.

Advanced / EPUB options

A future version of the tool should automatically detect the language of the title, but in the meantime, you can set it here. This will ensure that the correct language is used when your eBook is read with a computer voice.

This dialog is where you set the page progression direction. So for French, left to right. For Arabic, it would be right to left.

The tool will not usually generate an inline table of contents, since the reading system (e.g. app) should provide a nicer solution. But you can force the tool to include one in this dialog.

If you are including one, the title of the inline of table of contents can be specified here. You may prefer something like ‘Contents’, or you may use a different word in your language.

Your Word document may have an existing table of contents. In most cases you will not want to include in the EPUB. It may have links or references references that do not translate well in the conversion. However, the tool allows you to retain the Word table of contents also, if you so wish.

There are a few sample stylesheets supplied with the tool, which can be used to adjust the appearance of the EPUB (depending on the reading system used). You can add your own stylesheets to the installation folder at “Program Files (x86)\DAISY\WordToEPUB\CSS” and they can then be selected from within the tool. If you are generating an EPUB for a right to left script (e.g. Arabic, then choose an appropriate CSS from the selection.

The Word document is split into parts and included in the EPUB as several XHTML files. Usually this is done at each entry with a Heading 1 style. There may be cases where you prefer this to happen at Heading 2 or lower, and you can adjust that here.

Advanced / Pagination

Page navigation is a powerful feature of the EPUB format. Whilst the number of ‘screens’ in the same title will vary according to the display size, font size and margins, the page navigation will always take you to the same place. It means that a user can turn to a specific page quickly, and they will be on the same page as someone using the printed title.

By default, the WordToEPUB tool includes the page numbers used in the Word document. So, if page numbering in Word is set to start from 100, then this is what will be used in the EPUB. Page numbers would be included in the EPUB even if they are not displayed in the Word document. The source of the page numbering is noted in the EPUB metadata, and this can be changed in this dialog. If the source is a specific print edition, it may be helpful to note this.

If your Word document has page numbers in the headers/footers, then this can be used as the source for the EPUB page numbering. In this case, page numbers will use the page number format (upper/lower roman numerals, or Arabic numbering). Page numbers will only be added to the EPUB for the pages where there is a page number present in the header/footer.

Furthermore, some specialist teams have existing documents where an alternative method of marking page numbers has been used. The WordToEPUB tool supports some of these techniques. Page numbers can be manually included by marking them using the built-in style “Heading level 6”, using the style “Page Number (DAISY)”, or placing them at the start of a new line preceded with “PRINT PAGE “. In these cases, the value used for the page label is inserted. So ‘Page xii’, ‘123’ and “PRINT PAGE seven” would become ‘xii’ and ‘123’ and “seven” respectively. If you are basing the numbering on an existing print book, then match their number/naming convention.

Preferences

The options used by WordToEPUB can be configured and saved for each user.

The default settings for EPUB conversion can be determined on the Default conversion options tab.

Next, on the Default cover options tab the default can be set that the cover image is generated from the first page of the Word document, or you may set your own image, perhaps for your company or institution.

Third, the behavior of the tool can be adjusted in User interface options. A message prompts the user what to do next when the tool is started in standalone mode, but this can be disabled. Then, by default, the tool starts with a simple interface, but the advanced mode can set to appear if desired. During the production of the EPUB the tool can provide lots of messages about the different stages of the conversion if selected.

Lastly on this tab, the language used for the tool’s user interface can be chosen from the available options.

Now we move to the Word Add-in tab. Many users will choose to use WordToEPUB from the ribbon. This uses a simple COM Add-in to launch the tool. There is the option to install this when setup is first run, but it is also possible to add or remove this from within the tool itself. It is likely that Word will need to be restarted for the Add-in button to appear.

Finally, in the Updates tab it is possible to check for the latest version of the WordToEPUB tool, and to choose whether this is done automatically whenever the tool is started.

Troubleshooting

The wrong language is selected in the EPUB

The language of the EPUB is not detected automatically. You need to select the language using Advanced / EPUB options / Language. You can also change the text direction in this tab, e.g. for Arabic.

Not all images are displayed in the EPUB

Sometimes images are included in Word in WMF format, and these not supported in EPUB format. To resolve this issue select Advanced / EPUB options / Convert all images to PNG

EPUB conversion fails with pandoc error code 251

This is an out of memory message. If you are using a 32 bit installation of Windows, you can try the same conversion on a 64 bit installation and hopefully the error will not occur.

The WordToEPUB Add-in installation fails

The button on the Word ribbon provides a convenient way to run the tool inside Word, but the installation sometimes is not possible due to security policies or version issues. You can convert the Word document from File Explorer or by running the tool itself, as explained in Getting started with WordToEPUB. No features are lost and the conversion results are the same.

A word about Pandoc

One of the key components of WordToEPUB is Pandoc, a wonderful free, cross platform document converter. It is included in the distribution under the GNU General Public License.

The command line Pandoc tool is available for free from http://pandoc.org and is under active development.

Several other open source libraries are used in the WordToEPUB tool, and their distribution licenses are followed and acknowledged.

About WordToEPUB

Select the Help button to learn more about the WordToEPUB tool. The release notes for the installed version can be read from here, and you can also submit feedback. We’d love to hear from you!

Tags: WordToEPUB