Using Text To Speech in DAISY Pipeline

Using Text To Speech in DAISY Pipeline

The multi-format converter DAISY Pipeline App can use high quality Text-to-Speech cloud voices from Microsoft Azure, Google and Amazon Poly to generate audio for the given text.

You need to have the credentials to connect DAISY Pipeline to these cloud services. Once connected, you can choose any of the high-quality voices available from Azure, Google or Amazon in dozens of languages to add audio narration to the accessible formats you want to create using DAISY Pipeline.

If you have registered and obtained credentials then do the following to configure DAISY Pipeline.

  1. Open DAISY Pipeline and click on Settings in File menu.
  2. Now click on TTS Engines. Provide the credentials you have for the listed services. Click on connect. Once connected, the Connect button will change to ‘Disconnect’. You can connect to all the services if you have their keys. The credentials are retained when you restart DAISY Pipeline or update to new version.
  3. Now click on Browse Voices in this Settings window to choose the voices in which you want to record your documents. You can filter the voices by choosing the language, TTS engine, language, dialect and gender.
  4. From the list of available voices you can select any voice and make it read out any piece of text. You should click the “Add to preferred voices” button when you like a voice and want to use it.
  5. If the document you want to process has more languages, you should repeat the process above to select preferred voice for other languages in the document.
  6. Now click on Preferred Voices tab. Here you will see all the voices you had shortlisted in the Browse Voices tab. You will need to select a default voice if more than one voice is listed for a particular language. After selecting preferred voice for each language in the document to be converted, click the Close button.
  7. You can also click on More options in Settings window to configure other TTS parameters such as Speech rate, MP3 bitrate and Sample rate. You can leave their values at default if you are not sure.
  8. You can also Browse and select a Lexicon file if available. Lexicon file for TTS narration contains rules for pronunciation of certain words, acronyms and abbreviations. The Lexicon file will need to be created and edited externally. When Lexicon is not selected, the text is narrated by the TTS engine using its default rules.

Now you are ready to use TTS feature of DAISY Pipeline. The Settings remain saved when you close the app. Whenever you are processing a document you should check the languages contained in the document and then change the selection of TTS voices in DAISY Pipeline settings.

See also

Tags: Pipeline App