PhatVoice User's Guide


October, 2003

This manual describes the PhatVoice text-to-speech application for the PhatNoisetm Car Audio System.

Revision/Update Information: This is a new manual

Operating System and Version: Microsoft® Windows® 98 Second Edition or later

Software Version: PhatVoice V2.0


7 October 2003

Permission is granted to copy and redistribute this document and the accompanying software for non-commercial purposes.

The information in this document is subject to change without notice and should not be construed as a commitment by the authors. The authors assume no responsibility for any errors that may appear in this document.

DISCLAIMER: Use of this product involves making changes to the contents of the PhatNoise Car Audio System. Such changes are made at the sole risk of the user and neither the authors nor PhatNoise, Inc. shall be liable for the results of such changes.

PhatVoice is a freeware programming project not associated with PhatNoise, Inc.

The authors make no representations or warranties with respect to the contents or function of this documentation or software and specifically disclaim any implied warranties of merchantability or fitness for any particular purpose.

The following are trademarks of PhatNoise, Inc.:
PhatNoise PhatBox PhatNoise Car Audio System
SSA DMS PhatNoise Music Manager

Natural Voices is a registered trademark of AT&T Corporation.

Microsoft and Microsoft Windows are registered trademarks of Microsoft Corporation.


This product incorporates ActivePerl from ActiveState Corporation.

Commercial support for ActivePerl is available through ActiveState at: http://www.ActiveState.com/Support/Enterprise/.

For peer support resources for ActivePerl issues see: http://www.ActiveState.com/Support/.

Copyright © 2003 Terry Kennedy (Documentation), Dylan Ginsburg, Ben Cohen, and Terry Kennedy (Software).

Contents


Preface

This guide explains how to install and use the PhatVoice text-to-speech utility.

Intended Audience

This manual is intended for all PhatVoice users. It describes the operation of the PhatVoice utility as well as advanced options such as additional voices, custom pronunciation, and rebuilding PhatVoice from source.

It is expected that the user is already familiar with the installation and operation of the PhatNoise Car Audio System, including the PhatNoise Music Manager (PMM) application.

Important Cautions

Please remember that the PhatVoice software project is not affiliated with PhatNoise, Inc., makers of the PhatNoise Car Audio System.

Use of this product involves making changes to the contents of the PhatNoise Car Audio System. Such changes are made at the sole risk of the user and neither the authors nor PhatNoise, Inc. shall be liable for the results of such changes.

Related Materials

Conventions

In this document, the following conventions will be used to refer to the various components:


Chapter 1
Introduction

This chapter describes the PhatVoice package. It includes an overview of PhatVoice as well as information on additional resources.

1.1 Overview

PhatVoice is a freeware add-on to the PhatNoise system which improves on the speech generation capabilities of the PhatNoise Music Manager. It does this in three ways:

Each of these is discussed below.

1.1.1 Using hints to improve pronunciation

The PhatNoise Music Manager (PMM) generates speech based on the ID tags in MP3 files or from data entered by the user. A common limitation of text-to-speech engines is that they are limited by their built-in pronunciation rules. Traditionally, they do not do well with proper names, among other things. While Natural Voices represents a large improvement over prior TTS engines, it can still generate speech that ranges from "creatively mispronounced" to "what the heck did it say?". In some cases, it is possible to edit the data in PMM to create better speech, but this is not always possible. Further, it might be a repetitive task (for example, if you wanted to change all instances of "CD 1" to "Disc 1").

PhatVoice lets you perform these substitutions in a simple manner without needing to manually edit each song in PMM. In addition to changing the spelling of words, you can also embed specific commands to the TTS engine, such as phonetic pronunciation for a difficult word, or adding / removing emphasis on a word.

1.1.2 Generating speech for song titles

PMM does not currently generate speech for song (track) titles - it is limited to artist, album, playlist, and genre. PhatVoice adds the ability to generate speech for track titles.

A minor change to a configuration file on the PhatNoise Digital Media Storage (DMS) cartridge and copying a new announcement file to the DMS need to be performed in order to take advantage of this feature. These changes are documented in Section 2.4 of this manual.

1.1.3 Utilizing additional Natural Voices

If you have purchased and installed either the 16KHz versions of the standard Mike and Crystal Natural Voices or any of the additional Natural Voices, those voices are available to you in PhatVoice. You may either change the default voice used for all speech, or you may override the default voice for a particular item of speech. This is useful if you have a number of albums with an artist, album name, or song title in a foreign language. The US English Mike and Crystal Natural Voices tend to do poorly when pronouncing such items. If you have a foreign language Natural Voice installed, you can add a hint to have it switch when it encounters these items. The sample hints file distributed with PhatVoice contains a number of these hints.

1.2 Additional resources

A number of additional resources are available to you.

There are discussion and annoucement-only mailing lists for PhatVoice. The discussion group is a user-to-user forum (though the PhatVoice developers monitor it and respond to posts as well) where users can share pronunciation hints, tips and tricks, ask questions, and so forth. The announcement list is used to inform users about updates to PhatVoice. For more information, visit:


http://listserv.tmk.com/archives 

The PhatVoice home page is located at:


http://www.tmk.com/PhatVoice 

You may download the latest version of PhatVoice as well as additional materials from the above page.

PhatNoise, Inc. maintains discussion forums at:


http://www.phatnoise.com/forum/index.php 

While PhatVoice is not affiliated with PhatNoise, Inc. it grew out of an idea first posted in one of those forums, and notices of updates will be posted there as well.


Chapter 2
Installing PhatVoice

This chapter provides information about the prerequisites for PhatVoice as well as detailed installation steps for both the setup-based and zipfile-based installation processes.

2.1 Installation prerequisites

You will need have either the PhatNoise Music Manager or the Microsoft Speech 5.1 SDK installed. One of these is needed in order to provide the SAPI interface that PhatVoice uses. Consult the Related Materials section of this manual for download locations for these software packages if you need to install them.

2.2 Installing using the PhatVoice Setup Wizard

The recommended installation method is to use the PhatVoice Setup Wizard. This provides a Windows-based setup which will install and configure PhatVoice for you.

Note

Neither PhatVoice nor the Setup Wizard modify any registry keys or install any files outside of the user-specified installation directory. The only difference between the Setup Wizard and the zipfile-based installation is that the Setup Wizard automates the process and creates Start Menu and optional Desktop shortcuts for you.

To launch the Setup Wizard, simply select the PhatVoice_setup.exe file from the Windows Start / Run / Browse menu. This will launch the Setup Wizard as shown below:


After you click the Next button, you will be presented with the License Agreement screen. Be sure to read the license agreement completely and make sure you understand and accept it before continuing.


Once you click on the Accept button and click the Next button, you will see the Select Destination Directory dialog box. The default location has been pre-selected for you, but you may choose any other directory if you prefer. We strongly recommend that you install PhatVoice in a directory by itself rather than placing it in a directory with other programs or files.


After you select a destination directory and click Next, you may now select the Start Menu Folder (location) for the program. Again, the default location has been pre-selected. Likewise, we strongly recommend that you place PhatVoice in its own folder.


Once you have chosen a Start Menu Folder, click Next. Now, you have the choice of having an icon for PhatVoice placed on your Windows desktop. The default is to perform this operation, but if you don't want to clutter your desktop with another icon, feel free to de-select the checkbox here.


After clicking the Next button, you will now have a chance to review (and change, if you desire) your installation options.


When you click on the Install button, installation will proceed automatically.


At this point, PhatVoice setup has been completed. You have the option to run PhatVoice now, or to exit the Setup Wizard without running PhatVoice at this time. Click on the Finish button to exit the Setup Wizard.

At any time, you can start PhatVoice from the Windows Start Menu, using the folder name you gave the Setup Wizard. If you chose to install the Desktop shortcut, you may also start PhatVoice via that shortcut.

2.3 Installing from the zipfile

If you decided to download and install from the zipfile instead of using the PhatVoice Setup Wizard, simply open the PhatVoice.zip file using a zip utility of your choice and extract the files into a new directory. You will need to create any Start Menu items or Desktop shortcuts manually.

2.4 Necessary changes on the PhatBox DMS

You can use PhatVoice without making any changes to your DMS. In fact, you can experiment with it and work on text pronunciation hints without even having a cradle attached to your PC. However, a few changes to your DMS will let you get even more out of PhatVoice.

2.4.1 Firmware revision

We recommend that you run the latest firmware for your particular PhatBox, regardless of whether you use PhatVoice or not. You can check and / or update your firmware revision from the Device view in PMM. Select the Hardware Options item and then Firmware Setup. Consult Appendix A of the PhatNoise Car Audio System User Manual for more information about updating firmware.

2.4.2 phatbox.ini changes

Your PhatBox will use the new speech for Playlist, Artist, Album, and Genre without requiring any changes to the PhatBox configuration files. However, if you want your PhatBox to announce song names, you will need to edit the phatbox.ini file on your DMS. This file is found in the root directory of the PHTSYS partition. The following examples all assume that your computer mounts PHTSYS as drive G:; if your computer uses a different drive letter substitute it accordingly in each example.

First, make a backup copy of your existing phatbox.ini file in case anything goes wrong:


G: 
copy phatbox.ini phatbox.ini_save 

Next, open the phatbox.ini file in Notepad. Search for the section for your particular car (it will be inside square brackets). In this example, we will be making the change for a Toyota PhatBox. Scroll down a bit and you will see a number of fields labeled audioid.3....:


[Toyota] 
auto_pong=on 
sync_after_announce=on 
   .
   .
   .
audioid.2.6=WAIT 3 
audioid.2.7=END 
audioid.3.0=/dos/tts/beep3.wav 
audioid.3.1=CURMODE 
audioid.3.2=PLAYLIST 
audioid.3.3=ARTIST 
audioid.3.4=ALBUM 
audioid.3.5=GENRE 

You will be changing the audioid.3... tags. First, decide what you want your PhatBox to say. The default is Playlist / Artist / Album / Genre. You can change these to add, remove, or rearrange the items that are spoken when a track is announced. Personally, I use:


audioid.3.0=/dos/tts/beep3.wav 
audioid.3.1=TITLE 
audioid.3.2=ARTIST 
audioid.3.3=ALBUM 

Here, I have added the TITLE keyword to speak the song title. You can use any set of keywords in your configuration. Experiment until you find a combination your like. The items will be announced in numerical order as shown in the above list. When creating your list, make sure it is in order and there are no gaps in the sequence. You need to make this change on each of your DMS cartridges if you have more than one.

Note

Some PhatBox models (such as the Toyota example above) have a CURMODE tag, which generates the "Now browsing..." message. You can keep that keyword as item number 1 in the list and start your announcements at item 2, or you can remove it and start your announcements at item 1.

Note

The [BMW] part of the phatbox.ini file is slightly scrambled in all of the .ini files I've seen:


audioid.2.5=TITLE 
audioid.2.6=PLAY 
audioid.3.0=/dos/tts/beep3.wav 
audioid.3.1=PLAYLIST 
audioid.3.2=ARTIST 
audioid.3.3=ALBUM 
audioid.3.4=GENRE 
auto_pong=on 
audioid.2.7=WAIT 5 
audioid.2.8=END 

Don't worry about the pieces starting with auto_pong=on being out of sequence - just arrange the audioid.3... tags as you normally would.

2.4.3 The title.mp3 file

As mentioned previously, the PhatNoise system does not directly support track titles. In addition to generating the track titles, PhatVoice supplies a pair of speech file, one of which says "The current track is..." and the other which says "Track", both in the same voice as the rest of the PhatNoise announcements. One of these files needs to be copied to a file called title.mp3 in the TTS directory of the PHTSYS partition on your DMS. Which one you use is a matter of personal preference. The following examples assume that your computer mounts PHTSYS as drive G:; if your computer uses a different drive letter substitute it accordingly.

For "The current track is...":


copy "C:\Program Files\PhatVoice\current_track.mp3" G:\TTS\title.mp3 

For "Track...":


copy "C:\Program Files\PhatVoice\track.mp3" G:\TTS\title.mp3 

Again, note that PHTSYS is the first of two drive letters on the DMS. If you copy the title.mp3 file to the PHTDATA partition, it won't be used. You need to make this change on each of your DMS cartridges if you have more than one.


Next Contents