Flite+HTS Engine for Flash - An HMM Speech Synthesizer Running Entirely in Adobe Flash

Flite+HTS Engine for Flash is an Hidden Markov Model (HMM)-based Speech Synthesis system (HTS) running entirely in Flash. The idea to make HTS run in any modern browser and allow dynamic text-to-speech to be added to any webpage with the entire synthesis performed on the client.

It requires Flash version 10 to run. If the above applet doesn’t load, a demo is also available here. It’s tested on Windows and Mac although I’ve seen it throw exceptions a few times.

I have wanted to learn more about HTS and try out the engine for some time. Installing under Windows requires building from source on Cygwin. After seeing Quake running in the browser, I thought it would be an interesting to try and get the the HTS engine run in Flash and make it widely accessible. Instead of re-writing the entire system for Flash I first took the HTS_Engine and Flite code and compiled it to a Flash component (swc) by using the Alchemy compiler. With the core engine compiled I used FlashDevelop to build a small interface, I looked at QuakeFlash to understand how to perform the interoping. The UI elements are from the Minimalcomps library and it also makes use of the audio playback from Standingwave .

By far the trickiest part was getting the audio to play in the Flash applet. First, the ByteArray for moving data between the swc seems to default to big-endian (thanks Heiga Zen). Second, the Flash player was unable output 16K audio ad I couldn’t get HTS to increase the sample rate to 44.1K without a pitching up the speech. The current implementation uses a very crude up-sampling technique, this is why the pitch of the voice may sound different when playing back through the applet.

Assuming HTS engine has the functionality underneath. I think it should be possible to perform streaming output and write the buffer to Flash as soon as speech is ready. Again if its possible to adjust engine parameters without having to re-initialize, adding the ability to adjust the engine parameters would be a nice way of exploring HTS.

Code for the demo is available on Github https://github.com/edobashira/Flite-hts_engine-for-Flash

FSMJS

On Google code Georg Jaehnig has created a finite state machine library in Javascript called fsmjs. It covers the algorithm from the Mohri paper Weighted automata algorithms. The are demos of fsmjs here. I was playing around with it and wondered if I could compile it to the .net framework using the JScript.NET compiler, I made a few changes and got it to compile and run! The modified file is here.

To compile it at the Visual studio command prompt type 

jsc fsm.js

The default demo is will perform a star closure on toy machine and dump the results in dot format.

fsm.exe | dot –Tpdf > test.pdf

Will create a pdf of the machine.

The JScript.Net compiler doesn't seem to like globals so I can't currently compile it to a library,  but if it can be compiled to .net library  it should be possible to call it  from .net programs.

Openkernel and far style files for OpenFst

Just noticed there is a new version of OpenKernel that came out on 11th January 2010. Spotted some nice code in the last version so I was eager to check this version. My expectations were correct, extracting the archive revelled a directory named far directory. Could this by the compatibility with the ATT far format?

First, to compile I had to grab the icu 4.0 librrary from here as it wasn’t available on yum. These are that far commands that avaliable:

  • farextract – Seems to be the same as the old ATT farsplit command
  • farinfo
  • farprintstrings
  • farcompilestrings
  • farcreate – Not in ATT tool allows compiled fsts to merged in a far file

Quickly tried it out with a far file created from ATT tool ands it is not compatible (kind of obvious really). However, it’s nice to have far like files for OpenFst. One thing I noticed when checking out the commands is that the output file must be specified it doesn't default standard output.

Now to update my fstcount utility to support the OpenFst far format.