Quick & Dirty python: Converting a text file to audio (.wav)

Matt - 31 May, 2012

Tags:

This is a bit of a tangent but for some crazy reason, I wanted to convert some text to audio so I could listen to it while I drive. A quick Google search left me without any freeware that could handle the 53 page document–there are some cool websites that do text to mp3 like vozme and YAKiToMe! but they didn’t convert the whole document. I then found pyTTS, a python package that serves as a wrapper to the Microsoft Speech API (SAPI) , which has been in version 5 since 2000. But I didn’t easily find a version of pyTTS for python 2.6. So I decided to see if I could roll my own.

As it turns out, getting python to talk using SAPI is relatively easy. Reading a plain text file can be done in a few lines.

from comtypes.client import CreateObject

infile = "c:/temp/text.txt"

engine = CreateObject("SAPI.SpVoice")

f = open(infile, 'r')
theText = f.read()
f.close()

engine.speak(theText)

And it wasn’t that much more to have it write out a .wav file:

from comtypes.client import CreateObject

engine = CreateObject("SAPI.SpVoice")
stream = CreateObject("SAPI.SpFileStream")

infile = "c:/temp/text.txt"
outfile = "c:/temp/text4.wav"
stream.Open(outfile, SpeechLib.SSFMCreateForWrite)
engine.AudioOutputStream = stream

f = open(infile, 'r')
theText = f.read()
f.close()

engine.speak(theText)

stream.Close()

And with that chunk of code, I was able to convert my 54 page document into a 4 hour long .wav file (over 600 MB) that I used another software package to convert to .mp3 (200 MB). The voice is a bit robotic but not too bad, I just hope the content that I converted (a database specification standard) doesn’t put me to sleep while I drive.