Some time back, we've talked about how to build a speech recognition system in Python. Now let's look in to the other end of it: how to make a Python program that talks. More specifically, let's looks at building a text-to-speech system.
There are several libraries out there that would let you build a text-to-speech model: gTTS, tts_watson, Pyttsx, etc. But today, we'll be talking about using PyWin32 on Windows 10.
Windows 10 has a built-in speech engine, and you can access it through the PyWin32 library. As it uses the built-in system, it's quite efficient than other TTS methods on Windows, and does not require any external tools to playback the audio.
The PyWin32 library gets installed automatically if you're using Anaconda Python. If it's not installed, you can install it using either `conda install pywin32` or `pip install pywin32`.
Text-to-speech with PyWin32 |
Once PyWin32 is installed, you can import the win32com package in Python.
# Start by importing the win32com package import win32com.client as wincom
You can then create the voice dispatcher object,
speak = wincom.Dispatch("SAPI.SpVoice")
, and send it the text you want to read,
text = "Python text-to-speech test. using win32com.client" speak.Speak(text)
PyWin32 will directly speak the text using the built-in Microsoft speech engine. (unlike many other TTS libraries, it won't open up a player program to get the output)
If you want to read a longer text, and want to insert a gap/pause in the reading, you can insert a sleep() call for the required duration.
import win32com.client as wincom # you can insert gaps in the narration by adding sleep calls import time speak = wincom.Dispatch("SAPI.SpVoice") text = "Python text-to-speech test. using win32com.client" speak.Speak(text) # 3 second sleep time.sleep(3) text = "This text is read after 3 seconds" speak.Speak(text)
You can check the code in action in the video below,
You can do longer narrations with PyWin32 also, as I have done in the following video,
No comments:
Post a Comment