Speech Recognition just hanging there, not recognizing that you're speaking |
We found out that this was happening due to ambient noise.
Although we humans are able to distinguish speech from noise naturally, for a computer program they are just audio levels. It needs to know which levels should be considered speech (which it needs to process in order to recognize what's being said), and which levels should be considered silence or background noise. So, libraries like the SpeechRecognition has an energy threshold set which defines what audio level and above should be considered speech.
Now, this default energy threshold works most of the time. If your environment is sufficiently quiet, it will be able to recognize you talking without problems. But, if your environment is noisy - e.g. an office environment with many people talking, or there's machinery around - then the program will have issues distinguishing speech from noise, which will cause the issue we observed.
So, in a situation like that, we should adjust the energy threshold to properly distinguish the speech from noise. The SpeechRecognition package has a couple of parameters that helps you with this.