Last week, Google announced it has added free speech-to-text capabilities to Google Docs (Google calls it Voice Typing). This would have been huge news 20 years ago, yet when Google unveiled it, it was only described in a single paragraph in a middle of a larger blog entry. In a world with Apple’s Siri, Microsoft’s Cortana, and Google Now, a free speech-to-text service that works on multiple computing platforms may not seem like big news anymore.
Voice Typing is different, though; it’s kind of a built-in version of Dragon NaturallySpeaking (for those of you who remember and/or still use that program). Voice Typing works in Chrome on the desktop, as well as the Docs apps for Apple iOS (iPhone and iPad) and Android.
Here’s how it works: To start voice typing on an iOS device, tap the microphone icon to the left of the spacebar near the bottom of the screen. Tap the microphone icon on the right side of the screen above the on-screen keyboard to start Voice Typing on an Android phone or tablet. If you want to voice type on a Mac or Windows PC, you need to use Google Docs in a Chrome web browser. Then, select Tools > Voice Typing. You will see a microphone icon appear with the tool tip “Click to speak” appear in the browser screen near your Docs document.
Google Docs Voice Typing currently supports 48 languages, including regional variants of Chinese, English, Portuguese, and Spanish. You do not need to perform any kind of training before using Voice Typing, and it doesn’t appear to need a special microphone. For this article, I used the built-in microphones of my Dell Windows notebook, a Nexus 6, and an iPhone 6+ to test Google’s speech-to-text.
Voice Typing does require you speak words to add punctuation: “Period”, “Comma”, “Exclamation point”, “Question mark”, “New line”, and “New paragraph.” Unlike dedicated speech-to-text systems, Voice Typing does not have a way to correct or change text using just your voice. With Voice Typing left turned on, you must use your keyboard (physical or on-screen) to make changes to text.
In addition to my regular voice, I tested how well Voice Typing would work on truly continuous speech by playing a stephen colbert video on youtube into the microphone of my Nexus 6 phone running the Google Docs app. Google Docs recorded 288 words using Voice Typing by the time I pressed the Pause button. It looked like it did a credible job of performing speech-to-text of a person speaking relatively fast. My rough estimate is that it was about 85 to 90% correct. And, of course, there is no punctuation, since you need to actually speak the punctuation marks for it to appear in the document.
One tip: Voice Typing doesn’t like it when you swear. For example, If I say, “What the f***?”, it will censor the text of the censored word. This was, appropriately enough, first noted in a blog about the linguistics of swearing .
I started, but didn’t finish, writing this article using Voice Typing. Unless you are a smooth extemporaneous speaker (I am not), it is not the fastest way to write more than a few sentences of text. And, like all speech-to-text systems, it works best in a relatively quiet environment. I’m not sure if I will use Voice Typing regularly. I can see myself using it to make a few notes on my phone. And it may be interesting to see how well it performs in an interview situation with multiple people.