iMac and Macbook Pro between a large speaker and a large microphone.

Speech-to-Text: Dictation Software for OS X

Speech-to-text software, sometimes known as dictation software, is something that lets you talk to the computer in some form and have the computer react appropriately to what you are saying. This is totally different to text-to-speech software, which is software can read out text already in the computer.

In this article you’ll learn about different types of speech-to-text software for Mac OS X, and what your options are if you want to use it to control your computer, dictate text, or both.

In this article:

Command and Control Software

There are two types of speech-to-text software available. One type is called “command and control” and it lets you speak commands to your computer to control it; hence the name. For example, a command that the computer understands might be, “go to the Apple website” or, “tell me the time”. Each command is pre-programmed and the computer will only recognise those commands it’s been programmed for; you can’t use this software to write an email or use iChat for example.

Command and control software for the Mac – known as “Speakable Items” (or sometimes, confusingly, “spoken commands”) – is already built into every OS X computer and can be accessed via the Accessibility panel. Although this software is less capable than dictation software, it is more helpful for people with some types of disabilities.

OS X Mavericks Accessibility preferences pane opened to Speakable Items.
OS X Mavericks Accessibility preferences pane opened to Speakable Items.

To you up and running with Speakable Items, check out the Apple Support article for your version of OS X:

Dictation Software

The other type of speech-to-text software is usually called “dictation” software. This is the type that lets you use your voice write an article like this one, type stuff to your friends in iChat, or type an email.

There is dictation software built into OS X and there is a program developed by Nuance called Dragon Dictate for Mac. Dictate is the successor to a program named iListen which MacSpeech used to produce.

All dictation-capable text-to-speech products work very well for some people and fairly badly for others. Whether it will work for you depends on many things including: how much effort you’re willing to put into learning it, how good your microphone is, your age (text to speech usually works less well for children), how much your accent matches what the program expects, whether your disability affects your speech, and whether your voice changes a lot through the day.

These types of speech-to-text dictation programs have made huge improvements in the last few years though, so even if you have used dictation software before and given up it is worth trying again.

Built-in OS X Dictation

OS X’s free built in dictation requires OS X 10.8 Mountain Lion or later and can be accessed via the “Dictation and Speech” panel on System Preferences.

OS X Mavericks' System Preferences pane for Speech and Dictation.
OS X Mavericks’ System Preferences pane for Speech and Dictation.

Under Mountain Lion and by default in Mavericks it functions by listening to up to 30 seconds of speech and sending the speech to Apple’s servers for processing – the same way that Dictation to Siri on your iPhone works, as it’s essentially the same thing. If you have a stable and reliable broadband connection this is fine, but those with slow or metered internet connections may have trouble. For those users who want local speech processing, under Mavericks you can turn on Enhanced Dictation which allows continuous speech and offline processing.

To start you off with OS X’s dictation, here are some Apple support articles:

Nuance Dragon Dictate

Nuance’s Dragon Dictate for Mac version 4, the current version, requires the requires Intel-based Macintosh hardware and requires Mac OS X Mountain Lion 10.8.3 or higher and a Nuance-approved noise-canceling headset microphone. It will set you back approximately US$200 plus the cost of a microphone.

Dragon Dictate icon

Nuance’s Dragon is a more complete product than OS X’s built in dictation, allowing you to mix dictation and commands without needing to use the keyboard or mouse. For those who find keyboard or mouse use extremely difficult or impossible and wish to do as much as possible by voice, Dragon is still the only functional solution.

Dragon also allows for transcription of recorded files, provided they only contain a single speaker and that person has already set up recognition. There is an iPhone/iPad app called Dragon Recorder specifically for recording files for later transcription.

The speech recognition engine which powers Nuance’s Dragon is the same as that powering NaturallySpeaking, the premiere speech recognition program for Windows, and it is continually improving. Since 2008 when it was released, Dragon has made enormous improvements in speech recognition and it is much more forgiving and usable than it was then! I hope that improvements continue just as fast in the future – it’s a great thing for all users.

- Ricky

Dragon Dictate for Mac: Simply smarter speech recognition

If you are going to buy or upgrade any version of Nuance's Dragon Dictate, please consider using the links in this article. If you do, I'll get a commission - a small percentage of the sale price. It won't cost you anything and it will help to support me and ATMac.

4 thoughts on “Speech-to-Text: Dictation Software for OS X”

  1. Ricky, what voice-to-text software could be used for recording-and-transcribing lectures? The lecturer would not “train” the software in advance to his/her voice. Is this even possible?

    Cindy

    • Cindy: Unfortunately it’s not really possible to transcribe untrained speakers, especially if you are recording them in a situation where the microphone is far from their mouth. Microphones for speech-to-text really need to be just an inch or two away from the speaker, recording from where a student would sit in a lecture hall is very dicey unfortunately. You may have to wait a few years for technology to catch up with this need, I’m sorry.

  2. Hello,

    is it possible to feed this speech recognition software with line-in audio instead of the microphone input? For instance to transcribe a recorded speech which is on CD.

    Thanks!
    -martin

    • Martin: You can transcribe pre-recorded audio, provided the speaker has been trained with the profile you will be using – that means you have to have trained the system with the same speaker using the same microphone, which sounds unlikely if it’s on a CD. If you want to record something for later transcription in future, check the Nuance website for what recorders are recommended.

Leave a Reply

Your comment may be held up by our moderation or anti-spam software: please be patient if your comment does not immediately appear. You can include some HTML in comments, but including links or web addresses makes it more likely your comment will be delayed by moderation. Please stick to the comment policy.