Posts Tagged with 'transcribe'

Speech-to-Text: Dictation software for Mac OS X

A microphone

Speech-to-text software, sometimes known as dictation software, is something that lets you talk to the computer in some form and have the computer react appropriately to what you are saying. This is totally different to text-to-speech software, which is software can read out text already in the computer.

Command and Control Software

There are two types of speech-to-text software available. One type is called "command and control" and it lets you speak commands to your computer to control it; hence the name. For example, a command that the computer understands might be, "go to the Apple website" or, "tell me the time". Each command is pre-programmed and the computer will only recognise those commands it's been programmed for; you can't use this software to write an email or use iChat for example.

Command and control software for the Mac - known as "Speakable Items" (or sometimes, confusingly, "spoken commands") - is already built into every OS X computer, although most people don't know about it. You don't need to download, buy, or install anything to get this software to work, just a microphone that works with your computer. The main drawback is that the Speakable Items software programmed for English with a standard American accent, and has significant trouble with any other accent. It doesn't function at all with languages other than English.

Some resources for getting you up and running with Speakable Items include:

Dictation Software

The other type of speech-to-text software is usually called "dictation" software. This is the type that lets you write an article like this one, type stuff to your friends in iChat, or type an email. The most common Windows software for speech to text dictation - you've probably heard of it - is Dragon NaturallySpeaking. There is only one dictation-capable speech-to-text software available for OS X which is being updated and developed and it's [msd]. Dictate is the successor to a program named iListen which MacSpeech used to produce.

MacSpeech Dictate iconLike all dictation-capable text-to-speech products, MacSpeech Dictate works very well for some people and very badly for others. Whether it will work for you depends on many things including: how much effort you're willing to put into learning it, how good your microphone is, your age (text to speech usually works less well for children), how much your accent matches what the program expects, and whether your voice changes a lot through the day.

MacSpeech Dictate is also still fairly new software - it was only released on the 15th of February, 2008. In comparison, the premiere speech recognition program for Windows is Dragon NaturallySpeaking which has been in development since the 1980s[1].

When MacSpeech Dictate was originally released it had several major problems which made it unusable for people with disabilities, but most of these have now been resolved:

  • There was no good help functions inside the application - this was rectified in Dictate version 1.3
  • It didn't learn from corrections - this was rectified in Dictate version 1.2
  • Couldn't spell words out by voice - this was rectified in Dictate version 1.2
  • Couldn't request individual key presses (such as command-s or command-option-escape) by voice - this was rectified in Dictate version 1.3
  • Couldn't be taught new words, such as names or jargon specific to your profession - this was largely rectified in Dictate version 1.2, although some words still resist training
  • There was no way to control the mouse by voice - this was finally rectified in Dictate version 2.0.

I tried using the old iListen program a few years ago and could not get results that were useful, an on-screen keyboard was the best solution at the time. Although MacSpeech Dictate is in its early days as a program, its recognition of my particular voice is hugely better than iListen's was. This is not surprising though, as MacSpeech Dictate's speech recognition engine is based on the same engine used by Windows' Dragon NaturallySpeaking - widely recognised as the best consumer speech recognition available.

[msd] requires the requires Intel-based Macintosh hardware and requires Mac OS X 10.5.6 (Leopard) and higher. Thirteen English dialects/accents are supported, and US and UK spelling options. These are:

  • US Spelling
    • American
    • American - Inland Northern
    • American - Southern
    • American - Teens
    • Australian
    • British
    • Indian
    • Latino
    • Southeast Asian
  • UK Spelling
    • Australian

    • British

    • Indian

    • Southeast Asian

Specialised versions - Dictate Medical and Dictate Legal - are available for dictating in these language areas, and Dictate International is now available and recognises speech in French, German, and Italian. MacSpeech have strongly hinted that Spanish language recognition is next on their agenda.

MacSpeech Dictate is a great program for dictation and some computer control, but it is not something that will let you control the computer completely "hands free". For quadriplegic users and others who need full computer control, you will need to supplement Dictate with use of a mouth stick and keyboard, or a program such as SwitchXS for switch access to functions not available by voice. I highly recommend Dictate though, it's part of my suite of accessibility technology and I use it whenever I am able to.

Website: [msd]

- Ricky Buchanan

[msddisclaim]

[msdbanner]

New MacSpeech Scribe For Transcription

Icon for MacSpeech ScribeOne of the major things that the MacSpeech Dictate family has been lacking is the ability to take pre-recorded files and convert them to text. Not any more: MacSpeech Scribe will do just that for you, with up to 99% accuracy.

MacSpeech Scribe will accept any file in one of these formats:

  • .wav
  • .aif or .aiff
  • .m4v, .mp4, or .m4a

Audio file quality will affect the quality of your recognition, of course, so using a certified recording device is recommended, but not required - anything that will produce the correct file format will work. At the moment, the iPhone, iPod Touch, and several Olympus digital voice recorders are the only devices certified but I would expect that MacSpeech expands this range fairly quickly.

Recording a sound file to run through Scribe is pretty much like using MacSpeech Dictate itself, but without the ability to correct and train phrases as you go. If you want your transcribed document to include punctuation, you need to speak the punctuation signs into the recording, and you need to train MacSpeech Scribe to the voice of the person who recorded the audio file before it can transcribe.

So what are the limitations? Bear in mind that I have not had access to MacSpeech Scribe myself, but these are the limits that have been described by MacSpeech or can be inferred from the behaviour of other products in the MacSpeech family:

Photo of an iPhone in somebody's hand

MacSpeech Scribe lets you record sound on your iPhone, iPod Touch, or other recording device, then transcribes it when you're back at your computer.

  • You can only have one speaker per file, so MacSpeech Scribe will not be helpful for transcribing a meeting or class or any other situation where there is more than one speaker.
  • The program must be trained to the voice in the recording, so it's also unlikely to be useful for transcribing a speech or lecture unless the speaker is willing to spend some time with you creating a profile for MacSpeech Scribe.
  • Because of the need for punctuation to be spoken aloud, I am not sure if the accuracy would be adequate in a situation where punctuation was not spoken - from Scribe's perspective the text produced would be one really long paragraph.
  • We know from other MacSpeech products that the distance from mouth to microphone is very important for recognition, so I would think any speaker who is moving around would significantly degrade accuracy. If you need to record a speaker like this for MacSpeech Scribe's use I would suggest investing in a lapel microphone for your recorder.
  • Background noise or any other non-speech noise in your recording will also degrade accuracy. Get a directional microphone for your voice recorder so it only picks up your own voice, or dictate in a quiet place.
  • Changes in voice quality from emotion or emphasis also degrade recognition. [msd], in my experience, does best with a very steady tone of voice - not a monotone but no getting excited or sad or speaking too fast or too slowly - so I would expect that MacSpeech Scribe is similar in this respect.

MacSpeech quotes that:

MacSpeech Scribe lets you easily add new words and acronyms, edit and navigate transcribed documents, and so much more. MacSpeech Scribe makes it easy to work with your transcribed document so you can create the perfect document for your needs.

which leaves me unsure if its editing abilities are the same as other MacSpeech products and, if they are, does it let you verbally add a word or phrase that was missed by the dictation engine? If so, what does Scribe not have that Dictate has? I'll have to get hold of it to clarify that one for you!

MacSpeech Scribe is available immediately, in English only, for all the dialects of English usually recognised by MacSpeech products. There is a special price of US$99 for currently registered MacSpeech Dictate 1.5 customers, the regularly suggested retail priced is US$149.

- Ricky Buchanan

[msdbanner]

[msddisclaim]

Photo credit to Twon.

How Can I Dictate In Other Languages?

Icon for MacSpeech Dictate[Last updated 18 October, 2009]

[msd] is currently the only speech-to-text program under development for OS X. Thirteen English dialects/accents are supported, and US and UK spelling options. These are:

  • US Spelling
    • American
    • American - Inland Northern
    • American - Southern
    • American - Teens
    • Australian
    • British
    • Indian
    • Latino
    • Southeast Asian
  • UK Spelling
    • Australian

    • British

    • Indian

    • Southeast Asian

Specialised versions - Dictate Medical and Dictate Legal - are available for dictating in these language areas, and Dictate International is now available and recognises speech in French, German, and Italian. MacSpeech have strongly hinted that Spanish language recognition is next on their agenda.

For Windows, Dragon NaturallySpeaking 10 Preferred supports Spanish and Dutch languages, but there is currently no way to recognise these using MacSpeech Dictate. If you want to dictate in Dutch or Spanish and don't want to wait for Dictate International to include these languages, you can use Parallels or VM Fusion to run Windows XP or Vista on a "virtual machine" inside your OS X machine. Using Dragon NaturallySpeaking 10 Preferred you can dictate into a document within the Windows virtual machine.

To get your text back into OS X you'll need to select and copy it in the Windows program you're using, switch back to OS X and paste it into a document, email message, or other OS X window. So this method won't let you dictate directly into OS X documents or to control your OS X computer by voice, but if you need to dictate in other languages for writing documents it could be ideal. Dragon NS 10 has specific language editions for each language, and they all include English dictation as well as the language purchased. You'll also need a copy of Parallels or VM Fusion and a licensed copy of Windows XP or Windows Vista for this method.

Thanks to Paul for the creative solution! I hope it helps others until [msd] is up to the task.

- Ricky Buchanan

[msddisclaim]

[msdbanner]

MacSpeech Dictate Instructional Videos

MacSpeech Dictate iconThe [msd] people have been busy updating their website recently. As well as significant updates to the knowledge base, and new forums for users, they are now also offering free instructional videos.

MacSpeech Dictate instructional videos are designed to provide you with easy-to-understand, practical tips and techniques for getting the most from using MacSpeech Dictate.

The videos currently offered, and their lengths in minutes and seconds, are:

  • How to Install MacSpeech Dictate (2:17)
  • How to Create a Profile (2:19)
  • How to Use Phrase Training (1:52)
  • How to Create a Voice Command for a Text Macro (1:22)
  • Editing a Document (1:04)

Each of the videos is available in a high resolution and low resolution version. I especially recommend that all Dictate users watch the tutorials on Phrase Training and Editing a Document, as these are vital to the basic use of MacSpeech Dictate.

Website: MacSpeech Dictate Instructional Videos

- Ricky Buchanan

[msddisclaim]

[msdbanner]

MacSpeech Dictate 1.2 Upgrade Released

MacSpeech Dictate icon[msd] has finally been upgraded to version 1.2. The list of improvements and fixes is extensive, and essential function such as correction (they call it "phrase training") and a spelling mode have been introduced.

Some people have reported problems related to upgrading on the MacSpeech discussion forums, but hopefully these will be quickly addressed. I have had a minor problem myself but it doesn't seem to have affected the functioning of Dictate itself, just the appearance of one of the windows.

The full list of changes and new features is displayed when upgrading but I couldn't find it on the MacSpeech website. It includes changes/improvements in installation, setup, and the user interface overall.

There is also a new Recognition Window which aids with phrase training/correction. Here's an image of the recognition window with the Dictate notepad - it had correctly recognised the phrase "It knows my name" but offers other plausible alternatives which I could have used for correction:

Image of MacSpeech Dictate's Notepad and Recognition Windows

As well as the new commands related to training, these commands have also been added to ease editing:

  • "Move Backward/forward 1-99 Words"
  • "Capitalize The Word[s] "text" [Through/To "text"]
  • "Uppercase The Word[s] "text" [Through/To "text"]
  • "Lowercase The Word[s] "text" [Through/To "text"]
  • "Send Message" for iChat
  • "Purge Document"
  • "Cache Document"
  • "No Trailing Space" for Spelling mode
  • "Play The Selection"

This seems like a great upgrade and should definitely be installed by all MacSpeech Dictate users. Let's hope that minor upgrades flow more quickly now they've got this one out the door!

Website: [msd]

- Ricky Buchanan

[msddisclaim]

[msdbanner]