Posts Tagged with 'speech-recognition'

Speech-to-Text: Dictation software for Mac OS X

A microphone

Speech-to-text software, sometimes known as dictation software, is something that lets you talk to the computer in some form and have the computer react appropriately to what you are saying. This is totally different to text-to-speech software, which is software can read out text already in the computer.

Command and Control Software

There are two types of speech-to-text software available. One type is called "command and control" and it lets you speak commands to your computer to control it; hence the name. For example, a command that the computer understands might be, "go to the Apple website" or, "tell me the time". Each command is pre-programmed and the computer will only recognise those commands it's been programmed for; you can't use this software to write an email or use iChat for example.

Command and control software for the Mac - known as "Speakable Items" (or sometimes, confusingly, "spoken commands") - is already built into every OS X computer, although most people don't know about it. You don't need to download, buy, or install anything to get this software to work, just a microphone that works with your computer. The main drawback is that the Speakable Items software programmed for English with a standard American accent, and has significant trouble with any other accent. It doesn't function at all with languages other than English.

Some resources for getting you up and running with Speakable Items include:

Dictation Software

The other type of speech-to-text software is usually called "dictation" software. This is the type that lets you write an article like this one, type stuff to your friends in iChat, or type an email. The most common Windows software for speech to text dictation - you've probably heard of it - is Dragon NaturallySpeaking. There is only one dictation-capable speech-to-text software available for OS X which is being updated and developed and it's [msd]. Dictate is the successor to a program named iListen which MacSpeech used to produce.

MacSpeech Dictate iconLike all dictation-capable text-to-speech products, MacSpeech Dictate works very well for some people and very badly for others. Whether it will work for you depends on many things including: how much effort you're willing to put into learning it, how good your microphone is, your age (text to speech usually works less well for children), how much your accent matches what the program expects, and whether your voice changes a lot through the day.

MacSpeech Dictate is also still fairly new software - it was only released on the 15th of February, 2008. In comparison, the premiere speech recognition program for Windows is Dragon NaturallySpeaking which has been in development since the 1980s[1].

When MacSpeech Dictate was originally released it had several major problems which made it unusable for people with disabilities, but most of these have now been resolved:

  • There was no good help functions inside the application - this was rectified in Dictate version 1.3
  • It didn't learn from corrections - this was rectified in Dictate version 1.2
  • Couldn't spell words out by voice - this was rectified in Dictate version 1.2
  • Couldn't request individual key presses (such as command-s or command-option-escape) by voice - this was rectified in Dictate version 1.3
  • Couldn't be taught new words, such as names or jargon specific to your profession - this was largely rectified in Dictate version 1.2, although some words still resist training
  • There was no way to control the mouse by voice - this was finally rectified in Dictate version 2.0.

I tried using the old iListen program a few years ago and could not get results that were useful, an on-screen keyboard was the best solution at the time. Although MacSpeech Dictate is in its early days as a program, its recognition of my particular voice is hugely better than iListen's was. This is not surprising though, as MacSpeech Dictate's speech recognition engine is based on the same engine used by Windows' Dragon NaturallySpeaking - widely recognised as the best consumer speech recognition available.

[msd] requires the requires Intel-based Macintosh hardware and requires Mac OS X 10.5.6 (Leopard) and higher. Thirteen English dialects/accents are supported, and US and UK spelling options. These are:

  • US Spelling
    • American
    • American - Inland Northern
    • American - Southern
    • American - Teens
    • Australian
    • British
    • Indian
    • Latino
    • Southeast Asian
  • UK Spelling
    • Australian

    • British

    • Indian

    • Southeast Asian

Specialised versions - Dictate Medical and Dictate Legal - are available for dictating in these language areas, and Dictate International is now available and recognises speech in French, German, and Italian. MacSpeech have strongly hinted that Spanish language recognition is next on their agenda.

MacSpeech Dictate is a great program for dictation and some computer control, but it is not something that will let you control the computer completely "hands free". For quadriplegic users and others who need full computer control, you will need to supplement Dictate with use of a mouth stick and keyboard, or a program such as SwitchXS for switch access to functions not available by voice. I highly recommend Dictate though, it's part of my suite of accessibility technology and I use it whenever I am able to.

Website: [msd]

- Ricky Buchanan

[msddisclaim]

[msdbanner]

Dragon Dictate for Mac 2.0 Announced

Icon for MacSpeech DictateNuance Communications today announced the release of Dragon Dictate for Mac 2.0, a paid and rebranded upgrade for MacSpeech Dictate.

This is a major upgrade, bringing Dictate much closer to the Windows based Dragon NaturallySpeaking product. Major features include:

  • Uses the same speech recognition engine as the new Dragon NaturallySpeaking 11
  • Mouse movement with voice commands using a 3 by 3 grid system is now built in.
  • Mouse clicking with voice commands including clicks with modifiers, double clicking, etc., is now built in.
  • Proofreading documents with the Mac's built in text-to-speech commands is now also included.
  • More than one microphone can now be attached to a single profile.
  • New editing commands have been added so they match the commands that will be familiar to Windows Naturally Speaking users.

This sounds like it could now function as a complete keyboard replacement for disabled Mac users, which is great news!

Unfortunately my computer is still off being fixed (the first fix only worked for a few days), so I haven't had the chance to try this new version. As soon as is humanly possible, I will be getting myself a copy and testing it out. Meanwhile, Dan Cohen at GearDiary has reviewed Dragon Dictate and declares it 'awesome' - not a bad start!

Dragon Dictate for Mac costs US$199 including a basic microphone. The upgrade costs US$49 for a downloadable version, more if you need the upgrade on CD or want to purchase a new microphone at the same time.

If you use this banner to purchase your upgrade online I will get a small portion of your upgrade price, which will help support me and ATMac:

[msdbanner]

Have you upgraded yet? Are you planning to upgrade soon, or later, or not at all? And what new feature are you most excited about?

- Ricky Buchanan

ShoutOUT Speech To Text Messaging For iOS

shoutoutShoutOUT is a messaging application for text messages (SMSs), and Facebook and Twitter updates. It allows typed messages to be entered for free, messages using speech-to-text are charged for via in-app purchases beginning at 50 voice credits for US$1.99.

I couldn't test this app because it only works for USA customers who already have a USA mobile phone number, but it seems to have some good reviews from users.

ShoutOUT allows you to send and receive messages, with full texting capabilities including:

  • Inbox and outbox
  • Discussions threaded by recipient
  • Status updates on Facebook and Twitter
  • Push notification of incoming messages
  • Thumbnail images for all your contacts
  • Shake-to-Clear

ShoutOUT is a full-featured messaging app with voice dictation. Speak your text messages or Facebook and Twitter updates and see the results in seconds—there's no faster way to create and send messages.

ShoutOUT allows you to send and receive text messages at a cost far lower than the standard rates charged by mobile operators. For outbound messages, keyboard entered messages are free and unlimited, and you pay only pennies per message for voice-generated texts. All inbound messages are free and unlimited.

- Ricky Buchanan

The Ultimate MacSpeech Dictate 1.5 Global Commands List

Icon for MacSpeech Dictate[msd] is a great program but learning so many commands at once can be intimidating. I've put together another document to help you learn and remember all the global commands found in Dictate version 1.5.*.

MacSpeech Dictate has two types of commands - global commands and application specific commands. The global commands work in all programs and the application-specific commands work only in a single application, for example Mail, Safari, or iChat. This document is only concerned with the global commands, which you'll need to know best and are likely to do most often.

These documents aren't in any way meant to replace the Dictate User's Manual - every Dictate user should absolutely read the manual, even if you're not "the manual reading type". Trust me, you'll get far better use of Dictate if you have read the manual! But nobody's memory is perfect, especially for a program with so many commands, so I've made this commands list to help you out.

The first "Global Commands List" I created, for MacSpeech Dictate 1.2.1, was three pages long - this new one contains fourteen full pages of commands! Dictate has really matured and grown in just a few versions. I've been through all of MacSpeech's available documentation and looked at the AppleScript commands within the program to pull this together. There's no hidden "behind the scenes" knowledge included here, but it took many hours and a lot of organisation to get all of these commands together and in one place in a useful format.

Instead of downloading this one directly, I'm asking you to sign up to download it. As soon as you've confirmed your subscription you'll be taken to a page containing the zipped PDF file ready to download:

















Why sign up? I'll occasionally be sending you information about MacSpeech Dictate and the new MacSpeech Scribe, letting you know there's a new blog post on the topic, and telling you about important upgrades. If you aren't interested in the information you can always unsubscribe right away.

Once you've downloaded the list, I suggest you print it out and read through it, highlighting commands that you often forget or ones that you didn't know about but think you might find useful. This way you can find them quickly when you need them.

If you have any trouble signing up to receive the Ultimate MacSpeech Dictate 1.5 Global Commands List, please contact me and I'll happily help you out.

- Ricky Buchanan

[msddisclaim]

[msdbanner]

Nuance Buys MacSpeech: What Now?

Icon for MacSpeech DictateIt's been announced that the MacSpeech company has been purchased by Nuance. Nuance are the company behind the Windows product "Dragon NaturallySpeaking" and other recent Dragon products for iPod Touch and iPhone.

So what does this mean for [msd] and the other MacSpeech products? Nuance are quick to assure us that "nothing will change in the near term" but I think things will change for the better. Nuance is a much bigger company than MacSpeech - the Windows market has always been more than ten times bigger than the Mac OS X market, and the company which is now Nuance has been around much longer than MacSpeech has. I don't know the number of employees that either Nuance or MacSpeech actually has, but I'd be willing to bet that Nuance has a lot more. And this will probably mean good things for MacSpeech's products, as more talent is available to work on them they can progress more quickly.

MacSpeech Dictate still has some major features missing, such as mouse control, but it's growing and maturing quickly. One big problem for people wanting to switch from using Windows with NaturallySpeaking to using OS X with Dictate is that the names of various commands are completely different between the two products. Actually, Dictate's command set has grown in a haphazard way and commands are difficult to memorise because different commands are constructed in different ways. I would think that with the acquisition of MacSpeech, there's the possibility of the Dictate command set becoming more like the NaturallySpeaking command set. This may be confusing for existing customers, but it would be a huge blessing for customers switching from Windows to Mac and, I think, in the long run it would be a good thing.

Nuance obviously thinks the OS X market for voice recognition is growing and viable, and they're willing to spend money to get into it. This might mean that MacSpeech products eventually cost the same as the equivalent NaturallySpeaking products - at the moment the Mac versions cost significantly more, despite having fewer features. Reducing the cost would not have been possible for MacSpeech alone, as they needed cash flow, but Nuance are in a stronger financial position and they're bigger so they can probably cope with a bumpier cash flow. Cheaper assistive technology is good for everybody, so I hope this one comes true!

I've got a MacSpeech Dictate-related free download coming up soon too - so regular readers stay tuned!

What else do you think will change, or hope will change, with Nuance's buyout of MacSpeech?

- Ricky Buchanan

[msdbanner]

[msddisclaim]