TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: voice recognition software From:Matt Ion <soundy -at- SOUNDY -dot- ML -dot- ORG> Date:Fri, 21 Aug 1998 17:58:23 -0800
On Fri, 21 Aug 1998 15:46:40 -0500, Dave Whelan wrote:
>>Both use speech recognition, as opposed to voice
>>recognition. Simple system navigation works very well with little or no
>training.
>>With an hour or so put into training, dictation of discreet speech runs
>upward of
>>90-95wpm.
>
>Does this mean that if one person trains it for an hour or so, it can
>produce corrected type at 90-95wpm from anyone's speech?
Well, it might have a little more trouble with, say, Groundskeeper Willie than
with Walter Cronkite. The training has the user recite several English phrases
designed to comprise as many possible combinations of spoken sounds. Some
users require less training than others, and it does allow you to stop at
several different levels.
The lastest versions are supposed to work well with continuous speech, while
the one I use is designed for discreet speech (that ... means ... a ... slight
... pause ... between ... words ... which ... while ... it ... may ... seem ...
slow ... is ... usually ... close ... to ... the ... same ... rate ... as ...
normal ... conversation ... punctuated ... by ... 'ums' ... 'uhs' ... and ...
other ... such ... pauses.).
The dictation process is quite effectively context-sensitive, so for example,
if you dictated "to", and the next word was "much", it would jump back and
correct the word as "too".
95wpm is IBM's claim for it; I've never timed it myself, although it does keep
up very well from my experience. The original version used a custom sound card
that handled most of the processing; current versions can use a standard sound
card, but require a lot more processor power (IBM states a P75 with 24MB RAM
minimum for the Warp 4 version, with a P100 and 32MB or more recommended; my
laptop is a P166MMX with 48MB RAM).
>If so, it is far better than anything I have ever heard of
Well of course now... IBM couldn't market fire to the Eskimos.
>how much is it?
There are at least a couple different versions available for Win32 platforms,
going under the name ViaVoice and offering varying levels of speech
control/dictation for varying prices. Drop by www.ibm.com and plug "ViaVoice"
into their search engine.
If you feel like blowing off Windows for a technically superior operating
system, Warp 4 with VoiceType included can be found retail for probably around
$150.
Your friend and mine,
Matt
<All standard disclaimers apply>
"Reality is in alpha test on protoype hardware."
----------------------------------------------------------------------------
They say there are strangers, who threaten us
Our immigrants and infidels
They say there is strangeness, too dangerous
In our theaters and bookstore shelves
Those who know what's best for us
Must rise and save us from ourselves
- Rush, "Witch Hunt"