RE: Scanned Docs to Revise

Subject: RE: Scanned Docs to Revise
From: "Cassandra Greer" <cassandra -at- greer -dot- de>
To: "TECHWR-L" <techwr-l -at- lists -dot- techwr-l -dot- com>
Date: Sat, 18 Jun 2005 09:33:59 +0200

I have been using Omnipage 14 quite a bit lately for scanning bad quality
faxes and protected PDFs and PDFs that are simply large scans, I have to
say it does a thigh-slapp'n good job including making a pretty good attempt
of keeping something that resembles the original formatting and images. For
those of you who aren't familiar with OCR programs, it also lets you see
what they think the original says in comparison to the actual original so
that you can make changes right away if necessary - also in different

For long docs, it can take a while but I think that depends more on your RAM
and processor speed. I recently did a 350 page PDF which was just a bunch of
pages scanned from a book (three colums per page and with words from up to 7
languages in each column). I went off letting my 500 MB RAM and 2.4 Ghz take
over and had lunch, then came back and there it was. It got all the columns
per page and all the words though it stumbled on the formatting a bit.
However the file was still quite usable for my purposes. I have so far found
only a few language-specific letter recognition problems (in this case, I
think it didn't feel like letting me check target to source because it was
pooped out but I don't blame it ;) but thankfully in the parts that I don't

I have also heard good things about Abby FineReader...but have not yet tried
it out - there may be a trial version of it. There isn't of Omnipage 14


Cass :)

> It's called "optical character recognition" or OCR...although I don't
> envy you doing "hundreds of pages"...
> The most advanced packages even try to maintain original formatting.
> Perhaps the most popular of the advanced packages is Omnipage Pro 14
> Office, from As they say about it:
> "<snip> blurb </snip>."
> David
> On 6/17/05, twriter01 -at- hotmail -dot- com <twriter01 -at- hotmail -dot- com> wrote:
> >
> > Help! Suggestions on getting hundreds of scanned docs (.pdf)
> into MS Word
> > format. These docs were hard copies scanned into .pdf format.
> >
> > Only options I can think of are re-typing everything or using
> > voice-recognition. Any suggestions? Any technologies? TIA!


Now Shipping -- WebWorks ePublisher Pro for Word! Easily create online
Help. And online anything else. Redesigned interface with a new
project-based workflow. Try it today!

Doc-To-Help 2005 now has RoboHelp Converter and HTML Source: Author
content and configure Help in MS Word or any HTML editor. No
proprietary editor! *August release.

You are currently subscribed to techwr-l as:
archiver -at- techwr-l -dot- com
To unsubscribe send a blank email to leave-techwr-l-obscured -at- lists -dot- techwr-l -dot- com
Send administrative questions to lisa -at- techwr-l -dot- com -dot- Visit for more resources and info.


Re: Scanned Docs to Revise: From: David Neeley

Previous by Author: Hotfix release note strategies - long
Next by Author: Re: Where did you get your feet wet
Previous by Thread: Re: Scanned Docs to Revise
Next by Thread: Re: Scanned Docs to Revise

What this post helpful? Share it with friends and colleagues:

Sponsored Ads