TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Converting to text From:"Sandy Harris" <sharris -at- dkl -dot- com> To:TECHWR-L <techwr-l -at- lists -dot- raycomm -dot- com> Date:Wed, 23 Feb 2000 09:46:58 -0500
Ruth Lundquist wrote:
>
> I've asked this before, I'll ask it again (in hopes that someone has
> invented some new whiz-bang gizmo):
>
> Does anyone know of a macro, tool, program, utility, potion, magic spell,
> and/or sacrificial offering that will convert a Word document to *readable*
> text?
I'd try printing (not saving) to a text file first.
Then I'd try a sequence of actions. Word: save as HTML. Then go to w3c.org
and get either the HTML Tidy utility or the Amaya browser/editor. Both are
free and either will clean most of the MS-rubbish(TM) out of the HTML files.
Then grab a copy of the Lynx text-only browser. Standard in any Linux or
[Net|Free|Open]BSD distribution and available as free source. I've no idea
if it is available for Windoze or Mac. Then use the lynx command to get
plain text output. lynx -d or lynx --dump or ... I don't recall exactly.