TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Search Engines and PDF Files From:Max Wyss <prodok -at- prodok -dot- ch> To:"Becky Roberts" <becky_bee -at- hotmail -dot- com> Date:Sun, 8 Oct 2000 10:19:42 +0200
Becky,
you will have to make a difference between the two server
environments you will be using (file server environment for CD-ROM,
harddisk, mounted volumes etc.; web server environment for Intranet
and Internet (using http protocols)).
The following comments count mainly for distributions of the
documents in PDF format (I don't know and care much anymore about the
RoboHelp format).
In the file server environment, you have the indexer tool already
received with your Acrobat full version. I am speaking of Catalog,
which is used to create the indexes for groups of documents. For
using these indexes, your user must have the Search plug-in installed
(which is standard for the full Acrobat versions and also for the
Reader (unless someone is too cheap and downloads only the crippled
(without Search) Reader instead of the (somewhat bigger)
Reader+Search)). As you distribute the documents on CD, this is no
problem, as you then can provide the right version of Reader.
For creating these indexes, you start Catalog, specify the index and
its parameters (name of index, description, folders to process,
folders to exclude, etc.) and then let Catalog do its work. This
should be pretty simple and straightforward.
In the webserver environment, on the other hand, you will need an
indexer and search engine which does support PDF. There are a few
around, from very expensive, but powerful (such as the Verity search
engine ... which is also used by Acrobat Catalog, BTW), to open
source, and still useful. If you have a MessyNT server, Adobe
provides even a free add-on for IIS 4 which provides this capability.
You may find an overview over search engines which handle PDF on
PlanetPDF (http://www.planetpdf.com). Note that the search engine is
_server_ based. This will affect your choice.
Hope, this can help.
Max Wyss
PRODOK Engineering
Low Paper workflows, Smart documents, PDF forms
CH-8906 Bonstetten, Switzerland
Anybody out there using a search tool that they like to search within
a documentation set delivered online and on CD ROM? I have a rather
large doc set, all in either PDF or in RoboHelp and need cross
document search capability. Thanks for any tips on what to look at or
info on what works for you.