HTML Conversion: My way (the hard way)

Subject: HTML Conversion: My way (the hard way)
From: "Chris Hansen @ Work" <chris -at- KIVA -dot- COM>
Date: Wed, 19 Jun 1996 08:19:03 -0700

Hello Shannon.

On Tuesday, you wrote:

I need to establish a procedure for converting large numbers of documents
created with various software packages, e.g., Word, PageMaker, QuarkXpress,
Excel, Lotus, FrameMaker, IslandWrite, to HTML for our Internet site.

The FrameMaker converters I have used (WebWorks and the FrameMaker 5 HTML
Converter) are not trivial to use. Is Internet Assistant still the choice
for Word to HTML? How about PageMaker and QuarkXpress to HTML converters?

Instead of using these HTML convertes, has anyone thought of using
anything like a translator to divide the document into chunks of text, add
hypertext links, and basic HTML tags for headlines, paragraphs?

Well, I don't know if I can be of much help, but let me tell you what we did
over the last three weeks and perhaps it will be of some assistance.

As our software moved from a character-based to graphical interface-based
product, our online documentation has to move with it. Our software is written
in Oracle Forms, so the character online documentation was simply a textual
conversion of the printed documentation for each form, approximately 415 forms
between the 8 different modules offered from our company. The original
documentation was written in WordPerfect 5.1 for MS-DOS, so a couple of nasty
macros stripped the documentation to a fairly readable ASCII file which was then
loaded into the database and displayed in a Edit field from within the
application.

To make the online documentation cross-platform compatible, I stuck my neck out
and recommended HTML to the boss... he was happy to let me be in the 'hot seat'.

After checking out several different conversion applications from word
processor to HTML, I decided to take a different route. The plug-in for Word
(Internet Assistant) or the one for WordPerfect converted the documents to HTML
but the formatting was abyssmal and not at all consistent. Plus, I was in a
hurry to make a deadline. Instead, I took the WordPerfect documents (each form
has a separate document for the documentation, allowing easy revision and
updates) and stripped them of formatting with macros (wonderful things, macros)
and then appended a standard chunk of text to the beginning and end of the
document which contained a basic set of HTML coding.

The beginning text added to the document contained the HTML codes for the start
of the document plus a copyright notice in comments and META tags for
Authorship, ending with the <PRE> tag so that the initial file would be
viewable, but barely. The ending text started with a </PRE> tag and then
contained a standard address and then the HTML codes for the end of the
document. The entire document is then saved as the same name as the form the
documentation refers to.

The programmers modified the help function to call a Visual Basic
extension? (*.VBX) which Oracle allows in the Forms 4.5 product on all
platforms (as far as I know). This VBX takes a HTML document and displays it
formatted (HTML 2.0) in the form, so that the user presses the help key and
the form calls the VBX and passes the form name to load the correct
documentation.

Then, my assistant and I hashed out a format for the documentation, since the
new HTML files had no formatting codes whatsoever. I then set him to opening
each document in a great shareware tool called Aardvark Pro and had him add the
tags according to our decided upon format. While this was a bit tedious for
him, it allowed me to design the index, glossary and related pages without much
worry about consistency. For 400+ forms (ie. documents), this took about 5 days
and they look quite good. If you want, I'll send you our template. We stayed
with HTML 2.0 tags since our VBX will only handle that level, but that still
allowed us plenty of flexibility in formatting.

The conversion tools didn't allow us to decide which way we wanted to format the
documents. Each document has at least one section if not more of field
definitions as well as other minor formatting problems. Microsoft's Internet
Assistant kept trying to put our definitions into tables or some definitions
would be as definitions while others would be blockquotes. Very tiresome.

By using the HTML code-based editor, we were able to produce complete HTML
documents which can be used with Netscape, our internal VBX, Internet Explorer,
Mosaic or other browser and keep most if not all of the formatting intact.
Important, since that was one of the conditions from my boss.

I don't know if this helps, but there is my two cents worth, plus inflation.

chris
the rambling-on-and-on one

p.s. For all of you on the list, this next sentence is OFF-TOPIC. I just
returned from seeing the touring company's preformance of 'The Who's Tommy' and
it was simply amazing. The stage design and transitions, music, performances
and adaptation were stunning, to say the least. I was enthralled. See it if
you get the chance. Now, back to your regularly scheduled discussion.
--------------------------------------------------
Flying saucers are real, the Air Force doesn't exist.
-----------------------------
Contact Chris Hansen at: chansen -at- kiva -dot- com
tchansen -at- xmission -dot- com
http:\\www.xmission.com\~tchansen

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Post Message: TECHWR-L -at- LISTSERV -dot- OKSTATE -dot- EDU
Get Commands: LISTSERV -at- LISTSERV -dot- OKSTATE -dot- EDU with "help" in body.
Unsubscribe: LISTSERV -at- LISTSERV -dot- OKSTATE -dot- EDU with "signoff TECHWR-L"
Listowner: ejray -at- ionet -dot- net


Previous by Author: Re: Lament
Next by Author: Re: Keys font
Previous by Thread: Re: HTML Conversion: My way (the hard way)
Next by Thread: "Kill" fees in contracts?


What this post helpful? Share it with friends and colleagues:


Sponsored Ads