Re: FM+SGML and multilanguage publication

Subject: Re: FM+SGML and multilanguage publication
From: Dan Emory <danemory -at- primenet -dot- com>
To: "Carla Martinek" <carla_martinek -at- baxter -dot- com>, "Michal Kadlec" <kadlec -at- nex -dot- sk>, "FrameSGML List" <FrameSGML -at- onelist -dot- com>, Free Framers <framers -at- omsys -dot- com>, Techwrl-l <techwr-l -at- lists -dot- raycomm -dot- com>
Date: Tue, 2 May 2000 10:43:31 -0700 (MST)

Millan Kadlec asked how a bilingual publication (Slovak and English),
presently in side-by-side disconnected columns in FrameMaker, could
successfully be transformed into a structured document in FM+SGML so that it
could be roundtripped to/from SGML.

Disconnected text flows definitely will not work, for the following reasons:

1. Each new text flow must begin with an element that is valid at the
highest level.

2. On export to SGML, FM+SGML can only export the information in a single
text flow.

Carla Martinek's suggestion of using sidehead and body text columns of equal
width would eliminate the disconnected text flow problem.

Assuming you use Carla's approach:

1. In the EDD/DTD, you'll need to create a wrapper element for each of the
two languages (e.g., English_Only, Slovak_Only).

2. At each alternation in language, the applicable wrapper would be
inserted. The structure rules for these two wrappers would be identical, and
would include all of the elements that are common to both languages.

3. Let's assume that one of these common elements is named Para. The EDD
format rules for element Para would include rules such as this:

If context is: < * Slovak_Only
Specify formatting for Slovak in sidehead column
Else
Specify formatting for English in body column

You make similar format rules for each element that contains text.

So far, so good. Now comes the sticking point.

As discussed in an earlier posting of mine, FM and FM+SGML both have a
half-dozen or so undisplayable locked code points, some of which are for
characters in the slovak language. Here's how to solve this problem:

1. You'll need to have available some font (let's call it Slovak) in which
those missing characters occupy code points that are not locked (find
unlocked upper ASCII code points that are not needed in either language).

2. Now, to insert these characters, you must create, in both the EDD and the
DTD a special element (let's call it Fix_The_Bug). In the EDD, define this
element as a text range container, and, in this element's format rules,
specify a character tag (let's call it Fix_Bug) which in your template will
specify the special Slovak font described in step 1.

3. For each of the missing characters, you must modify the read/write rules
in the isolat2.rw file (located in directory isoents) as follows:

entity "entname" is fm char CODE in "Fix_Bug";

Where:
entname is the name of the ISO LATIN 2 PUBLIC entity for the character.

CODE is the hexadecimal code (0xyy, where yy is the code value) for the
character in the font named Slovak.

Fix_Bug is the name of the character tag described in step 2.

4. In file isoall.rw (also in directory isoents), you will find the top two
lines to be as follows:

#include "isolat1.rw"
#include "isolat2.rw"

Edit the file so that the first two lines are as follows:

#include "isolat2.rw"
#include "isolat1.rw"

And save the modified file under the same name.

5. Be sure the first line in your main read/write rules file has this line:

#include "isoall.rw"

6. Create the DTD from the EDD. Then, add to the DTD declarations and
invocations of all the ISO PUBLIC entity sets that you are using. As a
minimum, this should include ISO LATIN1 and ISO LATIN2.

Now, having done all these things, when you export SGML, the proper entity
references will be written out, and, on import to FM+SGML, these entity
references should be converted to the correct character within each
Fix_The_Bug element.

However, there is a bug in FM+SGML 5.5.6 which I discovered a year ago and
Adobe confirmed it as a new bug. The problem arises from the fact that some
of the code points used in isolat2.rw are used in isolat1.rw and several
others. Step 4 should give isolat2.rw preference over all others when the
same code point appears in more than two places. Unfortunately, the bug
causes the order of precedence to be ignored in the modified isoall.rw file
to be ignored. In other words, FM+SGML appears to have the order of
precedence hard-wired. I have tried every possible trick to eliminate this
bug, and short of massive editing of the isoxxx.rw and isoxxx.ent files,
there is no solution. It might be worthwhile to inquire whether this bug has
been fixed in FM+SGML 6.0. If it hasn't been fixed (which would be
unconsionable), you should demand that Adobe provide you with a viable
workaround.

The entire problem with locked code points and their negative impact on
language translations could have been eliminated if Adobe had made
FM+SGML6.0 a unicode-compliant product with a fully compliant XML
import/export capability. Then, by using a multi-language Unicode font in
FM+SGML, you could seamlessly type any character in any language, and
round-trip to/from XML, where entity references for characters are
unnecessary except for characters that are also used in the concrete syntax.








====================
| Nullius in Verba |
====================
Dan Emory, Dan Emory & Associates
FrameMaker/FrameMaker+SGML Document Design & Database Publishing
Voice/Fax: 949-722-8971 E-Mail: danemory -at- primenet -dot- com
10044 Adams Ave. #208, Huntington Beach, CA 92646
---Subscribe to the "Free Framers" list by sending a message to
majordomo -at- omsys -dot- com with "subscribe framers" (no quotes) in the body.





Previous by Author: Re: Know the New Economy
Next by Author: Re: advantages of docbook dtd
Previous by Thread: RE: "Responsibilities Reminder
Next by Thread: ADMIN: Re: "Responsibilities Reminder


What this post helpful? Share it with friends and colleagues:


Sponsored Ads