TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Tokenizing is what programmers call it when a bit of software breaks
text up into chunks (tokens) that the software can deal with.
In HTML, whitespace (spaces, tabs, and carriage returns) just separates
tokens, and isn't directly displayed. For example, when a browser looks
at
"this is
some text"
it sees four words: "this" "is" "some" "text" and displays them with
spaces in between. Thus "this is some text"
In the days before WYSIWYG document composition software, tech writers
who did their own layout had to deal with stuff like that.
But now it's way off topic (except for those of us who write HTML by
hand) and while I'm willing to discuss it off-list with anyone
interested, I won't post anymore about it.
Mike Huber
mike -dot- huber -at- software -dot- rockwell -dot- com
>-----Original Message-----
>From: Mike Collier - SSG [SMTP:MikeCol -at- SBSERVICES -dot- COM]
>Sent: Wednesday, July 02, 1997 11:35 AM
>To: TECHWR-L -at- LISTSERV -dot- OKSTATE -dot- EDU
>Subject: Tokenization (Was: RE: Stop It! Was: Spacing after a period...
>
>On behalf of myself and the one or two other people who don't know, I'd
>like an explanation of tokenization of text streams...
>
>P.S. I did check the archives and couldn't find anything.
>
>Michael Collier
>mikecol -at- sbservices -dot- com
>
>..who, after seeing a reply to this, will be able to ask
>"So, what part of 'the browser tokenizes the text stream' don't you
>understand?"
>
>>One of the things I like about HTML is that the number of spaces you put
>>between sentences does not matter. The browser tokenizes the text stream
>
TECHWR-L (Technical Communication) List Information: To send a message
to 2500+ readers, e-mail to TECHWR-L -at- LISTSERV -dot- OKSTATE -dot- EDU -dot- Send commands
to LISTSERV -at- LISTSERV -dot- OKSTATE -dot- EDU (e.g. HELP or SIGNOFF TECHWR-L).
Search the archives at http://www.documentation.com/ or search and
browse the archives at http://listserv.okstate.edu/archives/techwr-l.html