TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
Subject:Re: Tool to Analyze Text for Possible Snippets From:Jack DeLand <jackdeland -at- adamcharlesconsulting -dot- com> To:Paul Hanson <twer_lists_all -at- hotmail -dot- com> Date:Thu, 12 Apr 2018 18:49:42 -0400
Meh. Switch to Flare and Analyzer.
On Thu, Apr 12, 2018 at 3:59 PM, Paul Hanson <twer_lists_all -at- hotmail -dot- com>
wrote:
> Hi,
>
> I am looking at 8 different Word documents. The end game for these
> documents is to import them into my HAT (RoboHelp 2015) and maintain them
> in HTML. No problem - I know how to do all that.
>
> What I want to pick your brains about is how to determine the frequence of
> the duplicated text. I know there is duplicate text across the documents
> because I took the 8 Word documents, inserted each into a single Word
> document, stripped out the graphics, and sorted the paragraphs.
>
> I ended up with 280 sentences.
>
> Sure, I can visually scan the list and find a sentence like this - "Create
> and confirm a 4-digit Citrix PIN." - and see that it exists twice. I know I
> could paste the list of 280 sentences into Excel and remove the rows that
> are duplicated - that's NOT what I'm looking for.
>
> Instead, I'm looking for something close to this site:
>https://www.online-utility.org/text/analyzer.jsp, BUT I want to know how
> many times a sentence exists. For example, I pasted in the 280 sentences
> and the site came back with this information:
> |
> Some top phrases containing 8 words (without punctuation marks)
> Occurrences
> configure secure hub configure secure hub configure secure
> 4
> |
> However, that text is the following text:
> |
> Configure Secure Hub
> Configure Secure Hub
> Configure Secure Hub
> Configure Secure Hub
> Configure Secure Hub
> Configure Secure Hub
> |
> So what I want to do is paste in the 280 sentences and get a report that
> "Configure Secure Hub" exists in the list of 280 "6" times.
>
> Have you found an easy way to do this?
>
> The next step, after I figure out how to get the list of duplicated text
> is to generate .hts files (snippet files that RoboHelp recognizes) so that
> I can analyze the text outside of RoboHelp, create the .hts files, import
> the snippets into RoboHelp and then run find and replace actions to replace
> "Configure Secure Hub" with the reference to the snippet that will store
> the "Configure Secure Hub" text. I know how to create the snippet file,
> using a DOS command to "Copy [template.hts file] [name of snippet file]"
> but have yet to figure out how to get the actual text I want to store in
> the snippet INTO the snippet without manually pasting the text - Configure
> Secure Hub - into the snippet... but that's after I figure out to analyze
> the text automatically to know that "Configure Secure Hub" is repeated 6
> times in the 280 sentences.
>
> Paul Hanson
> My blog: http://prhmusic.blogspot.com<http://prhmusic.blogspot.com/>
> Me Playing Drums: http://prhmusic.blogspot.com/
> p/videos-of-me-playing-drums.html
> Twitter: @prhmusic
>
>
>
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Visit TechWhirl for the latest on content technology, content strategy and
> content development | http://techwhirl.com
>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> You are currently subscribed to TECHWR-L as jackdeland@
> adamcharlesconsulting.com.
>
> To unsubscribe send a blank email to
> techwr-l-leave -at- lists -dot- techwr-l -dot- com
>
>
> Send administrative questions to admin -at- techwr-l -dot- com -dot- Visit
>http://www.techwhirl.com/email-discussion-groups/ for more resources and
> info.
>
> Looking for articles on Technical Communications? Head over to our online
> magazine at http://techwhirl.com
>
> Looking for the archived Techwr-l email discussions? Search our public
> email archives @ http://techwr-l.com/archives
>
--
Adam Charles Consulting
adamcharlesconsulting.com
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Visit TechWhirl for the latest on content technology, content strategy and content development | http://techwhirl.com