[guispeak] Announcing CHM2TXT

  • From: Jamal Mazrui <empower@xxxxxxxxx>
  • To: guispeak@xxxxxxxxxxxxx, program-l@xxxxxxxxxxxxx, programming@xxxxxxxxxxxxxxxxxxxx
  • Date: Thu, 16 Aug 2007 17:15:30 -0400 (EDT)

Now available at

I hope this utility increases access to information stored in CHM
archives.  It is explained below.



Version 1.0
August 16, 2007
Copyright 2007 by Jamal Mazrui
Modified GPL License

Development Notes

Running on Windows 98 and above, CHM2TXT (chm2txt.exe) is a command line
utility that converts a file from Compiled HTML format (.chm) to
structured text (.txt).  Combining multiple HTML and graphics files, the
CHM format is commonly used for software documentation, e.g., what is
displayed by pressing F1.  The usual help viewing program, however, can be
challenging to search globally or to read continuously.  A single,
structured text file provides an alternative in such cases.  CHM2TXT is a
free, open source program that seeks to fill an observed need of many
users.  Note that its present limitations include the fact that topics are
ordered alphabetically, rather than according to the outline view of the
CHM file.

The command line syntax of CHM2TXT is as follows:
chm2txt "SourceFile.chm" "TargetFile.txt"

A file name should be fully qualified, that is, include a leading path --
either absolute or relative -- if not located in the current directory.
Quotes around a file may be omitted if it does not include a space
character.  The target may be omitted to produce one named like the source
except for extension.  Status messages are displayed on the console (via
standard output) during the conversion process.

The chm2txt.exe executable may be copied to and run from any directory.
The program creates a workspace in a subdirectory of the user's temporary
directory.  Batch files or other applications may invoke CHM2TXT in order
to convert multiple files with a single command, or to provide a graphical
user interface for specifying source and target files.  For example, such
capabilities are included in the EdSharp editor available at

The text file produced by CHM2TXT observes a few conventions that
facilitate navigation in editors that implement the "Homer editor
interface."  Besides EdSharp, TextPal is another such application,
available at

A structured text document is divided into sections separated by a
character sequence consisting of a hard page break and line break (ASCII
12, 13, and 10 codes).  The first section is the table of contents, and
remaining sections are the body.  Each topic name in the contents is also
a section heading in the body.

Relevant Homer keys for navigation are as follows.  Press Control+PageDown
to go to the next section, or Control+PageUp for the previous one.  Press
F6 to go from a topic in the contents to its corresponding section in the
body.  Press Shift+F6 to reverse that, going from a section in the body to
its topic in the contents.  Press Control+F6 to search for a section based
on text in its topic name.  Press Alt+F6 to search for the next match.

A structured text document may also be converted to an equivalent HTML
version, with a table of contents linked to section headings.  Press
Control+H to convert the current document to HTML format.  Press Control+S
to save it to disk.  Press F5 to launch it in the default web browser.

Development Notes
I developed CHM2TXT with the Perl Developer Kit 7.0 from
It incorporates Perl 5.8, as well as the libraries Text::CHM,
HTML::Stripper, and File::OldSlurp from the Comprehensive Perl Archive
Network at

The distribution archive, chm2txt.zip, contains Perl source code
(chm2txt.pl) and the batch file to compile it (compile.bat).  The code is
covered by a modified version of the GNU General Public License (GPL),
which is explained at
Essentially, software that uses the code must be open source, except that
I am willing to relax GPL conditions in a particular case if persuaded
that a greater good would result.

I welcome feedback, which helps CHM2TXT improve over time.  When reporting
a problem, the more specifics the better, including steps to reproduce it,
if possible.  If you happen to be a programmer, please consider
contributing code that fixes a problem or improves functionality.

The latest version of CHM2TXT is available at the same URL,

Jamal Mazrui

End of Document
** To leave the list, click on the immediately-following link:-
** [mailto:guispeak-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** guispeak-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:guispeak-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** guispeak-request@xxxxxxxxxxxxx with the Subject:- faq

Other related posts:

  • » [guispeak] Announcing CHM2TXT