CHM files, known as Microsoft Compressed HTML Help files, are a common format for eBooks and online documentation. They are basically a collection of HTML files stored in a compressed archive with the added benefit of an index.
Under Linux, you can view a CHM file with the xchm viewer. But sometimes that’s not enough. Suppose you want to edit, republish, or convert the CHM file into another format such as the Plucker eBook format for viewing on your Palm. To do so, you first need to extract the original HTML files from the CHM archive.
This can be done with the CHMLIB (CHM library) and its included helper application extract_chmLib
.
In Debian or Ubuntu:
$ sudo apt-get install libchm-bin $ extract_chmLib book.chm outdir
where book.chm
is the path to your CHM file and outdir
is a new directory that will be created to contain the HTML extracted from the CHM file.
In other Linuxes, you can install it from source. First download the libchm source archive from the above website. I couldn’t get the extract_chmLib
utility to compile under the latest version 0.4, so I used version 0.37.4 instead.
$ tar xzf chmlib-0.37.4.tgz $ cd chmlib-0.37/ $ ./configure $ make $ make install $ make examples
After doing the “make examples
“, you will have an executable extract_chmLib
in your current directory. Here is an example of running the command with no arguments and the output it produces:
$ ./extract_chmLib usage: ./extract_chmLib <chmfile> <outdir>
After running the utility to extract the HTML files from your CHM file, the extracted files will appear in <outdir>
. There won’t be an “index.html” file, unfortunately. So you’ll have to inspect the filenames and/or their contents to find the appropriate main page or Table of Contents.
Now the HTML is yours to enjoy!