Re: Announcing PDF2HTM
- From: "Peter Donahue" <pdonahue1@xxxxxxxxxxxxx>
- To: <programmingblind@xxxxxxxxxxxxx>
- Date: Sun, 25 Jan 2009 12:57:33 -0600
Hello Jamal and listers,
How about a single program that would do it all. If one wants to create
a TXT or an HTML file from a PDF document they should be able to select the
kind of translation desired from within the program. Rather than having two
separate programs why not combine them in to one? Perhaps you could
investigate offering translation from PDF in to DAISY, MP3 Audio, and
Braille as well. That would be one awesome program.
Peter Donahue
----- Original Message -----
From: "Jamal Mazrui" <empower@xxxxxxxxx>
To: <ProgrammingBlind@xxxxxxxxxxxxx>; <Program-L@xxxxxxxxxxxxx>;
<GUISpeak@xxxxxxxxxxxxx>
Sent: Sunday, January 25, 2009 12:49 PM
Subject: Announcing PDF2HTM
From the archive
http://EmpowermentZone.com/pdf2htm.zip
PDF2HTM
Version 1.0
January 25, 2009
Copyright 2009 by Jamal Mazrui
GPL License
PDF2HTM is a command-line utility that converts one or more files from PDF
to HTML format. The syntax is
pdf2htm.exe SourcePDF
where the parameter is either a file name or a wildcard spec like
*.pdf
Enclose it with quotes if it contains a space. A resulting HTML file has
the same name except for a .htm extension.
This was built with Python 2.5 and the packages PDFMiner and py2exe. The
top-level script, pdf2htm.py, is an adaptation of the PDFMiner tool called
pdf2txt.py. The batch file, RunSetup.bat, runs the py2exe script,
setup.py, to create the stand-alone executable, pdf2htm.exe.
All aspects of the HTML format are determined by underlying PDFMinor
routines. Visual aspects such as fonts are present, but structural
aspects such as headings do not seem to be converted, unfortunately.
Other programmers interested in this project may wish to work on improving
HTML structure.
__________
View the list's information and change your settings at
http://www.freelists.org/list/programmingblind
__________
View the list's information and change your settings at
http://www.freelists.org/list/programmingblind
Other related posts: