Re: edSharp pdf converter and new lines

  • From: Jamal Mazrui <empower@xxxxxxxxx>
  • To: programmingblind@xxxxxxxxxxxxx
  • Date: Tue, 28 Sep 2010 20:38:26 -0400

In general, EdSharp does not do file conversions, itself; It calls a free, external converter, either an executable or a COM server. Converterters are associated with file extensions, and are defined in EdSharp.ini in the Import and Export sections. If only one converter for an extension is defined, EdSharp chooses it automatically. Otherwise, EdSharp prompts with a list of converters to choose from.


Currently, the only converter defined for the .pdf extension is the PdfToText utility available at
http://www.foolabs.com/xpdf/download.html

Thanks for making me aware of that weakness in its conversions. Unfortunately, I have not found a free PDF conversion utility that consistently outperforms all others. I find that this one tends to do better than others in terms of inferring proper reading order, but worse than others in terms of formatting of the text.

In terms of the PDF conversion features most important to you, you may wish to compare results with PDF2TXT, available at
http://EmpowermentZone.com/p2tsetup.exe

Try both the regular conversion, and the Extra HTML option (Alt+X).

FileDir uses yet another conversion utility called GetText, which converts other formats to text as well as PDF.
http://EmpowermentZone.com/dirsetup.exe

Jamal


On 9/27/2010 1:21 PM, Alex Hall wrote:
Pretty sure. When I open the email to which the pdf is attached in
gmail, I can download it or have google translate it to html for me.
If I choose to have it translated, the resulting page looks how I
expect, with all new lines where they should be. If I download the
file and open it in edsharp, single new lines are gone.

On 9/27/10, Homme, James<james.homme@xxxxxxxxxxxx>  wrote:
Hi,
I'm not trying to defend EdSharp, but are you sure that the problem is not
with the PDF?

Thanks.

Jim

Jim Homme,
Usability Services,
Phone: 412-544-1810. Skype: jim.homme
Internal recipients,  Read my accessibility blog. Discuss accessibility
here. Accessibility Wiki: Breaking news and accessibility advice


-----Original Message-----
From: programmingblind-bounce@xxxxxxxxxxxxx
[mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Alex Hall
Sent: Monday, September 27, 2010 12:39 PM
To: programmingblind
Subject: edSharp pdf converter and new lines

Hi all, mostly Jamal:
I am wondering if new lines in pdf files could be handled better? When
I convert a pdf with edsharp, any single return is turned into a
space. If there are two hard returns or more in a row, they are
preserved, but one return is lost. This can get frustrating with
documents containing text to be parsed, questions to be answered, and
other types of structured or semi-structured text that is not purely
for reading, and a couple of my professors love pdf files so I get a
lot of non reading ones. Thanks.

--
Have a great day,
Alex (msg sent from GMail website)
mehgcap@xxxxxxxxx; http://www.facebook.com/mehgcap
__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind


This e-mail and any attachments to it are confidential and are intended
solely for use of the individual or entity to whom they are addressed.  If
you have received this e-mail in error, please notify the sender immediately
and then delete it.  If you are not the intended recipient, you must not
keep, use, disclose, copy or distribute this e-mail without the author's
prior permission.  The views expressed in this e-mail message do not
necessarily represent the views of Highmark Inc., its subsidiaries, or
affiliates.
__________
View the list's information and change your settings at
//www.freelists.org/list/programmingblind




__________
View the list's information and change your settings at //www.freelists.org/list/programmingblind

Other related posts: