Re: another pdf to text problem

  • From: "InthaneElf" <inthaneelf@xxxxxxxxxxxxxx>
  • To: <programmingblind@xxxxxxxxxxxxx>
  • Date: Sun, 5 Oct 2008 00:14:02 -0700

mind you, I said wait for an hour; that was "to be sure", that was not a 
statement of how long it does take with pdf2txt, on 5 to 15 pages... 

just wanted to clarify that, and thanks for the recommendation, I'll try it

inthane
proprietor, The Grab Bag, 
for blind computer users and programmers
http://grabbag.alacorncomputer.com
Owner: Alacorn Computer Enterprises
"own the might and majesty of a Alacorn!"
www.alacorncomputer.com
Owner: Agemtree
"merchants in fine facetted and cabochon gemstones"
www.agemtree.com
operator: Fruit Basket Demo Sight, where you can find a similar project done in 
several programming languages, along with its source code, so you can decide 
what language is right for you
http://fruitbasketdemo.alacorncomputer.com

  ----- Original Message ----- 
  From: Manish Agrawal 
  To: programmingblind@xxxxxxxxxxxxx 
  Sent: Saturday, October 04, 2008 10:20 PM
  Subject: Re: another pdf to text problem


  I use ABBY fine reader to convert any PDFs that have accessibility turned off 
or are otherwise too long to be read using acrobat reader (with no continuous 
scrolling etc).
  Fine reader does a very good job of converting PDFs and has never frozen on 
me for documents as big as 500 pages. 
  For a 15 page document, Fine Reader will not take more than 5 minutes in 
contrast to the "about 1 hour" mentioned below for pdf to txt.
  I am sure other commercial products like omni page are just as good.

  HTH,
  Manish



  On Fri, Oct 3, 2008 at 2:21 AM, programming <rproglock@xxxxxxx> wrote:

    Hi,

    Thanks for the help in getting the pdf file converted. Your advice worked. 
However, when the conversion was completed, the txt file was not right.

    Anyway, thanks for the help.

    Bob

      ----- Original Message ----- 
      From: InthaneElf 
      To: programmingblind@xxxxxxxxxxxxx 
      Sent: Wednesday, October 01, 2008 10:33 PM
      Subject: Re: another pdf to text problem


      ah, then you might wish to hit f11 to get the latest copy of PDF2TXT, and 
it will be in the main window, just tab across it and there will be a check box 
which says "image format"  check it and run the translation, if it can handle 
it it will, but keep in mind that this is going to be a slow process, if it 
sits there a while, let it sit, for 15 pages, I'd say wait an hour, if it has 
not finished off the work by then, and is not counting pages to you when you 
sit in the window for a while, then it choked on it, I have 4 scanned books 
here that it can't translate, there too big for it *sigh*

      of course there between 200 to 400 pages each so... 

      you shouldn't have much trouble with your 15 page document, I was up to 
page 40 something when P2T froze on me both times I tried it with one of those 
books.

      good luck,
      inthane
      proprietor, The Grab Bag, 
      for blind computer users and programmers
      http://grabbag.alacorncomputer.com
      Owner: Alacorn Computer Enterprises
      "own the might and majesty of a Alacorn!"
      www.alacorncomputer.com
      Owner: Agemtree
      "merchants in fine facetted and cabochon gemstones"
      www.agemtree.com
      operator: Fruit Basket Demo Sight, where you can find a similar project 
done in several programming languages, along with its source code, so you can 
decide what language is right for you
      http://fruitbasketdemo.alacorncomputer.com

        ----- Original Message ----- 
        From: programming 
        To: programmingblind@xxxxxxxxxxxxx 
        Sent: Wednesday, October 01, 2008 7:50 PM
        Subject: Re: another pdf to text problem


        Hi,

        Could you please tell me how to find the checkbox where I set to OCR?

        This might be a stupid question but I can't find it.

        Thanks for your help.

        Bob

          ----- Original Message ----- 
          From: InthaneElf 
          To: programmingblind@xxxxxxxxxxxxx 
          Sent: Wednesday, October 01, 2008 8:44 PM
          Subject: Re: another pdf to text problem


          did you check the checkbox for using the OCR function in PDF2TXT? and 
then try scanning it, it sounds like this is the age old problem of a scanned 
image used to create the .PDF, instead of a text document, and will require OCR 
to read it if it's possible to do at all.

          inthane
          proprietor, The Grab Bag, 
          for blind computer users and programmers
          http://grabbag.alacorncomputer.com
          Owner: Alacorn Computer Enterprises
          "own the might and majesty of a Alacorn!"
          www.alacorncomputer.com
          Owner: Agemtree
          "merchants in fine facetted and cabochon gemstones"
          www.agemtree.com
          operator: Fruit Basket Demo Sight, where you can find a similar 
project done in several programming languages, along with its source code, so 
you can decide what language is right for you
          http://fruitbasketdemo.alacorncomputer.com

            ----- Original Message ----- 
            From: programming 
            To: programmingblind@xxxxxxxxxxxxx 
            Sent: Wednesday, October 01, 2008 2:41 PM
            Subject: another pdf to text problem


            Hi list,

            When I open the listed pdf file into PDF-TO -TEXT, I get the 
following message:

            " 
            Cannot convert August 2008 Beacon.pdf

            File name=C:\PDF2TXT\PDF\August 2008 Beacon.pdf

            File size=1945487

            Author=Panasonic Communications Co.,LTD.

            Title=Network Scan Data

            Subject=MFP Image Format

            Creator=HPDFlib

            Producer=HPDFlib 1.01(MFP)

            PDF version=1.2

            Page count=15

            Number of form fields=0

            User Password=No

            Master Password=No

            Printing=Fully Allowed

            Changing the Document=Allowed

            Content Copying or Extraction=Allowed

            Authoring Comments and Form Fields=Allowed

            Form Field Fill-in or Signing=Allowed

            Content Accessibility Enabled=Allowed

            Document Assembly=Allowed

            Encryption Level=Blank"




            Is there any way to read this pdf file?

            Jamel, would it be OK for me to send you the file so you can work 
with it? If so, what is your email address?


            Thanks for any help you can give me as the file is one of my 
churches newsletters.

            Bob



  -- 
  Regards,
  Manish
  http://iaccessible.blogspot.com

Other related posts: