[recoll-user] Re: Recoll 1.12.0 hangs

  • From: Jean-Francois Dockes <jean-francois.dockes@xxxxxxxxxx>
  • To: recoll-user@xxxxxxxxxxxxx
  • Date: Mon, 23 Feb 2009 08:06:29 +0100

aehtytd02@xxxxxxxxxxxxxx writes:
 > 
 > It's my first time using recoll, so sorry if this is a known issue.  I'm 
 > using recoll 1.12.0 from the SuSE rpm from the website on SuSE 11.1.  I 
 > tried having recoll index a small directory mostly of PDFs as a test, and 
 > it keeps hanging on various PDFs, with 100% CPU use split between recoll 
 > and pstotext (1.9, from the SuSE build service).  I thought xpdf was used 
 > for indexing PDFs, or?  Maybe one in ten PDFs has this problem: I gave up 
 > after three of them.  The first such file can be found here: 
 > http://www.kyb.mpg.de/publications/attachments/Luxburg06_TR_%5B0%5D.pdf
 > 
 > Probably this is a pstotext bug more than a recoll bug, but how can I get 
 > around it (apart from moving every tenth file out of the directory to be 
 > indexed)?

This is not a known issue, thanks a lot for taking the time to investigate
it.

pstotext is normally not used at all for indexing pdf files. Also I tried
indexing the pdf you linked to on a Suse machine and it went through
without a problem. This makes me think that the problem may be a little
different than just recoll/pstotext choking on some pdfs.

How do you determine which file is being processed when the indexing 
hangs ?
The status line messages in the GUI are just indicative sample points and
can't be trusted for this.

Are there by any chance any postscript files stored with the pdfs ?

In order to better determine what command is actually performed, you could
use something like the following in a terminal window while the indexing is
hanging:

ps awwx| egrep 'recoll|pdftotext|pstotext|awk' | grep -v grep

Also I'd recommand using recollindex, not recoll, to do the indexing. This
is normally the same but having a separate process rather than a thread
inside the recoll GUI further simplifies things. Use "recollindex -z" in a
terminal window.

If anything is unclear, don't hesitate to ask questions (possibly off-list,
if not directly recoll-related, I have no way to know at this point how
familiar you are with the command line etc.)

Thanks again for reporting the issue.
Cheers,
JF









Other related posts: