[recoll-user] Re: Memory usage for purging

  • From: jfd@xxxxxxxxxx
  • To: recoll-user@xxxxxxxxxxxxx
  • Date: Mon, 22 Aug 2011 22:39:33 +0200

Theo Wollenleben writes:
 > I moved a directory tree containing most of my indexed files. Updating
 > the index almost doubled its size. Now there are 28.6 GB of index files
 > under the directory xapiandb. At the end of the index update, while the
 > status bar shows "Indexing in progress: Purge", the recoll process
 > starts consuming all of the available memory until swapping to disk
 > begins (recoll apparently needs more than 3 GB for purging my index). I
 > tried to let it finish but eventually killed the recoll process after a
 > few hours. Is there a way to purge the index without excessive memory
 > usage?

It is normal that renaming the main directory would double the index size
as the renamed files will be indexed as new before the purge phase will
delete the old data. Recoll has no concept for renaming or moving
files. Actually, in this situation, it would be better to remove the
Xapiandb directory before reindexing (or else use recollindex -z).

But I've really got no idea of why the purge phase is using a lot of
memory. It is normally a simple loop to delete the documents that don't
exist any more, just a repeated Xapian "delete" call.

For reference, what Recoll and Xapian versions are you using ?

I'd like to have a better suggestion, but the only idea which comes to
mind is to just delete the xapiandb directory and reindex. I do realize
that regenerating a dozen GB of index is no fun, but I just have no other
idea about what to do.

Regards,

jf

Other related posts: