[recoll-user] Re: Files with multiple records

Randy Kramer writes:
 > I'm still using recoll version 1.6.1.  I wondered if you ever got around to 
 > doing any more work on handling multiple records in a single file.  
 > Something 
 > like an mbox file, but with some simpler record separator (or an arbitrary 
 > record  separator that I, as a user, can set).
 > 
 > If you remember, I'm using the mbox file format for my askSam / TWiki like 
 > thing, and using recoll for the search function, but something with a less 
 > cumbersome record separator would be nice.

There is no progress on this point in newer Recoll versions. Mbox is still
the only supported multi-record format. 

This may change in the not too distant future as I am currently working on
cleaning up the core index management code in ways that could make these
kinds of things easier. 

In practise I am removing all knowledge about how data is located or
retrieved (file system paths etc.) out of the core index code for which
these elements will now be opaque. This is mostly because it looks cleaner,
but it also opens the way for indexers that would work totally differently
from the current one (ie: extracting data from an rdbms, or any other kind
of data store).

I am also working on exposing the core index functionality through Python,
which would probably make it easier to write such an indexer. Things go
slowly though: I am also learning Python as I go, and few people seem to
actually *need* this, so I'm mostly exploring for my own amusement :)

A Python query interface is already available. If anybody would like to
play with it, just ask. It doesn't work with the 1.10 production version
though, so this is still quite experimental.

jf


Other related posts: