[recoll-user] Re: Files with multiple records

On Saturday 16 August 2008 05:26 am, Jean-Francois Dockes wrote:
> There is no progress on this point in newer Recoll versions. Mbox is still
> the only supported multi-record format. 
> 
> This may change in the not too distant future as I am currently working on
> cleaning up the core index management code in ways that could make these
> kinds of things easier. 
> 
> In practise I am removing all knowledge about how data is located or
> retrieved (file system paths etc.) out of the core index code for which
> these elements will now be opaque. This is mostly because it looks cleaner,
> but it also opens the way for indexers that would work totally differently
> from the current one (ie: extracting data from an rdbms, or any other kind
> of data store).

Sounds good to me!  You may remember that I was at least trying to leave a 
path open to switch to an rdbms (like mysql)--at the present time, that seems 
less likely--even large text files (well ~10 MB) seem to work well for me, 
and I suspect that for many things they are faster than an rdbms--hmm, but if 
recoll indexed those records .... ;-)

> I am also working on exposing the core index functionality through Python,
> which would probably make it easier to write such an indexer. Things go
> slowly though: I am also learning Python as I go, and few people seem to
> actually *need* this, so I'm mostly exploring for my own amusement :)

What can I do to keep you amused? ;-)

> A Python query interface is already available. If anybody would like to
> play with it, just ask. It doesn't work with the 1.10 production version
> though, so this is still quite experimental.

I might ask for that in a few months--quite a bit on my plate at the moment 
and I need to avoid distractions.

regards,
Randy Kramer
-- 
"I didn't have time to write a short letter, so I created a video 
instead."--with apologies to Cicero, et.al.

Other related posts: