On Wed, Aug 17, 2011 at 2:43 AM, <jfd@xxxxxxxxxx> wrote: > Mike Roark writes: > > Hi, > > Hello, and sorry it took so long, I was away on holidays. > > > I'm having trouble registering my custom filter with recoll. I > > wrote a filter for rar files using rclzip as a template (incredibly easy > > using python rarfile module, literally just a substitution from 'zip' to > > 'rar'). It seems to work ok at command prompt... > > > > I'm using the debian recoll 1.15.9-1 package from testing. > > > > I added the following line in ~/.recoll/mimemap: > > > > .rar = application/x-rar > > > > I added the following line in ~/.recoll/mimeconf > > > > application/x-rar = execm rclrar > > > > I put the rclrar script in /usr/share/recoll/filters, with same > > perms as rclzip: > > > > [mike@timetraveler ~]$ ls -l /usr/share/recoll/filters/rcl{zip,rar} > > -rwxr-xr-x 1 root root 3503 Aug 7 12:25 > /usr/share/recoll/filters/rclrar > > -rwxr-xr-x 1 root root 3503 Jun 15 00:17 > /usr/share/recoll/filters/rclzip > > > > I set my loglevel = 6 in recoll.conf, and I have a small testdir > > with a rar file and a zip file in it which I'm indexing. I run > > recollindex -z > recoll.log 2>&1 to reindex... > > > > I expect to see some mention of rclrar in the log file. For > > comparison I see stuff like this for rclzip and for my zip file: > > > > :4:../rcldb/rcldb.cpp:1215:Db::needUpdate:yes (new): > > [Q/home/mike/tmp/testindex/puppet.zip|] > > :5:../index/fsindexer.cpp:360:processone: processing: [5 MB ] > > /home/mike/tmp/testindex/puppet.zip > > :4:../internfile/internfile.cpp:224:FileInterner:: > > [/home/mike/tmp/testindex/puppet.zip] mime [(null)] preview 0 > > :4:../internfile/internfile.cpp:298:FileInterner:: init ok > > application/zip [/home/mike/tmp/testindex/puppet.zip] > > :4:../internfile/internfile.cpp:767:FileInterner::internfile. ipath [] > > > :4:../internfile/mh_execm.cpp:142:MimeHandlerExecMultiple::next_document(): > > [/home/mike/tmp/testindex/puppet.zip] > > :4:../internfile/mh_execm.cpp:42:MimeHandlerExecMultiple::startCmd > > :4:../utils/execmd.cpp:185:ExecCmd::startExec: (1|1) > > /usr/share/recoll/filters/rclzip > > :4:../internfile/mh_execm.cpp:214:MHExecMultiple: got ipath > > [Apress.Pro.Puppet.May.2011.pdf] > > ... > > > > However, I see no mention of rclrar and no ipaths getting found. > > The rar file seems to only be indexed by filename (which I would expect > > with no filter), since I cannot search on any of the content of the pdf > > file inside of it... (log output for the rar file below)... > > > > Please let me know if you see anything wrong with my approach... I > > feel I'm missing something obvious. > > What you do looks quite ok. > > Did you add an [index] section at the top of your mimeconf though ? Lines > like "application/x-rar = execm rclrar" should be inside an "index" section > (there are other sections in mimeconf, to define what icon to use, how to > group doc types into broader genres etc.) > > Otherwise I can see really no reason why this wouldn't work, I'm quite > confident that rar will soon be a supported type :) > > Cheers, > > jf > > That was exactly the problem, and the script seems to work. Added it as an attachment here: https://bitbucket.org/medoc/recoll/issue/52/requests-for-additional-format-support