[recoll-user] Re: help writing custom filter

  • From: jfd@xxxxxxxxxx
  • To: recoll-user@xxxxxxxxxxxxx
  • Date: Wed, 17 Aug 2011 09:43:58 +0200

Mike Roark writes:
 > Hi,

Hello, and sorry it took so long, I was away on holidays.

 >      I'm having trouble registering my custom filter with recoll. I 
 > wrote a filter for rar files using rclzip as a template (incredibly easy 
 > using python rarfile module, literally just a substitution from 'zip' to 
 > 'rar'). It seems to work ok at command prompt...
 > 
 >      I'm using the debian recoll 1.15.9-1 package from testing.
 > 
 >      I added the following line in ~/.recoll/mimemap:
 > 
 > .rar = application/x-rar
 > 
 >      I added the following line in ~/.recoll/mimeconf
 > 
 > application/x-rar = execm rclrar
 > 
 >     I put the rclrar script in /usr/share/recoll/filters, with same 
 > perms as rclzip:
 > 
 > [mike@timetraveler ~]$ ls -l /usr/share/recoll/filters/rcl{zip,rar}
 > -rwxr-xr-x 1 root root 3503 Aug  7 12:25 /usr/share/recoll/filters/rclrar
 > -rwxr-xr-x 1 root root 3503 Jun 15 00:17 /usr/share/recoll/filters/rclzip
 > 
 >      I set my loglevel = 6 in recoll.conf, and I have a small testdir 
 > with a rar file and a zip file in it which I'm indexing. I run 
 > recollindex -z > recoll.log 2>&1 to reindex...
 > 
 >      I expect to see some mention of rclrar in the log file. For 
 > comparison I see stuff like this for rclzip and for my zip file:
 > 
 > :4:../rcldb/rcldb.cpp:1215:Db::needUpdate:yes (new): 
 > [Q/home/mike/tmp/testindex/puppet.zip|]
 > :5:../index/fsindexer.cpp:360:processone: processing: [5 MB ] 
 > /home/mike/tmp/testindex/puppet.zip
 > :4:../internfile/internfile.cpp:224:FileInterner:: 
 > [/home/mike/tmp/testindex/puppet.zip] mime [(null)] preview 0
 > :4:../internfile/internfile.cpp:298:FileInterner:: init ok 
 > application/zip [/home/mike/tmp/testindex/puppet.zip]
 > :4:../internfile/internfile.cpp:767:FileInterner::internfile. ipath []
 > :4:../internfile/mh_execm.cpp:142:MimeHandlerExecMultiple::next_document(): 
 > [/home/mike/tmp/testindex/puppet.zip]
 > :4:../internfile/mh_execm.cpp:42:MimeHandlerExecMultiple::startCmd
 > :4:../utils/execmd.cpp:185:ExecCmd::startExec: (1|1) 
 > /usr/share/recoll/filters/rclzip
 > :4:../internfile/mh_execm.cpp:214:MHExecMultiple: got ipath 
 > [Apress.Pro.Puppet.May.2011.pdf]
 > ...
 > 
 >       However, I see no mention of rclrar and no ipaths getting found. 
 > The rar file seems to only be indexed by filename (which I would expect 
 > with no filter), since I cannot search on any of the content of the pdf 
 > file inside of it... (log output for the rar file below)...
 > 
 >      Please let me know if you see anything wrong with my approach... I 
 > feel I'm missing something obvious.

What you do looks quite ok. 

Did you add an [index] section at the top of your mimeconf though ? Lines
like "application/x-rar = execm rclrar" should be inside an "index" section
(there are other sections in mimeconf, to define what icon to use, how to
group doc types into broader genres etc.)

Otherwise I can see really no reason why this wouldn't work, I'm quite
confident that rar will soon be a supported type :)

Cheers,

jf

Other related posts: