[recoll-user] Re: Using recoll to index (and search) programming code?

Randy Kramer writes:
 > So I decided to try recoll.  For whatever reason (I couldn't find an obvious 
 > reason in the recoll.conf file), it would not index the .c and .h files.  So 
 > I renamed a few of them (actually, one each) to .c.txt and .h.txt.
 > 
 > With that, the indexing works, and seems reasonably helpful so far.
 > 
 > I just wondered:
 >    * is there some reason I shouldn't be trying to index code with recoll--I 
 > mean, maybe recoll will create ridiculously big databases or something 
 > because it is code?

There is no reason not to index program code.

 >    * is there a way to get recoll to index files ending in .c and .h other 
 > than by the renaming trick?  (In the recoll.conf file, indexallfilenames is 
 > set to 1.)

indexallfilenames will only affect the indexing of file names, not file
contents. 

To index program code, you have to associate the file suffixes with a mime
type which recoll will index. For example, in ~/.recoll/mimemap:

.c = text/plain

This would get recoll to index c code as plain text. As a consequence the
external viewer used for c code will be the same as the one for plain text
(you'd have to change this to nedit in mimeconf, I think that the default
is emacs).

Another slightly more powerful approach would be to keep ".c = text/x-c"
and define an external filter for text/x-c. This is described in the user
manual. This would allow having a different editor for plain text and c
code. The filter might also do a better job at turning c to html for
previewing, I imagine there are tools for doing this (I didn't actually
check, though).

Of course Recoll has no idea of the semantic value of words in c code, so
it will not, for example distinguish a word used as a function name or
inside a literal string... 

Regards,
jf


Other related posts: