Alexander writes: > Hi JF, > > thanks again for the script! I tried it, it works and > this is what it adds to the head tag: > > <head> > <title></title> > <meta name="Producer" content="ABBYY FineReader 8.0 Professional Edition"/> > <meta name="CreationDate" content=""/> > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> > <meta name="date" content="2012-02-18T00:00:00"> > </head> > > It seems to be picked up by recoll correctly, since I was > subsequently able to find the docs using a keyword in combination > with the date filter of the advanced search. :-) Good ! I am attaching the updated script because the initial one had a few problems. While this script is specific to the date/document format at hand, modifying it to process some other date format, or field, would be quite easy. > If you find the time to answer I have two further questions > regarding the customisation of recoll. > > 1. Will recollindex pick up more than one date tag? > (I was thinking of adding all contained dates to the head) Recoll will only handle one field as the document date, meaning only the "date" field can be queried using the specific date/time interval syntax. However, it might be useful to add other dates from the document, giving them other field names (ie: date1, date2, etc.). They will be processed as ordinary text fields, but if they are in YMD format (as opposed to DMY), you should be able to do something with wildcards (just specifying '1' or '2' as the first character will dramatically reduce the processing time). You'll need to add your fields to the "fields" configuration file to get them to be indexed. See: http://www.lesbonscomptes.com/recoll/usermanual/rcl.program.html#rcl.program.filters.html http://www.lesbonscomptes.com/recoll/usermanual/rcl.program.fields.html http://www.lesbonscomptes.com/recoll/usermanual/rcl.install.config.html#rcl.install.config.fields + comments in the fields file. I was thinking that I needed to write a tutorial, but it already exists :) : https://bitbucket.org/medoc/recoll/wiki/HandleCustomField > 2. Does recoll take the original CreationDate from pdftotext into > account? (I was thinking of putting the real file creation date > reported by stat there) No, just one date field for now. You could add the file mtime as one of the custom fields above, but, as said, it will be processed as ordinary text, not like a date. Actually, it would not be impossible to modify recoll to handle multiple date fields (but not trivial either). If other people think that this would be an interesting feature, please speak up. Cheers, jf