On 4/25/12 10:39 AM, Steve Basford wrote: > >> I have just done some tests and bofhland_malware_URL.ndb increases my >> database reload time by 400%. >> Total number of sigs. loaded is not a factor in the reload time increase [...] > I think I might have made a little progress, after doing a few tests on > signature variations... > > Can do do the following test for me... > > 1. Load db's as normal.. make a note of the time to load. > 2. sed -i "s/:687474703A2F2F/:2F2F/g" bofhland_malware_URL.ndb > 3. reload and take a note of the time. > > Any improvement? > > Simple test for me using clamscan and one database only: > [...] > So, from 42s down to 2s Re-opening an old thread. The same problem popped up again, it seems; although Steve's suggestion is still in place, the root cause is probably the same. In detail: in the last few days a lot of spam is (ab)using t.co shortened URLs in the payload, so these are ending up in bofhland_cracked_URL.ndb (~7K distinct URLs atm) As a result, reload times for that sigfile are seriously increasing. Last time, the cause of he issue was -for some reason- that every URL in the sigfile started with an "http://"; and for some reason clamav didn't like that. The workaround applied was to replace "http://"; with "(B)". Seems to me, there's something in how clamav stores these signatures that screws up when there are too many rules starting with the same prefix. Right now, AFAICT, the problem is exactly the same, except the culprit prefix is now "t.co" instead of "http://";. And I can't obviously strip that part it as I did last time... Suggestions?