Hi,
i got a special kind of issue today. There is a big list of logs and we have to
return the first x (e.g. 1000) entries sorted on time, newest first.
In order to simplify the issue we just discuss one search criteria and this
will be account name.
There are three scenarios:
1.) account name = 'XXX' (simple)
2.) account_name like 'ABE%' (not simple)
3.) account_name in (..…) potentially a log list
Rather than just by rowcount we could potentially limit additionally by a
number of days in the past. One day has between 1.5 million up tp 8 million
rows.
Response time should be below 10 seconds.
There are 180 days worth of data kept online. Altogether 800 million rows.
Account_name has 2758 distinct values.
As you can see even the result for one account can be very big.
Of course for case 1.) an index (account_name, create_time desc) avoids sorting
and the answer is instand.
case 2 and 3 are not that easy.
I think I would need a 2 dimensional index like a spatial index… or can even a
text index do the trick?
(This is standard Edition.)
I guess somebody has already worked on such case.
Regards
Lothar