RE: oracle text

  • From: <Peter.Hitchman@xxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 8 Sep 2005 12:52:40 +0100

Hi,
You are asking some questions I cannot answer, as it has been a while since I 
worked with Oracle text.
I think that you will be better going with the concatenated datastore, I did 
this with the older InterMedia Text release and it worked very well. 

When it comes to the access order, you may well know better than the optimizer 
in which order to do things, but Oracle will usually do the text index search 
first and then filter the results from that using other predicates. This is 
because it see's doing the other way as less efficient, what is referred to as 
a functional lookup, for each row source being returned fire up the Oracle text 
code and see if the row should stay in the result set. But like I said you may 
know better, so gather full statistics (since 9i the text index stats are 
gathered on the fly by Oracle) and then if need be use hints.

Regards

Pete

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On Behalf Of Veres Lajos
Sent: 07 September 2005 14:44
To: oracle-l@xxxxxxxxxxxxx
Subject: oracle text


Hello,

I have a table with 2 text columns (varchar2(80),varchar2(4000)) and about 
10 numeric colums (and some unreleated to this post).
version 10.1.0.4, linux 2.4

Our users can search with the most of combinations of text and numeric 
rows.

I checked Oracle Text, and I have some questions.

1.
There is 2 text columns, sometimes I search only from the first. (title 
and body are this columns), and sometimes from all the two.
I should set up 2 index? Or I can use multicolumn or other datastore to 
sometimes search from the first field and sometimes from all the two?
Which will be faster?
Or exists any best practice?
Maybe section grouping or similar can help?

2.
The second column (body) can contain html, but not really sure that this 
html is correct. (users can fill this fields)
And I wont search for the html entries.
Can I set up filtering to filter out HTML entries?
As I seen INSO filter skips filtering html.

3. Context index vs ctxcat
Manual says i should create a ctxcat index with all available 
combinations of queried fields.
But there is too much.
Our search form allows our users to create all combinations of queries 
for this 2+10 fields.

4. filtering, searching
Most queries should be faster if "contains" operator filters only the 
result.
We have queries like:
where contains(field,'searchstring')>0 and id1 IN (1,2,3) and id2 in (3,4) 
and id3=5

I think the query should be faster if at first oracle collects data with 
indexes from the table, and later filters only for searchstring.
id{1-10} fields are mostly indexed.
(top used, and some top used combinations at least)

Optimizer can guess similar things? Or It will start gather data everytime 
with contains and filter later?

Today I found this:
http://www.oracle.com/technology/sample_code/products/text/htdocs/concatenated_text_datastore/cdstore_readme.html
This concatenated datastore seems to be an other solution to the same problem.
Have you any experience with this?

Any help would be appreciated.

Thanks.

-- 
Veres Lajos
vlajos@xxxxxxxxxxxxxx
+36 20 438 5909
--
//www.freelists.org/webpage/oracle-l
--
//www.freelists.org/webpage/oracle-l

Other related posts: