Re: Complex CONTEXT index

  • From: Manuela Atoui <atoui@xxxxxxxxxxxxx>
  • To: ORACLE-L <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 26 Jan 2009 10:32:00 +0100

Rich Jesse wrote:

Don't you need to translate the BLOB content into indexable text before you
index it? A simple transliteration of hex values is no help; you need
something that would convert the enclosed encoded Word or PDF into real
words.

This is exactly what Oracle's Ultrasearch does.  I've used it in the past
(10gR1) to power an Intranet search site that crawled Intranet sites as well
as file shares.  It indexes very well, grabbing text from every popular
format including binary MS documents, PDFs, drawings, as well as the headers
in images and video files.  But it wasn't the most stable.  I needed to
bounce it regularly, at least monthly, IIRC.

Perhaps it's better with 10gR2 or 11g, if it's still available.

HTH!  GL!

Rich


--
//www.freelists.org/webpage/oracle-l


Dear All,
maybe not exactly what the OP is looking for, but why concatenate the different columns manually? You can us the MULTI_COLUMN_DATASTORE to create a virtual document for each record.
I used it with VARCHAR2 and BLOB's as well as CLOB's.
Please see link below for details:
http://download.oracle.com/docs/cd/B19306_01/text.102/b14218/cdatadic.htm#sthref327

Have a nice day
Manuela
begin:vcard
fn:Manuela Atoui
n:Atoui;Manuela
org:FIZ CHEMIE Berlin;PI
adr;dom:;;Franklinstr. 11;Berlin;;10587
email;internet:atoui@xxxxxxxxxxxxx
title:Software Entwickler, Oracle DBA
tel;work:+49 30 399 77-206
note;quoted-printable:Geschaeftsfuehrer: Prof. Dr. R. Deplanque=0D=0A=
        Aufsichtsratsvorsitzender: SenRat Bernd Lietzau, Berlin=0D=0A=
        Registergericht: Amtsgericht Charlottenburg HRB 19047=0D=0A=
        USt.-ID Nr.: DE 136 629 878=0D=0A=
        =0D=0A=
        FIZ CHEMIE Berlin ist ein Institut der Leibniz Gemeinschaft=0D=0A=
        
url:htttp://www.fiz-chemie.de
version:2.1
end:vcard

Other related posts: