Re: Complex CONTEXT index

> Don't you need to translate the BLOB content into indexable text before you
> index it? A simple transliteration of hex values is no help; you need
> something that would convert the enclosed encoded Word or PDF into real
> words.

This is exactly what Oracle's Ultrasearch does.  I've used it in the past
(10gR1) to power an Intranet search site that crawled Intranet sites as well
as file shares.  It indexes very well, grabbing text from every popular
format including binary MS documents, PDFs, drawings, as well as the headers
in images and video files.  But it wasn't the most stable.  I needed to
bounce it regularly, at least monthly, IIRC.

Perhaps it's better with 10gR2 or 11g, if it's still available.

HTH!  GL!

Rich


--
http://www.freelists.org/webpage/oracle-l


Other related posts: