Re: Saving MS Office Documents into an Oracle Database?

  • From: Mladen Gogala <gogala.mladen@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Tue, 11 Jun 2019 20:10:08 -0400

I would be very careful with that. File systems are not made to contain hundreds of thousands file entries. Directories would become extremely slow and an ordinary "ls" command could take 20 minutes. BFILE is good when you don't have gazillions of them. When there is a large quantity of documents, BLOB is a better option. However, make sure that you have tiered storage and that your BLOB tablespace will go to something cheap, like Isilon. That will work if you have tens of millions of documents. If you have hundreds of millions or even billions of documents, an alternative is to use a NoSQL database like MongoDB and keep just the record key in the Oracle database. You will also need another text search system in that case. I can wholeheartedly recommend Sphinx.

However, be aware that we're now talking library of congress sized document store.  I know of no company with such an immense collection of documents. If there is one, than it becomes a political problem. What are these documents about, do they contain personal data and what is the purpose of such a collection. Only the almighty Google knows everything about everybody.


On 6/11/19 2:45 PM, Sayan Malakshinov wrote:

You can use BFILEs(ie store just links to files in the oracle directory, not whole files in oracle datafiles) or BLOBs for that.

Btw, oracle text support full text indexes for MS office documents such as excel and word :)

On Tue, Jun 11, 2019 at 9:42 PM Jeremy Ovenden <Jeremy@xxxxxxxxxxxxx <mailto:Jeremy@xxxxxxxxxxxxx>> wrote:

    Heh thanks for reinforcing your view with a useful image.

    Why would you store anything in a database? Presumably for
    centralised, secure access to data.

    Many thanks


Best regards,
Sayan Malakshinov
Oracle performance tuning engineer
Oracle ACE Associate

Mladen Gogala
Database Consultant
Tel: (347) 321-1217

Other related posts: