RE: Choosing data file size for a multi TB database?

  • From: "Parker, Matthew" <matthewp@xxxxxxxxxx>
  • To: <zoran_martic@xxxxxxxxx>, <BranimirP@xxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Mon, 5 Sep 2005 23:35:27 -0700

In Oracle a large number of datafiles do affect you administratively and in 

In a 22TB database that has +4500 files. A simple query such as 
select tablespace_name,sum(bytes)/1024/1024 from dba_segments group by 

takes 5 minutes and that is not indicative of a large number of objects, it is 
indicative of the freespace bitmap walk across the datafiles (locally managed 
tablespaces). Under some circumstances or patch versions it can take much 

Also in examination of this database even under low load the time to check 
point is 15 minutes and many times overlays the next log switch. Someday Oracle 
will parallelize the CKPT process, similar to how it is done on recovery.

As to Tim's point about a Java application with 42 million files and 
directories, the Java app would be non-performant if it had to update all 42 
million files/directories on a regular basis.

For you 10TB database 16GB files will give you a database with about 650 files. 
This still makes the files managable from a size perspective for 
backup/recovery or other types of movement requirements.

As to database backup verification we could talk for hours about backup 
scenarios and backup methodolgies, but in it's simplest terms, this comes down 
to a question I normally ask in interviews of DBA's. How do you test a backup 
of a database when you do not have sufficient space to hold a secondary copy of 
the database? I never accept trust the rust as an answer. The real answer of 
course is simple, you perform recovery of a part of the database and you walk 
through all the parts. It can be done, it simply takes time. 

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Zoran Martic
Sent: Saturday, September 03, 2005 7:47 AM
To: BranimirP@xxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Choosing data file size for a multi TB database?

The checkpointing heavily depends on the multiple DBWR
having the good I/O subsystem behind that can survive
the I/O throughput usually measured in random I/O
operations then throughput. Usually does not matter is
it 50 or 1000 files, it I believe more depends on the
physical I/O layout and how the disk I/O is handling
fast delivery of I/O operations at the backend, then
is it 100 or 1000 files in te database.

The huge database are using various ways to protect
itself. Who has money usually have some crazy disk
based backup strategy that is anyway the fastest
backup and recovery. Usually all very important
databases are also covered by good disaster standby or
somekind of replication.


--- Branimir Petrovic <BranimirP@xxxxxxxx> wrote:

> What about checkpoint against tens of thousands of
> data files, surely 
> more-merrier rule holds? For that reason (or due to
> a fear factor) I 
> was under may be false impression that smaller
> number (in hundreds) 
> of relatively larger data files (20 GB or so) might
> be better choice. 
> Other very real problem with 10TB database I can
> easily foresee, but 
> for which I do not know proper solution, is how
> would one go about the
> business of regular verification of taped backup
> sets? Have another 
> humongous hardware just for that purpose? Fully
> trust the rust? (i.e. 
> examine backup logs and never try restoring, or...)
> What do people 
> do to ensure multi TB monster databases are surely
> and truly safe 
> and restorable/rebuildable?
> Branimir
> -----Original Message-----
> From: Tim Gorman [mailto:tim@xxxxxxxxx]
> Sent: Friday, September 02, 2005 5:59 PM
> To: oracle-l@xxxxxxxxxxxxx
> Subject: Re: Choosing data file size for a multi TB
> database?
> Datafile sizing has the greatest regular impact on
> backups and restores.
> Given a large multi-processor server with 16 tape
> drives available, which
> would do a full backup or full restore fastest?
> *     a 10-Tbyte database comprised of two 5-Tbyte
> datafiles 
> *     a 10-Tbyte database comprised of ten 1-Tbyte
> datafiles 
> *     a 10-Tbyte database comprised of two-hundred
> 50-Gbyte datafiles? 
> *     a 10-Tbyte database comprised of two-thousand
> 5-Gbyte datafiles?
> Be sure to consider what type of backup media are
> you using, how much
> concurrency will you be using, and the throughput of
> each device?
> There is nothing "unmanageable" about hundreds or
> thousands of datafiles;
> don't know why that's cited as a concern.  Oracle8.0
> and above has a
> limitation on 65,535 datafiles per tablespace, but
> otherwise large numbers
> of files are not something to be concerned about. 
> Heck, the average
> distribution of a Java-based application is
> comprised of 42 million
> directories and files and nobody ever worries about
> it...

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

Other related posts: