Re: Insert into partitioned tables and specifying the partition name

  • From: amonte <ax.mount@xxxxxxxxx>
  • To: Greg Rahn <greg@xxxxxxxxxxxxxxxxxx>
  • Date: Sat, 10 Apr 2010 13:47:10 +0200

Hello

I have tested using PDML.

The tablespace which I observed contention is bigfile yes and it is uniform
size with 10MB each extent. The one I created was normal smallfile and there
were 8 datafiles.

When inserting to the bigfile I observe constantly waits such as gc buffer
busy, buffer busy waits on block 2 for the bigfile and it can take as long
as 60 seconds (v$session.SECONDS_IN_WAIT).

After checking the datafile autoextend looks like it can cause the problems
I am seeing since the autoextend is set to 100M only. Just changed it to
512M. Will see how it affects the performance

Finally is bigfile really a good idea in 10g?

Thank you


2010/4/10 Greg Rahn <greg@xxxxxxxxxxxxxxxxxx>

> The coordinator should not be doing writes, assuming you are using
> PDML?   Are you?
> alter session enable parallel dml;
>
> If you are seeing waits on block 2 then the issue is space allocation
> - your initial/next extents are too small (maybe the default of 64k
> for ASSM tablespaces?).  If you have multiple parallel loads going on
> at the same time then the extent allocation rate could be quite high.
> I'd recommend using 8MB for initial/next in these cases (assuming the
> partitions end up with more than 8MB).  This should drastically reduce
> the number of extents that get allocated and thus reduce the locking
> on block 2.  Are all these partitioned tables in a single bigfile
> tablespace?
>
> On Fri, Apr 9, 2010 at 5:23 PM, amonte <ax.mount@xxxxxxxxx> wrote:
> > Hello
> >
> > It is 10.2.0.4 RAC so no SQL Monitor :-(
> >
> > I just noticed that when slaves are waiting on "PX Deq Credit: send blkd"
> it
> > is because the coordinator is blocked by some events such as
> >
> > db file parallel write
> > controlfile parallel write
> > library cache lock
> >
> > Not sure why when the first two events kicks in the whole system freezes
> > until dbwr or ckpt finishes their writing. I say freeze because I run
> > against v$sess_io every second and when these two events appears no
> increase
> > on consistent gets or physical reads in v$sess_io
> >
> > I did some tests just now with the 25 million rows table. If I create the
> > table in a tablespace where other fact tables resides and run the load in
> > node 1, if there are some fact tables (residing in same tablespace) being
> > loaded too in other node my load is slow, constantly I see gc waits such
> as
> > gc current block requests on the datafile id and block 2 of file id (isnt
> > that the bitmap block?). In this case the load into partitioned table
> > without partition clause can take as long as 40 minutes (also because of
> the
> > waits mentioned previously).
> >
> > If I create a new tablespace for my table only the load is pretty fast,
> it
> > finishes in around 8 minutes constantly
> >
> > TIA
> >
> >
> >
> > 2010/4/10 Greg Rahn <greg@xxxxxxxxxxxxxxxxxx>
> >>
> >> First off, what is the performance data telling you when you compare
> >> the 40 minute to the 15 minute runs?  SQL Monitor would be my first
> >> recommendation (if on 11g) else ASH.
> >> What version are you on?
> >>
> >> Inserting into an unpartitioned table may be slightly faster than a
> >> partitioned table, but it should not be that significant.  It also
> >> depends on a few things but ultimately it comes down to the parallel
> >> execution plan you are getting.  This can obviously change with
> >> depending on if the source table is partitioned or not and if the
> >> partitioning schemes match (equipartitioned).
> >>
> >> PX Deq Credit: send blkd means that the "writer/consumer" PX servers
> >> are not writing fast enough and the "reader/producer" PX servers are
> >> having to wait to sent them rows.  That's not enough to say
> >> partitioning is the problem.
> >>
> >> You can use partition extended syntax to insert into a specific
> >> partition but in your case you would have to have 3 sessions (or
> >> iterations) because there can only be one target partition (can not
> >> use partition extended syntax for 3 parts at the same time).
> >>
> >>
> >> On Fri, Apr 9, 2010 at 4:03 PM, amonte <ax.mount@xxxxxxxxx> wrote:
> >> > I wonder if is a general practice in a DWH when loading data into
> >> > partitioned tables using INSERT.. SELECT the partition name should be
> >> > used
> >> > in the INSERT?
> >> >
> >> > We have seen performance inestabilities when we do INSERT .. SELECT
> >> > without
> >> > specifying the partition name. It is normal because there is an extra
> >> > step
> >> > to calculate which partition to go but sometimes it is just too slow.
> >> > For
> >> > example when inserting 25 million rows to a heap table we get constant
> 4
> >> > minutes. When inserting same number of rows into a 500 partitions
> table
> >> > (the
> >> > data only goes to3 partitions) it sometimes runs in 15 minutes and
> >> > sometimes
> >> > 40. In the case of 40 minutes we observe the parallel slaves's
> physical
> >> > reads increase much slower (event PX Deq Credit: send blkd).
> >> >
> >> > When specifying the partition name we get stability as well, not as
> fast
> >> > as
> >> > compared to heap table but at least the time is predictable. So I
> wonder
> >> > if
> >> > others specifies the partition name when doing mass loading?
> >>
> >> --
> >> Regards,
> >> Greg Rahn
> >> http://structureddata.org
> >
> >
>
>
>
> --
> Regards,
> Greg Rahn
> http://structureddata.org
>

Other related posts: