Re: Curiosity: single-column index on sparse data cannot be built in parallel

From: Charles Schultz <sacrophyte@xxxxxxxxx>
To: Jonathan Lewis <jonathan@xxxxxxxxxxxxxxxxxx>
Date: Tue, 14 Jul 2015 19:42:19 -0500

One more observation; when I create the said index in a midst of other
indexes on the same table (all in a series, one after another), the index
does not seem to be built in parallel. However, after waiting several
hours, if I drop and rebuild that index in isolation it does in fact use
parallel slaves (and in just over 4 minutes to boot!). This gives me
another angle to investigate; my gut reaction says the TEMP tablespace
group might have some contention.

More later,

On Tue, Jul 14, 2015 at 7:05 PM, Charles Schultz <sacrophyte@xxxxxxxxx>
wrote:

I didn't check $pq_tqstat, but I show results when I next do the test. I
say it was not running in parallel because only one session was active via
Enterprise Manager (yeah, I was being lazy, I know) and the creation took
over 1 hour the first test, 51 minutes the next test, whereas all my other
indexes (mix of local and non-partitioned) on the same table were taking
less than 20 minutes (some less than 10).

Jonathan, can you elaborate a little more on the "interesting anomaly"? :)
It is entirely possible I am hitting that, but would like to learn more.

As far as parallel 100, I was just trying to open the doors to see where
my bottlenecks are. The extent sizes are included in the index ddl (5m);
again, I don't have issues with the other indexes, but there is something
about this being on a single column, so you may be on the right track with
large extents and small volumes of data per slave. You are making me
curious now. :) I love the anthromorphic angle on "refusing to cooperate".
*grin*

PS - in case it is not obvious, this is a vendor supplied table and index.
PS2 - tests currently executing on Sun T5440, for those that care about
such things.
PS3 - as to "why"; I got a lot of indexes to build in a short time, so I
am looking for the bounds. :)

On Tue, Jul 14, 2015 at 4:12 PM, Jonathan Lewis <
jonathan@xxxxxxxxxxxxxxxxxx> wrote:

I'm not going to try for 500M rows, but I just built a table with 10M
rows with your pattern of data and ran a parallel create index.

There is an interesting anomaly to the create index (11.2.0.4) that might
make the build look like a serial build - how are you showing that the
build isn't parallel ?
I queried v$pq_tqstat after the build.

Parallel 100 seems a little optimistic for such a "small" index - 17M
rows at about 13 bytes per row gives about 220MB or 2MB per slave to build.
What's your default extent size - perhaps there's something about large
extents and small volumes of data per slave that makes Oracle refuse to
co-operate.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
@jloracle
------------------------------
*From:* oracle-l-bounce@xxxxxxxxxxxxx [oracle-l-bounce@xxxxxxxxxxxxx] on
behalf of Charles Schultz [sacrophyte@xxxxxxxxx]
*Sent:* 14 July 2015 21:59
*To:* ORACLE-L
*Subject:* Curiosity: single-column index on sparse data cannot be built
in parallel

Good day,

I am trying to find the technical reason for why Oracle cannot use
parallel slaves to build a single-column index on a sparse column with few
distinct values:

F COUNT(*)
- ----------
538937561
Y 51464
N 17486819

Just by playing around, I discovered that if I put this column as the
leading edge on an index with many columns, it can be built in parallel.

According to the relevant documentation
<http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel007.htm#i1009131>,
I would expect that the base table is sampled, but perhaps the random
sampling returns 0 keys due to the sparse nature of the column?

Facts:
Oracle Enterprise Edition 11.2.0.4
table is partitioned, but not on this key (lol)
parallel_max_servers=3600

ddl extracted via datapump:
CREATE INDEX "FIMSMGR"."FGBTRND_ENCD_INDEX" ON "FIMSMGR"."FGBTRND"
("FGBTRND_DEFER_GRANT_IND")
PCTFREE 10 INITRANS 2 MAXTRANS 255
STORAGE(INITIAL 5242880 NEXT 5242880 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "FIN_LARGE_INDX" PARALLEL 100 ;

--
Charles Schultz

--
Charles Schultz

--
Charles Schultz

Follow-Ups:
- RE: Curiosity: single-column index on sparse data cannot be built in parallel
  - From: Jonathan Lewis

References:
- Curiosity: single-column index on sparse data cannot be built in parallel
  - From: Charles Schultz
- RE: Curiosity: single-column index on sparse data cannot be built in parallel
  - From: Jonathan Lewis
- Re: Curiosity: single-column index on sparse data cannot be built in parallel
  - From: Charles Schultz

Re: Curiosity: single-column index on sparse data cannot be built in parallel

Other related posts: