RE: Table with ~255+ columns

From: "Mark W. Farnham" <mwf@xxxxxxxx>
To: <mwf@xxxxxxxx>, <jlewisoracle@xxxxxxxxx>, <loknath.73@xxxxxxxxx>
Date: Tue, 17 Nov 2020 10:34:40 -0500

By the way, the covering view method of handling huge numbers of columns was
the primary reason for the implementation of “insertable views” as an
operational method in Oracle. IF you use the splitting tables with covering
view method to improve the performance of high column count design abortio…,
er, “architectures” make sure you don’t violate any of the rules for the view
to be insertable. Otherwise it is an odds on favorite to break your application.

mwf

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Mark W. Farnham
Sent: Tuesday, November 17, 2020 9:45 AM
To: jlewisoracle@xxxxxxxxx; loknath.73@xxxxxxxxx
Cc: 'Oracle L'
Subject: RE: Table with ~255+ columns

AND… not disagreeing, but a useful technique in splitting into two tables is to
use a covering view that appears to be exactly the same to applications.

Usually useful:

1)    All the natural keys in the “most often used” columns table “A” of the
split

2)    A unique, indexed key from table “A” to table “B” as an equijoin in the
view (If not obvious, the index is on table “B” and this key is a new column,
not referenced in any application for the exclusive use of the covering view.
100% of the time by experience if this index is allowed to be used by other
than the view, additional columns are added to it and the covering view
performance make your system worse than following the continued rows.)

Re-read what JL wrote, and if you do experiment with the split tables
technique, carefully consider all the applications queries and the texture of
your data before you engineer a few variants of splitting to test.

The potential upside is enormous, but re-read what JL wrote.

I HAVE seen very large numbers of columns where only 20 or so were actually
used in the queries of the applications and all the rest were “just in case”
dynamically used user defined columns of an application suite. That observation
directly leads to the question, if you are using a purchased application, “What
happens if the vendor adds a column in an upgrade or patch.”

Good luck.

PS: Re-read what JL wrote. Really: one more time.

mwf

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Jonathan Lewis
Sent: Tuesday, November 17, 2020 5:26 AM
To: loknath.73@xxxxxxxxx
Cc: Oracle L
Subject: Re: Table with ~255+ columns

Lok P,

There are so many details about that "continued rows" statistic that have to be
tested by version it's not really safe to give you a straight answer to "how do
you work out the impact"

In your version I think the statistic is only incremented if the continued
fetch has to visit another block, (in earlier versions you the count would go
up even if all the pieces were in the same block) so one measure would be to
compare the number of table fetch continued rows with the number of session
logical I/Os - because (if I'm right) the each fetch continued row will also be
a session logical I/O. If most of the instance CPU is spent in SQL, and you
aren't doing a huge number of tablescans (compare rows fetched by tablescan
with rows fetched by rowid) then as a rough guide the count of continued row
fetches compared to session logical I/O is an indication of CPU used because of
rows that are spread across two or more blocks.

This is a very rough guide, and you will need to check whether my memory of
counting for your version is correct. Even if the tests show that the above is
technically correct, it doesn't mean this table is the major culprit since (as
you've already noted) you can get row migration and row chaining without
breaking the 255 limit.

Something to bear in mind when pursuing this issue - it takes a lot of
developer and dba time in coding and testing to split a table into two sets of
columns so even if it's theoretically the right thing to do it may not be
sensible to do it.  Possibly the better option is to see if a one-off rebuild,
and some simpler changes in coding and configuration could do some effective
damage limitation.  (e.g. a common cause of row migration is having a bad
setting of pctfree, and code that inserts row stubs and then updates them to
make them grow).

If you've got a clone of the production system, running a few static analyses
of the data to what its storage patterns look like might be a starting point to
highlight potential waste of resources and strategies for addressing them. e.g.

Compare table HWM with space required to hold the rows in the table.  If
intra-row chaining has one nasty things then there might be a lot of waste
space all through the table. This can be interpreted not only as a waste of
space but also as an indication of excess costs of updates and queries.  Write
code to calculate the row length for a row (sum of column length + overheads)
and get a histogram of row lengths - anything other than a simple bell curve is
suggestive of "insert stub and grow". Write code to draw a picture of null and
not null columns for a row (i.e. a string of 270 bytes with "-" for used
column, " " for blank column" and see where the unused columns are - you may
find a column rebuild with a change in column order will make most rows use
less than the critical 255.

Apart from anything else, a consideration for your developers is that if they
add 2 columns and then update a lot of history to put values in those two
columns the impact on the table will probably be a disaster as Oracle will try
to rewrite those columns on the update and may do terrible things with chaining
AND migration. The effect MUST be tested carefully before implementation.

Regards

Jonathan lewis

On Mon, 16 Nov 2020 at 21:14, Lok P <loknath.73@xxxxxxxxx> wrote:

We are using version 11.2.0.4 of oracle. We have a table having ~270 columns.
But considering the rows chaining is inevitable after exceeding the ~255
columns limit(i also see in the below blog stating the same) we are asking the
team to not add additional new columns and to plan for dropping those existing
columns such that the total number of columns will be restricted within ~255.
But the team is asking to measure the current overhead on the queries as we
already have ~15more columns in the table exceeding the ~255 column limit. Is
there a way to measure the overhead of row chaining for that table in
production in terms of DB time ?

References:
- Table with ~255+ columns
  - From: Lok P
- Re: Table with ~255+ columns
  - From: Jonathan Lewis
- RE: Table with ~255+ columns
  - From: Mark W. Farnham

RE: Table with ~255+ columns

Other related posts: