Re: DWH varchar2(4000)

  • From: Tim Gorman <tim@xxxxxxxxx>
  • To: ricard.martinez@xxxxxxxxxxxx, oracle-l@xxxxxxxxxxxxx
  • Date: Mon, 22 Dec 2014 09:50:14 -0700

Ricard,

My apologies for my misreading your question (and for misspelling your name)...

There is no impact on space consumption to the practice of using VARCHAR2(4000). A 10-byte string within a VARCHAR2(4000) column takes up the same amount of space as a 10-byte string in a VARCHAR2(10) column or a VARCHAR2(30) column.

Of course, since the precision is often used by ad-hoc tools and IDEs for default display settings and input validation, this capability goes out the window, but the designers have probably taken that into account :-) , and do not anticipate ad-hoc queries against this DWH, which I think is an extremely questionable assumption.

Hope this helps...

-Tim



On 12/22/14 9:21, ricard.martinez@xxxxxxxxxxxx wrote:

Hi Tim,

Thanks for that, but i think i do not explain myself properly.

They are using varchar2 (4000) on the varchar2 columns.

On the other columns they are using what they need (date, timestamp,clob, etc,etc), so in this case the datatype conversion does not concern me really.

There are just using 4000 for a simple code description (AA, AB, etc) just in case this code description increases in future. Now apply this to all varchar2 on all the tables.

Hope is more clear now J

Anyway thanks again for your reply.

Regards

*From:*oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Tim Gorman
*Sent:* 22 December 2014 16:06
*To:* oracle-l@xxxxxxxxxxxxx
*Subject:* Re: DWH varchar2(4000)

Richard,

Space is not an issue; a VARCHAR2 uses only as much storage as needed for the data. The precision of 4000 is only a constraint for stored values and a hint to display programs; it is not like the right-padded CHAR datatype.

What is the issue is datatype conversion. Besides implicit sorting semantics for numeric and date/datetime datatypes which will be eliminated, the Oracle optimizer in particular relies on the semantics of the datatype to understand data cardinality. For example, two dates stored as DATE datatypes with values of 28-FEB-2014 and 01-MAR-2014 are known to be "adjacent" values, while values of 28-FEB-2016 and 01-MAR-2016 are known not to be adjacent. This may not seem like a big deal, but it really is, particularly for DWH databases which need to make the best decisions possible for an execution plan for long-running SQL statements.

Oracle is not a NOSQL datastore and not using the appropriate datatype is tantamount to crippling the Oracle optimizer, which is crucial for query performance whether they think it is or not.

If the developers/data modelers are pre-provisioning "spare columns" because they are afraid of running ALTER TABLE ... ADD COLUMN, then the alternative they have chosen has absolutely zero advantages over using ALTER TABLE ... ADD COLUMN. Every possible objection they can dream up is magnified by the course they have chosen. If they simply don't want to worry about type-conversion, then they simply need to grow up and realize that data typing is important on this platform, even if it isn't on other platforms.

Bottom-line: use the correct datatype. If they can't decide, then they're doing it wrong.

Best of luck!

-Tim


On 12/22/14 8:01, Ricard Martinez wrote:

    Hi list,

    We have a DWH database on 11.2.0.4.

    The developers/data modelers are creating all the tables with
    varchar2(4000) as standard by default. They don't care if they
    just insert 10 bytes, 200  or nothing.

    When asked about why this standard is in place, the answer is
    just: "Just in case we need it".

    For my point of view this is not a good practice, as can cause
    memory waste on client applications, as well as data garbage or
    errors like ORA-01450: maximum key length (3218) exceeded.

    But as i'm not a DWH expert, maybe im missing information that
    proves that this is really a good practice, and not just a
    laziness way of working.

    I will be glad to know your opinion, or point me to any doc/test
    that proves me wrong, in order to learn about this.

    Thanks


Other related posts: