RE: How to select only columns having values..

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: "'Ric Van Dyke'" <ric.van.dyke@xxxxxxxxxx>, <mark.powell2@xxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 18 Feb 2014 12:27:22 -0500

"But why do any of that?"

 

I infer that the time to be saved is some person manipulating the data.

 

I'll bet another dozen donuts that the OP meant "about" when writing
"above."

 

My solution probably violates the "simple" requirement. I picture the user
hand copying this data into a spreadsheet.

 

The larger point is that we increasingly find delays in both fetching and
transmitting extraneous columns to client machines. Pre-empting that on the
server will often be worth the cost.

 

mwf

 

From: Ric Van Dyke [mailto:ric.van.dyke@xxxxxxxxxx] 
Sent: Tuesday, February 18, 2014 8:57 AM
To: Mark W. Farnham; mark.powell2@xxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

Roger that Mark.

 

But why do any of that?  Why not just read the entire row, really the whole
table, into some local structure in the app and parse thru looking for what
you need?  Now you hit the DB once and done.  No dynamic SQL no hacking thru
the DD, and one parse/execute/fetch(s) cycle.   Even using a PL/SQL block
that processes each row one at a time would likely take a couple of seconds
at worse. 

 

I agree that this part of the code shouldn't take minutes of time for what
appears to be a small amount of data.  Of course "above 1000 records" is
vague enough to mean just about any number of rows, 1001?  10,000,000? 

 

We don't know the grander picture of all this so it's really hard to say
what is exactly the right solution.  I just don't see the need for such
complex SQL at this point.  

 

From: Mark W. Farnham [mailto:mwf@xxxxxxxx] 
Sent: Tuesday, February 18, 2014 8:05 AM
To: Ric Van Dyke; mark.powell2@xxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

 

The OP wrote:

"If I get the data like this, It will save few minutes for the data
arrangement."

and

"I have a table with 200 columns and having above 1000 Records." 

I sincerely hope any of these methods take far less than a few minutes for
the computer to execute (an eye blink, right?), so I infer that the time to
be saved is some person manipulating the data.

If I grok'd the attachment, it relies on some non-obvious conversion of null
to 'N' and it delivers column name value pairs for each column of a returned
row that is not equal to 'N'.

For me, at least, the rows under the minimum number of column headings would
be easier to digest.

Even so, the attachment operated by scanning each returned row and parsing
it to discover columns not equal to N and then, for each such column
encountered, tossing the column name and value into the output.

That is only going to read the table once, but it parses 200 columns for
each returned row interpretively.

My suggestion reads the entire table twice, but only parses the 200 columns
once to generate the new sql statement delivering the column sparse results
as a set.

For 3 rows and 10 columns of 1000 rows by 200 columns, either way should not
take very long. Probably the person who crafted the attachment knew it was
intended to operate on a small data set, so zero of this is intended as
criticism in that person's direction.

A functional difference is only non-N valued row-column pairs are delivered
by the attachment, while my suggestion would deliver all the columns for
each selected row if any of the selected rows have some non-NULL value for
that column. I'm not sure which the OP would prefer.

Given larger data sets, where we cared about the data processing performance
rather than the "data arrangement," I'd bet a dozen donuts that reading the
predicated result set for all the columns in the table once and the selected
columns a second time would significantly outperform row by row column by
column parsing.

The double pass against the database does require some two statement read
consistency to be bulletproof. In recent versions I suppose an as of clause
referencing the time of the first pass would be better than opening a
transaction.

mwf

 

 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Ric Van Dyke
Sent: Monday, February 17, 2014 2:11 PM
To: mwf@xxxxxxxx; mark.powell2@xxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

Cool as that sounds Mark, I doubt it would work any "better" (as in faster)
which is the posters original concern.  This is basically just a different
way to do the same thing his block of code already does.  The problem in
both cases is the creation of dynamic SQL which is never going to be faster
than just running a statement.  

 

The base of this issue is some horrible design decisions.  No matter how
much "bling" you put on this, it's still a terrible way to store the data.
Which in turn makes it very difficult to retrieve.   Maybe it was done to
make the insert of the date "easier", never a good idea.  We generally
insert once and query a qua-zillion times. 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Mark W. Farnham
Sent: Monday, February 17, 2014 1:33 PM
To: mark.powell2@xxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

While I'm not taking the time to do it, and I'm not claiming this is
efficient (in other words it is possibly saving analyst time at the expense
of computer time), I *think*

you could generate a dynamic select clause using the <table> with the
particular where clause joined and pivoted on column_name <c> from
user_tab_columns for the <table> having

sum(decode(nvl(<c>,0),<c>,1,0,0)) "<c>"  for each column <c> having > 0,

plucking the rows from the result set as the column names to create the
desired select statement with the original <table> and predicate.

 

This, of course, would only work on types for which the nvl function is
valid. For any <c> in your table for which nvl is not valid you would have
to use some other means to determine whether any of the rows for your
current predicate have non-null values.

 

The text of your generated queries for a 200 column table is going to be
pretty long.

 

There is probably a reasonable way to code this up in PL/SQL other than
using the pivot, but if you're citing an arbitrary where clause and no data
monitoring, I agree there is no way to specify which columns might have only
nulls without running the query. On top of this, understand that since this
is a two stage process you would need to construct this internal to a
transaction boundary lest some column changes from status from having some
non-nulls to only containing nulls between the first and second queries.

 

I suppose you could reduce the number of candidate columns and invariantly
include any columns in the query for which non-null is a column attribute.

 

Getting the syntax all correct and doing this double dynamic select
statement generation seems pretty tricky to me, but I *think* you could do
it.

 

mwf 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Powell, Mark
Sent: Monday, February 17, 2014 10:23 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

I also agree with Ric.  Select the entire row and let the application
front-end deal with displaying the data.  Stored PL/SQL could be used
between the front-end and the database, but it is going to be real
interesting when the situation arises that one of the rows that is returned
does not have data in all the same columns as the other rows returned in the
target set.  I expect that this situation is bound to happen sooner or
later.

 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Maaz Anjum
Sent: Monday, February 17, 2014 8:52 AM
To: ric.van.dyke@xxxxxxxxxx
Cc: rajakannan@xxxxxxxxxx; Chitale, Hemant K; oracle-l@xxxxxxxxxxxxx
Subject: Re: How to select only columns having values..

 

I agree with Ric. 

 

I assume you need these populated rows at execution time. This is just a
thought, but if statistics are up-to-date on the table then perhaps checking
the user_tab_columns for the num_nulls column compared against the total
number of rows (user_tables.num_rows) might help you. The point being, you
query meta data that is less costly than the actual table. I might be wrong,
but it's just an idea.

 

Either way, it doesn't seem you can avoid a complicated SQL statement. 

 

Cheers,

Maaz

 

On Mon, Feb 17, 2014 at 8:40 AM, Ric Van Dyke <ric.van.dyke@xxxxxxxxxx>
wrote:

Offhand this certainly has the sound of some very bad data modeling but that
likely is beyond your control.  Is this some sort of "generic table"?  If so
that will just lead to problems and you're only just starting to have them.
Simply put, this table breaks just about every rule of relational theory and
hence is not going to perform well no matter what you do.  

 

It seems to me the "best" thing would be just select the whole row(s) and
then have the application dissect the row(s) and pull out where there is
data and where there isn't.  Creating the dynamic SQL as you do works, but
will always be slow.  Getting the entire row and then parsing thru it to
find data is likely going to be faster. 

 

 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Raja Kannan Sengoden
Sent: Monday, February 17, 2014 2:36 AM
To: Chitale, Hemant K


Cc: oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

Thank a lot Hemant K Chitale.

 

As per the real scenario, all the columns will have the value.

 

But for the particular condition, only limited columns will have value.

 

At this time, I don't want to do analyze the data and maintain.

 

Hence I requested, is there any simple way..

 

Attached query helped me to get the expected result.

 

Thanks to Tony who helped me to get the expected result.

 

Thanks & Regards,

Raja

 

From: Chitale, Hemant K [mailto:Hemant-K.Chitale@xxxxxx] 
Sent: Monday, February 17, 2014 3:12 PM
To: Raja Kannan Sengoden
Cc: oracle-l@xxxxxxxxxxxxx
Subject: RE: How to select only columns having values..

 

> If I don't know exactly, Is there any way to select only those 10 columns?

Without actually querying the table, you cannot identify the 10 columns.

 

You could periodically query the table for all the rows and identify columns
with NOT NULL values and then maintain a list of such columns as "meta
data".  However, there is no guarantee that a column that had NULL values
for all 1000 rows yesterday or even a minute ago still has NULL values
because there might have been an INSERT or UPDATE that set one row's value
to a non-NULL.

 

How do you know that the 190 columns have NULL values ?  

 

Hemant K Chitale

 

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Raja Kannan Sengoden
Sent: Monday, February 17, 2014 12:38 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: How to select only columns having values..

 

Dear Experts,

 

I have a table with 200 columns and having above 1000 Records. 

 

But for a particular where clause, there is only 3 records fetched.

 

As the table having 200 columns, but the particular 3 record having only 10
columns with some value, and 190 columns having null value.

 

If I am sure, which columns I want or which column having value, then I can
select only those columns.

 

If I don't know exactly, Is there any way to select only those 10 columns?

 

If I get the data like this, It will save few minutes for the data
arrangement.

 

 

Thanks in advance.

 

Raja.


This email and any attachments are confidential and may also be privileged.
If you are not the intended recipient, please delete all copies and notify
the sender immediately. You may wish to refer to the incorporation details
of Standard Chartered PLC, Standard Chartered Bank and their subsidiaries at
https://www.sc.com/en/incorporation-details.html.





 

-- 
A life yet to be lived... 

Other related posts: