thanks very much for the link, I must confess I never heard of this feature and
it could save me a lot of work.
But there is a subtle point in the described feature – the proposed calculation
is order independent. This is achieved with the combination of the row hashes
using the XOR operation, that is commutative, so any ordering leads to the same
This small difference has a big performance impact as for the hash code
calculation simple full table scan is sufficient (no need of sort).
Anyway this is a very good news because to achieve the described functionality
with a native Oracle implementation would be sufficient to add new digest_types
to existing functionality of DBMS_SQLHASH that represents a calculation of say
MD5 with XOR row hash combination.
From: Andy Sayer [mailto:andysayer@xxxxxxxxx]
Sent: Dienstag, 05. Juni 2018 23:49
Cc: ORACLE-L <oracle-l@xxxxxxxxxxxxx>
Subject: Re: Hash Code Based Identity of a Relational Table
There is certainly a benefit to being able to compare things in two different
locations without sending the entire thing over the wire again. You might be
very annoyed though, Oracle has already written something for this, it's not
very well documented but it's there: dbms_sqlhash
You just give it a query which can just be a select * statement. It computes
the hash of each row (by concatenating the columns together as a string with a
space between each column) and then combines that for a hash of the complete
row set at the very end. You need to be careful about row ordering and any NLS
style parameters that will change the implicit conversion of any data type you
have in your table.
On 5 June 2018 at 22:34, Jaromir D.B.Nemec <jaromir@xxxxxxxxxxxx
<mailto:jaromir@xxxxxxxxxxxx> > wrote:
I worked once on a project where a file had to be spooled from a database table
and transferred on a remote host. I wanted to introduce a possibility of
validating that the produced file has the same identical content as the source
table and came across the idea of using hash codes (MD5) to achieve it.
On a next occasion when there were two tables to be verified being identical I
came back to this idea and I wrote a simple user defined aggregate function
based on a MD5 hash code for that task.
So basically the aggregate function MD5_XOR represented a state of the table
that could be compared with other table.
select MD5_XOR(to_char(ID)||COL_TXT|| to_char(COL_DATE,'dd.mm.yyyy
hh24:mi:ss')) md5 from tab1
The details can be found in
I found that approach interesting and shared the idea with some colleagues, but
the response was ranging from “may I ask what is it good for?” to “this is very
unwise idea” which may be trace of the Oracle “Database Ideas” page under
This motivated me to write a white paper, where I tried to explain the idea in
more detail and present some possible use cases:
So my main point is to receive this way a more technical grounded feedback
about this approach of using a hash code for representing a state of a whole
table or a part of it.
Any response is highly appreciated.
Jaromir D.B. Nemec