Hi All,
I worked once on a project where a file had to be spooled from a database
table and transferred on a remote host. I wanted to introduce a possibility
of validating that the produced file has the same identical content as the
source table and came across the idea of using hash codes (MD5) to achieve
it.
On a next occasion when there were two tables to be verified being identical
I came back to this idea and I wrote a simple user defined aggregate
function based on a MD5 hash code for that task.
So basically the aggregate function MD5_XOR represented a state of the table
that could be compared with other table.
Example
select MD5_XOR(to_char(ID)||COL_TXT|| to_char(COL_DATE,'dd.mm.yyyy
hh24:mi:ss')) md5 from tab1
MD5_XOR
-------------------------------
173f1f8f85f1a154044b7629a23e949c
The details can be found in
http://www.db-nemec.com/MD5/CompareTablesUsingMD5Hash.html ;
I found that approach interesting and shared the idea with some colleagues,
but the response was ranging from "may I ask what is it good for?" to "this
is very unwise idea" which may be trace of the Oracle "Database Ideas" page
under https://community.oracle.com/ideas/20275 ;
This motivated me to write a white paper, where I tried to explain the idea
in more detail and present some possible use cases:
http://www.db-nemec.com/MD5/HashCodeBasedIndentityintheDatabase.html ;
So my main point is to receive this way a more technical grounded feedback
about this approach of using a hash code for representing a state of a whole
table or a part of it.
Any response is highly appreciated.
Kind Regards,
Jaromir D.B. Nemec