Re: Migration validation to OCI
- From: Tim Gorman <tim.evdbt@xxxxxxxxx>
- To: mwf@xxxxxxxx, ecandrietta@xxxxxxxxx, 'ORACLE-L' <oracle-l@xxxxxxxxxxxxx>
- Date: Mon, 15 Apr 2024 11:50:34 -0700
List,
Mark makes an excellent point citing the old technique of "control
totals"...
Attached are two SQL*Plus scripts that use the equally old technique of
SQL-generating-SQL to create what I call "comparison tables" to generate
control totals for comparing the contents of two tables, each with the
same "shape", but different versions of one another. Each of the
columns in the two tables have several aggregations run against them,
dependent on the data_type of the column.
From the comments header in the scripts...
For the entire tables being compared...
- name of the table
- row count of the table
For each NUMBER column...
- count(distinct column-name)
- min(nvl(column-name, 0))
- max(nvl(column-name, 0))
- avg(nvl(column-name, 0))
- sum(nvl(column-name, 0))
- stddev(nvl(column-name, 0))
For each CHAR or VARCHAR2 column...
- count(distinct column-name)
- min(length(column-name))
- max(length(column-name))
- avg(length(column-name))
- sum(length(column-name))
- min(ora_hash(column-name))
- max(ora_hash(column-name))
- avg(ora_hash(column-name))
- sum(ora_hash(column-name))
For each DATE column...
- count(distinct column-name)
- min(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
- max(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
- avg(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
- sum(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
For other datatypes...
- not yet supported by this script
The purpose of these calculations is merely to detect differences, not
necessarily to pinpoint where the differences lie or what they are.
Here is a sample output based on testing after a code change to ETL
logic to fix how one column (named LOCATION_KEY) is loaded...
COLUMN_NAME T1_VALUE T2_VALUE
----------------------------- ----------------------------------------
----------------------------------------
TABLE_NAME NEW_TABLE_AA OLD_TABLE_AA
CNT_LOCATION_KEY 4002 4272
AVG_LOCATION_KEY 20303.866432106804794529027401907016245
20153.5951564461457324878614033480490347
STD_LOCATION_KEY 14050.4442319081117478342482185723117511
17204.6251672774281857692695152895980603
SUM_LOCATION_KEY 8200264663 8139573551
MIN_DATE_ENTERED 15-APR-2024 10:38:28 15-APR-2024
04:17:10
AVG_DATE_ENTERED 15-APR-2024 10:38:28 15-APR-2024
04:17:10
MAX_DATE_ENTERED 15-APR-2024 10:38:28 15-APR-2024
04:17:10
MIN_DATE_LAST_UPDATED 15-APR-2024 10:38:28
15-APR-2024 04:17:10
AVG_DATE_LAST_UPDATED 15-APR-2024 10:38:28
15-APR-2024 04:17:10
MAX_DATE_LAST_UPDATED 15-APR-2024 10:38:28
15-APR-2024 04:17:10
This output from the SQL*Plus script generated by the
"query_compare.sql" script shows that only the column LOCATION_KEY has
been affected by the code change, as the COUNT DISTINCT, AVERAGE,
STANDARD DEVIATION, and SUM aggregations of the values in that column
are all different. Even the overall table row counts are the same, as
are the values in the dozens of other columns.
The report also shows differences in DATE columns named DATE_ENTERED and
DATE_LAST_UPDATED, but those differences are expected since the old
table was loaded early in the morning while the new table values were
populated mid-morning, about 6 hours later. These columns are artifacts
of the data loading process, not the results of the data load logic.
This output doesn't necessarily tell us exactly what is different in the
LOCATION_KEY column, but it does assure us that there are differences in
LOCATION_KEY in one or more columns, and importantly this tells us that
there are no differences in the total row count of the table, nor in any
other significant column of the tables.
Hope this helps?
Thanks!
-Tim
On 4/14/2024 6:29 AM, Mark W. Farnham wrote:
A fairly standard minimum is the number of rows in each user table and
the sum of the values of numeric columns in each table. The CDP exam
circa 1982 called these “control totals.”
A more aggressive verification is to also pipe the text fields through
a hasher and compare the hash values.
A reverse migration of all the data back to the original machine (even
one user table at a time if space is a problem) with a full row by
row, column by column comparison is comprehensive, but the cost versus
the benefit does need to be considered. (This can also be done between
OCI and the existing machine, but usually the network latency doing
things row by row makes a back load of the data more performant.)
Good luck,
mwf
PS: I presume your OCI means Oracle Cloud Infrastructure, not Oracle
Call Interface
*From:*oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Eriovaldo Andrietta
*Sent:* Sunday, April 14, 2024 8:28 AM
*To:* ORACLE-L
*Subject:* Migration validation to OCI
Hello,
An exadata database server onpremise will be migrated to OCI As Is
to the same Oracle version (12.x).
I would like to know what else would have to be validated at OCI to
ensure that the migration was successful.
I thought I would check:
- dba_users
- dba_objects (and each object type separately)
- dba_tab_privs
- dba_sys_privs
- dba_roles
- dba_synonym
- dba_extents (in order to check the volumn)
- count the number of lines for each table, or to check at the
dba_tables numrows if the statistics are updated.
I guess that the number of lines migrated are enough to accept to
content of the table
The goal is to check only in the context of objects and not
configuration (v$parameters, tablespaces, resource manager and other
structures).
What else could be validated?
Regards
Eriovaldo
/**********************************************************************
* File: setup_compare.sql
* Type: SQL*Plus script
* Author: Tim Gorman (Bridgeton Data, Inc.)
* Date: 02-Apr 2002
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
*
http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* Copyright (c) 2024 by Bridgeton Data Inc. All rights reserved.
*
*
* Description:
*
* SQL*Plus script to use the technique of SQL-generating-SQL to
* query the Oracle data dictionary in order to generate another
* SQL*Plus script to query two "comparison" tables and find whether
* there are differences in the aggregate calculations from the
* columns.
*
* This should indicate whether the two tables being compared are
* different in any way.
*
* The comparison tables would have columns based on aggregations
* from the values within the columns of the tables being compared.
*
* For the entire tables being compared...
* - name of the table
* - row count of the table
* For each NUMBER column...
* - count(distinct column-name)
* - min(nvl(column-name, 0))
* - max(nvl(column-name, 0))
* - avg(nvl(column-name, 0))
* - sum(nvl(column-name, 0))
* - stddev(nvl(column-name, 0))
* For each CHAR or VARCHAR2 column...
* - count(distinct column-name)
* - min(length(column-name))
* - max(length(column-name))
* - avg(length(column-name))
* - sum(length(column-name))
* - min(ora_hash(column-name))
* - max(ora_hash(column-name))
* - avg(ora_hash(column-name))
* - sum(ora_hash(column-name))
* For each DATE column...
* - count(distinct column-name)
* - min(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* - max(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* - avg(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* - sum(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* For other datatypes...
* - not yet supported by this script
*
* The purpose of these aggregations is to quickly show where there
* are differences in column values in the table.
*
* Modifications:
* TGorman 02apr02 written
*********************************************************************/
define V_PREFIX="" -- assuming prefix for the name of the table in
which comparison data is stored
define V_SUFFIX="_SAVE" -- assuming prefix for the name of the table in
which comparison data is stored
undefine V_OWNER
undefine V_TBLNAME
col sort0 noprint
col sort1 noprint
set echo off feedback off timing off pagesize 0 linesize 999 trimspool on
recsep off verify off
spool run_setup_compare_&&V_OWNER._&&V_TBLNAME..sql
prompt set echo on feedback on timing on pagesize 100 linesize 999 trimspool on
prompt spool run_setup_compare_&&V_OWNER._&&V_TBLNAME
select 10 sort0,
0 sort1,
'REM drop table
'||upper('&&V_OWNER')||'.CMP_'||substr(upper('&&V_TBLNAME'),1,26)||'
purge;'||chr(10)||
'create table
'||upper('&&V_OWNER')||'.CMP_'||substr(upper('&&V_TBLNAME'),1,26)||'
as'||chr(10)||
'select '''||table_name||'''
TABLE_NAME'||chr(10)||','||chr(9)||'count(*) ROW_CNT' txt
from all_tables
where owner = upper('&&V_OWNER')
and table_name = upper('&&V_TBLNAME')
union all
select 20 sort0,
column_id sort1,
case when data_type = 'NUMBER' then
','||chr(9)||'count(distinct '||column_name||') as
LCNT_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'min(nvl('||column_name||',0)) as
MIN_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'avg(nvl('||column_name||',0)) as
AVG_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'max(nvl('||column_name||',0)) as
MAX_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'stddev(nvl('||column_name||',0)) as
STD_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'sum(nvl('||column_name||',0)) as
SUM_'||substr(column_name,1,25)
when data_type in ('CHAR', 'VARCHAR2') then
','||chr(9)||'count(distinct '||column_name||') as
LCNT_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'min(length('||column_name||')) as
LMIN_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'avg(length('||column_name||')) as
LAVG_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'max(length('||column_name||')) as
LMAX_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'sum(length('||column_name||')) as
LSUM_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'count(distinct ora_hash('||column_name||')) AS
HCNT_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'min(ora_hash('||column_name||')) AS
HMIN_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'avg(ora_hash('||column_name||')) AS
HAVG_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'max(ora_hash('||column_name||')) AS
HMAX_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'sum(ora_hash('||column_name||')) AS
HSUM_'||substr(column_name,1,25)
when data_type = 'DATE' then
case when nullable = 'N' then
','||chr(9)||'count(distinct '||column_name||') as
CNT_'||substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(min(to_number(to_char('||column_name||',''YYYYMMDDHH24MISS''))),
''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MIN_'||substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(avg(to_number(to_char('||column_name||',''YYYYMMDDHH24MISS''))),
''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as AVG_'||substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(max(to_number(to_char('||column_name||',''YYYYMMDDHH24MISS''))),
''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MAX_'||substr(column_name,1,26)
else
','||chr(9)||'count(distinct '||column_name||') as
CNT_'||substr(column_name,1,26)||chr(10)||','||chr(9)||
'to_char(to_date(min(to_number(to_char(nvl('||column_name||
',to_date(''2249'',''YYYY''),''YYYYMMDDHH24MISS''))),''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MIN_'||
substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(avg(to_number(to_char(nvl('||column_name||
',to_date(''2249'',''YYYY''),''YYYYMMDDHH24MISS''))),''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as AVG_'||
substr(column_name,1,26)||chr(10)||','||chr(9)||
'to_char(to_date(max(to_number(to_char(nvl('||column_name||
',to_date(''2249'',''YYYY''),''YYYYMMDDHH24MISS''))),''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MAX_'||
substr(column_name,1,26)
end
else
','||chr(9)||'COLUMN "'||column_name||'":
<<type='||data_type||'>>'
end txt
from all_tab_columns
where owner = upper('&&V_OWNER')
and table_name = upper('&&V_TBLNAME')
union all
select 30 sort0,
0 sort1,
'from'||chr(9)||owner||'.'||table_name txt
from all_tables
where owner = upper('&&V_OWNER')
and table_name = upper('&&V_TBLNAME')
union all
select 40 sort0,
0 sort1,
'union all'||chr(10)||'select '''||table_name||'''
TABLE_NAME'||chr(10)||','||chr(9)||'count(*) ROW_CNT' txt
from all_tables
where owner = upper('&&V_OWNER')
and table_name = '&&V_PREFIX'||upper('&&V_TBLNAME')||'&&V_SUFFIX'
union all
select 50 sort0,
column_id sort1,
case when data_type = 'NUMBER' then
','||chr(9)||'count(distinct '||column_name||') as
CNT_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'min(nvl('||column_name||',0)) as
MIN_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'avg(nvl('||column_name||',0)) as
AVG_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'max(nvl('||column_name||',0)) as
MAX_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'stddev(nvl('||column_name||',0)) as
STD_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'sum(nvl('||column_name||',0)) as
SUM_'||substr(column_name,1,25)
when data_type in ('CHAR', 'VARCHAR2') then
','||chr(9)||'count(distinct '||column_name||') as
CNT_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'min(length('||column_name||')) as
LMIN_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'avg(length('||column_name||')) as
LAVG_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'max(length('||column_name||')) as
LMAX_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'sum(length('||column_name||')) as
LSUM_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'min(ora_hash('||column_name||')) AS
HMIN_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'avg(ora_hash('||column_name||')) AS
HAVG_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'max(ora_hash('||column_name||')) AS
HMAX_'||substr(column_name,1,25)||chr(10)||
','||chr(9)||'sum(ora_hash('||column_name||')) AS
HSUM_'||substr(column_name,1,25)
when data_type = 'DATE' then
case when nullable = 'N' then
','||chr(9)||'count(distinct '||column_name||') as
CNT_'||substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(min(to_number(to_char('||column_name||',''YYYYMMDDHH24MISS''))),
''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MIN_'||substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(avg(to_number(to_char('||column_name||',''YYYYMMDDHH24MISS''))),
''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as AVG_'||substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(max(to_number(to_char('||column_name||',''YYYYMMDDHH24MISS''))),
''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MAX_'||substr(column_name,1,26)
else
','||chr(9)||'count(distinct '||column_name||') as
CNT_'||substr(column_name,1,26)||chr(10)||','||chr(9)||
'to_char(to_date(min(to_number(to_char(nvl('||column_name||
',to_date(''2249'',''YYYY''),''YYYYMMDDHH24MISS''))),''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MIN_'||
substr(column_name,1,26)||chr(10)||
','||chr(9)||'to_char(to_date(avg(to_number(to_char(nvl('||column_name||
',to_date(''2249'',''YYYY''),''YYYYMMDDHH24MISS''))),''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as AVG_'||
substr(column_name,1,26)||chr(10)||','||chr(9)||
'to_char(to_date(max(to_number(to_char(nvl('||column_name||
',to_date(''2249'',''YYYY''),''YYYYMMDDHH24MISS''))),''YYYYMMDDHH24MISS''),''DD-MON-YYYY
HH24:MI:SS'') as MAX_'||
substr(column_name,1,26)
end
else
','||chr(9)||'COLUMN "'||column_name||'":
<<type='||data_type||'>>'
end txt
from all_tab_columns
where owner = upper('&&V_OWNER')
and table_name = '&&V_PREFIX'||upper('&&V_TBLNAME')||'&&V_SUFFIX'
union all
select 60 sort0,
0 sort1,
'from'||chr(9)||owner||'.'||table_name txt
from all_tables
where owner = upper('&&V_OWNER')
and table_name = '&&V_PREFIX'||upper('&&V_TBLNAME')||'&&V_SUFFIX'
order by 1, 2;
prompt /
prompt set echo off feedback 6 timing off
prompt spool off
spool off
set feedback 6 pagesize 100 recsep each verify on
@run_setup_compare_&&V_OWNER._&&V_TBLNAME
/**********************************************************************
* File: query_compare.sql
* Type: SQL*Plus script
* Author: Tim Gorman (Bridgeton Data, Inc.)
* Date: 02-Apr 2002
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
*
http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
* Copyright (c) 2024 by Bridgeton Data Inc. All rights reserved.
*
*
* Description:
*
* SQL*Plus script to use the technique of SQL-generating-SQL to
* query the Oracle data dictionary in order to generate another
* SQL*Plus script to create a "comparison table".
*
* This comparison table would have columns based on aggregations
* from the values within the columns of the tables being compared.
*
* The purpose of doing this should indicate whether the two tables
* being compared are different in any way. The aggregated calculations
* themselves have no meaning except to reveal any differences.
*
* For the entire tables being compared...
* - name of the table
* - row count of the table
* For each NUMBER column...
* - count(distinct column-name)
* - min(nvl(column-name, 0))
* - max(nvl(column-name, 0))
* - avg(nvl(column-name, 0))
* - sum(nvl(column-name, 0))
* - stddev(nvl(column-name, 0))
* For each CHAR or VARCHAR2 column...
* - count(distinct column-name)
* - min(length(column-name))
* - max(length(column-name))
* - avg(length(column-name))
* - sum(length(column-name))
* - min(ora_hash(column-name))
* - max(ora_hash(column-name))
* - avg(ora_hash(column-name))
* - sum(ora_hash(column-name))
* For each DATE column...
* - count(distinct column-name)
* - min(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* - max(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* - avg(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* - sum(to_number(to_char(column-name,'YYYYMMDDHH24MISS'))
* For other datatypes...
* - not yet supported by this script
*
* The purpose of these aggregations is to quickly show where there
* are differences in column values in the table.
*
* Modifications:
* TGorman 02apr02 written
*********************************************************************/
undefine V_OWNER
undefine V_TBLNAME
define V_PREFIX=""
define V_SUFFIX="_SAVE"
set echo off feedback off timing off pages 0 lines 999 trimspool on recsep off
verify off
spool run_query_cmp_&&V_OWNER._&&V_TBLNAME..sql
prompt set echo off feedback on timing off pages 100 lines 999 trimspool on
prompt spool run_query_cmp_&&V_OWNER._&&V_TBLNAME
set termout off
select decode(c.column_id,1,null,'UNION ALL'||chr(10))||
'select'||chr(9)||''''||c.column_name||''' COLUMN_NAME, '||
case when c.data_type in ('NUMBER') then
'TO_CHAR(T1.'||c.column_name||')'
when c.data_type in ('DATE') then
'TO_CHAR(T1.'||c.column_name||',''DD-MON-YYYY HH24:MI:SS'')'
else
'T1.'||c.column_name
end||
' T1_VALUE, '||
case when c.data_type in ('NUMBER') then
'TO_CHAR(T2.'||c.column_name||')'
when c.data_type in ('DATE') then
'TO_CHAR(T2.'||c.column_name||',''DD-MON-YYYY HH24:MI:SS'')'
else
'T2.'||c.column_name
end||
' T2_VALUE'||chr(10)||
'from'||chr(9)||c.owner||'.CMP_'||substr(upper('&&V_TBLNAME'),1,26)||'
T1, '||
c.owner||'.CMP_'||substr(upper('&&V_TBLNAME'),1,26)||' T2'||chr(10)||
'where'||chr(9)||'T1.TABLE_NAME = '''||t1.table_name||''''||chr(10)||
'and'||chr(9)||'T2.TABLE_NAME = '''||t2.table_name||''''||chr(10)||
'and'||chr(9)||'T1.'||c.column_name||' <> T2.'||c.column_name txt
from all_tab_columns c,
&&V_OWNER..cmp_&&V_TBLNAME t1,
&&V_OWNER..cmp_&&V_TBLNAME t2
where c.owner = upper('&&V_OWNER')
and c.table_name = 'CMP_'||substr(upper('&&V_TBLNAME'),1,26)
and t1.table_name = upper('&&V_TBLNAME')
and t2.table_name = upper('&&V_PREFIX')||t1.table_name||upper('&&V_SUFFIX')
order by c.column_id;
prompt /
prompt spool off
prompt set feedback 6
spool off
set termout on
@run_query_cmp_&&V_OWNER._&&V_TBLNAME
Other related posts: