Re: "Re-sequencing" a pseudo-key

From: Kim Berg Hansen <kibeha@xxxxxxxxx>
To: De DBA <dedba@xxxxxxxxxx>
Date: Fri, 10 Oct 2014 08:18:51 +0200
Hi, Tony

Sorry, I should have mentioned: We have good old database EE license
(currently on 11.2.0.3.0 EE) with just one option, Active Data Guard.
Funding for Golden Gate for this one project, well, I don't see that
coming, sorry ;-)

Other than that, interesting idea, definitely.


Regards


Kim Berg Hansen

http://dspsd.blogspot.com
kibeha@xxxxxxxxx
@kibeha


On Fri, Oct 10, 2014 at 2:43 AM, De DBA <dedba@xxxxxxxxxx> wrote:

>  Hi Kim,
>
> We don't know which options you have at your disposal, however if you have
> access to GoldenGate you could also consider the following variant on
> Iggy's solution:
>
>
>    1. Create a new schema (in a new database, ideally) with the same
>    objects as the original (perhaps reserve storage as you create the tables),
>    but no triggers or foreign keys yet (GG cannot handle those). As newly
>    created rows have a unique, incrementing RECID, we can use the offset
>    between the new recid and the maximum negative old_recid in the translation
>    table -- as we're currently slowly climbing to 0 -- to calculate the new
>    recid for new rows as new.recid = x + abs( old.recid ) - abs(y) where x =
>    max(new_recid) and y is the greatest negative recid in the translation
>    table. GoldenGate can be configured to do such transformations on the fly,
>    if I am not mistaken.
>    2. I don't know if it makes much difference, but I'd create the master
>    translation table as IOT with old_recid as the leading column, in the new
>    schema. For this variant it is necessary to populate the entire translation
>    table up front, as the maximum new recid is needed by GG.
>     3. Populate tables in the new schema with the existing rows. You can
>    repopulate table by table and do a recid/newrecid lookup for each row in
>    the new schema. GG copies & resequences new rows as the are created.
>    4. (no longer needed)
>    5. When resequencing of the old rows is completed, schedule a short
>    outage to create foreign keys and swap the application from the old schema
>    to the new.
>
>
> I believe this will have some additional advantages:
>
>
>    - the original production schema (database) remains untouched and
>    complete, and so a provides a reliable rollback option if
>    resequencing/cutover fails for some reason
>    - the new schema is identical to the old, no object renaming needed.
>    Only the translation table is added.
>    - no need to repeat the translation process (point 4) and less chance
>    of data loss due to oversight or error.
>     - if the new schema is created in a second database/machine there is
>    minimal stress on the production database, as transforms and recid lookups
>    run in the new database
>    - the outage will be short, as all indexes & MVs already exist and
>    only foreign keys need to be created
>
>
> There are some cons of course:
>
>
>    - GG is not cheap
>    - GG can be fickle and "abend" (= abort) when problems arise - it
>    needs TLC, lots of it
>    - GG does not support all Oracle datatypes
>    - Extra storage & resource use (ideally a second server with dedicated
>    network link)
>
>
> Hth,
> Tony
>
> On 09/10/14 09:50, Iggy Fernandez wrote:
>
> It's an amazing problem with numerous restrictions but where there is a
> will there may be a way.
>
>  To summarize, you have a list of sequence number with holes. The holes
> are probably caused by the way the sequence numbers are being used. A
> process that reads 10 sequence numbers may end up only using a few of them.
> This results in the sequence numbers being gobbled up faster than expected.
>
>  The challenge is doing billions and billions of lookups into the
> translation table in a short downtime window without an Exadata. Winning
> the lottery or finding that much spare change is unrealistic. Not to
> mention that you have to update indexes and refresh materialized views.
>
>  I think an incremental solution is possible as follows:
>
>
>    1. Add new_recid columns to each table. There will be as many new
>    columns in each tables as there are columns that need to be translated. For
>    now, they need to be nullable.
>    2. Create a master translation table with 600 million rows and two
>    columns: recid and new_recid. New rows are being constantly added to the
>    database but that's not a problem for this algorithm. New rows can be
>    tackled later. Using a master translation table for all recid columns will
>    ensure that all the translations  are in the same order as the original
>    values. Preserving this order is not necessary but is aesthetically
>    pleasing.
>    3. Incrementally update the values of the new_recid column with the
>    translation of the corresponding recid value. This can be done
>    incrementally over the next six months. If updating a large number of rows,
>    you can commit at regular intervals to avoid long transactions.
>    4. Purge the master translation table and repeat steps 2 and 3 for new
>    entries only. Keep doing this until the day of cutover. In fact, you don't
>    have to tackle 600 million entries in the first shot. You could do perhaps
>    10 million entries every day for 60 days.
>    5. As you get close to the cutover date, you can begin building
>    indexes involving the new_recid columns instead of the recid columns using
>    ONLINE build. You can also create a second set of MVs using the new_recid
>    columns. For fast refreshability you can use ROWID materialized views
>    instead of PRIMARY KEY materialized views. The only downside is that any
>    DDL operation that requires row movement will force a complete refresh but
>    what are the chances.
>
>
>  With this incremental approach, you have not impacted the application in
> any way up to the time of cutover. The cutover steps:
>
>
>    1. Perform the last few updates.
>    2. Put the database into flashback recovery mode.
>    3. Drop primary key and foreign key constraints
>    4. Rename the recid columns to old_recid and new_recid columns to recid
>    5. The required indexes already exist
>    6. The new materialized views already exist. You don't need to rename
>    them if you are using automatic query rewrite. If you are accessing them
>    directly, you will have to work out some solution involving prebuilt 
> tables.
>    7. You may not need the primary key constraints any more
>    8. You can drop the old_recid columns; it would happen instantaneously
>    but you can postpone this step to a future time
>    9. Test the application and flash back if there is a failure
>
>
>  The downside to an approach such as this is that you have added new
> columns and  caused some chained rows and row migrations. Lots of details
> still need to be worked out. Lots of testing will ensure that you have a
> good plan.
>
>  Now if you had an Exadata ...
>
>   ------------------------------
>
>
>
Follow-Ups:
- Re: "Re-sequencing" a pseudo-key
  - From: Hans Forbrich
References:
- "Re-sequencing" a pseudo-key
  - From: Kim Berg Hansen
- RE: "Re-sequencing" a pseudo-key
  - From: Iggy Fernandez
- Re: "Re-sequencing" a pseudo-key
  - From: De DBA
Re: "Re-sequencing" a pseudo-key

Other related posts: