Hearing the Oracle

Home » SQL Tips » Deletion Insurance

Deletion Insurance

If you have some experience, the topic of deleting consequential data automatically brings to mind optional safety measures in case things go wrong. The recent article concerning data deduplication is a good example. What if — I didn’t test my DELETE code enough and a snafu occurs? What if — I have to retrieve some or all of the deleted rows after the fact? What if — I want to keep the deleted rows (or partial rows) available on the side for inspection for awhile? This post will examine several techniques which could be used in this type of situation and compare them from the point of view of administrative setup (or DBA reliance).

Safety Valves

We’ll use the dataset described in the article mentioned above (‘De-duplicating Rows’) for our testbed. If you want code to rebuild and populate it, here you go:
w h i t e s p a c e

  rob@XE_11g2> CREATE TABLE test_deleting
  (alpha_code VARCHAR2(2) NOT NULL,
   num_code INTEGER NOT NULL,
   batch_id INTEGER,
   timestamp DATE DEFAULT sysdate);

  Table Created.

  rob@XE_11g2> BEGIN
   FOR i IN 1 .. 460
   LOOP
      FOR j IN 1 .. 1000
      LOOP
         INSERT INTO test_deleting VALUES
         (
         CHR(65 + FLOOR(dbms_random.value(0,26)))||CHR(65 + FLOOR(dbms_random.value(0,26))),
         CEIL(dbms_random.value(0,676)),
         ROUND(dbms_random.value(2000, 2013),0),
         TO_DATE(sysdate, ‘DD-MON-YYYY’) + dbms_random.value(-31,-1)
         );
      END LOOP;
      COMMIT;
   END LOOP;
   END;

  Procedure completed.

w h i t e s p a c e
The deletion SQL we will use will be a modified version of the original dedup code. For our present purposes we’ll change it slightly so as to affect a smaller number of rows. The following script will delete 13150 rows.
w h i t e s p a c e


   DELETE FROM test_deleting X
   WHERE batch_id = 2002 AND X.rowid <>
      (SELECT MAX(Y.rowid) FROM test_deleting Y
       WHERE Y.alpha_code = X.alpha_code
       AND Y.num_code = X.num_code)
   /

w h i t e s p a c e

The three ‘safety’ measures we are going to consider are:

         • Flashback Querying
         • DELETE Triggers
         • Twin Bulk Operations

The first two methods are distinguished by depending to some degree on the availability and permissiveness of your DBA. Of course, the best situation, as a developer, is to have your own private desktop sandbox database and enough admin savvy to perform routine DBA-type functions.

If you have not yet entered COMMIT and you run your sessions with SET AUTOCOMMIT OFF, then you may undo any mistakenly deleted rows immediately with the ROLLBACK command. Of course, this depends upon realizing the unintended deletion before moving on to anything else. In all that follows, we are assuming that a session COMMIT has already happened.

Flashback Queries

Oracle’s various Flashback capabilities would seem positively magical to a DBA from release 7 transported into the future. Depending upon the size of the UNDO tablespace area along with the size and frequency of your database’s transactions, you can retrieve ‘deleted’ data as far back as resources permit storing. Some sites use Flashback Archives architecturally in their design to permit analyses of evolving stages of the same datasets. For our purposes here, we are concerned with two different data aging methods: by timestamp and by SCN, or System Change Number. Every Oracle transaction which is committed and written to the UNDO space is marked by sequentially increasing numbers, since the database’s creation. These are the SCNs. Many other internal events act to increase the current SCN value, but it is guaranteed that each transaction will carry a unique one. If you want to delve into the minutae of SCN internals, this fellow’s blog entry takes a stab at it.

SCNs are regarded as more accurate than timestamps to pinpoint a specific state of a piece of data in history. Both are easily correlated, however, as depicted in the scripts below.
w h i t e s p a c e


   SELECT
      timestamp_to_scn(sysdate – 5/(24*60)) AS scn_5minutes_ago,
      timestamp_to_scn(sysdate) AS scn_now
   FROM dual;

   SELECT
      scn_to_timestamp(&which_scn) AS scn_time
   FROM dual;

w h i t e s p a c e
I usually keep a script called ‘SCN.SQL’ within my $SQLPATH directory for quick execution, via the command:  @scn . It is handy for gathering a baseline SCN as a potential restore point right before doing a DELETE or UPDATE you might have reservations about:
w h i t e s p a c e


   SELECT
      DBMS_FLASHBACK.SET_SYSTEM_CHANGE_NUMBER AS scn,
      TO_CHAR(sysdate, ‘YYYY-MM-DD HH24:MI:SS’) AS timepoint
   FROM dual;

w h i t e s p a c e
First, let’s look at a basic FLASHBACK TABLE scenario. This capability is for restoring a single table to an earlier state, and is ideal when no dependencies or entanglements are in play with related tables. The main limit is the UNDO_RETENTION period setting, which you’ll note in our session is set to the default value: 15 minutes. Additionally, the user must have been granted the FLASHBACK ANY TABLE system privilege, and must have ‘row movement’ enabled, which conveniently can be done dynamically within a session. We collect the SCN (29719987) before deleting and then use this value to flashback the table a few minutes after committing the deletion:

                                                                                                                    view Flashback Table session 

The FLASHBACK TABLE functionality is usually not a bundled freebie within Oracle installations unless your database software is superior to the Standard Edition. Without this feature, the flashback command in the session above would result in an error: ORA-00439: feature not enabled: Flashback Table. A quick way to discern whether your system runs this feature is to query the V$OPTION table.
w h i t e s p a c e


   SELECT * FROM v$option
   WHERE UPPER(parameter) LIKE ‘FLASH%’;

w h i t e s p a c e

Another approach in this case, which is not dependent upon any specific database options, is the plain vanilla flashback query. This employs the AS OF syntax within the SELECT command. The only real obstacle to it’s effectiveness is the age of the desired data exceeding what remains available in UNDO. If the data persists within UNDO, you can both query it and therefore restore it with the appropriate UPDATE or INSERT statement. Here’s a simple flashback session recovering deleted rows. Note the use of AS OF within the SELECT and INSERT statements, which signals a flashback query.
w h i t e s p a c e

   rob@XE_11g2> @scn
           SCN TIMEPOINT
   ========== ===================
     29753589 2013-03-14 13:04:11

   rob@XE:11g2> DELETE FROM test_deleting X
   WHERE batch_id = 2004 AND X.rowid <>
      (SELECT MAX(Y.rowid) FROM test_deleting Y
       WHERE Y.alpha_code = X.alpha_code
       AND Y.num_code = X.num_code);

   12199 rows deleted.

   rob@XE:11g2> COMMIT;
   Commit complete.

   rob@XE:11g2> SELECT
   (SELECT COUNT(*) FROM test_deleting) AS count_after_del,
   (SELECT COUNT(*) FROM test_deleting AS OF SCN &scn_earlier) AS count_before_del
   FROM dual;

   Enter value for scn_earlier: 29753589
   COUNT_AFTER_DEL  COUNT_BEFORE_DEL
   ===============  ================
            447001            460000

   rob@XE_11g2> @scn
           SCN TIMEPOINT
   ========== ===================
     29755096 2013-03-14 13:06:44

   rob@XE:11g2> INSERT INTO test_deleting X
   (SELECT * FROM test_deleting AS OF SCN &scn_earlier
    MINUS
    SELECT * FROM test_deleting);

   Enter value for scn_earlier: 29753589
   12199 rows created.

   rob@XE:11g2> COMMIT;
   Commit complete.

w h i t e s p a c e

DELETE Triggers

If you have a particularly sensitive table, you could consider placing a trigger on it such that every DELETE operation will first fire off some PL/SQL code which traps the rows which are going to be removed, perhaps by journaling the important info somewhere or simply copying the rows wholesale to a preservation table before executing the DELETE. You will need to have been granted some permissions before doing this; or you can do it yourself if you are master of your own database. Additionally, you will need to have EXECUTE privileges for the new trigger and make sure it has been ENABLED.
w h i t e s p a c e


   CREATE TABLE test_deleting_journal
      (timestamp DATE DEFAULT sysdate,
       culprit VARCHAR2(30),
       alpha_code VARCHAR2(2),
       num_code INTEGER)
   /

w h i t e s p a c e


   GRANT CREATE TRIGGER TO rob;

   CREATE OR REPLACE TRIGGER tr_test_deleting
   BEFORE DELETE ON test_deleting
   FOR EACH ROW
   DECLARE
      v_culprit ALL_USERS.USERNAME%TYPE;
   BEGIN
      SELECT user INTO v_culprit FROM dual;

      INSERT INTO test_deleting_journal
      VALUES
         (sysdate, v_culprit, :old.alpha_code, :old.num_code);
   END;
   /

   GRANT EXECUTE ON tr_test_deleting TO rob;
   ALTER TRIGGER tr_test_deleting ENABLED;

w h i t e s p a c e

Now we’ll find that once we perform the DELETE, the same rows which have been deleted by the dedup code have been preserved within the journal table. Note that this is an instance of a row-level trigger, as indicated by the FOR EACH ROW clause within the trigger definition. So, in addition to the standard overhead incurred for the insertions in the trigger, there is also the overhead of being reduced to a row-by-row operation. In the original rowid deletion SQL we did not have this problem, which was why it had a comparatively quick execution time. At any rate, if need be, any mistakenly dropped rows can now be reconstructed into the original table by querying the journal table populated by the trigger.

Religious Proclamation:  I’m not a great fan of triggers. I see them as a kind of unfortunate tool under necessary circumstances; something to correct later when there is more time. I begrudge them their place in controlled moderation, but anytime you see their usage proliferating within a development project or DB app, it usually means performance issues, the need for improved design work, and a more complicated debugging context. I regard INSTEAD OF triggers as borderline pathological, right up there with Frankenstein and genetically-modified brussel sprouts. If you want to read an excellent overview about trigger usage, which includes a cautionary stance similar to my own, you can’t do better than this writeup from Tim Hall’s Oracle blog.

Twin Bulk Operations

You might think to try improving on the row-level DELETE trigger with one which works by bulk, executing once for the entire statement. The PL/SQL coding for this is not so simple, however, since you need to anticipate or deal with whatever mechanisms are used or forms are taken by the original triggering SQL. At least the trigger above, as written, is ‘universal’ in that it will apply to any DELETEs. All this suggests that if you want to have a safety mechanism that applies to all situations, and avoids slower row-by-row processing, why not just use two separate SQL statements? The first, a bulk journaling operation, and the second our usual deletion.

This means you need to think on the fly. But once the SQL code is worked out to perform the deletion, it’s quite straightforward to amend it for journaling code (i.e. INSERT) as a prelude. This kind of approach works best if you have a one-off or case-by-case scenario and needn’t worry about multiple users potentially deleting rows from the table in question.
w h i t e s p a c e

   rob@XE_11g2> INSERT INTO test_deleting_journal
   SELECT sysdate, (SELECT user FROM dual), alpha_code, num_code
   FROM test_deleting X
   WHERE batch_id = 2002 AND X.rowid
      (SELECT MAX(Y.rowid) FROM test_deleting Y
       WHERE Y.alpha_code = X.alpha_code
       AND Y.num_code = X.num_code);

   13150 rows created.

   rob@XE:11g2> DELETE FROM test_deleting X
   WHERE batch_id = 2002 AND X.rowid
      (SELECT MAX(Y.rowid) FROM test_deleting Y
       WHERE Y.alpha_code = X.alpha_code
       AND Y.num_code = X.num_code);

   13150 rows deleted.

   rob@XE:11g2> SELECT * FROM test_deleting_journal WHERE rownum < 11;
   TIMESTAMP  CULPRIT  AL NUM_CODE
   ========= ======== == ========
   02-MAR-13 ROB      BV       27
   02-MAR-13 ROB      BV      119
   02-MAR-13 ROB      BV      212
   02-MAR-13 ROB      BV      360
   02-MAR-13 ROB      BV      406
   02-MAR-13 ROB      BV      407
   02-MAR-13 ROB      BV      454
   02-MAR-13 ROB      BV      541
   02-MAR-13 ROB      BV      633
   02-MAR-13 ROB      BW       52
   10 rows selected.

w h i t e s p a c e

To make this approach even more spontaneous, you could dispense with the prior table creation entirely and handle it on the fly. Just make the following substitution in the first line of INSERT snippet, so that

       INSERT INTO test_deleting_journal

becomes:

       CREATE TABLE test_deleting_backup AS…

~RS

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: