Undo Function

The database will have a “roll-back” table which will allow for “undo” operations. It's envisaged that each transaction (which is a collection of stored procedure calls) will have a unique transaction identifier. When a user wants to undo a previous action they will be able to as the previous transaction identifiers will be stored in their session history. There will be an arbitrary limit to how far they can roll-back transactions.

The roll-back table will also capture the previous/current value for each change. This will be useful for reporting and also invaluable for auditing activity in the database.

This will result in a lot of disk space being used to support this functionality. These days storage is cheap so it is not considered a bad choice.

Another method is to never update a row, but mark it as old and copy a new row details. This will then allow you to easily roll back by just copying the row again to a previous one. This means you can also snapshot the whole application at a point in time. The downside of this is that it's going to store a lot of data. To try and mitigate this the tables should not have too many columns. This means more tables, and more normalisation. This is also known as a "Point in Time Architecture".

This idea came from: stackoverflow.com/questions/2640610/best-practices-for-managing-updating-a-database-with-a-complex-set-of-changes and www.simple-talk.com/sql/database-administration/database-design-a-point-in-time-architecture/.

There will be a few meta-data columns in each table:

  • _table_id – this is the internal identifier for the record. This is a logical identifier for the record (BIGINT).
  • _table_uuid – this is the universal unique identifier for the logical record. This is the one that is passed around in the application (BINARY(16))
  • _system_auto_id – this is the auto-increment column for the table (BIGINT)
  • _system_valid_from – this is the date that the row was valid from (TIMESTAMP)
  • _system_valid_to – this is the date that the row was valid until (TIMESTAMP)
  • _system_is_valid – this is a boolean to indicate the row is active or not. This is a shortcut for a clustered index combining with table_id (BIT).
  • _system_transaction_uuid – this is the UUID for the transaction which caused the row to be created (BINARY(16)).
  • _system_previous_id – this is a reference to the previous version of the row (refers to the system_auto_id, BIGINT).
  • _system_next_id – this is a reference to the next version of the row (refers to the system_auto_id, BIGINT).

The table_uuid and table_id will be generated the first time a new item is added to a table. These will then be re-used for each time the new version of the row is added. There will be a “Sequence” table or something like it, which will store the next id for every table in the system. This will have a mutex lock on it so that the next generated id will not be clashing with something else.

There will also be an audit table to capture field level changes, and a human readable audit message which describes what has been changed. See the “Detailed Auditing” section below.

The clustered index (main index) for every table will be the following columns:

  • _table_id
  • _table_uuid
  • _system_is_valid