Next: Thread Management Up: The Shore Storage Manager Previous: Transaction Facilities

Crash Recovery Facilities

The crash recovery facilities of the SSM can be divided into: logging, checkpointing, and recovery management.

Logging

Updates performed by transactions are logged so that the can be rolled back (in the event of a transaction abort) or restored (in the event of a crash). Both the old and new values of an updated location are logged. This allows a steal, no-force buffer management policy, which means the buffer manager is free to write dirty pages to disk at any time and yet does not have to write dirty pages for a a transaction to commit.

The log is a location holding log records. Currently the log is stored in Unix files in a special directory (we plan to support using a raw device partition in the future). The size and location of the log is determined by configuration options described in the initialization section above.

The proper value for the size of the log depends upon the expected transaction mix. More specifically, it depends on the age of the oldest (longest running) transaction in the system and the amount of log space used by all active transactions. Here are some general rules to determine the amount of free log space available in the system.

Log records between the first log record generated by the oldest active transaction and the most recent log record generated by any transaction cannot be thrown away.
Log records from a transaction are no longer needed once the transaction has committed or completely aborted and all updates have made it to disk. Aborting a transaction causes log space to be used, so space is reserved for aborting each transaction. Enough log space must be available to commit or abort all active transactions at all times.
Only space starting at the beginning of the log can be reused. This space can be reused if it contains log records only for transactions meeting the previous rule.
All ss_m calls that update records require log space twice the size of the space updated in the record. All calls that create, append, or truncate records require log space equal to the size created, inserted, or deleted. Log records generated by these calls (generally one per call) have an overhead of approximately 50 bytes.
The amount of log space reserved for aborting a transaction is equal to the amount of log space generated by the transaction.
When insufficient log space is available for a transaction, the transaction is aborted.
The log should be at least 1 Mbyte.

For example, consider a transaction T1 that creates 300 records of size 2,000 bytes, writes 20 bytes in 100 objects, and is committed. T1 requires at 615 Kbytes for the creates and 9 Kbytes of log space for the writes. Since log space must be reserved to abort the transaction, the log size must be over 1.248 Mbytes to run this transaction. Assuming T1 is the only transaction running in the system, all the log space it uses and reserves becomes available when it completes. If another transaction, T2, is started at the same time as T1, but is still running after T1 is committed, only the reserved space for T1 is available for other transactions. The portion of the log used by T1 and T2 is not available until T2 is finished.

Transactions that fail because of insufficient log space are commonly those that load a large number of objects into a file during the creation of a database. A solution to this problem is to load the file in a series of smaller transactions. When the last transaction is committed, the load is complete. If the load needs to be aborted, a separate transaction is run to destroy the file.

Checkpointing

Checkpoints are taken periodically by the SSM in order to free log space and shorten recovery time. Checkpoints are ``fuzzy'' and can do not require the system to pause while they are completing.

Recovery

The SSM recovers from software, operating system, and CPU failure by restoring updates made by committed transactions and rolling back all updates by transactions that did not commit by the time of the crash. when an instance of class ss_m is created.

Recovery has three phases:

Analysis
During the analysis phase the log is scanned to determine what transactions were active and which devices were mounted at the time of the failure.
Redo
During the redo phase the devices are remounted and the log is scanned starting at a location determined by analysis. The operation recorded in each log record is redone if necessary. After redo, the database is in the state it was just before the crash.
Undo
During the undo phase, all active transactions at the time of the crash are undone. The devices are dismounted, and a checkpoint is taken.

The time it takes for recovery depends on several factors, including the number of transactions in progress at the time of the failure, the number of log records generated by these transactions, and the number of log records generated since the last checkpoint.

Next: Thread Management Up: The Shore Storage Manager Previous: Transaction Facilities

Marvin Solomon
Fri Aug 2 13:40:00 CDT 1996