2 ARIES Recovery
Next: 3 The Minirel RDBMS
Up: An ARIES Log Manager
Previous: 1 Introduction
ARIES (Algorithm for Recovery and Isolation Exploiting Semantics)
recovery is based on the Write Ahead Logging (WAL) protocol. Every
update operation writes a log record which is one of
- An undo-only log record: Only the before image is
logged. Thus, an undo operation can be done to retrieve the old data.
- A redo-only log record: Only the after image is logged.
Thus, a redo operation can be attempted.
- An undo-redo log record. Both before image and after
images are logged.
Every log record is assigned a unique and monotonically increasing
log sequence number (LSN).
Every data page has a page LSN field that is set to the LSN of the log
record corresponding to the last update on the page. WAL requires
that the log record corresponding to an update make it to stable
storage before the data page
corresponding to that update is written to disk. For performance
reasons, each log write is not immediately forced to disk. A log tail
is maintained in main memory to buffer log writes.
The log tail is flushed to disk when it gets full. A transaction
cannot be declared committed until the commit log record makes it to
disk.
Once in a while the recovery subsystem writes a checkpoint
record to the log. The checkpoint record contains the transaction
table (which gives the list of active transactions) and the dirty page
table (the list of data pages in the buffer pool that have not yet made it
to disk). A master log record is maintained separately, in stable
storage, to store the LSN of the latest checkpoint record that made it
to disk. On restart, the recovery subsystem reads the master log
record to find the checkpoint's LSN, reads the checkpoint record, and
starts recovery from there on.
The actual recovery process consists of three passes:
- Analysis. The
recovery subsystem determines the earliest log record from which the
next pass must start. It also scans the log forward from the
checkpoint record to construct a snapshot of what the system looked
like at the instant of the crash.
- Redo. Starting at the earliest LSN determined in pass
(1) above, the log is read forward and
each update redone.
- Undo. The log is scanned backward and updates
corresponding to loser transactions are undone.
For further details of the recovery process, see [Mohan et al. 92,Ramamurthy & Tsoi 95].
It is clear from this description of ARIES that the following
features are required for a log manager:
- Ability to write log records. The log manager should
maintain a log tail in main memory and write log records to it. The
log tail should be written to stable storage on demand or when the log
tail gets full. Implicit in this requirement is the fact that the log
tail can become full halfway through the writing of a log record. It
also means that a log record can be longer than a page.
- Ability to wraparound. The log is typically maintained
on a separate disk. When the log reaches the end of the disk, it is
wrapped around back to the beginning.
- Ability to store and retrieve the master log record.
The master log record is stored separately in stable storage, possibly
on a different duplex-disk.
- Ability to read log records given an LSN. Also, the
ability to scan the log forward from a given LSN to the end of log.
Implicit in this requirement is that the log manager should be able to
detect the end of the log and distinguish the end of the log from a
valid log record's beginning.
- Ability to create a log. In actual practice, this will
require setting up a duplex-disk for the log, a duplex-disk for the
master log record, and a raw device interface to read and write the
disks bypassing the Operating System.
- Ability to maintain the log tail. This requires some
sort of shared memory because the log tail is common to all
transactions accessing the database the log corresponds to.
Mutual exclusion of log writes and reads have to be taken care of.
The following sections describe some simplifying assumptions that we
have made to fit the protocol into Minirel and
the interface and implementation of our log manager.
Next: 3 The Minirel RDBMS
Up: An ARIES Log Manager
Previous: 1 Introduction
ajitk@cs.wisc.edu, cjin@cs.wisc.edu