Heap File and Disk Space manager
Ravi Murthy ravim@cs.wisc.edu
Sriram Narasimhan nsriram@cs.wisc.edu
Project report
Click here for the project report
General Description
Our group provides two independent interfaces.
Database (DB) Interface
The DB class provides the abstraction of a single database. The
operations on this class includes creating and destroying
databases. Existing databases can be opened and closed. Further, there
are methods to retrieve certain characteristic properties of the
database, like the number of pages and page size.
The DB class
also provides directory service . This service is implemented
in terms of records consisting of file names and their header page
ids. There are functions to insert/delete/access file entries.
The top level of the program should create an instance of this class
for every database used.
The abstraction of a page is given by the class Page. All higher level applications use this Page. A Page contains LSN and a data part. Higher layers can impose their own types of pages on this common structure.
HeapFile Interface
The heapfile interface provides the capability of manipulating
heapfiles which are unordered sets of records. Heapfiles can be
created and destroyed. Existing heapfiles can be opened and
closed. Records can be inserted and deleted. Each record is uniquely
identified by a record id (RID). A specific record can be retrieved by
giving the record id. It also supports sequential scans on
heapfiles. A scan object that is returned is used to retrieve all
records starting from the first record. Note that any selections can
only be applied by a higher layer which uses this scan to get all
records.
Private Interface
The following is a description of the space manager interface. This
is not needed by any other group. However, if you are interested in
knowing how the space manager really works, you are welcome to read
on....
The space manager is the component of the system that takes care
of the allocation and deallocation of pages among the different files
of a database. A database is a single UNIX file. It contains several
files , like the heapfiles, B+tree files and so on. A database
has a single space map that keeps track of the usage of the pages. The
space manager maintains the space map in shared memory and ensures
consistent access to it by different processes.
The other function of the space map is maintaining the directory
for every database. The directory contains the mapping between file
names and their header pages. Thus when a file needs to be opened, its
directory entry can be looked up which gives us the header page. The
directory is also maintained in shared memory and consistent access
ensured.
The space manager logs all changes to the space map and the
directory of a database. The case of deallocation of pages is handled
carefully to ensure that the effects of an aborting transaction can be
completely undone. Deallocation requests by a transaction are not
executed immediately. They are instead deffered and the page ids are
maintained in a list. Only when the transaction commits, changes are
made to the space map to reflect the deallocation.