Heap File

Introduction

A heap file is an unordered set of records. The following operations are supported:
* Heap files can be created and destroyed.
* Existing heapfiles can be opened and closed.
* Records can be inserted and deleted.
* Records are uniquely identified by a record id (rid). A specific record can be retrieved by using the record id.

Sequential scans on heap files are also supported. A scan object is created, and get-next calls on this object allow us to retrieve all records starting from the first record.

A file scan iterator uses a heap file scan, and calls the eval function to apply any desired selections to the retrieved tuples.

Temporary heap files are supported as well. These are used for external sorting.

A word on naming: The catalog also names files. It would be best to have a global naming algorithm.

Design Decisions

The main kind of page structure used in the Heap File is HFPage, and this is viewed as a Page object by lower-level code. The HFPage class uses a slotted page structure with a slot directory that contains (slot offset, slot length) pairs. The page number and slot number are used together to uniquely identify a record (rid). When a record is deleted, the space is reclaimed by compacting records on the page, and the length in the corresponding slot entry is set to a negative value (indicating that the slot is empty).

Pages are maintained in a doubly linked list, and can be de-allocated when empty. The heap file layer makes calls to the buffer manager. The only exception is that the directory (file naming) services of the DB class (disk space management) are called directly.

Alignment issues are not considered. Tuples are stored only as a series of bytes. It is up to upper layers to make sense of them. (See related point in discussion of tuples.)

See the text for a more detailed description of implementation alternatives for heap files.

Click here for the public interface.

Back to the List of Components
Back to the Minibase Home Page