The heapfile component consists of the HeapFile, HFPage and Scan classes. Of these, the HFPage code could be re-used in large part from the earlier version of minirel. However, since the heapfile itself is now built on top of totally different interface, most of HeapFile code had to be written afresh. An entire new class Scan was implemented to support sequential scans.
The development of the multiuser versions of the above required us to modify some of the single user code to now deal with shared memory. Further, locking and logging for concurrency control and crash recovery ensured that we had a complete understanding of the primitives offered, their limitations and means of obtaining the best results with them.
The normal operation of the space manager was tested by having a set of tests that use every call in the interface. Then, several pathological cases were explored. We forced the space map to span several pages, by having a large number of pages and a smaller sized page. Then, the allocation and deallocation of runs that span across page boundaries were checked. Similarly, the header page can also overflow. By having a small enough page size (128 bytes), we forced the directory to overflow across several pages and tested the routines that insert, delete and retrieve file entries.
The multiuser version of the space manager uses shared memory for storing the space map and the directory, Synchronization of accesses to shared memory was tested by designing one process to lock the shared memory and sleep for a while. This caused the other processes to block trying to access the shared memory. When the first process woke up and released the lock, the other process could finish its execution.
The heapfile component was exhaustively tested by a series of tests. These use all the methods in the public interface to the heapfile and in several different combinations. We inserted large number of records that caused the heapfile to grow to many pages. Under such conditions, we tested the working of inserts, deletes, scans and all other operations.
Our code has been integrated with several other components and has been used by them for several weeks. No major bug has been reported by any of them, thus reenforcing the reliability of our code. The space management code has been used by the buffer manager group. Heapfiles have been used by the catalogs group among others.
Currently, there are several restrictions about the use of databases within a process and also among processes. Support for several open databases in a transaction might be useful. Also, processes with different open databases should be permitted to share the same buffer pool.
The current structure requires that the higher layers talk to the buffer manager most of the time. However, they need to call the DB methods for allocation, deallocation and directory service. It might be better if a common interface that provides all the features is developed.
The shared memory implementation handles only allocation of memory but does not support deallocation. The users should write their own mechanism for managing the deallocated space. In a system like minirel, where several components use shared memory and are developed separately, this would mean a severe case of redundancy of code. Every component has its own free list implementation and some components require the maintenance of several lists. It would be nice if the shared region class also supported deallocation of memory in some clean way.
Only page level locks are supported. This severely restricts the amount of concurrency possible. For instance, a change to a single record on a heapfile page causes the entire page to be locked in exclusive mode. No other transaction can access this page till the previous one committed or aborted. Having locks with different granularities and support for a richer variety of locks would aid high concurrency.
Byte level logging causes high overhead while logging changes to a few bits on the space map. We are not aware of any scheme that avoids this overhead.
As regards our implementation, there is some scope for improvement in the way inserts on heapfiles are handled. We are not sure that maintaining a list of pages with some free space on it is a good solution. Some better scheme needs to be developed that ensures that inserts are both reasonably fast and make efficient use of the available space. Also, we have a restriction on the nature of updates that are possible on records. Use of forwarding pointers can get rid of this restriction.
During the recovery phase, the recovery manager loads the pages into memory and restores the before images for the aborted transaction. This causes a problem in the case of space map and the header pages. These pages are stored in shared memory by the space manager and not in the buffer pool. Thus changes by the recovery manager to these pages do no get reflected in the space manager. A possible solution is that the recovery manager treats these pages in a special manner. It can be given access to the shared memory structure maintained by the space manager and thus, all the changes are made there and not in the buffer pool.