The SSM provides a hierarchy of storage structures. A description of each type of storage structure is given below, followed by a description of the identifiers used to refer to them.
A device is a location, provided by the operating system, for storing data. In the current implementation, a device is either a disk partition or an operating system file. A device is identified by the name used to access it through the operating system. Each device is managed by a single server. A device has a quota. The sum of the quotas of all the volumes on a device cannot exceed the device quota. Note: devices are currently limited to containing only one volume.
The device management interface is part of class ss_m and is described in device(ssm).
For each mounted device, the server forks a process called diskrw (determined by the sm_diskrw option) to perform asynchronous I/O on the device. These processes communicate with the server through sockets and shared memory, so your OS must be configured with shared memory support.
A volume is a collection of file and index storage structures (described below) managed as a unit. All storage structures reside entirely on one volume. A volume has a quota specifying how must large it can grow. Every volume has a dedicated B+-tree index, called the root index to be used for cataloging the data on the volume.
The volume management interface is part of class ss_m and is described in volume(ssm).
A record is an un-typed container of bytes, consisting of a tag, header and body. The tag is a small, read-only location that stores the record size and other implementation-related information. The header is variable length (limited to what will fit on a page) location for a VAS to store meta-information about the record (such as its type). The body is the primary data storage location and can range in size from zero bytes to 4-GB. A record can grow and shrink in size by operations that append and truncate bytes at the end of the record.
A file is a collection of records. Files are used for clustering records and have an interface for iterating over all the records they contain. The number of records that a file can hold is limited only by the space available on the volume containing the file. The minimum size of a file is 64K-bytes (8 pages). We are working on ways to reduce this to 8K, but in either case, using a file to store a collection containing only a few small records will waste space.
Methods for creating/destroying files and creating/destroying/modifying records are part of class ss_m and described in file(ssm). There is a pin_i class for pinning records for reading and modifying. This class is documented in pin_i(ssm). There are the classes scan_file_i for iterating over the records in a file, and append_file_i for appending records to a file. Both are described in scan_file_i(ssm).
The B+tree index facility provides associative access to data. Keys and their associated values can be variable length (up to the size of a page). Keys can be composed of any of the basic C-language types plus variable length character strings. A bulk-loading facility is provided. The number of key-value pairs that an index can hold is limited only by the space available on the volume containing the index. The minimum size of a B+tree index is 8K-bytes (1 page).
Methods for index operations are part of class ss_m and described in btree(ssm). There is scan_index_i class for iterating over a range of keys in the index This class is documented in scan_index_i(ssm).
An R-Tree is a height-balanced tree structure designed specifically for indexing multi-dimensional spatial objects. It stores the minimum bounding box (with 2 or higher dimension) of a spatial object as the key in the leaf pages. The current implementation in SHORE is a variant of R-Tree called R*-Tree [BKSS], which improves the search performance by using a better heuristic for redistributing entries and dynamically reorganizing the tree during insertion. Currently, only 2-dimensional R*-trees with integer coordinates are supported by the SSM. A bulk-loading facility is provided. The number of key-value pairs that an index can hold is limited only by the space available on the volume containing the index. The minimum size of an R*tree index is 64K-bytes (8 pages).
The R*-Tree implementation stores [key, value] pairs, where the key is of type nbox_t. and the value is of type vec_t. A 2-D nbox_t is a rectangle which stores coordinates in the order of x_low, y_low, x_high, y_high (lower left point and higher right point). Currently, only integer values are supported for the coordinates.
Methods for R*-tree index operations are part of class ss_m and described in rtree(ssm). There is scan_rt_i class for iterating over a range of keys in the index This class is documented in scan_rt_i(ssm).
Volumes, files, records and indexes all have identifiers. There are two broad categories of identifiers: logical and physical. Logical IDs are location-independent meaning there is a level of indirection for mapping the ID to a physical location. Physical IDs are location-dependent, meaning they refer to the physical location (usually location on disk) of the referenced object. Although the SSM has both physical and logical ID versions of its interface, only the version using logical IDs is supported at this time.
Volume IDs are a globally unique, 8-byte long ID called lvid_t described in lid_t(common). File, record and index IDs are identified by a serial number, serial_t, unique to the volume containing them. Serial numbers are currently 4 bytes long, but we plan to make them 8 bytes long in the future. The complete ID for a file, index or record is a combination of the volume ID and serial number call lid_t and described in lid_t(common). Once the object a serial number identifies is destroyed, the serial number will never be reused.
The purpose of having two-part IDs (volume ID, serial number) is to allow local (ie. intra-volume) references to be stored using only the serial number. This can significantly reduce the size of databases containing records with lots of pointers. To keep remote (ie. inter-volume) references just as short, special serial numbers for remote references are used.
Methods for operating on IDs and generating remote references are described in lid(ssm).