Bulk-Loading Indexes
[Storage Structures]

Collaboration diagram for Bulk-Loading Indexes:


Detailed Description

Bulk-loading indexes consists of the following steps:

To avoid excessive logging of files that do not need to persist after the bulk-load is done, use the sm_store_property_t property t_load_file for the source files.


Functions

static rc_t ss_m::bulkld_index (const stid_t &stid, int nsrcs, const stid_t *source, sm_du_stats_t &stats, bool sort_duplicates=true, bool lexify_keys=true)
 Bulk-load a B+-Tree index from multiple data sources.
static rc_t ss_m::bulkld_index (const stid_t &stid, const stid_t &source, sm_du_stats_t &stats, bool sort_duplicates=true, bool lexify_keys=true)
 Bulk-load a B+-Tree index from a single data source.
static rc_t ss_m::bulkld_index (const stid_t &stid, sort_stream_i &sorted_stream, sm_du_stats_t &stats)
 Bulk-load a B+-Tree index from a single data stream.
static rc_t ss_m::bulkld_md_index (const stid_t &stid, int nsrcs, const stid_t *source, sm_du_stats_t &stats, int2_t hff=75, int2_t hef=120, nbox_t *universe=NULL)
 Bulk-load a multi-dimensional index from multiple sources.
static rc_t ss_m::bulkld_md_index (const stid_t &stid, const stid_t &source, sm_du_stats_t &stats, int2_t hff=75, int2_t hef=120, nbox_t *universe=NULL)
 Bulk-load a multi-dimensional index from a single source. The storage manager does not provide complete support for non-unique multidimensional indexes. While you may insert multiple (distinct) entries for the same key in a multi-dimensional index, you will not be able to use them; only the first can be retrieved.
static rc_t ss_m::bulkld_md_index (const stid_t &stid, sort_stream_i &sorted_stream, sm_du_stats_t &stats, int2_t hff=75, int2_t hef=120, nbox_t *universe=NULL)
 Bulk-load a multi-dimensional index from a sorted stream source. The storage manager does not provide complete support for non-unique multidimensional indexes. While you may insert multiple (distinct) entries for the same key in a multi-dimensional index, you will not be able to use them; only the first can be retrieved.


Function Documentation

static rc_t ss_m::bulkld_index ( const stid_t stid,
int  nsrcs,
const stid_t source,
sm_du_stats_t &  stats,
bool  sort_duplicates = true,
bool  lexify_keys = true 
) [static, inherited]

Bulk-load a B+-Tree index from multiple data sources.

Parameters:
[in] stid ID of the index to be loaded.
[in] nsrcs Number of files used for data sources.
[in] source Array of IDs of files used for data sources.
[out] stats Statistics concerning the load activity will be written here.
[in] sort_duplicates If "true" the bulk-load will sort duplicates by value.
[in] lexify_keys If "true" the keys are assumed not to be in lexicographic format, and the bulk-load will reformat the key before storing it in the index, otherwise they are assumed already to be in lexicographic format.
Lexicographic format is the translation of numbers (int, float, double, unsigned, etc) into byte strings such that a lexicographic comparison of the byte strings yields the same result as the numeric comparison of the original data.

Note:
The data must already have been sorted by key in lexicographic format, but the keys themselves don't have to be in lexicographic format; if the keys are not already in lexicographic format, the lexify_keys must be given the value "true".
In the case of duplicate keys, the bulk-load will handle the sorting of the elements if sort_duplicates is "true"; this sort will be done by a lexicographic comparison of the byte strings that compose the elements.

static rc_t ss_m::bulkld_index ( const stid_t stid,
const stid_t source,
sm_du_stats_t &  stats,
bool  sort_duplicates = true,
bool  lexify_keys = true 
) [static, inherited]

Bulk-load a B+-Tree index from a single data source.

Parameters:
[in] stid ID of the index to be loaded.
[in] source IDs of file used for data source.
[out] stats Statistics concerning the load activity will be written here.
[in] sort_duplicates If "true" the bulk-load will sort duplicates by value.
[in] lexify_keys If "true" the keys are assumed not to be in lexicographic format, and the bulk-load will reformat the key before storing it in the index, otherwise they are assumed already to be in lexicographic format.

static rc_t ss_m::bulkld_index ( const stid_t stid,
sort_stream_i sorted_stream,
sm_du_stats_t &  stats 
) [static, inherited]

Bulk-load a B+-Tree index from a single data stream.

Parameters:
[in] stid ID of the index to be loaded.
[in] sorted_stream Iterator that serves as the data source.
[out] stats Statistics concerning the load activity will be written here.
See sort_stream_i.

static rc_t ss_m::bulkld_md_index ( const stid_t stid,
int  nsrcs,
const stid_t source,
sm_du_stats_t &  stats,
int2_t  hff = 75,
int2_t  hef = 120,
nbox_t universe = NULL 
) [static, inherited]

Bulk-load a multi-dimensional index from multiple sources.

Parameters:
[in] stid ID of the index to be loaded.
[in] nsrcs Number of files used for data sources.
[in] source Array of IDs of files used for data sources.
[out] stats Statistics concerning the load activity will be written here.
[in] hff Heuristic fill factor. Not used.
[in] hef Heuristic expansion factor. Not used.
[in] universe Universal bounding box of all spatial objects indexed.

static rc_t ss_m::bulkld_md_index ( const stid_t stid,
const stid_t source,
sm_du_stats_t &  stats,
int2_t  hff = 75,
int2_t  hef = 120,
nbox_t universe = NULL 
) [static, inherited]

Bulk-load a multi-dimensional index from a single source. The storage manager does not provide complete support for non-unique multidimensional indexes. While you may insert multiple (distinct) entries for the same key in a multi-dimensional index, you will not be able to use them; only the first can be retrieved.

Parameters:
[in] stid ID of the index to be loaded.
[in] source ID of file to be used for data source.
[out] stats Statistics concerning the load activity will be written here.
[in] hff Heuristic fill factor. Not used.
[in] hef Heuristic expansion factor. Not used.
[in] universe Universal bounding box of all spatial objects indexed.

static rc_t ss_m::bulkld_md_index ( const stid_t stid,
sort_stream_i sorted_stream,
sm_du_stats_t &  stats,
int2_t  hff = 75,
int2_t  hef = 120,
nbox_t universe = NULL 
) [static, inherited]

Bulk-load a multi-dimensional index from a sorted stream source. The storage manager does not provide complete support for non-unique multidimensional indexes. While you may insert multiple (distinct) entries for the same key in a multi-dimensional index, you will not be able to use them; only the first can be retrieved.

Parameters:
[in] stid ID of the index to be loaded.
[in] sorted_stream Input stream that is data source.
[out] stats Statistics concerning the load activity will be written here.
[in] hff Heuristic fill factor. Not used.
[in] hef Heuristic expansion factor. Not used.
[in] universe Universal bounding box of all spatial objects indexed.


Generated on Mon Apr 11 13:51:01 2011 for Shore Storage Manager by  doxygen 1.4.7