Compiling Instrumented Executables

The instrumentor provides a driver script that acts as a transparent wrapper around GCC. It is run as follows:

sampler-cc { C source file | GCC compiler flag | instrumentor flag ...}

Recognized instrumentor flags are as follows:

Instrumentor flags

-fsampler-scheme=scheme

Activate the named instrumentation scheme. This flag may be used multiple times to activate multiple schemes. However, the order in which active schemes are applied is fixed and does not depend on the order in which they are named on the command line.

[Important]Important

If this flag never appears on the command line, then no instrumentation scheme is activated and no instrumentation is added to the compiled code.

The following instrumentation schemes are currently available:

Instrumentation schemes

atoms

For each statement which accesses a potentially shared memory location, count how many times the last access to the same location was from the same thread or from a different thread. Each statement induces one instrumentation point with a pair of counters:

  1. last access from same thread

  2. last access from different thread

bounds

At each assignment of a scalar value, record the minimum and maximum values ever assigned. Each assignment induces one instrumentation point with one global min/max pair:

  1. minimum value

  2. maximum value

The minimum component is initialized to the maximum representable value for the assigned type, whereas the maximum component is initialized to the minimum representable value for the assigned type. A site which has never been sampled, then, will have a minimum which is greater than the corresponding maximum.

See also the -fassign-across-pointer, -fassign-into-field, and -fassign-into-index flags for ways to customize which kinds of assignments are instrumented.

branches

For each conditional (branch), count how many times the branch predicate is false or true. This includes if statements as well as branches that are implicit in looping control structures and certain operators (&&, ||, ?:). Each branch induces one instrumentation point with a pair of counters:

  1. branch predicate false

  2. branch predicate true

coverage

Count how many times each statement is executed. This currently has a performance bug which counts executions of scalar-pairs instrumentation so we suggest not to use both coverage and scalar-pairs simultaneously on large programs. Each statement induces one instrumentation point with one counter:

  1. statement executed

data

For each statement which accesses a potentially aliased memory location, count how many times each statement is executed. Each statement induces one instrumentation point with one counter:

  1. memory accessed

The -ftrace-sites flag changes this scheme to record the addresses of memory accessed.

float-kinds

At each assignment of a floating point value, count how many times the assigned value is in each of nine possible categories. Each such assignment induces one instrumentation point with a nonuple of counters:

  1. assigned value is -Inf

  2. assigned value is negative and normalized, but neither -Inf nor -0

  3. assigned value is negative and denormalized

  4. assigned value is -0

  5. assigned value is NaN

  6. assigned value is +0

  7. assigned value is positive and denormalized

  8. assigned value is positive and normalized, but neither +Inf nor +0

  9. assigned value is +Inf

This scheme applies to all real floating point types: float, double, and long double. It does not apply to complex floating types.

See also the -fassign-across-pointer, -fassign-into-field, and -fassign-into-index flags for ways to customize which kinds of assignments are instrumented.

function-entries

Count how many times each function is called. Each function body induces one instrumentation point with one counter:

  1. function entered

g-object-unref

This scheme is intended for use with the GLib Object System. Before each call to g_object_unref, check the current reference count for the object about to be unref'd. Each such call induces one instrumentation point with a quadruple of counters:

  1. zero references: object already being reclaimed

  2. one reference: object about to be reclaimed

  3. more than one reference: object not about to be reclaimed

  4. invalid: argument is not a GObject instance

returns

At each scalar-returning call site, count how many times the called function returned a negative, zero, or positive value. Each such call induces one instrumentation point with a triple of counters:

  1. return value negative

  2. return value zero

  3. return value positive

scalar-pairs

At each assignment of a scalar value, count how many times the assigned value is less than, equal to, or greater than each other same-typed in-scope variable. Each such comparable variable at each such assignment induces one instrumentation point with a triple of counters:

  1. assigned value less than other

  2. assigned value equal to other

  3. assigned value greater than other

This scheme can also optionally compare each assigned value to each constant-valued integer expression seen in the program. This can be useful to trap problems relating to fixed-size buffers, structure sizes, or other magic values. To turn on constant expression comparisons, add -fcompare-constants to the compiler command line.

See also the -fassign-across-pointer, -fassign-into-field, and -fassign-into-index flags for ways to customize which kinds of assignments are instrumented.

-fsample, -fno-sample

Enable or disable sampling of instrumentation points. If sampling is disabled, instrumentation points are run unconditionally. Default is to sample.

-finclude-function=function, -finclude-function=*, -fexclude-function=function, -fexclude-function=*

Function filtering. Each of these flags takes one mandatory argument which can be the name of a function or the special wildcard * which names every function. These flags can be given multiple times, creating an ordered include/exclude list. Each function that might be instrumented is checked against this list. The first match determines whether that function should be included or excluded from instrumentation.

These flags are useful for filtering out boring functions:

-fexclude-function=boringFunction

They can also be used to create executables with just a few selected functions instrumented:

-finclude-function=interestingFunction -fexclude-function=*

Default is to include all functions, as though -finclude-function=* were implicitly the last flag given.

[Tip]Tip

If you use the wildcard, remember to protect it from expansion by the shell such as by quoting it as '*' or "*".

-finclude-file=filename, -finclude-file=*, -fexclude-file=filename, -fexclude-file=*, -finclude-location=filename:line-number, -finclude-location=*:line-number, -fexclude-location=filename:line-number, -fexclude-location=*:line-number

File name and line number filtering. Each of these flags takes one mandatory argument which can be the name of a source file or the special wildcard * which names every file. For the location-based flags, the filename argument must be followed by a colon (:) and a decimal line number. Line numbers may not be given as *; to match all lines in a file, use the file-based flags rather than the location-based flags.

These flags can be given multiple times, creating an ordered include/exclude list. Each potential instrumentation point is checked against this list, using the source location (file name and line number) containing that point. The first match determines whether that instrumentation point should be retained or discarded.

Included and excluded locations are checked against names as seen by the compiler. These will tend to be simple file names for source files given on the command line, but may be relative or absolute path names for header files pulled in by the preprocessor. File names must match exactly to be considered. In some cases it may be necessary to examine the preprocessor output to learn what file name the compiler is seeing for a given fragment of code.

Note that checks are done using the name of the file actually containing the potential instrumentation point. If source file code.c brings in header file header.h, and that header contains a complete function definition, then instrumentation points in that function body will be included or excluded based on header.h as their file name, not code.c.

Also note that this check is performed at each individual instrumentation point. If a single function contains code from multiple source files, this filter can include some instrumentation points while excluding others. This can happen, for example, in Bison parsers and Flex lexers which mix user-specified actions with fixed boilerplate. One can instrument the actions while excluding the boilerplate:

-fexclude-file=/usr/gnu/share/bison.simple -fexclude-file=lex.yy.c

Default is to include instrumentation points from all files, as though -finclude-file=* were implicitly the last flag given.

[Tip]Tip

If you use the wildcard, remember to protect it from expansion by the shell such as by quoting it as '*' or "*".

-fsampler-random=randomizer

Random countdown management. randomizer must be one of the following:

online

Random countdowns are generated dynamically, as needed, while the program runs.

offline

A fixed bank of random countdowns is generated and stored before the instrumented program is launched.

fixed

Countdowns are not random. Samples are taken according to some fixed period determined when the program is launched.

See the section called “Low-Level Control Via Environment Variables” for how this choice affects which environment variables should be set at run time. Default is online.

-fshow-stats, -fno-show-stats

Static metrics. Used for data collection when writing papers. Default is silent operation. Safe to ignore.

-fspecialize-empty-regions, -fno-specialize-empty-regions, -fspecialize-singleton-regions, -fno-specialize-singleton-regions

Region specialization optimizations. Default is to perform all specializations. Safe to ignore.

-fuse-points-to, -fno-use-points-to

Use a conservative points-to analysis to identify possible callees at indirect function call sites. This feature is experimental and should be used with caution.

-fcompare-constants, -fno-compare-constants

When using the scalar-pairs instrumentation scheme, compare each assigned value to each constant-valued integer expression seen anywhere in the program. This can be useful to trap problems relating to fixed-size buffers, structure sizes, or other magic values, but may substantially increase the amount of instrumentation in the code. Defaults to not comparing with constant expressions.

-fassign-across-pointer, -fno-assign-across-pointer

When using the float-kinds or scalar-pairs instrumentation scheme, instrument an assignment whose left-hand side crosses a pointer. Defaults to not instrumenting pointer-crossing assignments.

Clarification is needed here about treatment of complex lvalues with a mix of pointer, field, and array accesses.

-fassign-into-field, -fno-assign-into-field

When using the float-kinds or scalar-pairs instrumentation scheme, instrument an assignment whose left-hand side is a structure field access. Defaults to not instrumenting structure field assignments.

Clarification is needed here about treatment of complex lvalues with a mix of pointer, field, and array accesses.

-fassign-into-index, -fno-assign-into-index

When using the float-kinds or scalar-pairs instrumentation scheme, instrument an assignment whose left-hand side is an array index access. Defaults to not instrumenting indexed array assignments.

Clarification is needed here about treatment of complex lvalues with a mix of pointer, field, and array accesses.

-fthreads, -fno-threads

Generate code suitable for multi-threaded execution. Defaults to generating single-threaded code. Multi-threaded code is also generated if the standard GCC -pthread flag is observed on the command line.

If both -fno-threads and -pthread are given, then the former overrides the later regardless of the order in which they appear. This can be useful to override automatically generated GCC command lines which use -pthread but which do not actually need thread support.

Multithreaded code requires a native C compiler with support for thread-local storage: GCC 3.2 at a minimum, 3.3 preferred. Linux distributions with support for the Native POSIX Threads Library (NPTL) will offer the best performance. Even so, there will be some additional overhead beyond that of single-threaded code, so do not enable multithreaded code generation unless your application is really multithreaded.

-frename-locals, -fno-rename-locals

Uniquely rename all local variables and formal parameters. This can be useful to avoid confusing multiple same-named variables when examining instrumentation site information. A variable var within a function func is renamed as func$var. Defaults to no renaming.

-fcache-countdown, -fno-cache-countdown

Cache the global next sample countdown in local variables. This is an important performance optimization. Defaults to caching, and should only be disabled when benchmarking the effect of this particular optimization.

-fpredict-checks, -fno-predict-checks

Emit static branch prediction hints when checking the global next sample countdown. This is a minor performance optimization. Defaults to emitting prediction hints, and should only be disabled when benchmarking the effect of this particular optimization.

-ftrace-sites, -fno-trace-sites

Use LTTng-ust to record a trace of all instrumentation sites instead of bumping up counters and dumping a summary report. When using the data instrumentation scheme, the trace will record the address of each memory access. When using the branches, data, and function-entries instrumentation schemes together, a complete dynamic dependence graph can be reconstructed from a trace.