Adaptive Transactional Memory Test Platform

Sun has recently announced that its forthcoming multicore processor, code-named Rock, will support a form of hardware transactional memory (HTM). The Adaptive Transactional Memory Test Platform (ATMTP) extends GEMS to model Rock's HTM instructions and to provide a first-order approximation of the success and failure characteristics of transactions on Rock. ATMTP is not an accurate model of Rock’s implementation or performance, but nonetheless will allow developers to test and tune their code for Rock, before Rock-based systems are available. ATMTP will also allow GEMS users to experiment with extending its functionality or adapting it to model variations on Rock’s HTM support to guide research on future HTM implementations.

ATMTP is brought to you by the Scalable Synchronization Research Group of Sun Microsystems Laboratories. For information about GEMS or to download the latest version of GEMS (which includes ATMTP) see the GEMS website.

ATMTP users are encouraged to read our Transact 2008 papers about ATMTP. Please cite the first one in any publication describing use of ATMTP.

http://research.sun.com/scalable/pubs/TRANSACT2008-ATMTP.pdf

http://research.sun.com/scalable/pubs/TRANSACT2008-ATMTP-Apps.pdf

Acknowledgements

We thank the Multifacet Group at the University of Wisconsin for their support using the GEMS and LogTM simulators and for agreeing to distribute the ATMTP with GEMS. We are especially grateful to Luke Yen for his hard work and patience with us during the ATMTP integration process. We also thank Shailender Chaudhry, Bob Cypher, Martin Karlsson, and Marc Tremblay for many useful conversations about Rock’s design and behavior.

Feedback? Questions?

We would like to hear about your experience with ATMTP, positive or otherwise. Please send email to atmtp-feedback AT sun.com to tell us about your plans or experience using ATMTP, to report any problems with the simulator or documentation, to receive future announcements about ATMTP, or if you would like to contribute to future releases.

Setting up ATMTP

First, follow the guidelines for setting up Simics and your GEMS environment (Setup). Before beginning the instructions on the QuickStart page, add a symbolic link to the file Rock.h, as follows:

cd $GEMS/simics/src/extensions/ruby
ln -s ../../../../ruby/Rock/Rock.h

Next, set the default value for XACT_MEMORY to true in $GEMS/ruby/config/rubyconfig.defaults. Finally, build Ruby following the directions in the QuickStart guide, but specifying the protocol "MESI_CMP_filter_directory" instead of "MOSI_SMP_bcast." For more information on that protocol and the LogTM-SE transactional memory simulator included in GEMS, see: Transactional Memory. ATMTP currently works only with Ruby (not Opal) so ignore the directions for building Opal.

Creating ATMTP Programs

ATMTP uses a different interface than the LogTM-SE simulator to represent transactions (the LogTM-SE simulator interface is described in TM Workload Setup). For ATMTP (and Rock), transactions are specified by the instructions chkpt and commit. Chkpt takes a single argument, a PC-relative address (the fail_pc) to which execution jumps if the transaction fails. Commit takes no arguments. The chkpt instruction begins a hardware transaction. If the processor executes a commit instruction, all instructions between the chkpt and commit will have executed atomically and the transaction will have succeeded. Otherwise, if the transaction fails, the program will behave as if the chkpt was simply a branch to the fail_pc (label 1 in the example below) and none of the intervening code will have executed. A simple hardware transaction:

        chkpt 1f
        ... transaction body
        commit
        ba  2f
1:      ... failure path
2:      ...

For convenience, we have included C/C++ wrapper functions that can be used to execute Rock transactions without assembly programming. See $GEMS/microbenchmarks/atmtp_examples/rock-tm-test/rock_tm.* for the wrapper code. To write the above transaction using the C++ wrappers:

if(begin_transaction()){
        // transaction body
        commit_transaction();
} else {
        // failure path
}

Because transactions may abort repeatedly, we highly recommend against blindly retrying failed transactions (e.g., by using the address of the chkpt transaction as the fail_pc). Programs can avoid repeated transaction failures by deciding whether to retry a failed transaction based on the contents of the checkpoint status register (%cps), which reports the cause of failure. The cps register is a bit vector with the following bits of information (bit 0 is unused):

  1. COH: (Coherence) Another processor executed a conflicting memory request
  2. TCC : (Trap) Executed a software trap
  3. INST: (Instruction) Executed an instruction not supported in transactions
  4. PREC: (Precise Exception) Generated a precise instruction (e.g., TLB miss or divide by zero)
  5. ASYNC: (Asynchronous) An asynchronous interrupt was received
  6. SIZ: (Size) Exceeded store queue
  7. LD: (Load) A line in read set was evicted from the L1 cache or a load missed in the DTLB
  8. ST: (Store) A store missed in the DTLB

Programs can read the cps register with the rd instruction:

rd %cps, <destination register>

Or, programs written in C/C++ can use the read_cps_register wrapper function in rock_tm.h:

unsigned int read_cps_reg();

Using ATMTP

To use ATMTP, the following parameter values:

  1. ATMTP_ENABLED = true;
  2. XACT_LAZY_VM = true;
  3. XACT_EAGER_CD = true;

    must be set in Ruby either by setting them at compile time in $GEMS/ruby/config/rubyconfig.defaults, or at runtime using the command: "ruby0.setparam_str <true/false>" at the Simics prompt. For example:

simics> ruby0.setparam_str ATMTP_ENABLED "true"
simics> ruby0.setparam_str XACT_LAZY_VM "true"
simics> ruby0.setparam_str XACT_EAGER_CD "true"

To run ATMTP, set these parameters, then invoke Ruby as specified in the GEMS QuickStart guide.

For examples of how to run ATMTP, see the section titled "Running the ATMTP Examples," which describes the three example ATMTP/Rock programs that are included in GEMS 2.1, in the directory $GEMS/microbenchmarks/atmtp_examples).

Configuration Options

Running the ATMTP Examples

ATMTP comes with three example programs "rock-tm-test," "cps-test," and "stl-vector-test." All three example programs are included in the GEMS 2.1 distribution in the directory $GEMS/microbenchmarks/atmtp_examples. Running these examples on ATMTP requires installing and configuring ATMTP and GEMS as described above and creating at least one Simics checkpoint of a booted USIII+ (e.g., sarek) machine (for help creating a checkpoint, see the Simics documentation: https://www.simics.net/). Once you have a checkpoint and have built and configured ATMTP, follow the instructions below to compile and run the examples.

Compiling the Example Programs

Each example program comes with a makefile that will compile the program using Sun Studio version 12 (CC version 5.9), which is available for free download athttp://developers.sun.com/sunstudio/. To use this downloaded version of the compiler, simply unzip and untar the downloaded distribution file and include the SUNWspro/bin directory in your path. Although we recommend updating to the latest version of Sun Studio, it is possible to compile the examples with an older compiler. CC v5.9:use the provided makefiles. CC v5.5-v5.8:edit the provided makefiles replacing "-m64" with "-xarch=v9".

rock-tm-test

The first example, rock-tm-test, is a "hello world" for Rock transactions. The program creates a group of threads, each of which executes a series of simple transactions that atomically increment one of a set of counters. The transactions are likely to succeed in almost all cases.

To compile rock-tm-test, make sure the Sun Studio (http://developers.sun.com/sunstudio/) compiler, CC, is in your path (or, edit the Makefile to specify the location of the compiler). Then, build the example program by running make.

bash> cd $GEMS/microbenchmarks/atmtp_examples/rock-tm-test
bash> export PATH=$COMPILER_DIR:$PATH
bash> make

Rock-tm-test takes 2 arguments, the number of threads and the number of transactions. We have provided a Simics script (rock-tm-test.simics) that executes rock-tm-test on ATMTP. By default, the script creates one thread for each processor of the simulated machine and runs 10000 transactions. To run rock-tm-test using rock-tm-test.simics, go to the DESTINATION directory specified in the ruby build, (e.g., $GEMS/simics/home/MESI_CMP_filter_directory). From that directory, invoke Simics using an existing checkpoint (using the -c option) and instructing it to execute rock-tm-test.simics (using the -x), as shown below.

bash> cd $GEMS/simics/home/MESI_CMP_filter_directory
bash> export SIMICS_EXTRA_LIB=./modules
bash> ./simics -c <checkpoint> -x ../../../microbenchmarks/atmtp_examples/rock-tm-test/rock-tm-test.simics

cps-test

The second example, cps-test, demonstrates several different causes of transaction failure in Rock and tests the ATMTP transaction failure feedback via the cps register. This program also binds threads to processors to prevent thread migration.

Like rock-tm-test, to compile cps-test, make sure the Sun Studio compiler (CC) is in your path (or, edit the Makefile to specify the location of the compiler). Then, build the example program by running make.

bash> cd $GEMS/microbenchmarks/atmtp_examples/cps-test
bash> export PATH=$COMPILER_DIR:$PATH
bash> make

Cps-test takes three arguments, the number of threads, the number of iterations executed by the larger tests, and a number representing the machine configuration (see cps-test.cpp for details). To run cps-test, set appropriate environment variables in your shell and execute cps-test.simics from the DESTINATION directory specified in the ruby build, specifying a checkpoint file with the -c option, as shown below. By default, cps-test.simics will run cps-test with one thread per simulated processor, 10000 iterations and an automatically selected machine configuration.

bash> cd $GEMS/simics/home/MESI_CMP_filter_directory
bash> export SIMICS_EXTRA_LIB=./modules
bash> ./simics -c <checkpoint> -x ../../../microbenchmarks/atmtp_examples/cps-test/cps-test.simics

Notice the output of cps-test in the simulated console window. The program prints out the number of successful transactions for each test as well as the number of transactions that failed due to each of several causes (as reported via the cps register).

stl-vector-test

The third example, stl-vector-test, provides an example of one possible use of Rock's HTM, eliding locks in critical sections. In this case, the critical sections are operations on a vector from the standard template library (STL).

To compile stl-vector-test, run the script "compile-all" (see example below). That script builds four versions of the program stl-vectorFull. Two use HTM to elide locks (stl-vectorFull.htm.oneLock and stl-vectorFull.htm.rwLock), the others use locks for synchronization. Two use read-write locks, which are elided or not (stl-vectorFull.htm.rwLock and stl-vectorFull.noTM.rwLock), the others use a single exclusive lock, which is elided in stl-vector.htm.oneLock.

bash> cd $GEMS/microbenchmarks/atmtp_examples/stl-vector-test
bash> export PATH=$COMPILER_DIR:$PATH
bash> ./compile-all

Stl-vector-test depends on the standard template library and requires that the file libstlport.so.1 be available for run-time linking. To run the stl-vector-test, first copy that file (from $COMPILER_DIR/lib/stlport4/v9/) to $GEMS/microbenchmarks/atmtp_examples/cps-test/. Then, as above, set appropriate environment variables in your shell and execute stl-vector-test.simics from the DESTINATION directory specified in the ruby build. For example:

bash> cp $COMPILER_DIR/lib/stlport4/v9/libstlport.so.1 $GEMS/microbenchmarks/atmtp_examples/stl-vector-test/
bash> cd $GEMS/simics/home/MESI_CMP_filter_directory
bash> export SIMICS_EXTRA_LIB=./modules
bash> ./simics -c <checkpoint> -x ../../../microbenchmarks/atmtp_examples/stl-vector-test/stl-vector-test.simics

To run other versions of the program (e.g., to measure the impact of this use of HTM), edit the command line that launches the stl-vectorFull program in stl-vector-test.simics (e.g., replacing stl-vectorFull.htm.oneLock with stl-vectorFull.nohtm.oneLock).

Interpreting the Results

Ruby maintains counts and profiles several memory system events, including several related to transactional memory. ATMTP adds additional counters that track ATMTP-specific events. To access the ATMTP and Ruby statistics, invoke the command "dump-stats" on the Ruby object from the simulator console. For example, to print Ruby's statistics to the file my_output.stats, execute the following command:

simics> ruby0.dump-stats "my_output.stats"

Transactional memory statistics all include the prefix "xact" and are described on the TransactionalMemory Wiki page, under the heading "Output Statistics."

In addition to the statistics provided by Ruby, and LogTM-SE, ATMTP additionally provides the following statistics:

ATMTP (last edited 2008-06-09 18:53:22 by KevinMoore)