Automated Checkpoint Generation
This article covers the automatic checkpoint creation scripts contained in GEMS_ROOT/checkpoints. Most of the content here repeats what is contained in the checkpoints README.
Contents
1. Introduction
The scripts contained in the GEMS checkpoint directory are used to create Simics checkpoints for a Serengeti Simics target machine running Solaris. Future support for any other target should not be expected.
The scripts can be used for two (related) functions. First, they can be used to create "naked checkpoints," which consist of a Simics target machine configuration with Solaris installed and running. These naked checkpoints can be used as a base for installing and/or running evalution workloads. The second function is the automatic installation and checkpointing of commercial workloads. Currently, there is support for Apache, SPECjbb2000 and dss. Support for oltp is present in the GEMS 2.1 release but untested.
The specific configuration of the naked machines is controlled by the parameters related to a "naked checkpoint family". A family configuration is used as the base for several machine configurations that consist of differing numbers of CPUs and variable amounts of memory. A family is defined partially by a set of simics scripts (the naked scripts, located in checkpoints/simics_scripts) and partially by several parameters in the system.conf configuration file. As these checkpointing scripts mature, more of the functionallity contained in the simics scripts will be extracted and made controllable via configuration parameters.
The workload creation scripts will create two checkpoints: a cold checkpoint taken immediately after the workload has been installed and invoked, and a warm checkpoint that is taken after a number of transactions have transpired (determined by parameters in workload.conf). The workloads are loaded on a naked checkpoint system.
2. Usage Requirements
1. A simics host machine with Simics 2.x installed. (We use Simics 2.2.19 internally) This is the machine from which you will launch the checkpointing scripts. If creating an apache workload, this machine must be connected to the internet. If you require a proxy server, make sure the proxy configuration is noted in system.conf.
2. If creating any workload checkpoint, a Sparc machine running Solaris is needed. This could be the same machine running your Simics guest, or it could be a remote machine that you have ssh access to. This machine is used to natively compile the workloads that are ultimately loaded into a naked checkpoint. In the configuration files, all parameters beginning with SPARC_BUILD_MACHINE refer to this machine. (NOTE: it is possible to compile and configure workloads inside the simulated naked machine by making alterations to the checkpoint scripts. If you choose to do so, however, you are on your own.)
3. Source files. You will need to download several source/image files before creating a checkpoint. At a minimal, you will need the CD images for a Solaris installation. For dss or oltp you will need an evaluation version of DB2.
3. Quick Start
3.1. Setup Steps (only do once)
- Download the first two installation CD images for Solaris from the Sun website and change the configration variables OS_NAME and SIMICS_OS_INSTALL_CD_1(2) in system.conf to reflect your actions. Note that the path you specify in SIMICS_OS_INSTALL_CD_X should point to the *unzipped* cd images. Currently, Solaris Express, Solaris 10, Solaris 9, and Solaris 8 are supported.
- Read through system.conf and adjust and paths and/or configuration options to suit your needs. A detailed explination of each parameter can be found in the configuration file.
3.2. Naked Checkpoint creation
- To create a naked checkpoint from the family described in system.conf, run the naked-check-create.sh script, passing the number of CPUs and memory size (in MB) to the script. e.g.:
./naked-check-create.sh 16 65536
3.3. Workload Creation
- To create a checkpoint containing a running workload in a system of a specified size, run the workload-check-create.sh script, passing the number of CPUs, memory size (in MB), and workload name to the script ("all" can also be specified as the workload name, which will create checkpoints for all the workloads these scripts know how to create). e.g.:
./workload-check-create.sh 16 65536 apache
4. Details of Script Operation
There are two configuration files that you should inspect and adjust according to your needs. The first, system.conf, pertains to options that affect the creation of naked checkpoints, including the target system OS, hardware configuration, file paths, and Simics version information. The second, workload.conf, contains parameters that affect the compilation and execution of workloads that are set up in a naked checkpoint environment.
4.1. Naked checkpoint creation
./naked-check-create.sh 16 65536
This command will do between 0-2 steps:
- If no disk image for the naked checkpoint family exists, the script will boot and install Solaris, saving the installation to a disk image in the directory NAKED_CHECKPOINT_DIR.
- If a checkpoint file corresponding the system described by the naked checkpoint family and command line parameters cannot be found (i.e. has not already been created), the Solaris disk image created in step 1 is rebooted, the OS is reconfigured for the CPU and memory sizes specified at the command line, the host filesystem support is set up, and finally the system is checkpointed and written to the directory specified by NAKED_CHECKPOINT_DIR in system.conf
4.2. Workload checkpoint creation
./workload-check-create.sh 16 65536 apache
This command will do between 0-4 steps:
- naked-check-create.sh is invoked to create a naked checkpoint for the given system size if one does not already exist.
- If a previous build of the workload(s) is not found in the directory WORKLOAD_BUILD_DIR, the script will invoke a remote build on the SPARC_MACHINE based on the parameters in workload.conf.
- If a cold checkpoint for the given workload, naked checkpoint family, processor count, and memory size is not found, then one is created. The naked checkpoint for the specified system size is booted in Simics, the workload binaries and associated files are copied into the guest filesystem, and the workload application is started. After hitting the first magic break delinating a single transaction completion, a cold checkpoint of the workload is taken and store in WORKLOAD_CHECKPOINT_DIR.
- If a warm checkpoint for the given workload, naked checkpoint family, processor count, and memory size is not found, then one is created. The cold checkpoint for the specified system is reloaded and the workload is run for a number of transactions indicated by configuration parameters in workload.conf. After enough transactions have transpired, a warm checkpoint is taken and stored in WORKLOAD_CHECKPOINT_DIR.
5. Using Naked Checkpoints
Naked checkpoints can be created with an arbitrary number of CPUs and memory size, up to the limits allowed by Simics. The checkpoints are set up with support for mounting the host file system to facilitate easy manual installation of workloads. To map the host filesystem into the simics guest, type "mount /host" at the command prompt of the guest machine after the naked checkpoint has booted.
Naked checkpoints can also be integrated with microbenchmark scripts. The simics command scripts used to start microbenchmarks can be integrated with the latest naked checkpoints by using environment variables available in the workloads dictionary. You simply need to have the microbenchmark script read a configuration from the latest naked checkpoint family instead of using an older checkpoint like those from the golden or silver families. In a simics command file, e.g. barnes.simics, locate and modify the follow:
checkpoint_dir = workloads.get_var(env_dict, "checkpoint_dir") checkpoint = workloads.get_var(env_dict, "checkpoint") read-configuration "%s/%s" % (checkpoint_dir, checkpoint)
6. Workload Specific Details
Below are quirks specific to individual scripts.
6.1. naked checkpoints
The naked checkpoints need the first two iso CD images for the Solaris operating system. Download a release of eitehr SolarisExpress or Solaris 10 from the Sun Microsystems website, then be sure to point the configuration parameters in system.conf to the files.
6.2. apache
The source for apache is downloaded with wget during the execution of workload-check-create.sh. Be sure to set any proxy information in system.conf before beginning installation.
6.3. jbb
The official SPECjbb2000 release needs to be modified before using it with our scripts. The details can be found in the configuration comments in workload.conf.
6.4. dss
The dss workload needs a release of DB2 version 9 or later. You can retrieve this source from the IBM website. After downloading the tarball, indicate its location in workload.conf.
6.5. oltp
As of GEMS 2.1, there is no official release of an oltp script. A fairly straightforward modification of remote_build/dss_script.py will render an oltp-like workload, however. To do so, obtain a copy of the DBmbench release (http://www.ece.cmu.edu/~simflex/software.html) and navigate to the tpcc driver, which itself consists of three files. You can use this driver by calling it instead of the tpch driver found on the last line of the setup_cmds list in dss_build.py. Of course, you will also need to transfer the new driver into Simics beforehand by issuing a cp command from /host/<path to tpcc driver> to the guest.