Raw Data
This section contains the information about how to obtain and understand the raw data used for this paper. The data consists of the output from the gem5-gpu simulator and supporting scripts for parsing, filtering, and displaying the data.
How to obtain
A tarball of all of the data is available at this address:http://gem5-gpu.cs.wisc.edu/gpummu-hpca14/files/raw_data.tar.bz2.
The above tarball is 216 MB compressed and about 4 GB uncompressed.
How to interpret
The tarball contains a set of directories which each contain the data for all applications for a particular configuration of the simulator. The raw statistics can be found in the leaf directories under the name "stats.txt." This file is generated by gem5-gpu automatically and contains all of the statistics that were registered in the code as gem5 stats. These statistics include all of the gem5 statistics, gem5-gpu statistics, and additional statistics added for this paper. The tarball also contains a script (getData.py) used to filter and process the raw stats output, and some images that were used in the paper. Below we describe the file structure of this data, briefly describe the scripts included, and finally describe how graphs were generated for the paper.
File structure
Each top-level directory contains the output of one particular configuration of the simulator. The directories are named based on the configuration with the following general format: <L1 TLB size>-<Shared L2 TLB size>-<page walk cache size>. Additionally, "-p" refers to TLB prefetching enabled, and "mt<num>" refers to the depth of the multithreaded page table walker. Finally, the directory "baseline" is the baseline we compared to, described in the paper.
Each top-level directory has the following layout:
- rodinia-nocopy
- Simulation size (test<simsmall<sim1day<simlarge): We used the largest size for each benchmark that ran to completion for our data.
- benchmark
- stats.txt: gem5 statistics (the main stats file)
- gpu_stats.txt: GPU specific stats
- run.rev: A file that details the version, patches, and other details about the simulator used to run this workload. See "How to configure" below for details.
- config.ini: Details the configuration of the simulator for this run.
- Other files not used for the data in this paper. See gem5 and gem5-gpu documentation for details
- benchmark
- Simulation size (test<simsmall<sim1day<simlarge): We used the largest size for each benchmark that ran to completion for our data.
- run: a script to run all of the workloads under this configuration
Scripts
The file getData.py is a python script that is used to parse the stats.txt file for each of the configurations. It is not meant to be used in a standalone fashion. Instead, it should be used by other python scripts to extract the data and other scripts can generate graphs or other output. Below in the "Graphs" section we describe how we generated the graphs for this paper.
getData.py contains classes to wrap the gem5 stats types (like Hist), and a class for the general m5 stats file. This main class (Stats) is used to parse the statistics files and each instance of a Stats class can be queried to obtain (almost) any stat in the stats file.
Additionally, getData.py contains a Benchmark class which is used to query the statistics for some particular benchmark (like backprop, in fs mode, size simsmall). This class also has many helper functions such as "getPKIStat", which automatically divides the stat by the total number of instructions executed (get data per kilo-instruction).
There are also helper functions included in getData.py meant to be used to find the largest run of a benchmark that completed and gather its stats, as well as printing and plotting stats.
Finally, getData.py contains a set of helper functions to take raw stats from the Benchmark objects and generate interesting statistics, like TLB miss rate, memory accesss per kilo-cycle, etc.
Graphs
In this project to generate our graphs we used the python package Matplotlib. We also used the package IPython notebook to organize the graphs and data. Below I have a set of links to the notebook files that were used. To use the notebook files yourself, you will need to install and configure Matplotlib and IPython notebook. Additionally, you will need to be running an IPython notebook server on your computer. Finally, the notebooks below have been sanitized by removing my particular paths, so you will need to adjust any paths if you want to regenerate any graphs from the data in the above tarball.
To view the notebook you can follow this link: http://nbviewer.ipython.org/url/research.cs.wisc.edu/multifacet/gpummu-hpca14/GPUMMU-data-hpca-2014.ipynb
Or download the notebook from http://research.cs.wisc.edu/multifacet/gpummu-hpca14/GPUMMU-data-hpca-2014.ipynb
Raw Code
In this section, we describe how to reproduce the experiments that were run to generated the data from above. We describe how to obtain, compile, and run the code. Additionally, we describe how to reproduce the configurations used in this work.
How to obtain
Code
The code is contained in a set of mercurial repositories. Specifically, it is a set of patch queues on top of a particular version of gem5-gpu. For information about how to use mercurial and patch queues, please see the mercurial documentation.
Below are the steps required to check out the code:
- Check out gem5-gpu. See http://gem5-gpu.cs.wisc.edu/
-
Update to the following changesets:
- gem5: 57aac1719f86
- gem5-gpu: 353fd0030d60
- gpgpu-sim: 65e93a2eddf9
-
Check out project-specific patch queues
- cd gem5
- hg qq -c personal
- cd gem5/.hg
- hg clone http://gem5-gpu.cs.wisc.edu/gpummu-hpca14/repo/gem5-patches-gpummu-hpca14 patches-personal
- cd ../../gem5-gpu/.hg/
- hg clone http://gem5-gpu.cs.wisc.edu/gpummu-hpca14/repo/gem5-gpu-patches-gpummu-hpca14 patches
- cd ../gpgpu-sim/
- hg qq -c personal
- cd .hg
- hg clone http://gem5-gpu.cs.wisc.edu/gpummu-hpca14/repo/gpgpu-sim-patches-gpummu-hpca14 patches-personal
Workloads
The workloads we used came from the rodinia benchmark suite and are included in the gem5-gpu repositories. Update the workloads to the following changesets:
- benchmarks/rodinia-nocopy: 80d632b78b02
- benchmarks/libcuda: 45e1ffdfd958
- benchmarks/common: c7eaccf4f118
Extra files
Since we used gem5 full-system mode, other files including the disk image and kernel image are required. Note: These are large files (5 GB and 56 MB, respectively). The images are available below:
- kernel image (21 MB compressed, 56 MB uncompressed)
- root disk image (140 MB compressed, 530 MB uncompressed)
- benchmark disk image (1.2 GB compressed, 4.3 GB uncompressed)
Additionally, for this work we used a set of scripts to run gem5-gpu. You can obtain those by checking out the mercurial repository as follows.
- hg qclone http://gem5-gpu.cs.wisc.edu/gpummu-hpca14/repo/regression
- hg up -r a0a8f0bf8d44
- hg qpush -a
How to configure and compile
There is a script included which automatically updates all of the repositories and patch queues to a revision given a .rev file. This script can be found at regression/revisions.py. This file can be used to both retrieve and restore a set of revisions across all of the gem5-gpu repositories. We use this script to set up and save the configurations used to generate our data.
To update to a particular revision using the revisions script:
- regression/revisions.py -f <revision file> restore
- regression/revisions.py -f <revision file> retrieve
For each configuration that we tested, there is a file run.rev saved to the output directory which contains the revision information. This file can be restored using the above command to recreate the environment that configuration used.
How to run
Compiling
These instructions are the same as for gem5-gpu.
- cd gem5
- scons --default=X86 build/VI_hammer_tlb/gem5.opt PROTOCOL=VI_hammer GPGPU_SIM=True EXTRAS=../gem5-gpu/src:../gpgpu-sim
Running
We used the scripts found in the regression/ directory for running gem5-gpu. This is not a requirement. To run all of the benchmarks with the test inputs:
- mkdir run
- mkdir <some config>
- cd !$
- ../../regression/regress.py -eVI_hammer_tlb --gem5-fusion-root=../../ --config-params=--work-end-exit-count=1 -f -u rodinia-nocopy -m test
Running the workloads this way will generate output directories that mirror those found in the data in the first part of this document.
There is some documentation on the regression script in the code and in the help context (--help), but it is an unsupported script.