From Vertical Research Group

Main: Tools

On this page... (hide)

  1. 1. Optimization based spatial architecture scheduler (October 2013)
  2. 2. DySER Framework x86 toolchain (compiler, simulator) (October 2013)
  3. 3. DySER Framework SPARC toolchain (compiler, simulator, Verilog, FPGA tutorial/bringup instructions) (June 2013)
  4. 4. ISA Power Struggles data and detailed techreport (February 2013)
  5. 5. iCompiler (February 2013)
  6. 6. Dark Silicon Models (January 2012)
  7. 7. Parsec on CUDA (July 2011)
  8. 8. Timing Speculation analysis
  9. 9. MapReduce for Cell

Our group has a strong record of making our tools on our research publicly available and support them. Below is a chronological list of tools. Email for any questions, dangling pointers etc.

1.  Optimization based spatial architecture scheduler (October 2013)

This page describes our ILP scheduler tool. It can be downloaded here. A web-version is also available at the Live demo link. source tarball

If you use this tool, cite this PLDI paper or journal :

Related Material:

2.  DySER Framework x86 toolchain (compiler, simulator) (October 2013)

We have released our compiler and simulator for the entire DySER toolchain. This page describes the x86-based toolchain which we have released as a virtual machine. Click here

Related papers:

3.  DySER Framework SPARC toolchain (compiler, simulator, Verilog, FPGA tutorial/bringup instructions) (June 2013)

This toolchain is our SPARC-based toolchain for the DySER project. The compiler is slightly less sophisticated than the x86 compiler. On the other hand this toolchain includes our entire verilog and tutorials for FPGA bringup etc. Click here

4.  ISA Power Struggles data and detailed techreport (February 2013)

We have released a techreport and all of data from our ISA Power Struggles HPCA paper. Click here

5.  iCompiler (February 2013)

One of the outcomes of our Idempotence work is our iCompiler which is an LLVM-based compiler that outputs programs that are continuous idempotent regions. The tools pages for the iCompiler is here. We would appreciate citing either of these papers if you use the tool:

6.  Dark Silicon Models (January 2012)

We have developed a web-based interface for exploring the models used in our ISCA-10 paper on Dark Silicon. A webpage devoted to this tool explains the details and hosts the web model. Click here.

Related papers:

7.  Parsec on CUDA (July 2011)

We have developed GPU implementations of some of the PARSEC benchmarks in CUDA. Specifically we have developed GPU implementations for the following benchmarks: blackcholes, fluidanimate, streamcluster, and swaptions.

It is important to note that these files are provided AS IS, and can be improved in many aspects. While we performed some performance optimization, there is more to be done. We do not claim that this is the most optimal implementation. The code is presented as a representative case of a CUDA implementation of these workloads only. It is NOT meant to be interpreted as a definitive answer to how well this application can perform on GPUs or CUDA. If any of you are interested in improving the performance of this benchmark, please let us know.

Link to paper, please cite it if you use our work -- Bibtex.

Additionally, it is important to note that this implementation was based on CUDA SDK 2.3. Future versions of CUDA allow you to implement more C++ features, which may simplify this code or allow other optimizations (in our paper, we note some of these places).

The benchmarks are being released as of July 13th, 2011. Email the following addresses to request to download this implementation:,

Important Notes:

8.  Timing Speculation analysis

Based on our models of timing speculation in our DSN paper titled: "A Unified Model for Timing Speculation: Evaluating the Impact of Technology Scaling, CMOS Design Style, and Fault Recovery Mechanism.", we have developed a web-based tool for those models for others to use. See below.

9.  MapReduce for Cell

MapReduce for Cell

MapReduce is a simple and flexible parallel programming model proposed by Google for large scale data processing in a distributed computing environment. We have developed a design and implementation of MapReduce for the Cell processor architecture.

The runtime is available for public download here. The package includes an application suite that demonstrates usage of the runtime. Applications include a word count application, distributed sort, the kmeans clustering algorithm, and several other applications.

If you use this work in your own work, we would appreciate you letting us know. If you want to cite MapReduce for Cell in your research writings, please refer to the paper for the Cell B.E. Architecture. by M. de Kruijf and K. Sankaralingam, University of Wisconsin Computer Sciences Technical Report CS-TR-2007-1625, October 2007. An extended version of this paper is also published in IBM Journal of Research and Development (Volume:53 , Issue: 5 ).

Retrieved from
Page last modified on December 13, 2013, at 01:49 PM