Introduction:
The LLVM-based DySER compiler is a collection of LLVM optimization
passes that transform the LLVM bitcode and generate optimized code for DySER
(Dynamically Specialized Execution Resources).
Related Publications:
|
Downloads:
| Files | Size |
|---|---|
| dyser-compiler-vm.nvram | 8.5K |
| dyser-compiler-vm-s001.vmdk | 1.6G |
| dyser-compiler-vm-s002.vmdk | 2.0G |
| dyser-compiler-vm-s003.vmdk | 2.0G |
| dyser-compiler-vm-s004.vmdk | 2.0G |
| dyser-compiler-vm-s005.vmdk | 2.0G |
| dyser-compiler-vm-s006.vmdk | 2.0G |
| dyser-compiler-vm-s007.vmdk | 2.0G |
| dyser-compiler-vm-s008.vmdk | 2.0G |
| dyser-compiler-vm-s009.vmdk | 2.0G |
| dyser-compiler-vm-s010.vmdk | 1.9G |
| dyser-compiler-vm-s011.vmdk | 64K |
| dyser-compiler-vm.vmdk | 973B |
| dyser-compiler-vm.vmsd | 0 |
| dyser-compiler-vm.vmx | 2.6K |
| dyser-compiler-vm.vmxf | 272B |
README:
Introduction
------------
We are releasing DySER Toolchain including DySER compiler and
simulator as a VMware Virtual Machine. It is available at
http://research.cs.wisc.edu/vertical/dyser-compiler/. It is about
20GB.
VM User
-------
username: vertical
password: vertical
This user has sudo access
DySER Tools Install
----------------------
The released DySER tools including compiler, simulator and other
toolchains are in /opt/dyser-tools/.
Install Directory Structure
---------------------------
/opt/dyser-tools/bu/ - X86 binutils (as, ld)
/opt/dyser-tools/gcc/ - X86 gcc 4.7.1
/opt/dyser-tools/gem5/gem5.opt - X86+DySER simulator
/opt/dyser-tools/llvm - LLVM
/opt/dyser-tools/slicer - DySER compiler
/opt/dyser-tools/dg/x86/dragonegg.so - LLVM plugin for X86 GCC
/opt/dyser-tools/dytools - Various tools for compiling and
simulating
/dycc - Driver program to compile
/run-gem5 - To invoke the simulator
DySER Tools Source
------------------
The source for the DySER tools is in /home/vertical/dyser-tools/. The
toolchain is already built and installed in /opt/dyser-tools/. It
is not necessary to build the toolchain again.
However, if you want to build dyser tools from source, please follow
these steps.
Step 1: Change directory to /home/vertical/dyser-tools
$> cd /home/vertical/dyser-tools
Step 2: (Optional) Remove any old build.
$> rm -Rf .build
Step 3: Invoke Install.sh as follows
$> ./Install.sh --prefix ${PREFIX}
#${PREFIX} is the path where you want to install dyser-tools
If the prefix is not provided, the tools will be installed in the
subdirectory named .install. in the current path.
This will build all the tools needed including gcc, llvm, dragonegg
(llvm plugin for gcc to generate llvm IR), slicer (dyser compiler
passes), binutils (assembler, linker etc.,), gem5 (simulator) and
various tools such as run-gem5, dyconfcc (to generate configuration
bits for dyser), dysched (a gui tool to edit/view the dyser
schedule) etc.,
Running DySER Benchmarks
------------------------
DySER benchmarks (Modified version of Parboil benchmarks) are in
/home/vertical/dyser-benchmarks.
To compile and simulate them, change directory to
/home/vertical/dyser-benchmarks/expts.
If you build the toolchain yourself, please change INSTALL_DIR
variable in run-expts.sh and DY_INSTALL variable in
/home/vertical/dyser-benchmarks/parboil/make-auto.config and
/home/vertical/dyser-benchmarks/parboil/make.config.
Invoke run-expts.sh script as shown below:
$> clean=yes bash run-expts.sh
This will compile and create binaries for the benchmarks in
/home/vertical/dyser-benchmarks/parboil/
The binaries will be in /home/vertical/dyser-benchmarks/parboil/binary/
In addition, it also creates a bash script called cs.jobs. Invoking
this script will simulate the binaries under GEM5.
Without changing the make files, it will use the pre built dyser
tools in /opt/dyser-tools. If you want to change it to use the
tools that you build, please change INSTALL_DIR and DY_INSTALL
variable as mentioned above.
$> bash cs.jobs
This will run the binaries with small input set under GEM5
simulator.s OOO model. Use extract-results.sh in the same directory
to create a result table with the cycle count.
$> bash extract-results.sh
Benchmarks scalar dyser auto
(Note: These results are different from our published results in PACT
2013 paper [Breaking SIMD shackles with an exposed flexible
microarchitecture and the Access Execute PDG, in PACT 2013], because
our compiler is improved since then and the input size is smaller in
this run.)
A Small tutorial on how to use the DySER Compiler
-------------------------------------------------
We will use a simple kernel that computes the dot product of two
vectors as the running example. The various implementation of the
dot product is in /home/vertical/dyser-benchmarks/dotp
directory. The directory contents are
scalar.c - scalar implemenation of the dotproduct kernel.
auto.c - a version of dotproduct with pragmas for DySER compiler.
Using the pragma, dyser compiler can target DySER automatically.
autovec.c - a version of dotproduct with pragmas for DySER compiler and
enables vectorization.
dyser.c - a programmer optimized version that targets dyser.
Makefile - Makefile that compiles and simulates the dotproduct kernels.
Scalar Kernel
-------------
float dotp(float *a, float *b, int size)
{
float result = 0; int i;
for (i = 0; i < size; ++i) {
result += a[i] * b[i];
}
return result;
}
Using the compiler to auto "DySERize" the kernel:
-------------------------------------------------
The steps to generate dyser code automatically using the compiler are
1. Include "dyser-dlp.h" in /opt/dyser-tools/include. This has
dyser pragmas.
2. Insert DYSER_LOOP("") pragma inside the loop to let the compiler
know that the loop is the candidate for dyserization.
3. Compile the kernel with the dyser compiler driver "dycc" in
/opt/dyser-tools/dytools/dycc.
#include "dyser-dlp.h"
int dotp(int *a, int *b, int size)
{
int result = 0, i;
for (i = 0; i < size; ++i) {
// DySER Loop Pragma.
// Use :peeled{1} to shut-off the in-build loop-peeler, because it
// does not work well with other transformations yet.
DYSER_LOOP(" :peeled{1}");
result += a[i] * b[i];
}
return result;
}
See the auto.c in the dotp directory for the auto dyserized C code.
See the autovec.c for the auto-dyserization with vectorization.
Commands for compiling and running DySERized binary
---------------------------------------------------
$> export PATH=/opt/dyser-tools/dytools/:$PATH
$> dycc -o auto.out auto.c -I/opt/dyser-tools/include -static -DAUTO_DYSER
$> run-gem5 auto.out
DySERizing a region instead of loop body
----------------------------------------
Sometimes you may prefer to dyserize a region instead of loop body
either because the loop body is larger than the dyser or
you want fine control over what operations should be in DySER.
To dyserize a region, we can use DYSER_REGION_START() and
DYSER_REGION_END() macros. They are defined in
install/include/dyser-dlp.h, along with DYSER_LOOP() macro.
These macros mark the start and end of the candidate region for
DySER. See an example below.
// dyser-dlp.h has the definitions for DYSER_REGION_START and
// DYSER_REGION_END macros
#include "dyser-dlp.h"
int foo_dyser_region(int x1, int x2, int x3)
{
// BEGIN the DySER region.
// First argument is the region id,
// second argument is a char pointer, it is not used.
DYSER_REGION_START(0 /* REGION_ID */, (char*)0);
x3 += x1 + x2;
x3 *= x1;
x3 -= x2;
// END the DySER region
// First argument is the region id
// Second argument is a region output.
// This second argument is to prevent the LLVM to hoist this macro,
// above the dyser region.
DYSER_REGION_END(0 /*REGION_ID*/, x3 /*REGION_OUTPUT */);
return x3;
}
//===============
To generate the dyser region (assuming you have
/opt/dyser-tools/dytools in your path),
$> dycc -c -o foo.o foo.c -DAUTO_DYSER -I/opt/dyser-tools/include -O3 -static
TPACF in the parboil benchmarks is an example of this way of "dyserization".
Manually DySERized Kernel (Using dyser intrinsics)
--------------------------------------------------
Please see dyser.c, dyser-sched.txt in the dotp product.
To manually dyserize the kernel, please follow the steps below:
1. Include "dyser-dlp.h" from /opt/dyser-tools/include. This header
file has dyser pragmas, dyser intrinsics, etc.
2. Manually partition the loop body into the memory slice
(instructions that compute memory address) and computation slice
(everything else) for the dot product kernel.
float dotp(float *a, float *b, int size)
{
float result = 0; int i;
for (i = 0; i < size; ++i) {
//Memory access will execute in main processor
float a_val = a[i];
float b_val = b[i];
// dyser is stateless -- no loop carried dependence -- must feed previous
// result as input
float prev_result = result;
//purely computation. will execute in dyser
tmp = a_val + b_val;
output = prev_result + tmp;
// Output
result = output;
}
return result;
}
3. Schedule the computation slice to DySER using dysched GUI tool in
/opt/dyser-tools/dytools/dysched.
dysched tool is GUI tool that makes it easier to create a schedule
file. You can create, edit dyser schedules with the tool.
See the dyser-sched.txt file in the dotp directory for the dyser
schedule of the dotproduct.
More examples for dyser schedules in /home/vertical/dyser-benchmarks/ directory.
4. Generate dyserconfig.h, which will have dyser configuration, using
dyconfcc (available at /opt/dyser-tools/dytools)
See the Makefile in the dotp directory on how to invoke dyconcc to create dyserconfig.h.
The generated dyserconfig.h will have a static function called
dyInit() which will initialize dyser.
5. include "dyserconfig.h" in the C program.
6. Insert call to DySEND, DyRECV, DyLOAD and DySTORE intrinsics to
send/receive data from dyser.
// Switch to dotproduct schedule (DyID defined in dyserconfig.h)
DySwitch(DyID);
for (i = 0; i < size; ++i) {
DyLOAD(a[i],
|