Introduction:
The LLVM-based DySER compiler is a collection of LLVM optimization
passes that transform the LLVM bitcode and generate optimized code for DySER
(Dynamically Specialized Execution Resources).
Related Publications:
|
Downloads:
Files | Size |
---|---|
dyser-compiler-vm.nvram | 8.5K |
dyser-compiler-vm-s001.vmdk | 1.6G |
dyser-compiler-vm-s002.vmdk | 2.0G |
dyser-compiler-vm-s003.vmdk | 2.0G |
dyser-compiler-vm-s004.vmdk | 2.0G |
dyser-compiler-vm-s005.vmdk | 2.0G |
dyser-compiler-vm-s006.vmdk | 2.0G |
dyser-compiler-vm-s007.vmdk | 2.0G |
dyser-compiler-vm-s008.vmdk | 2.0G |
dyser-compiler-vm-s009.vmdk | 2.0G |
dyser-compiler-vm-s010.vmdk | 1.9G |
dyser-compiler-vm-s011.vmdk | 64K |
dyser-compiler-vm.vmdk | 973B |
dyser-compiler-vm.vmsd | 0 |
dyser-compiler-vm.vmx | 2.6K |
dyser-compiler-vm.vmxf | 272B |
README:
Introduction ------------ We are releasing DySER Toolchain including DySER compiler and simulator as a VMware Virtual Machine. It is available at http://research.cs.wisc.edu/vertical/dyser-compiler/. It is about 20GB. VM User ------- username: vertical password: vertical This user has sudo access DySER Tools Install ---------------------- The released DySER tools including compiler, simulator and other toolchains are in /opt/dyser-tools/. Install Directory Structure --------------------------- /opt/dyser-tools/bu/ - X86 binutils (as, ld) /opt/dyser-tools/gcc/ - X86 gcc 4.7.1 /opt/dyser-tools/gem5/gem5.opt - X86+DySER simulator /opt/dyser-tools/llvm - LLVM /opt/dyser-tools/slicer - DySER compiler /opt/dyser-tools/dg/x86/dragonegg.so - LLVM plugin for X86 GCC /opt/dyser-tools/dytools - Various tools for compiling and simulating /dycc - Driver program to compile /run-gem5 - To invoke the simulator DySER Tools Source ------------------ The source for the DySER tools is in /home/vertical/dyser-tools/. The toolchain is already built and installed in /opt/dyser-tools/. It is not necessary to build the toolchain again. However, if you want to build dyser tools from source, please follow these steps. Step 1: Change directory to /home/vertical/dyser-tools $> cd /home/vertical/dyser-tools Step 2: (Optional) Remove any old build. $> rm -Rf .build Step 3: Invoke Install.sh as follows $> ./Install.sh --prefix ${PREFIX} #${PREFIX} is the path where you want to install dyser-tools If the prefix is not provided, the tools will be installed in the subdirectory named .install. in the current path. This will build all the tools needed including gcc, llvm, dragonegg (llvm plugin for gcc to generate llvm IR), slicer (dyser compiler passes), binutils (assembler, linker etc.,), gem5 (simulator) and various tools such as run-gem5, dyconfcc (to generate configuration bits for dyser), dysched (a gui tool to edit/view the dyser schedule) etc., Running DySER Benchmarks ------------------------ DySER benchmarks (Modified version of Parboil benchmarks) are in /home/vertical/dyser-benchmarks. To compile and simulate them, change directory to /home/vertical/dyser-benchmarks/expts. If you build the toolchain yourself, please change INSTALL_DIR variable in run-expts.sh and DY_INSTALL variable in /home/vertical/dyser-benchmarks/parboil/make-auto.config and /home/vertical/dyser-benchmarks/parboil/make.config. Invoke run-expts.sh script as shown below: $> clean=yes bash run-expts.sh This will compile and create binaries for the benchmarks in /home/vertical/dyser-benchmarks/parboil/ The binaries will be in /home/vertical/dyser-benchmarks/parboil/binary/ In addition, it also creates a bash script called cs.jobs. Invoking this script will simulate the binaries under GEM5. Without changing the make files, it will use the pre built dyser tools in /opt/dyser-tools. If you want to change it to use the tools that you build, please change INSTALL_DIR and DY_INSTALL variable as mentioned above. $> bash cs.jobs This will run the binaries with small input set under GEM5 simulator.s OOO model. Use extract-results.sh in the same directory to create a result table with the cycle count. $> bash extract-results.sh Benchmarks scalar dyser auto (Note: These results are different from our published results in PACT 2013 paper [Breaking SIMD shackles with an exposed flexible microarchitecture and the Access Execute PDG, in PACT 2013], because our compiler is improved since then and the input size is smaller in this run.) A Small tutorial on how to use the DySER Compiler ------------------------------------------------- We will use a simple kernel that computes the dot product of two vectors as the running example. The various implementation of the dot product is in /home/vertical/dyser-benchmarks/dotp directory. The directory contents are scalar.c - scalar implemenation of the dotproduct kernel. auto.c - a version of dotproduct with pragmas for DySER compiler. Using the pragma, dyser compiler can target DySER automatically. autovec.c - a version of dotproduct with pragmas for DySER compiler and enables vectorization. dyser.c - a programmer optimized version that targets dyser. Makefile - Makefile that compiles and simulates the dotproduct kernels. Scalar Kernel ------------- float dotp(float *a, float *b, int size) { float result = 0; int i; for (i = 0; i < size; ++i) { result += a[i] * b[i]; } return result; } Using the compiler to auto "DySERize" the kernel: ------------------------------------------------- The steps to generate dyser code automatically using the compiler are 1. Include "dyser-dlp.h" in /opt/dyser-tools/include. This has dyser pragmas. 2. Insert DYSER_LOOP("") pragma inside the loop to let the compiler know that the loop is the candidate for dyserization. 3. Compile the kernel with the dyser compiler driver "dycc" in /opt/dyser-tools/dytools/dycc. #include "dyser-dlp.h" int dotp(int *a, int *b, int size) { int result = 0, i; for (i = 0; i < size; ++i) { // DySER Loop Pragma. // Use :peeled{1} to shut-off the in-build loop-peeler, because it // does not work well with other transformations yet. DYSER_LOOP(" :peeled{1}"); result += a[i] * b[i]; } return result; } See the auto.c in the dotp directory for the auto dyserized C code. See the autovec.c for the auto-dyserization with vectorization. Commands for compiling and running DySERized binary --------------------------------------------------- $> export PATH=/opt/dyser-tools/dytools/:$PATH $> dycc -o auto.out auto.c -I/opt/dyser-tools/include -static -DAUTO_DYSER $> run-gem5 auto.out DySERizing a region instead of loop body ---------------------------------------- Sometimes you may prefer to dyserize a region instead of loop body either because the loop body is larger than the dyser or you want fine control over what operations should be in DySER. To dyserize a region, we can use DYSER_REGION_START() and DYSER_REGION_END() macros. They are defined in install/include/dyser-dlp.h, along with DYSER_LOOP() macro. These macros mark the start and end of the candidate region for DySER. See an example below. // dyser-dlp.h has the definitions for DYSER_REGION_START and // DYSER_REGION_END macros #include "dyser-dlp.h" int foo_dyser_region(int x1, int x2, int x3) { // BEGIN the DySER region. // First argument is the region id, // second argument is a char pointer, it is not used. DYSER_REGION_START(0 /* REGION_ID */, (char*)0); x3 += x1 + x2; x3 *= x1; x3 -= x2; // END the DySER region // First argument is the region id // Second argument is a region output. // This second argument is to prevent the LLVM to hoist this macro, // above the dyser region. DYSER_REGION_END(0 /*REGION_ID*/, x3 /*REGION_OUTPUT */); return x3; } //=============== To generate the dyser region (assuming you have /opt/dyser-tools/dytools in your path), $> dycc -c -o foo.o foo.c -DAUTO_DYSER -I/opt/dyser-tools/include -O3 -static TPACF in the parboil benchmarks is an example of this way of "dyserization". Manually DySERized Kernel (Using dyser intrinsics) -------------------------------------------------- Please see dyser.c, dyser-sched.txt in the dotp product. To manually dyserize the kernel, please follow the steps below: 1. Include "dyser-dlp.h" from /opt/dyser-tools/include. This header file has dyser pragmas, dyser intrinsics, etc. 2. Manually partition the loop body into the memory slice (instructions that compute memory address) and computation slice (everything else) for the dot product kernel. float dotp(float *a, float *b, int size) { float result = 0; int i; for (i = 0; i < size; ++i) { //Memory access will execute in main processor float a_val = a[i]; float b_val = b[i]; // dyser is stateless -- no loop carried dependence -- must feed previous // result as input float prev_result = result; //purely computation. will execute in dyser tmp = a_val + b_val; output = prev_result + tmp; // Output result = output; } return result; } 3. Schedule the computation slice to DySER using dysched GUI tool in /opt/dyser-tools/dytools/dysched. dysched tool is GUI tool that makes it easier to create a schedule file. You can create, edit dyser schedules with the tool. See the dyser-sched.txt file in the dotp directory for the dyser schedule of the dotproduct. More examples for dyser schedules in /home/vertical/dyser-benchmarks/ directory. 4. Generate dyserconfig.h, which will have dyser configuration, using dyconfcc (available at /opt/dyser-tools/dytools) See the Makefile in the dotp directory on how to invoke dyconcc to create dyserconfig.h. The generated dyserconfig.h will have a static function called dyInit() which will initialize dyser. 5. include "dyserconfig.h" in the C program. 6. Insert call to DySEND, DyRECV, DyLOAD and DySTORE intrinsics to send/receive data from dyser. // Switch to dotproduct schedule (DyID defined in dyserconfig.h) DySwitch(DyID); for (i = 0; i < size; ++i) { DyLOAD(a[i], |