User Tools

Site Tools


internal:dyser-internal-tutorial

Internal Tutorial for Vertical Group

The DySER opensource project has gone through a radical change. This includes change in file structure scripts, new compiler DyCC, and some new tools.

Building the Framework

The opensplyser evaluation framework contains two pieces:

  • SW toolchain files
    • The current version we use for micro2013 is in /p/vertical/projects/dyser/install/dyser-tools-full
  • HW toolchain files
    • Check out the HW framework from git. This directory occupies aound 2.0 GB. (VCS wavefront files created in simulation may exceed 100 GB in space.)
       git clone /p/vertical/projects/dyser/svn/splyser.git <check_out_dir>

The following instructions describe how to build the toolchain.

  • Building SW toolchain:
    • [TODO] No updated build instruction now.
  • Building HW toolchain:
    • You will find 4 directories in your <checkout_dir>: dyser-1.0, dyser-2.0, hardDySER, opensparc.
      • dyser-1.0: The DySER RTL developed in OpenSPLySER paper.
      • dyser-2.0: Current developing DySER RTL, does not work with opensparc now.
      • hardDySER: toolchain that can create a not-configurable DySER from dyser schedule file.
      • opensparc: The OpenSPARC toolchain that includes the dyser integration RTL.
    • To set up the toolchain, navigate to:
      cd <checkout_dir>/opensparc

      and look for

      OpenSPARCT1.bash
    • <checkout_dir>/opensparc/OpenSPARCT1.bash: this is the file you need to modify for your environment
      • Modify line 6(HW) and 7(SW) to point to your install directory
        # ***Modification required for new install***
        # Top of opensparc portion
        export HW_ROOT="<path to your checkout dir>"
        export SW_ROOT="<path to your SW toolchain install dir>"

        The HW_ROOT should be your <check_out_dir>, and the SW_ROOT now is /p/vertical/projects/dyser/install/dyser-tools-full.

      • Modify line 23 for the scratch space of VCS object files. Make sure you can access the scratch space you assigned.
        # ***Modification required for new install***
        #Regression run-time scratchspace
        export DRMJOBSCRATCHSPACE=/scratch/vcsjobscratch
      • This file will be sourced by vcs-config.sh when running VCS simulation

Structure

In DySER opensource project, the benchmarks are maintained in HW toolchain. The main working directories are:

  • opensparc/verif/diag/c/<benchmark_sets> - This is the directory that has all the benchmarks:
    • hardDYSER - includes scalar/(scalar code) and splyser/(hand DySERized code)
    • micro13 - includes vec/ directory, which are the annotated c codes for DyCC to compile (Compiler DySERized code)
  • opensparc/regr_runs/<benchmark_sets> - This is the directory that has all the VCS simulation scripts.
    • hardDYSER - VCS simulation scripts with hand DySERized code
    • micro13 - VCS simulation scripts with compiler DySERized code

We provide GEM5 simulation script and embedded it into the makefiles under diag directory, such that we can compile and run gem5-simulation in place. On the other hand, the VCS framework searches the diag directory, compile the benchmarks and verilog RTL, and invoke VCS for RTL-level simulation.

Simulation

The DySER evaluation framework can evaluate the design on 3 different platforms:

GEM5 Simulation

The GEM5+DySER simulator is part of the SW toolchain. After installed the SW toolchain, a wrapper script run-gem5 is created. This wrapper script is then used in config.mk under opensparc/verif/diag/c/<benchmark_sets>. Here is an example to simulate fft in hardDYSER:

Navigate to:

opensparc/verif/diag/c/hardDYSER/fft/splyser

Compile and run simulation:

make run_perf GEM5=1 1W=1

A GEM5 simulation should be executed, and the output will be in m5out directory. the m5out/stats.txt is the simulation report, where the system.switch_cpus.numCycles is the number of total cycles in the interested region.

In hardDySER benchmark set, Makefile commandline options are GEM5, FPGA, 1W and 8W. The GEM5 adds -DFF flag which translates the inserted marker (which marks the interested region) in the benchmark for GEM5 simulation. The FPGA flag translates the marker for FPGA simulation. The 1W and 8W flags compiles for 1W and 8W (vectorized) version of the DySERized code.

In micro13 benchmark set, Makefile commandline options are GEM5, FPGA, AUTO_DYSER and AUTO_VEC. The GEM5 and FPGA flags are the same as above. The AUTO_DYSER flag tells compiler to DySERize the annotated region, and the AUTO_VEC flag tells the compiler to DySERize and vectorize the annotated region.

VCS Simulation

The VCS simulation framework is based on OpenSPARC sims script. Read OpenSPARC manual in opensparc/doc/OpenSPARCT1_DVGuide.pdf for more information of sims script. In brief, the sims script compiles verilog files, compile and link benchmark code, create image and run VCS simulation.

First, navigate to:

/p/vertical/dyser-huge/dyser-micro13/opensparc/regr_runs/hardDYSER/fft.splyser.8w

You will see two files in this directory:

  • run.sh: This script sources the OpenSPARCT1.bash to set up the environment, and executes the sims script. You can find variables that controls VCS output, such as wavefront dump, switching activity for power estimation, and DySER trace files.
  • fft.splyser.8w.s: This File contains the compilation flags for the compiler. The sims script will call another tool MIDAS in the OpenSPARC toolchain to read this file and compile the benchmark.

Now, execute:

bash run.sh

Several files will be created after simulation:

  • build/: This directory contains every intermediate assembly and executable.
  • log files: The vlog.log is the VCS OpenSPARC execution trace, the d0.dycore.log is the DySER core trace, the d0.dyser.log is the DySER interface trace, and the sims.log is the script execution log.

FPGA

FPGA is same as last revision.
[TODO: new FPGA tutorial]

Creating a New benchmark

Currently, there are two ways of creating a new benchmark for simulation:

  • New hand DySERized benchmark: Start from a new program, identify the DySER schedule and create the schedule with dysched tool, use genCore.py and genRom.py to generate DySER core and DySER broadcast rom, and use gcc inline assembly DySER instructions to DySERize the c code.
  • New compiler DySERized code: [TODO: c code annotation]

Hand DySERized Benchmark

The first step is to read the code, find out the computation in the program and then create a DySER schedule. Execute the software tool dytools/dysched you will see:
250

[TODO: detailed tutorial of dysched] Right click on FU to assign functions to the functional unit:

After you assign the function, FU will be colored and an edge will be created.

Double click on the FU, you will find the cursor becomes a triangle which has the same color as the FU. Now you can click on the edges to create datapath. Next, you can create primary input (PI) and primary output (PO) on the switches.

The input and output can be at any switch, and each switch can have two inputs and two outputs. (This logical view of DySER is different from the hardware implementation.) You can click on Show Port Numbers to find out what is the port number of the PIs/POs in the configuration. The port number is important and we will use it later

Use file→save config to save the config file to disk. We will use this file to generate a DySER core verilog RTL. Several limitations are:

  • The genCore script can only generate DySER core with 2-fanout switches. A 3-fanout (or more) switch will have collide signals in RTL.
  • We only have 32 input and 32 output in the OpenSPARC-DySER interface.
  • Only a subset of functional units are supported.

The saved configuration file has the dimension and a number of switch and FU declaration. With the port number in mind, we can modify the c code with the macros in opensparc/verif/diag/c/include/dyser-dlp-sparc.h:

  • DySEND(var, port) send data from register to DySER.
  • DyRECV(port, var) receive data from DySER to register.
  • DyLOAD(mem, port) load data from memory to DySER.
  • DySTORE(port, mem) store data from DySER to memory.
  • DyLOADPD(mem, port) broadcast load, loads to a special vector port.

Before modifying the code, use genCore.py to generate the RTL verilog. Because the logical view is different than the actual RTL, we need to have a port mapping to map the logical ports to the actual physical ports in RTL.

Use hardDySER/tools/genCore.py to read the created config file.

python genCore.py -f <config_file>

A <config_file>.hardDYSER.conf will be created at the same directory. In this file, you can find the splyser port mapping. Follow this port mapping to add DySER sends and receives to the code. Now you can compile and simulate in GEM5 and VCS. You can refer to the benchmarks in hardDYSER/ to see the config files (*.hardDySER) and to see how to use the macros in c.

[TODO: tutorial of genRom and broadcast load]

internal/dyser-internal-tutorial.txt · Last modified: 2013/06/07 14:15 by chenhan