The aim of the new world order build/test framework is to separate the functionality of the frame work and the component related build/test scripts. This is achieved by having a clear semantics on how the component related scripts will be invoked.
It should have following features –
Ease of adding new components without modifying the build system code.
Ease of adding new platforms preferable without modifying the build system code.
Building multiple components at the same time if they are independent and follow the dependencies when required..
Log the build/test related information in the database.
Should transfer the required externals for a particular component to the remote build machine. This can even include externals like Java. Flexibility on this feature is debatable.
Framework should be smart enough to handle the operating system signals and should report failure if any of the scripts is killed.
Web interface for component builders/testers.
Remote Machine(s)
Machine where the platform specific tasks of your build/test executes. If you
have submitted a job that involves more than one platform, there will be
more than one remote machine (one per platform) where your jobs get executed.
Glue scripts
Set of scripts written by the
user to build/test the component. Glue-Scripts bind the two entities
(Framework scripts and component related scripts) together.
Framework/Framework scripts
Set of scripts that
control the execution of the Glue scripts. User does not need to know much about
these scripts other than the useful interfaces provided by the framework. Build
system framework expects specific code of conduct from the Glue-Scripts but is
not concerned with how they achieve their tasks.
The goal for having the framework is to make it simple to do simple builds, without having to understand the complexity available for more complex builds/tests.
For example, one can build a product foo by fetching it's sources from the web, untarring the sources and running make (e.g., "ncftp foo.edu:/foo.tar.gz; tar zxvf foo.tar.gz; configure; make") doesn't need to do anything special for different platforms. So this can behave just as it would on a single system. You write each of those commands in a script, specify them for the corresponding build steps in your submit file, add a remote_post to tar up whatever build output you want to save, and you're done.
Thus the build system functionality is divided into two
These semantics are shown below. Lines written in "blue" are the functionalities provided by the build system. Lines written in "black" are the "Glue-Scripts" provided by the component builder (user). Lines in "green" are remarks or more information available to the user.
Submit side of the framework is mainly concerned with doing the preliminary tasks done before the component build/test is actually dispatched to the remote machine(s) and doing the post build tasks after the component build/test results arrive back from the remote machine(s).
User initiates the build/test by passing the NMI submit file to the framework. This NMI submit file has all the information that the framework requires to successfully build/test the component. A sample NMI submit file is shown below in examples. It contains the information about the component and where the glue scripts are and what glue script to invoke at a given time. Once this information is parsed by the framework it creates an initial database record for this build/test run and gets a runid. It also creates a working directory where the build/test related information is stored. It creates the required sub-directories based on the platform list. Once these general tasks are done framework creates DAG of condor jobs for the requested build/test run. Some of these jobs are executed in the submit machine and some on the remote machine(s). Once the stage is set framework fetches component sources as instructed in the "sources" option of the NMI submit file and places them in a known location. It then invokes the pre_all script for the component if declared. Framework also invokes the platform_pre and platform_post scripts if declared at appropriate time.
Once the condor job running on the remote machine(s) completes (successfully or unsuccessfully) framework invokes the post_all script.
Framework also updates the database about the results of each subtasks for your builds/test.
|
Read the NMI submit file Create an initial database record and get a runid Fetch sources .............. pre_all .............. for each platform { cp common/* <arch>/ cd <arch> platform_pre .............. submit vanilla job update database platform_post .............. update database } post_all .............. update database display html |
Fetches sources in $Workspace/$gid/common Wakes up in $Workspace/$gid/common Wakes up in $Workspace/$gid/$platform Wakes up in $Workspace/$gid/$platform Wakes up in $Workspace/$gid/common |
Once the condor job (job to build the component on a targeted platform) comes to the remote machine, the very first script to be invoked is the framework wrapper script that actually executes the glue scripts. User can split the build/test into several subtasks of his/her choice in the remote_declare. If the user needs to take any action before declaring the remote tasklist he/she can do it in remote_pre_declare. Similarly if there are any tasks to be executed before the first build/test task but after the remote_declare, this can be done in remote_pre.
Then for every subtask declared by the user the build system runs them saves the statistics and results for the subtasks and and streams the output back to the submit machine. When the component is built/tested and all the subtasks are completed remote_post script is run. This essentially serves for the tasks like packaging the component or correlating the test results in a particular way that is not a part of the build/test but is required for the component distribution. The framework expects that the glue scripts create a single tar file called "results.tar.gz" that contains everything that is to be transferred back to the submit side. Stdout and Stderr for all the subtasks are transferred back by default. Any other user log files created should be included in results.tar.gz by the user.
| Component
build wrapper remote_pre_declare .............. remote_declare .............. remote_pre .............. if declare_list is empty { insert a special noop task in the tasklist } for each task in declare_list { remote_task .............. record task runtime save error and output save return status } remote_post .............. send back all results |
Refer to examples in case of doubts
| Parameters | Description | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| description | Component description string | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| project | Name of the project for which you are building or testing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| project_release | Version of the Project for which you are building or testing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| component | Name of the component you are building or testing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| component_version | Version of the Component you are building or testing | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| run_type |
<BUILD | TEST> If nothing is specified it defaults to UNKNOWN in the database. It is always good to specify the run_type in order to distinguish builds from tests. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| sources (deprecated) | This has been deprecated. Use parameter "inputs" (see below) instead. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| inputs |
List of sources to be fetched. You must also fetch the glue separately if it is not available with the sources. This list is actually a list of NMI submit files each telling how to fetch the individual sources. For example you want to fetch glue from CVS repository and component sources from web, then write the instructions in two separate files say glue.cvs and source.ftp and your sources option should look like - sources = glue.cvs, source.ftp Note that if you use multiple inputs and they download a file with the same name, you will end up with just one of the copies of the file, and there is no guarantee which one it will be.
Parameters available for different methods for fetching sources:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| platforms |
List of platforms to build on. To get the list of available platforms from condor run the following command from the submit machine (grandcentral.cs.wisc.edu) - /usr/local/bin/condor_status -l | grep nmi_platform Specify "platforms" as a comma separated list you want to build/test on . |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| notify | Email notification list. Put each email separated by commas. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| priority | Priority of the users own jobs relative to each other | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
prereqs prereqs_<platform> |
Comma separated list of the software that you will require to be present on the system in order to build/test your component. For example you may require a specific version of java, binutils, etc to build/test your component. Prereqs should be specified in format prereq_name-prereq_version (Example: java-1.4.2_05) To list the available prereqs on the remote machines run following command - condor_status -l | grep has_ This will show the locations where the prereq software is installed. Grab the prereq naming format and put it in your list. If you are building/testing on multiple platforms and need different version of prereq for different platforms use the paramter prereqs_<platform> and replace the <platform> with actual platform name. For example you need gcc-2.95.3 on platform sun4u_sol_5.9 and gcc-3.2.2 on all the other platforms your NMI submit file should look like this -
prereqs = gcc-3.2.2, ....... |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
pre_all platform_pre platform_post post_all remote_pre_declare remote_declare remote_pre remote_task remote_post remote_post_always |
Refer this section | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
pre_all_args platform_pre_args platform_post_args post_all_args remote_pre_declare_args remote_declare_args remote_pre_args remote_task_args remote_post_args remote_post_always_args |
Arguments to the respective glue scripts | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| +<condor_paramter> |
If you would like to pass a specific parameter to condor then add a "+"
before it. +preiodic_remove = true |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ++<user_paramter> |
If you would like to pass a user defined parameter to condor then add a "++"
before it. ++foo = bar |
Framework makes some of the information available in the environment. This information can be used in the glue scripts. Available environment variables are -
| ENV Variable | Local/Remote | Description |
| NMI_<parameter> | Both | All the parameters used in the NMI submit file are available in the environment in the format NMI_<parameter>. For example the parameter "component" is available to glues scripts as $NMI_component environment variable. |
| NMI_PLATFORM | Remote | Name of the current platform. |
| _NMI_TASKNAME | Remote | Name of the remote (user defined) task |
| _NMI_STEP_FAILED | Remote | If this is there in the environment then the last NMI task failed |
| NMI_BIN | Local | Bin directory where the nmi executables are located |
| _NMI_NMIDIR | Local | Your workspace directory |
| _NMI_USERDIR | Local | Userdir in the workspace |
| _NMI_DBLOGDIR | Local | Directory where database logs are stored |
Framework allows basic macro substitution in the NMI submit file. If you define an environmnet variable $_NMI_FOO you can use macro $(FOO).
There is a default macro available for use by the system
| Macro | Description |
| $(USER) | Framework substitutes this with your login name. |
| [$prompt] export PATH=$PATH:/nmi/bin |
Assuming that your NMI submit file is named "cmdfile" file to build the component run -
| [$prompt] nmi_submit cmdfile |
A very basic web interface is available for the users to see the results of their build/test run. It is available at http://grandcentral.cs.wisc.edu/build/. We are working on making it more user friendly by providing more filters for easy access.
For every run that user submits the framework creates record in the "Run" table and assigns a runid to it. You can click the link corresponding to Run table to see information related to this record. Framework also creates a record for every task that is run like fetching the sources, pre_all script, platform_pre, tasks defined in the tasklist.nmi, etc. Each task has its own unique taskid. To get the information for the tasks related to a particular Run just look at the tasks which have the runid. The web interface also provides means to access the logs, output and error over the web.
Local Builder is a stripped down version of the framework. It enables the user to build/test their components on the same system where it is run from. This also means that users cannot build/test the binaries for platforms other than the local system. Conceptually it works on similar principles as the framework i.e. it uses the same glue scripts however it does not rely on jobs being run by condor. All the glue and framework related scripts are invoked locally. This enables the users to leverage on the work done by the NMI group and build/test the binaries on their own system. So all the discussion mentioned above for the framework also applies to the local builder unless otherwise stated.
| [$prompt]
export PATH=$PATH:/nmi/bin [$prompt] nmi_run_local cmdfile |
Since the local builder does not rely on the database being present, information for the local builds/tests are not logged into the database. nmi_run_local creates a local working directory (hence forth called $workdir) in the format <username>_<machine name>_<epoche secs>_$$. This $workdir is created in the current directory where the build/test is fired from. The results for your run can be found in $workdir/userdir/<platform>/results.tar.gz
Examples below show the command description files for GsiOpenssh and condor. The scripts used to build the products are also linked from the description file.
NMI submit file (cmdfile) - GsiOpenssh |
| description
= Gsissh build for nmi project = nmi project_release = 6.0 component = gsi_openssh component_version = 3.5 sources = gsissh-glue.cvs, gsissh-prereq.scp, gsissh-compat.ftp, gsissh-setup.ftp, gsissh-src.ftp platform_pre = nwo/glue/gsissh/build/platform_pre platform_pre_args = /space/parag/nmi-6.0/bundles remote_declare = nwo/glue/gsissh/build/remote_declare remote_task = nwo/glue/gsissh/build/remote_task remote_task_args = nondebug-bins remote_post = nwo/glue/gsissh/build/remote_post platform_post = nwo/glue/gsissh/build/platform_post platform_post_args = /space/parag/nmi-6.0/bundles platforms = x86_rh_9,
x86_rh_7.2, sun4u_sol_5.9 |
NMI submit file (cmdfile) - condor |
| description = nightly condor 6.7.x build run project = condor project_release = 6, 7, x component = condor component_version = 6, 7, x sources = condor_srcsfile-BUILD-V6_7-branch-2004-8-25 pre_all = nmi_glue/build/pre_all remote_declare = nmi_glue/build/remote_declare remote_pre = nmi_glue/build/remote_pre remote_task = nmi_glue/build/remote_task remote_post = nmi_glue/build/remote_post platform_post = nmi_glue/build/platform_post post_all = nmi_glue/build/post_all platforms = x86_rh_9, x86_rh_8.0, x86_rh_7.2, sun4u_sol_5.9
prereqs = perl-5.8.5, tar-1.14, patch-2.5.4, m4-1.4.1, binutils-2.15,
flex-2.5.4a, make-3.80, byacc-1.9, bison-1.25, gzip-1.2.4, gcc-2.95.3,
coreutils-5.2.1 |
You can do a CVS checkout of the glue scripts for NMI components by -
[$prompt] export CVSROOT=":ext:<username>@<cvs server>:/p/condor/repository/nmi" [$prompt] export CVS_RSH="ssh" [$prompt] export CVS_SERVER="/p/condor/public/bin/auth-cvs" [$prompt] cvs co nwo/glue |
Please replace the username with your login name and "cvs server" with the CSL machine name where you ran the "stashticket" command. You can find more details about this here