This section contains the instructions for installing HTCondor. The installation will have a default configuration that can be customized. Sections of the manual that follow this one explain customization.
Read this entire section before starting installation.
Please read the copyright and disclaimer information in section on page of the manual. Installation and use of HTCondor is acknowledgment that you have read and agree to the terms.
The platform-dependent HTCondor files are currently available from two sites. The main site is at the University of Wisconsin-Madison, Madison, Wisconsin, USA. A second site is the Istituto Nazionale di Fisica Nucleare Sezione di Bologna, Bologna, Italy. Please choose the site nearest to you.
Make note of the location of where you download the binary into.
The HTCondor binary distribution is packaged in the following files and directories:
Before you install, please consider joining the condor-world mailing list. Traffic on this list is kept to an absolute minimum. It is only used to announce new releases of HTCondor. To subscribe, send a message to email@example.com with the body:
Before installation, make a few important decisions about the basic layout of your pool. The decisions answer the questions:
One machine in your pool must be the central manager. Install HTCondor on this machine first. This is the centralized information repository for the HTCondor pool, and it is also the machine that does match-making between available machines and submitted jobs. If the central manager machine crashes, any currently active matches in the system will keep running, but no new matches will be made. Moreover, most HTCondor tools will stop working. Because of the importance of this machine for the proper functioning of HTCondor, install the central manager on a machine that is likely to stay up all the time, or on one that will be rebooted quickly if it does crash.
Also consider network traffic and your network layout when choosing your central manager. All the daemons send updates (by default, every 5 minutes) to this machine. Memory requirements for the central manager differ by the number of machines in the pool. A pool with up to about 100 machines will require approximately 25 Mbytes of memory for the central manager's tasks. A pool with about 1000 machines will require approximately 100 Mbytes of memory for the central manager's tasks.
A faster CPU will improve the time to do matchmaking.
HTCondor can restrict the machines allowed to submit jobs. Alternatively, it can allow any machine the network allows to connect to a submit machine to submit jobs. If the HTCondor pool is behind a firewall, and all machines inside the firewall are trusted, the HOSTALLOW_WRITE configuration entry can be set to *. Otherwise, it should be set to reflect the set of machines permitted to submit jobs to this pool. HTCondor tries to be secure by default, so out of the box, the configuration file ships with an invalid definition for this configuration variable. This invalid value allows no machine to connect and submit jobs, so after installation, change this entry. Look for the entry defined with the value YOU_MUST_CHANGE_THIS_INVALID_CONDOR_CONFIGURATION_VALUE.
Start up the HTCondor daemons as the Unix user root. Without this, HTCondor can do very little to enforce security and policy decisions. You can install HTCondor as any user, however there are both serious security and performance consequences. Please see section 3.6.13 on page in the manual for the details and ramifications of running HTCondor as a Unix user other than root.
Either root will be administering HTCondor directly, or someone else would be acting as the HTCondor administrator. If root has delegated the responsibility to another person, keep in mind that as long as HTCondor is started up as root, it should be clearly understood that whoever has the ability to edit the condor configuration files can effectively run arbitrary programs as root.
To simplify installation of HTCondor, create a Unix user named condor on all machines in the pool. The HTCondor daemons will create files (such as the log files) owned by this user, and the home directory can be used to specify the location of files and directories needed by HTCondor. The home directory of this user can either be shared among all machines in your pool, or could be a separate home directory on the local partition of each machine. Both approaches have advantages and disadvantages. Having the directories centralized can make administration easier, but also concentrates the resource usage such that you potentially need a lot of space for a single shared home directory. See the section below on machine-specific directories for more details.
Note that the user condor must not be an account into which a person can log in. If a person can log in as user condor, it permits a major security breach, in that the user condor could submit jobs that run as any other user, providing complete access to the user's data by the jobs. A standard way of not allowing log in to an account on Unix platforms is to enter an invalid shell in the password file.
If you choose not to create a user named condor, then you must specify either via the CONDOR_IDS environment variable or the CONDOR_IDS config file setting which uid.gid pair should be used for the ownership of various HTCondor files. See section 3.6.13 on UIDs in HTCondor on page in the Administrator's Manual for details.
HTCondor needs a few directories that are unique on every machine in your pool. These are spool, log, and execute. Generally, all three are subdirectories of a single machine specific directory called the local directory (specified by the LOCAL_DIR macro in the configuration file). Each should be owned by the user that HTCondor is to be run as.
If you have a Unix user named condor with a local home directory on each machine, the LOCAL_DIR could just be user condor's home directory (LOCAL_DIR = $(TILDE) in the configuration file). If this user's home directory is shared among all machines in your pool, you would want to create a directory for each host (named by host name) for the local directory (for example, LOCAL_DIR = $(TILDE)/hosts/$(HOSTNAME)). If you do not have a condor account on your machines, you can put these directories wherever you'd like. However, where to place the directories will require some thought, as each one has its own resource needs:
Generally speaking, it is recommended that you do not put these directories (except lock) on the same partition as /var, since if the partition fills up, you will fill up /var as well. This will cause lots of problems for your machines. Ideally, you will have a separate partition for the HTCondor directories. Then, the only consequence of filling up the directories will be HTCondor's malfunction, not your whole machine.
In general, there are a number of places that HTCondor will look to find its configuration files. The first file it looks for is the global configuration file. These locations are searched in order until a configuration file is found. If none contain a valid configuration file, HTCondor will print an error message and exit:
If you specify a file in the CONDOR_CONFIG environment variable and there's a problem reading that file, HTCondor will print an error message and exit right away, instead of continuing to search the other options. However, if no CONDOR_CONFIG environment variable is set, HTCondor will search through the other options.
Next, HTCondor tries to load the local configuration file(s). The only way to specify the local configuration file(s) is in the global configuration file, with the LOCAL_CONFIG_FILE macro. If that macro is not set, no local configuration file is used. This macro can be a list of files or a single file.
Every binary distribution contains a contains five subdirectories: bin, etc, lib, sbin, and libexec. Wherever you choose to install these five directories we call the release directory (specified by the RELEASE_DIR macro in the configuration file). Each release directory contains platform-dependent binaries and libraries, so you will need to install a separate one for each kind of machine in your pool. For ease of administration, these directories should be located on a shared file system, if possible.
All of the files in the bin directory are programs the end HTCondor users should expect to have in their path. You could either put them in a well known location (such as /usr/local/condor/bin) which you have HTCondor users add to their PATH environment variable, or copy those files directly into a well known place already in the user's PATHs (such as /usr/local/bin). With the above examples, you could also leave the binaries in /usr/local/condor/bin and put in soft links from /usr/local/bin to point to each program.
All of the files in the sbin directory are HTCondor daemons and agents, or programs that only the HTCondor administrator would need to run. Therefore, add these programs only to the PATH of the HTCondor administrator.
All of the files in the libexec directory are HTCondor programs that should never be run by hand, but are only used internally by HTCondor.
The files in the lib directory are the HTCondor libraries that must be linked in with user jobs for all of HTCondor's checkpointing and migration features to be used. lib also contains scripts used by the condor_compile program to help re-link jobs with the HTCondor libraries. These files should be placed in a location that is world-readable, but they do not need to be placed in anyone's PATH. The condor_compile script checks the configuration file for the location of the lib directory.
etc contains an examples subdirectory which holds various example configuration files and other files used for installing HTCondor. etc is the recommended location to keep the master copy of your configuration files. You can put in soft links from one of the places mentioned above that HTCondor checks automatically to find its global configuration file.
The documentation provided with HTCondor is currently available in HTML, Postscript and PDF (Adobe Acrobat). It can be locally installed wherever is customary at your site. You can also find the HTCondor documentation on the web at: http://www.cs.wisc.edu/condor/manual.
If you are using AFS at your site, be sure to read the section 3.12.1 on page in the manual. HTCondor does not currently have a way to authenticate itself to AFS. A solution is not ready for Version 7.8.8. This implies that you are probably not going to want to have the LOCAL_DIR for HTCondor on AFS. However, you can (and probably should) have the HTCondor RELEASE_DIR on AFS, so that you can share one copy of those files and upgrade them in a centralized location. You will also have to do something special if you submit jobs to HTCondor from a directory on AFS. Again, read manual section 3.12.1 for all the details.
HTCondor takes up a fair amount of space. This is another reason why it is a good idea to have it on a shared file system. The compressed downloads currently range from a low of about 100 Mbytes for Windows to about 500 Mbytes for Linux. The compressed source code takes approximately 16 Mbytes.
In addition, you will need a lot of disk space in the local directory of any machines that are submitting jobs to HTCondor. See question 6 above for details on this.
The Perl script condor_configure installs HTCondor. Command-line arguments specify all needed information to this script. The script can be executed multiple times, to modify or further set the configuration. condor_configure has been tested using Perl 5.003. Use this or a more recent version of Perl.
After download, all the files are in a compressed, tar format. They need to be untarred, as
tar xzf completename.tar.gzAfter untarring, the directory will have the Perl scripts condor_configure and condor_install, as well as bin, etc, examples, include, lib, libexec, man, sbin, sql and src subdirectories.
condor_configure and condor_install are the same program, but have different default behaviors. condor_install is identical to running
condor_configure --install=.condor_configure and condor_install work on the named directories. As the names imply, condor_install is used to install HTCondor, whereas condor_configure is used to modify the configuration of an existing HTCondor install.
condor_configure and condor_install are completely command-line driven; it is not interactive. Several command-line arguments are always needed with condor_configure and condor_install. The argument
--install=/path/to/releasespecifies the path to the HTCondor release directories. The default command-line argument for condor_install is
--prefix=<directory>specifies the path to the install directory.
--local-dir=<directory>specifies the path to the local directory.
--type option to condor_configure
specifies one or more of the roles that a machine may take on
within the HTCondor pool: central manager, submit or execute.
These options are given in a comma separated list.
So, if a machine is both a submit and execute
the proper command-line option is
Install HTCondor on the central manager machine first. If HTCondor will run as root in this pool (Item 3 above), run condor_install as root, and it will install and set the file permissions correctly. On the central manager machine, run condor_install as follows.
% condor_install --prefix=~condor \ --local-dir=/scratch/condor --type=manager
To update the above HTCondor installation, for example, to also be submit machine:
% condor_configure --prefix=~condor \ --local-dir=/scratch/condor --type=manager,submit
As in the above example, the central manager can also be a submit
point or an execute machine, but this is only recommended for very
small pools. If this is the case, the
option changes to
manager,execute or manager,submit or
After the central manager is installed, the execute and submit machines should then be configured. Decisions about whether to run HTCondor as root should be consistent throughout the pool. For each machine in the pool, run
% condor_install --prefix=~condor \ --local-dir=/scratch/condor --type=execute,submit
See the condor_configure manual page in section 10 on page for details.
Now that HTCondor has been installed on the machine(s), there are a few things to check before starting up HTCondor.
For Unix platforms other than Linux, HTCondor can monitor the activity of your mouse and keyboard, provided that you tell it where to look. You do this with the CONSOLE_DEVICES entry in the condor_startd section of the configuration file. On most platforms, reasonable defaults are provided. For example, the default device for the mouse is 'mouse', since most installations have a soft link from /dev/mouse that points to the right device (such as tty00 if you have a serial mouse, psaux if you have a PS/2 bus mouse, etc). If you do not have a /dev/mouse link, you should either create one (you will be glad you did), or change the CONSOLE_DEVICES entry in HTCondor's configuration file. This entry is a comma separated list, so you can have any devices in /dev count as 'console devices' and activity will be reported in the condor_startd's ClassAd as ConsoleIdleTime.
To start up the HTCondor daemons, execute <release_dir>/sbin/condor_master. This is the HTCondor master, whose only job in life is to make sure the other HTCondor daemons are running. The master keeps track of the daemons, restarts them if they crash, and periodically checks to see if you have installed new binaries (and if so, restarts the affected daemons).
If you are setting up your own pool, you should start HTCondor on your central manager machine first. If you have done a submit-only installation and are adding machines to an existing pool, the start order does not matter.
To ensure that HTCondor is running, you can run either:
ps -ef | egrep condor_or
ps -aux | egrep condor_depending on your flavor of Unix. On a central manager machine that can submit jobs as well as execute them, there will be processes for:
Once you are sure the HTCondor daemons are running, check to make sure that they are communicating with each other. You can run condor_status to get a one line summary of the status of each machine in your pool.
Once you are sure HTCondor is working properly, you should add condor_master into your startup/bootup scripts (i.e. /etc/rc ) so that your machine runs condor_master upon bootup. condor_master will then fire up the necessary HTCondor daemons whenever your machine is rebooted.
If your system uses System-V style init scripts, you can look in <release_dir>/etc/examples/condor.boot for a script that can be used to start and stop HTCondor automatically by init. Normally, you would install this script as /etc/init.d/condor and put in soft link from various directories (for example, /etc/rc2.d) that point back to /etc/init.d/condor. The exact location of these scripts and links will vary on different platforms.
If your system uses BSD style boot scripts, you probably have an /etc/rc.local file. Add a line to start up <release_dir>/sbin/condor_master.
Now that the HTCondor daemons are running, there are a few things you can and should do:
This section contains the instructions for installing the Windows version of HTCondor. The install program will set up a slightly customized configuration file that may be further customized after the installation has completed.
Please read the copyright and disclaimer information in section on page of the manual. Installation and use of HTCondor is acknowledgment that you have read and agree to the terms.
Be sure that the HTCondor tools are of the same version as the daemons installed. The HTCondor executable for distribution is packaged in a single file named similar to:
condor-7.4.3-winnt50-x86.msiThis file is approximately 80 Mbytes in size, and it may be removed once HTCondor is fully installed.
Before installing HTCondor, please consider joining the condor-world mailing list. Traffic on this list is kept to an absolute minimum. It is only used to announce new releases of HTCondor. To subscribe, follow the directions given at http://www.cs.wisc.edu/condor/mail-lists/.
For any installation, HTCondor services are installed and run as the Local System account. Running the HTCondor services as any other account (such as a domain user) is not supported and could be problematic.
Before installing the Windows version of HTCondor, there are two major decisions to make about the basic layout of the pool.
If the answers to these questions are already known, skip to the Windows Installation Procedure section below, section 3.2.5 on page . If unsure, read on.
One machine in your pool must be the central manager. This is the centralized information repository for the HTCondor pool and is also the machine that matches available machines with waiting jobs. If the central manager machine crashes, any currently active matches in the system will keep running, but no new matches will be made. Moreover, most HTCondor tools will stop working. Because of the importance of this machine for the proper functioning of HTCondor, we recommend installing it on a machine that is likely to stay up all the time, or at the very least, one that will be rebooted quickly if it does crash. Also, because all the services will send updates (by default every 5 minutes) to this machine, it is advisable to consider network traffic and network layout when choosing the central manager.
Install HTCondor on the central manager before installing on the other machines within the pool.
The HTCondor release directory takes up a fair amount of space. The size requirement for the release directory is approximately 250 Mbytes. HTCondor itself, however, needs space to store all of the jobs and their input files. If there will be large numbers of jobs, consider installing HTCondor on a volume with a large amount of free space.
Installation of HTCondor must be done by a user with administrator privileges. After installation, the HTCondor services will be run under the local system account. When HTCondor is running a user job, however, it will run that user job with normal user permissions.
Download HTCondor, and start the installation process by running the installer. The HTCondor installation is completed by answering questions and choosing options within the following steps.
If HTCondor has been previously installed, a dialog box will appear before the installation of HTCondor proceeds. The question asks if you wish to preserve your current HTCondor configuration files. Answer yes or no, as appropriate.
If you answer yes, your configuration files will not be changed, and you will proceed to the point where the new binaries will be installed.
If you answer no, then there will be a second question that asks if you want to use answers given during the previous installation as default answers.
The first step in installing HTCondor is a welcome screen and license agreement. You are reminded that it is best to run the installation when no other Windows programs are running. If you need to close other Windows programs, it is safe to cancel the installation and close them. You are asked to agree to the license. Answer yes or no. If you should disagree with the License, the installation will not continue.
Also fill in name and company information, or use the defaults as given.
The HTCondor configuration needs to be set based upon if this is a new pool or to join an existing one. Choose the appropriate radio button.
For a new pool, enter a chosen name for the pool. To join an existing pool, enter the host name of the central manager of the pool.
Each machine within an HTCondor pool may either submit jobs or execute submitted jobs, or both submit and execute jobs. A check box determines if this machine will be a submit point for the pool.
A set of radio buttons determines the ability and configuration of the ability to execute jobs. There are four choices:
For testing purposes, it is often helpful to use the always run HTCondor jobs option.
For a machine that is to execute jobs and the choice is one of the last two in the list, HTCondor needs to further know what to do with the currently running jobs. There are two choices:
This choice involves a trade off. Restarting the job on a different machine is less intrusive on the workstation owner than leaving the job in memory for a later time. A suspended job left in memory will require swap space, which could be a scarce resource. Leaving a job in memory, however, has the benefit that accumulated run time is not lost for a partially completed job.
Enter the machine's accounting (or UID) domain. On this version of HTCondor for Windows, this setting is only used for user priorities (see section 3.4 on page ) and to form a default e-mail address for the user.
Various parts of HTCondor will send e-mail to an HTCondor administrator if something goes wrong and requires human attention. Specify the e-mail address and the SMTP relay host of this administrator. Please pay close attention to this e-mail, since it will indicate problems in the HTCondor pool.
For more details on these access permissions, and others that can be manually changed in your configuration file, please see the section titled Setting Up IP/Host-Based Security in HTCondor in section section 3.6.9 on page .
Running HDFS requires Java to be installed, and HTCondor must know where the installation is. Running HDFS in data node mode also requires the installation of Cygwin, and the path to the Cygwin directory must be added to the global PATH environment variable.
HDFS has several configuration options that must be filled in to be used.
The next step is where the destination of the HTCondor files will be
We recommend that HTCondor be installed in the location shown as the default
in the install choice:
C:\Condor. This is due to several hard coded
paths in scripts and configuration files.
Clicking on the Custom choice permits changing the installation directory.
Installation on the local disk is chosen for several reasons. The HTCondor services run as local system, and within Microsoft Windows, local system has no network privileges. Therefore, for HTCondor to operate, HTCondor should be installed on a local hard drive, as opposed to a network drive (file server).
The second reason for installation on the local disk is that the Windows usage of drive letters has implications for where HTCondor is placed. The drive letter used must be not change, even when different users are logged in. Local drive letters do not change under normal operation of Windows.
While it is strongly discouraged, it may be possible to place HTCondor on a hard drive that is not local, if a dependency is added to the service control manager such that HTCondor starts after the required file services are available.
This section details how to run the HTCondor for Windows installer in an unattended batch mode. This mode is one that occurs completely from the command prompt, without the GUI interface.
The HTCondor for Windows installer uses the Microsoft Installer (MSI) technology, and it can be configured for unattended installs analogous to any other ordinary MSI installer.
The following is a sample batch file that is used to set all the properties necessary for an unattended install.
@echo on set ARGS= set ARGS=NEWPOOL="N" set ARGS=%ARGS% POOLNAME="" set ARGS=%ARGS% RUNJOBS="C" set ARGS=%ARGS% VACATEJOBS="Y" set ARGS=%ARGS% SUBMITJOBS="Y" set ARGS=%ARGS% CONDOREMAIL="firstname.lastname@example.org" set ARGS=%ARGS% SMTPSERVER="smtp.localhost" set ARGS=%ARGS% HOSTALLOWREAD="*" set ARGS=%ARGS% HOSTALLOWWRITE="*" set ARGS=%ARGS% HOSTALLOWADMINISTRATOR="$(IP_ADDRESS)" set ARGS=%ARGS% INSTALLDIR="C:\Condor" set ARGS=%ARGS% POOLHOSTNAME="$(IP_ADDRESS)" set ARGS=%ARGS% ACCOUNTINGDOMAIN="none" set ARGS=%ARGS% JVMLOCATION="C:\Windows\system32\java.exe" set ARGS=%ARGS% USEVMUNIVERSE="N"set ARGS=%ARGS% VMMEMORY="128" set ARGS=%ARGS% VMMAXNUMBER="$(NUM_CPUS)" set ARGS=%ARGS% VMNETWORKING="N" set ARGS=%ARGS% USEHDFS="N" set ARGS=%ARGS% NAMENODE="" set ARGS=%ARGS% HDFSMODE="HDFS_NAMENODE" set ARGS=%ARGS% HDFSPORT="5000" set ARGS=%ARGS% HDFSWEBPORT="4000" msiexec /qb /l* condor-install-log.txt /i condor-7.1.0-winnt50-x86.msi %ARGS%
Each property corresponds to answers that would have been supplied while running an interactive installer. The following is a brief explanation of each property as it applies to unattended installations:
After defining each of these properties for the MSI installer, the installer can be started with the msiexec command. The following command starts the installer in unattended mode, and it dumps a journal of the installer's progress to a log file:
msiexec /qb /lxv* condor-install-log.txt /i condor-7.2.2-winnt50-x86.msi [property=value] ...
More information on the features of msiexec can be found at Microsoft's website at http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/msiexec.mspx.
If you are to install HTCondor on many different machines, you may wish to use some other mechanism to install HTCondor on additional machines rather than running the Setup program described above on each machine.
WARNING: This is for advanced users only! All others should use the Setup program described above.
Here is a brief overview of how to install HTCondor manually without using the provided GUI-based setup program:
The HTCondor service can be installed and removed using the sc.exe tool, which is included in Windows XP and Windows 2003 Server. The tool is also available as part of the Windows 2000 Resource Kit.
Installation can be done as follows:
sc create Condor binpath= c:\condor\bin\condor_master.exe
To remove the service, use:
sc delete Condor
CONDOR_CONFIG should point to the condor_config file. In this version of HTCondor, it must reside on the local disk.
RELEASE_DIR should point to the directory where HTCondor is installed. This
C:\Condor, and again, this must reside on the
These files currently must reside on the local disk for a variety of reasons. Advanced Windows users might be able to put the files on remote resources. The main concern is twofold. First, the files must be there when the service is started. Second, the files must always be in the same spot (including drive letter), no matter who is logged into the machine.
Note also that when installing manually, you will need to create the directories that HTCondor will expect to be present given your configuration. This normally is simply a matter of creating the log, spool, and execute directories.
After the installation of HTCondor is completed, the HTCondor service must be started. If you used the GUI-based setup program to install HTCondor, the HTCondor service should already be started. If you installed manually, HTCondor must be started by hand, or you can simply reboot. NOTE: The HTCondor service will start automatically whenever you reboot your machine.
To start HTCondor by hand:
Or, alternatively you can enter the following command from a command prompt:
net start condor
Run the Task Manager (Control-Shift-Escape) to check that HTCondor services are running. The following tasks should be running:
Also, you should now be able to open up a new cmd (DOS prompt) window, and the HTCondor bin directory should be in your path, so you can issue the normal HTCondor commands, such as condor_q and condor_status.
Once HTCondor services are running, try submitting test jobs. Example 2 within section 2.5.1 on page presents a vanilla universe job.
RPMs are available in HTCondor Version 7.8.8. We provide a Yum repository, as well as installation and configuration in one easy step. This RPM installation is currently available for Red Hat-compatible systems only. As of HTCondor version 7.5.1, the HTCondor RPM installs into FHS locations.
Yum repositories are at http://research.cs.wisc.edu/htcondor/yum/ . The repositories are named to distinguish stable releases from development releases and by Red Hat version number. The 6 repositories are:
For HTCondor to work properly under RHEL5, the Red Hat Network channel called "RHEL Virtualization" must be explicitly enabled.
Here is an ordered set of steps that get HTCondor running using the RPM.
cd /etc/yum.repos.d wget http://research.cs.wisc.edu/htcondor/yum/repo.d/condor-stable-rhel5.repoNote that this step need be done only once; do not get the same repository more than once.
yum install condorFor 64-bit machines:
yum install condor.x86_64
/sbin/service condor start
Debian packages are available in HTCondor Version 7.8.8. We provide an APT repository, as well as installation and configuration in one easy step. These Debian packages of HTCondor are currently available for Debian 5 (Lenny) and Debian 6 (Squeeze). As of HTCondor version 7.5.1, the HTCondor Debian package installs into FHS locations.
The HTCondor APT repositories are specified at http://research.cs.wisc.edu/htcondor/debian/ . See this web page for repository information.
Here is an ordered set of steps that get HTCondor running.
deb http://research.cs.wisc.edu/htcondor/debian/stable/ lenny contrib deb http://research.cs.wisc.edu/htcondor/debian/development/ lenny contrib deb http://research.cs.wisc.edu/htcondor/debian/stable/ squeeze contrib deb http://research.cs.wisc.edu/htcondor/debian/development/ squeeze contribNote that this step need be done only once; do not add the same repository more than once.
apt-get update apt-get install condor
Then, if any configuration changes are made, restart HTCondor with
Dynamic deployment is a mechanism that allows rapid, automated installation and start up of HTCondor resources on a given machine. In this way any machine can be added to an HTCondor pool. The dynamic deployment tool set also provides tools to remove a machine from the pool, without leaving residual effects on the machine such as leftover installations, log files, and working directories.
Installation and start up is provided by condor_cold_start. The condor_cold_start program determines the operating system and architecture of the target machine, and transfers the correct installation package from an ftp, http, or grid ftp site. After transfer, it installs HTCondor and creates a local working directory for HTCondor to run in. As a last step, condor_cold_start begins running HTCondor in a manner which allows for later easy and reliable shut down.
The program that reliably shuts down and uninstalls a previously dynamically installed HTCondor instance is condor_cold_stop. condor_cold_stop begins by safely and reliably shutting off the running HTCondor installation. It ensures that HTCondor has completely shut down before continuing, and optionally ensures that there are no queued jobs at the site. Next, condor_cold_stop removes and optionally archives the HTCondor working directories, including the log directory. These archives can be stored to a mounted file system or to a grid ftp site. As a last step, condor_cold_stop uninstalls the HTCondor executables and libraries. The end result is that the machine resources are left unchanged after a dynamic deployment of HTCondor leaves.
Dynamic deployment is designed for the expert HTCondor user and administrator. Tool design choices were made for functionality, not ease-of-use.
Like every installation of HTCondor, a dynamically deployed installation relies on a configuration. To add a target machine to a previously created HTCondor pool, the global configuration file for that pool is a good starting point. Modifications to that configuration can be made in a separate, local configuration file used in the dynamic deployment. The global configuration file must be placed on an ftp, http, grid ftp, or file server accessible by condor_cold_start. The local configuration file is to be on a file system accessible by the target machine. There are some specific configuration variables that may be set for dynamic deployment. A list of executables and directories which must be present for HTCondor to start on the target machine may be set with the configuration variables DEPLOYMENT_REQUIRED_EXECS and DEPLOYMENT_REQUIRED_DIRS . If defined and the comma-separated list of executables or directories are not present, then condor_cold_start exits with error. Note this does not affect what is installed, only whether start up is successful.
A list of executables and directories which are recommended to be present for HTCondor to start on the target machine may be set with the configuration variables DEPLOYMENT_RECOMMENDED_EXECS and DEPLOYMENT_RECOMMENDED_DIRS . If defined and the comma-separated lists of executables or directories are not present, then condor_cold_start prints a warning message and continues. Here is a portion of the configuration relevant to a dynamic deployment of a HTCondor submit node:
DEPLOYMENT_REQUIRED_EXECS = MASTER, SCHEDD, PREEN, STARTER, \ STARTER_STANDARD, SHADOW, \ SHADOW_STANDARD, GRIDMANAGER, GAHP, CONDOR_GAHP DEPLOYMENT_REQUIRED_DIRS = SPOOL, LOG, EXECUTE DEPLOYMENT_RECOMMENDED_EXECS = CREDD DEPLOYMENT_RECOMMENDED_DIRS = LIB, LIBEXEC
Additionally, the user must specify which HTCondor services will be started. This is done through the DAEMON_LIST configuration variable. Another excerpt from a dynamic submit node deployment configuration:
DAEMON_LIST = MASTER, SCHEDD
Finally, the location of the dynamically installed HTCondor executables is tricky to set, since the location is unknown before installation. Therefore, the variable DEPLOYMENT_RELEASE_DIR is defined in the environment. It corresponds to the location of the dynamic HTCondor installation. If, as is often the case, the configuration file specifies the location of HTCondor executables in relation to the RELEASE_DIR variable, the configuration can be made dynamically deployable by setting RELEASE_DIR to DEPLOYMENT_RELEASE_DIR as
RELEASE_DIR = $(DEPLOYMENT_RELEASE_DIR)
In addition to setting up the configuration, the user must also determine where the installation package will reside. The installation package can be in either tar or gzipped tar form, and may reside on a ftp, http, grid ftp, or file server. Create this installation package by tar'ing up the binaries and libraries needed, and place them on the appropriate server. The binaries can be tar'ed in a flat structure or within bin and sbin. Here is a list of files to give an example structure for a dynamic deployment of the condor_schedd daemon.
% tar tfz latest-i686-Linux-2.4.21-37.ELsmp.tar.gz bin/ bin/condor_config_val bin/condor_q sbin/ sbin/condor_preen sbin/condor_shadow.std sbin/condor_starter.std sbin/condor_schedd sbin/condor_master sbin/condor_gridmanager sbin/gahp_server sbin/condor_starter sbin/condor_shadow sbin/condor_c-gahp sbin/condor_off