This is an outdated version of the HTCondor Manual. You can find current documentation at http://htcondor.org/manual.

Next: 5.4 Dynamic Deployment Up: 5. Grid Computing Previous: 5.2 Connecting HTCondor Pools Contents Index

Subsections

5.3 The Grid Universe

5.3.1 HTCondor-C, The condor Grid Type

HTCondor-C allows jobs in one machine's job queue to be moved to another machine's job queue. These machines may be far removed from each other, providing powerful grid computation mechanisms, while requiring only HTCondor software and its configuration.

HTCondor-C is highly resistant to network disconnections and machine failures on both the submission and remote sides. An expected usage sets up Personal HTCondor on a laptop, submits some jobs that are sent to an HTCondor pool, waits until the jobs are staged on the pool, then turns off the laptop. When the laptop reconnects at a later time, any results can be pulled back.

HTCondor-C scales gracefully when compared with HTCondor's flocking mechanism. The machine upon which jobs are submitted maintains a single process and network connection to a remote machine, without regard to the number of jobs queued or running.

5.3.1.1 HTCondor-C Configuration

There are two aspects to configuration to enable the submission and execution of HTCondor-C jobs. These two aspects correspond to the endpoints of the communication: there is the machine from which jobs are submitted, and there is the remote machine upon which the jobs are placed in the queue (executed).

Configuration of a machine from which jobs are submitted requires a few extra configuration variables:

CONDOR_GAHP=$(SBIN)/condor_c-gahp
C_GAHP_LOG=/tmp/CGAHPLog.$(USERNAME)
C_GAHP_WORKER_THREAD_LOG=/tmp/CGAHPWorkerLog.$(USERNAME)

The acronym GAHP stands for Grid ASCII Helper Protocol. A GAHP server provides grid-related services for a variety of underlying middle-ware systems. The configuration variable CONDOR_GAHP gives a full path to the GAHP server utilized by HTCondor-C. The configuration variable C_GAHP_LOG defines the location of the log that the HTCondor GAHP server writes. The log for the HTCondor GAHP is written as the user on whose behalf it is running; thus the C_GAHP_LOG configuration variable must point to a location the end user can write to.

A submit machine must also have a condor_collector daemon to which the condor_schedd daemon can submit a query. The query is for the location (IP address and port) of the intended remote machine's condor_schedd daemon. This facilitates communication between the two machines. This condor_collector does not need to be the same collector that the local condor_schedd daemon reports to.

The machine upon which jobs are executed must also be configured correctly. This machine must be running a condor_schedd daemon. Unless specified explicitly in a submit file, CONDOR_HOST must point to a condor_collector daemon that it can write to, and the machine upon which jobs are submitted can read from. This facilitates communication between the two machines.

An important aspect of configuration is the security configuration relating to authentication. HTCondor-C on the remote machine relies on an authentication protocol to know the identity of the user under which to run a job. The following is a working example of the security configuration for authentication. This authentication method, CLAIMTOBE, trusts the identity claimed by a host or IP address.

SEC_DEFAULT_NEGOTIATION = OPTIONAL
SEC_DEFAULT_AUTHENTICATION_METHODS = CLAIMTOBE

Other working authentication methods are GSI, SSL, KERBEROS, and FS.

5.3.1.2 HTCondor-C Job Submission

Job submission of HTCondor-C jobs is the same as for any HTCondor job. The universe is grid. grid_resource specifies the remote condor_schedd daemon to which the job should be submitted, and its value consists of three fields. The first field is the grid type, which is condor. The second field is the name of the remote condor_schedd daemon. Its value is the same as the condor_schedd ClassAd attribute Name on the remote machine. The third field is the name of the remote pool's condor_collector.

The following represents a minimal submit description file for a job.

# minimal submit description file for an HTCondor-C job
universe = grid
executable = myjob
output = myoutput
error = myerror
log = mylog

grid_resource = condor joe@remotemachine.example.com remotecentralmanager.example.com
+remote_jobuniverse = 5
+remote_requirements = True
+remote_ShouldTransferFiles = "YES"
+remote_WhenToTransferOutput = "ON_EXIT"
queue

The remote machine needs to understand the attributes of the job. These are specified in the submit description file using the '+' syntax, followed by the string remote_. At a minimum, this will be the job's universe and the job's requirements. It is likely that other attributes specific to the job's universe (on the remote pool) will also be necessary. Note that attributes set with '+' are inserted directly into the job's ClassAd. Specify attributes as they must appear in the job's ClassAd, not the submit description file. For example, the universe is specified using an integer assigned for a job ClassAd JobUniverse. Similarly, place quotation marks around string expressions. As an example, a submit description file would ordinarily contain

when_to_transfer_output = ON_EXIT

This must appear in the HTCondor-C job submit description file as

+remote_WhenToTransferOutput = "ON_EXIT"

For convenience, the specific entries of universe, remote_grid_resource, globus_rsl, and globus_xml may be specified as remote_ commands without the leading '+'. Instead of

+remote_universe = 5

the submit description file command may appear as

remote_universe = vanilla

Similarly, the command

+remote_gridresource = "condor schedd.example.com cm.example.com"

may be given as

remote_grid_resource = condor schedd.example.com cm.example.com

For the given example, the job is to be run as a vanilla universe job at the remote pool. The (remote pool's) condor_schedd daemon is likely to place its job queue data on a local disk and execute the job on another machine within the pool of machines. This implies that the file systems for the resulting submit machine (the machine specified by remote_schedd) and the execute machine (the machine that runs the job) will not be shared. Thus, the two inserted ClassAds

+remote_ShouldTransferFiles = "YES"
+remote_WhenToTransferOutput = "ON_EXIT"

are used to invoke HTCondor's file transfer mechanism.

As HTCondor-C is a recent addition to HTCondor, the universes, associated integer assignments, and notes about the existence of functionality are given in Table 5.1. The note "untested" implies that submissions under the given universe have not yet been thoroughly tested. They may already work.

Table 5.1: Functionality of remote job universes with HTCondor-C

Universe Name	Value	Notes
standard	1	untested
vanilla	5	works well
scheduler	7	works well
grid	9
	grid_resource is condor	works well
	grid_resource is cream	untested
	grid_resource is gt2	works well
	grid_resource is gt5	untested
	grid_resource is nordugrid	untested
	grid_resource is unicore	untested
	grid_resource is lsf	works well
	grid_resource is pbs	works well
java	10	untested
parallel	11	untested
local	12	works well

For communication between condor_schedd daemons on the submit and remote machines, the location of the remote condor_schedd daemon is needed. This information resides in the condor_collector of the remote machine's pool. The third field of the grid_resource command in the submit description file says which condor_collector should be queried for the remote condor_schedd daemon's location. An example of this submit command is

grid_resource = condor schedd.example.com machine1.example.com

If the remote condor_collector is not listening on the standard port (9618), then the port it is listening on needs to be specified:

grid_resource = condor schedd.example.comd machine1.example.com:12345

File transfer of a job's executable, stdin, stdout, and stderr are automatic. When other files need to be transferred using HTCondor's file transfer mechanism (see section 2.5.4 on page ), the mechanism is applied based on the resulting job universe on the remote machine.

5.3.1.3 HTCondor-C Jobs Between Differing Platforms

HTCondor-C jobs given to a remote machine running Windows must specify the Windows domain of the remote machine. This is accomplished by defining a ClassAd attribute for the job. Where the Windows domain is different at the submit machine from the remote machine, the submit description file defines the Windows domain of the remote machine with

  +remote_NTDomain = "DomainAtRemoteMachine"

A Windows machine not part of a domain defines the Windows domain as the machine name.

5.3.2 HTCondor-G, the gt2, and gt5 Grid Types

HTCondor-G is the name given to HTCondor when grid universe jobs are sent to grid resources utilizing Globus software for job execution. The Globus Toolkit provides a framework for building grid systems and applications. See the Globus Alliance web page at http://www.globus.org for descriptions and details of the Globus software.

HTCondor provides the same job management capabilities for HTCondor-G jobs as for other jobs. From HTCondor, a user may effectively submit jobs, manage jobs, and have jobs execute on widely distributed machines.

It may appear that HTCondor-G is a simple replacement for the Globus Toolkit's globusrun command. However, HTCondor-G does much more. It allows the submission of many jobs at once, along with the monitoring of those jobs with a convenient interface. There is notification when jobs complete or fail and maintenance of Globus credentials that may expire while a job is running. On top of this, HTCondor-G is a fault-tolerant system; if a machine crashes, all of these functions are again available as the machine returns.

5.3.2.1 Globus Protocols and Terminology

The Globus software provides a well-defined set of protocols that allow authentication, data transfer, and remote job execution. Authentication is a mechanism by which an identity is verified. Given proper authentication, authorization to use a resource is required. Authorization is a policy that determines who is allowed to do what.

HTCondor (and Globus) utilize the following protocols and terminology. The protocols allow HTCondor to interact with grid machines toward the end result of executing jobs.

GSI: The Globus Toolkit's Grid Security Infrastructure (GSI) provides essential building blocks for other grid protocols and HTCondor-G. This authentication and authorization system makes it possible to authenticate a user just once, using public key infrastructure (PKI) mechanisms to verify a user-supplied grid credential. GSI then handles the mapping of the grid credential to the diverse local credentials and authentication/authorization mechanisms that apply at each site.
GRAM: The Grid Resource Allocation and Management (GRAM) protocol supports remote submission of a computational request (for example, to run a program) to a remote computational resource, and it supports subsequent monitoring and control of the computation. GRAM is the Globus protocol that HTCondor-G uses to talk to remote Globus jobmanagers.
GASS: The Globus Toolkit's Global Access to Secondary Storage (GASS) service provides mechanisms for transferring data to and from a remote HTTP, FTP, or GASS server. GASS is used by HTCondor for the gt2 grid type to transfer a job's files to and from the machine where the job is submitted and the remote resource.
GridFTP: GridFTP is an extension of FTP that provides strong security and high-performance options for large data transfers.
RSL: RSL (Resource Specification Language) is the language GRAM accepts to specify job information.
gatekeeper: A gatekeeper is a software daemon executing on a remote machine on the grid. It is relevant only to the gt2 grid type, and this daemon handles the initial communication between HTCondor and a remote resource.
jobmanager: A jobmanager is the Globus service that is initiated at a remote resource to submit, keep track of, and manage grid I/O for jobs running on an underlying batch system. There is a specific jobmanager for each type of batch system supported by Globus (examples are HTCondor, LSF, and PBS).

In its interaction with Globus software, HTCondor contains a GASS server, used to transfer the executable, stdin, stdout, and stderr to and from the remote job execution site. HTCondor uses the GRAM protocol to contact the remote gatekeeper and request that a new jobmanager be started. The GRAM protocol is also used to when monitoring the job's progress. HTCondor detects and intelligently handles cases such as if the remote resource crashes.

There are now two different versions of the GRAM protocol in common usage: gt2 and gt5. HTCondor supports both of them.

gt2: This initial GRAM protocol is used in Globus Toolkit versions 1 and 2. It is still used by many production systems. Where available in the other, more recent versions of the protocol, gt2 is referred to as the pre-web services GRAM (or pre-WS GRAM) or GRAM2.
gt5: This latest GRAM protocol is an extension of GRAM2 that is intended to be more scalable and robust. It's usually referred to as GRAM5.

5.3.2.2 The gt2 Grid Type

HTCondor-G supports submitting jobs to remote resources running the Globus Toolkit's GRAM2 (or pre-WS GRAM) service. This flavor of GRAM is the most common. These HTCondor-G jobs are submitted the same as any other HTCondor job. The universe is grid, and the pre-web services GRAM protocol is specified by setting the type of grid as gt2 in the grid_resource command.

Under HTCondor, successful job submission to the grid universe with gt2 requires credentials. An X.509 certificate is used to create a proxy, and an account, authorization, or allocation to use a grid resource is required. For general information on proxies and certificates, please consult the Globus page at

http://www-unix.globus.org/toolkit/docs/4.0/security/key-index.html

Before submitting a job to HTCondor under the grid universe, use grid-proxy-init to create a proxy.

Here is a simple submit description file. The example specifies a gt2 job to be run on an NCSA machine.

executable = test
universe = grid
grid_resource = gt2 modi4.ncsa.uiuc.edu/jobmanager
output = test.out
log = test.log
queue

The executable for this example is transferred from the local machine to the remote machine. By default, HTCondor transfers the executable, as well as any files specified by an input command. Note that the executable must be compiled for its intended platform.

The command grid_resource is a required command for grid universe jobs. The second field specifies the scheduling software to be used on the remote resource. There is a specific jobmanager for each type of batch system supported by Globus. The full syntax for this command line appears as

grid_resource = gt2 machinename[:port]/jobmanagername[:X.509 distinguished name]

The portions of this syntax specification enclosed within square brackets ([ and ]) are optional. On a machine where the jobmanager is listening on a nonstandard port, include the port number. The jobmanagername is a site-specific string. The most common one is jobmanager-fork, but others are

jobmanager
jobmanager-condor
jobmanager-pbs
jobmanager-lsf
jobmanager-sge

The Globus software running on the remote resource uses this string to identify and select the correct service to perform. Other jobmanagername strings are used, where additional services are defined and implemented.

The job log file is maintained on the submit machine.

Example output from condor_q for this submission looks like:

% condor_q


-- Submitter: wireless48.cs.wisc.edu : <128.105.48.148:33012> : wireless48.cs.wi

 ID      OWNER         SUBMITTED     RUN_TIME ST PRI SIZE CMD
   7.0   smith        3/26 14:08   0+00:00:00 I  0   0.0  test

1 jobs; 1 idle, 0 running, 0 held

After a short time, the Globus resource accepts the job. Again running condor_q will now result in

% condor_q


-- Submitter: wireless48.cs.wisc.edu : <128.105.48.148:33012> : wireless48.cs.wi

 ID      OWNER         SUBMITTED     RUN_TIME ST PRI SIZE CMD
   7.0   smith        3/26 14:08   0+00:01:15 R  0   0.0  test

1 jobs; 0 idle, 1 running, 0 held

Then, very shortly after that, the queue will be empty again, because the job has finished:

% condor_q


-- Submitter: wireless48.cs.wisc.edu : <128.105.48.148:33012> : wireless48.cs.wi

 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 idle, 0 running, 0 held

A second example of a submit description file runs the Unix ls program on a different Globus resource.

executable = /bin/ls
transfer_executable = false
universe = grid
grid_resource = gt2 vulture.cs.wisc.edu/jobmanager
output = ls-test.out
log = ls-test.log
queue

In this example, the executable (the binary) has been pre-staged. The executable is on the remote machine, and it is not to be transferred before execution. Note that the required grid_resource and universe commands are present. The command

transfer_executable = false

within the submit description file identifies the executable as being pre-staged. In this case, the executable command gives the path to the executable on the remote machine.

A third example submits a Perl script to be run as a submitted HTCondor job. The Perl script both lists and sets environment variables for a job. Save the following Perl script with the name env-test.pl, to be used as an HTCondor job executable.

#!/usr/bin/env perl

foreach $key (sort keys(%ENV))
{
   print "$key = $ENV{$key}\n"
}

exit 0;

Run the Unix command

chmod 755 env-test.pl

to make the Perl script executable.

Now create the following submit description file. Replace example.cs.wisc.edu/jobmanager with a resource you are authorized to use.

executable = env-test.pl
universe = grid
grid_resource = gt2 example.cs.wisc.edu/jobmanager
environment = foo=bar; zot=qux
output = env-test.out
log = env-test.log
queue

When the job has completed, the output file, env-test.out, should contain something like this:

GLOBUS_GRAM_JOB_CONTACT = https://example.cs.wisc.edu:36213/30905/1020633947/
GLOBUS_GRAM_MYJOB_CONTACT = URLx-nexus://example.cs.wisc.edu:36214
GLOBUS_LOCATION = /usr/local/globus
GLOBUS_REMOTE_IO_URL = /home/smith/.globus/.gass_cache/globus_gass_cache_1020633948
HOME = /home/smith
LANG = en_US
LOGNAME = smith
X509_USER_PROXY = /home/smith/.globus/.gass_cache/globus_gass_cache_1020633951
foo = bar
zot = qux

Of particular interest is the GLOBUS_REMOTE_IO_URL environment variable. HTCondor-G automatically starts up a GASS remote I/O server on the submit machine. Because of the potential for either side of the connection to fail, the URL for the server cannot be passed directly to the job. Instead, it is placed into a file, and the GLOBUS_REMOTE_IO_URL environment variable points to this file. Remote jobs can read this file and use the URL it contains to access the remote GASS server running inside HTCondor-G. If the location of the GASS server changes (for example, if HTCondor-G restarts), HTCondor-G will contact the Globus gatekeeper and update this file on the machine where the job is running. It is therefore important that all accesses to the remote GASS server check this file for the latest location.

The following example is a Perl script that uses the GASS server in HTCondor-G to copy input files to the execute machine. In this example, the remote job counts the number of lines in a file.

#!/usr/bin/env perl
use FileHandle;
use Cwd;

STDOUT->autoflush();
$gassUrl = `cat $ENV{GLOBUS_REMOTE_IO_URL}`;
chomp $gassUrl;

$ENV{LD_LIBRARY_PATH} = $ENV{GLOBUS_LOCATION}. "/lib";
$urlCopy = $ENV{GLOBUS_LOCATION}."/bin/globus-url-copy";

# globus-url-copy needs a full path name
$pwd = getcwd();
print "$urlCopy $gassUrl/etc/hosts file://$pwd/temporary.hosts\n\n";
`$urlCopy $gassUrl/etc/hosts file://$pwd/temporary.hosts`;

open(file, "temporary.hosts");
while(<file>) {
print $_;
}

exit 0;

The submit description file used to submit the Perl script as an HTCondor job appears as:

executable = gass-example.pl
universe = grid
grid_resource = gt2 example.cs.wisc.edu/jobmanager
output = gass.out
log = gass.log
queue

There are two optional submit description file commands of note: x509userproxy and globus_rsl. The x509userproxy command specifies the path to an X.509 proxy. The command is of the form:

x509userproxy = /path/to/proxy

If this optional command is not present in the submit description file, then HTCondor-G checks the value of the environment variable X509_USER_PROXY for the location of the proxy. If this environment variable is not present, then HTCondor-G looks for the proxy in the file /tmp/x509up_uXXXX, where the characters XXXX in this file name are replaced with the Unix user id.

The globus_rsl command is used to add additional attribute settings to a job's RSL string. The format of the globus_rsl command is

globus_rsl = (name=value)(name=value)

Here is an example of this command from a submit description file:

globus_rsl = (project=Test_Project)

This example's attribute name for the additional RSL is project, and the value assigned is Test_Project.

5.3.2.3 The gt5 Grid Type

The Globus GRAM5 protocol works the same as the gt2 grid type. Its implementation differs from gt2 in the following 3 items:

The Grid Monitor is disabled.
Globus job managers are not stopped and restarted.
The configuration variable GRIDMANAGER_MAX_JOBMANAGERS_PER_RESOURCE is not applied (for gt5 jobs).

Normally, HTCondor will automatically detect whether a service is GRAM2 or GRAM5 and interact with it accordingly. It does not matter whether gt2 or gt5 is specified. Disable this detection by setting the configuration variable GRAM_VERSION_DETECTION to False. If disabled, each resource must be accurately identified as either gt2 or gt5 in the grid_resource submit command.

5.3.2.4 Credential Management with MyProxy

HTCondor-G can use MyProxy software to automatically renew GSI proxies for grid universe jobs with grid type gt2. MyProxy is a software component developed at NCSA and used widely throughout the grid community. For more information see: http://grid.ncsa.illinois.edu/myproxy/

Difficulties with proxy expiration occur in two cases. The first case are long running jobs, which do not complete before the proxy expires. The second case occurs when great numbers of jobs are submitted. Some of the jobs may not yet be started or not yet completed before the proxy expires. One proposed solution to these difficulties is to generate longer-lived proxies. This, however, presents a greater security problem. Remember that a GSI proxy is sent to the remote Globus resource. If a proxy falls into the hands of a malicious user at the remote site, the malicious user can impersonate the proxy owner for the duration of the proxy's lifetime. The longer the proxy's lifetime, the more time a malicious user has to misuse the owner's credentials. To minimize the window of opportunity of a malicious user, it is recommended that proxies have a short lifetime (on the order of several hours).

The MyProxy software generates proxies using credentials (a user certificate or a long-lived proxy) located on a secure MyProxy server. HTCondor-G talks to the MyProxy server, renewing a proxy as it is about to expire. Another advantage that this presents is it relieves the user from having to store a GSI user certificate and private key on the machine where jobs are submitted. This may be particularly important if a shared HTCondor-G submit machine is used by several users.

In the a typical case, the following steps occur:

The user creates a long-lived credential on a secure MyProxy server, using the myproxy-init command. Each organization generally has their own MyProxy server.
The user creates a short-lived proxy on a local submit machine, using grid-proxy-init or myproxy-get-delegation.
The user submits an HTCondor-G job, specifying:

MyProxy server name (host:port)

MyProxy credential name (optional)

MyProxy password
At the short-lived proxy expiration HTCondor-G talks to the MyProxy server to refresh the proxy.

HTCondor-G keeps track of the password to the MyProxy server for credential renewal. Although HTCondor-G tries to keep the password encrypted and secure, it is still possible (although highly unlikely) for the password to be intercepted from the HTCondor-G machine (more precisely, from the machine that the condor_schedd daemon that manages the grid universe jobs runs on, which may be distinct from the machine from where jobs are submitted). The following safeguard practices are recommended.

Provide time limits for credentials on the MyProxy server. The default is one week, but you may want to make it shorter.
Create several different MyProxy credentials, maybe as many as one for each submitted job. Each credential has a unique name, which is identified with the MyProxyCredentialName command in the submit description file.
Use the following options when initializing the credential on the MyProxy server:
```
myproxy-init -s <host> -x -r <cert subject> -k <cred name>
```
The option -x -r <cert subject> essentially tells the MyProxy server to require two forms of authentication:
1. a password (initially set with myproxy-init)
2. an existing proxy (the proxy to be renewed)

A submit description file may include the password. An example contains commands of the form:

executable      = /usr/bin/my-executable
universe        = grid
grid_resource   = gt2 condor-unsup-7
MyProxyHost     = example.cs.wisc.edu:7512
MyProxyServerDN = /O=doesciencegrid.org/OU=People/CN=Jane Doe 25900
MyProxyPassword = password
MyProxyCredentialName = my_executable_run
queue

Note that placing the password within the submit file is not really secure, as it relies upon whatever file system security there is. This may still be better than option 5.

Use the -p option to condor_submit. The submit command appears as
```
condor_submit -p mypassword /home/user/myjob.submit
```
The argument list for condor_submit defaults to being publicly available. An attacker with a log in to the local machine could generate a simple shell script to watch for the password.

Currently, HTCondor-G calls the myproxy-get-delegation command-line tool, passing it the necessary arguments. The location of the myproxy-get-delegation executable is determined by the configuration variable MYPROXY_GET_DELEGATION in the configuration file on the HTCondor-G machine. This variable is read by the condor_gridmanager. If myproxy-get-delegation is a dynamically-linked executable (verify this with ldd myproxy-get-delegation), point MYPROXY_GET_DELEGATION to a wrapper shell script that sets LD_LIBRARY_PATH to the correct MyProxy library or Globus library directory and then calls myproxy-get-delegation. Here is an example of such a wrapper script:

#!/bin/sh
export LD_LIBRARY_PATH=/opt/myglobus/lib
exec /opt/myglobus/bin/myproxy-get-delegation $@

5.3.2.5 The Grid Monitor

HTCondor's Grid Monitor is designed to improve the scalability of machines running the Globus Toolkit's GRAM2 gatekeeper. Normally, this service runs a jobmanager process for every job submitted to the gatekeeper. This includes both currently running jobs and jobs waiting in the queue. Each jobmanager runs a Perl script at frequent intervals (every 10 seconds) to poll the state of its job in the local batch system. For example, with 400 jobs submitted to a gatekeeper, there will be 400 jobmanagers running, each regularly starting a Perl script. When a large number of jobs have been submitted to a single gatekeeper, this frequent polling can heavily load the gatekeeper. When the gatekeeper is under heavy load, the system can become non-responsive, and a variety of problems can occur.

HTCondor's Grid Monitor temporarily replaces these jobmanagers. It is named the Grid Monitor, because it replaces the monitoring (polling) duties previously done by jobmanagers. When the Grid Monitor runs, HTCondor attempts to start a single process to poll all of a user's jobs at a given gatekeeper. While a job is waiting in the queue, but not yet running, HTCondor shuts down the associated jobmanager, and instead relies on the Grid Monitor to report changes in status. The jobmanager started to add the job to the remote batch system queue is shut down. The jobmanager restarts when the job begins running.

The Grid Monitor requires that the gatekeeper support the fork jobmanager with the name jobmanager-fork. If the gatekeeper does not support the fork jobmanager, the Grid Monitor will not be used for that site. The condor_gridmanager log file reports any problems using the Grid Monitor.

The Grid Monitor is enabled by default, and the configuration macro GRID_MONITOR identifies the location of the executable.

5.3.2.6 Limitations of HTCondor-G

Submitting jobs to run under the grid universe has not yet been perfected. The following is a list of known limitations:

No checkpoints.
No job exit codes. Job exit codes are not available when using gt2.
Limited platform availability. Windows support is not yet available.

5.3.3 The nordugrid Grid Type

NorduGrid is a project to develop free grid middleware named the Advanced Resource Connector (ARC). See the NorduGrid web page (http://www.nordugrid.org) for more information about NorduGrid software.

HTCondor jobs may be submitted to NorduGrid resources using the grid universe. The grid_resource command specifies the name of the NorduGrid resource as follows:

grid_resource = nordugrid ng.example.com

NorduGrid uses X.509 credentials for authentication, usually in the form a proxy certificate. condor_submit looks in default locations for the proxy. The submit description file command x509userproxy is used to give the full path name to the directory containing the proxy, when the proxy is not in a default location. If this optional command is not present in the submit description file, then the value of the environment variable X509_USER_PROXY is checked for the location of the proxy. If this environment variable is not present, then the proxy in the file /tmp/x509up_uXXXX is used, where the characters XXXX in this file name are replaced with the Unix user id.

NorduGrid uses RSL syntax to describe jobs. The submit description file command nordugrid_rsl adds additional attributes to the job RSL that HTCondor constructs. The format this submit description file command is

nordugrid_rsl = (name=value)(name=value)

5.3.4 The unicore Grid Type

Unicore is a Java-based grid scheduling system. See http://www.unicore.eu/ for more information about Unicore.

HTCondor jobs may be submitted to Unicore resources using the grid universe. The grid_resource command specifies the name of the Unicore resource as follows:

grid_resource = unicore usite.example.com vsite

usite.example.com is the host name of the Unicore gateway machine to which the HTCondor job is to be submitted. vsite is the name of the Unicore virtual resource to which the HTCondor job is to be submitted.

Unicore uses certificates stored in a Java keystore file for authentication. The following submit description file commands are required to properly use the keystore file.

keystore_file: Specifies the complete path and file name of the Java keystore file to use.
keystore_alias: A string that specifies which certificate in the Java keystore file to use.
keystore_passphrase_file: Specifies the complete path and file name of the file containing the passphrase protecting the certificate in the Java keystore file.

5.3.5 The batch Grid Type (for PBS, LSF, and SGE)

The batch grid type is used to submit to a local PBS, LSF, or SGE system using the grid universe and the grid_resource command by placing a variant of the following into the submit description file.

grid_resource = batch pbs

The second argument on the right hand side will be one of pbs, lsf, or sge.

Any of these batch grid types requires two variables to be set in the HTCondor configuration file. BATCH_GAHP is the path to the GAHP server binary that is to be used to submit one of these batch jobs. GLITE_LOCATION is the path to the directory containing the GAHP's configuration file and auxillary binaries. In the HTCondor distribution, these files are located in $(LIB)/glite. The batch GAHP's configuration file is in $(GLITE_LOCATION)/etc/batch_gahp.config. The batch GAHP's auxillary binaries are to be in the directory $(GLITE_LOCATION)/bin. The HTCondor configuration file appears

GLITE_LOCATION = $(LIB)/glite
BATCH_GAHP     = $(GLITE_LOCATION)/bin/batch_gahp

The batch GAHP's configuration file has variables that must be modified to tell it where to find

PBS: on the local system. pbs_binpath is the directory that contains the PBS binaries. pbs_spoolpath is the PBS spool directory.
LSF: on the local system. lsf_binpath is the directory that contains the LSF binaries. lsf_confpath is the location of the LSF configuration file.

The popular PBS (Portable Batch System) can be found at http://www.pbsworks.com/, and Torque is at (http://www.adaptivecomputing.com/products/open-source/torque/).

As an alternative to the submission details given above, HTCondor jobs may be submitted to a local PBS system using the grid universe and the grid_resource command by placing the following into the submit description file.

grid_resource = pbs

HTCondor jobs may be submitted to the Platform LSF batch system. See the Products page of the Platform web page at http://www.platform.com/Products/ for more information about Platform LSF.

As an alternative to the submission details given above, HTCondor jobs may be submitted to a local Platform LSF system using the grid universe and the grid_resource command by placing the following into the submit description file.

grid_resource = lsf

The popular Grid Engine batch system (formerly known as Sun Grid Engine and abbreviated SGE) is available in two varieties: Oracle Grid Engine (http://www.oracle.com/us/products/tools/oracle-grid-engine-075549.html) and Univa Grid Engine (http://www.univa.com/?gclid=CLXg6-OEy6wCFWICQAodl0lm9Q).

As an alternative to the submission details given above, HTCondor jobs may be submitted to a local SGE system using the grid universe and adding the grid_resource command by placing into the submit description file:

grid_resource = sge

5.3.6 The EC2 Grid Type

HTCondor jobs may be submitted to clouds supporting Amazon's Elastic Compute Cloud (EC2) interface. Amazon's EC2 is an on-line commercial service that allows the rental of computers by the hour to run computational applications. It runs virtual machine images that have been uploaded to Amazon's online storage service (S3 or EBS). More information about Amazon's EC2 service is available at http://aws.amazon.com/ec2.

The ec2 grid type uses the EC2 Query API, also called the EC2 REST API.

5.3.6.1 EC2 Job Submission

HTCondor jobs are submitted to Amazon's EC2 with the grid universe, and setting the grid_resource command to ec2, followed by the service's URL. For example, partial contents of the submit description file may be

grid_resource = ec2 https://ec2.amazonaws.com/

Since the job is a virtual machine image, most of the submit description file commands specifying input or output files are not applicable. The executable command is still required, but its value is ignored. It can be used to identify different jobs in the output of condor_q.

The VM image for the job must already reside in one of Amazon's storage service (S3 or EBS) and be registered with EC2. In the submit description file, provide the identifier for the image using either ec2_ami_id.

This grid type requires access to user authentication information, in the form of path names to files containing the appropriate keys.

The ec2 grid type has two different authentication methods. The first authentication method uses the EC2 API's built-in authentication. Specify the service with expected http:// or https:// URL, and set the EC2 access key and secret access key as follows:

ec2_access_key_id = /path/to/access.key
ec2_secret_access_key = /path/to/secret.key

While both pairs of files may be associated with the same account, the credentials are not the same.

The second authentication method for the EC2 grid type is X.509. Specify the service with an x509:// URL, even if the URL was given in another form. Use ec2_access_key_id to specify the path to the X.509 public key (certificate), and ec2_secret_access_key specifies the path to the X.509 private key as in the following example:

grid_resource = ec2 x509://service.example
ec2_access_key_id = /path/to/x.509/public.key
ec2_secret_access_key = /path/to/x.509/private.key

If using an X.509 proxy, specify the proxy in both places.

HTCondor can use the EC2 API to create an SSH key pair that allows secure log in to the virtual machine once it is running. If the command ec2_keypair_file is set in the submit description file, HTCondor will write an SSH private key into the indicated file. The key can be used to log into the virtual machine. Note that modification will also be needed of the firewall rules for the job to incoming SSH connections.

An EC2 service uses a firewall to restrict network access to the virtual machine instances it runs. Typically, no incoming connections are allowed. One can define sets of firewall rules and give them names. The EC2 API calls these security groups. If utilized, tell HTCondor what set of security groups should be applied to each VM using the ec2_security_groups submit description file command. If not provided, HTCondor uses the security group default.

The EC2 API allows the choice of different hardware configurations for instances to run on. Select which configuration to use for the ec2 grid type with the ec2_instance_type submit description file command. HTCondor provides no default.

Each virtual machine instance can be given up to 16Kbytes of unique data, accessible by the instance by connecting to a well-known address. This makes it easy for many instances to share the same VM image, but perform different work. This data can be specified to HTCondor in one of two ways. First, the data can be provided directly in the submit description file using the ec2_user_data command. Second, the data can be stored in a file, and the file name is specified with the ec2_user_data_file submit description file command. This second option allows the use of binary data. If both options are used, the two blocks of data are concatenated, with the data from ec2_user_data occurring first. HTCondor performs the base64 encoding that EC2 expects on the data.

5.3.6.2 EC2 Configuration Variables

The ec2 grid type requires these configuration variables to be set in the HTCondor configuration file:

EC2_GAHP     = $(SBIN)/ec2_gahp
EC2_GAHP_LOG = /tmp/EC2GahpLog.$(USERNAME)

The ec2 grid type does not presently permit the explicit use of an HTTP proxy.

5.3.7 The cream Grid Type

CREAM is a job submission interface being developed at INFN for the gLite software stack. The CREAM homepage is http://grid.pd.infn.it/cream/. The protocol is based on web services.

The protocol requires an X.509 proxy for the job, so the submit description file command x509userproxy will be used.

A CREAM resource specification is of the form:

grid_resource = cream <web-services-address> <batch-system> <queue-name>

The <web-services-address> appears the same for most servers, differing only in the host name, as

<machinename[:port]>/ce-cream/services/CREAM2

Future versions of HTCondor may require only the host name, filling in other aspects of the web service for the user.

The <batch-system> is the name of the batch system that sits behind the CREAM server, into which it submits the jobs. Normal values are pbs, lsf, and condor.

The <queue-name> identifies which queue within the batch system should be used. Values for this will vary by site, with no typical values.

A full example for the specification of a CREAM grid_resource is

grid_resource = cream https://cream-12.pd.infn.it:8443/ce-cream/services/CREAM2
   pbs cream_1

This is a single line within the submit description file, although it is shown here on two lines for formatting reasons.

5.3.8 The deltacloud Grid Type

HTCondor jobs may be submitted to Deltacloud services. Deltacloud is a translation service for cloud services. Cloud services allow the rental of computers by the hour to run computation applications. Many cloud services define their own protocol for users to communicate with them. Deltacloud defines its own simple protocol and translates a user's commands into the appropriate protocol for the cloud service the user specifies. Anyone can set up a Deltacloud service and configure it to translate for a specific cloud service. See the Deltacloud web page at http://deltacloud.apache.org/ for more information about Deltacloud.

5.3.8.1 Deltacloud Job Submission

HTCondor jobs are submitted to Deltacloud using the grid universe and the grid_resource command into the submit description file following this example:

grid_resource = deltacloud https://deltacloud.foo.org/api

The URL in this example will be replaced with the URL of the Deltacloud service desired.

The VM image for the job must already be stored and registered with the cloud service. In the submit description file, provide the identifier for the image using the deltacloud_image_id command.

To authenticate with Deltacloud, HTCondor needs your credentials for the cloud service that the Deltacloud server is representing. The credentials are presented as a user name and the name of a file that holds a secret key. Both are specified in the submit description file:

deltacloud_username = your_username
deltacloud_password_file = /path/to/password/file

You can create and register an SSH key pair with the cloud service, which you can then use to securely log in to virtual machines, once running. The command deltacloud_keyname in the submit description file specifies the identifier of the SSH key pair to use.

The cloud service may have multiple locations where the virtual machine can run. The submit description file command deltacloud_realm_id selects one. If not specified, the service will select a sensible default.

The cloud service may offer several hardware configurations for instances to run on. Select which configuration to use with the deltacloud_hardware_profile submit description file command. If not specified, the cloud service will select a sensible default. The optional commands deltacloud_hardware_profile_memory, deltacloud_hardware_profile_cpu, and deltacloud_hardware_profile_storage customize the selected hardware profile.

Each virtual machine instance can be given some unique data, accessible by the instance connecting to a well-known address. This makes it easy for many instances to share the same VM image, but perform different work. This data can be specified with the submit description file command deltacloud_user_data. The amount of data that can be provided depends on the cloud service. EC2 services allow up to 16Kb of data.

5.3.8.2 Configuration for Deltacloud

The deltacloud grid type requires one configuration variable to be set, to specify the path and executable of the deltacloud_gahp:

DELTACLOUD_GAHP = $(SBIN)/deltacloud_gahp

5.3.9 Matchmaking in the Grid Universe

In a simple usage, the grid universe allows users to specify a single grid site as a destination for jobs. This is sufficient when a user knows exactly which grid site they wish to use, or a higher-level resource broker (such as the European Data Grid's resource broker) has decided which grid site should be used.

When a user has a variety of grid sites to choose from, HTCondor allows matchmaking of grid universe jobs to decide which grid resource a job should run on. Please note that this form of matchmaking is relatively new. There are some rough edges as continual improvement occurs.

To facilitate HTCondor's matching of jobs with grid resources, both the jobs and the grid resources are involved. The job's submit description file provides all commands needed to make the job work on a matched grid resource. The grid resource identifies itself to HTCondor by advertising a ClassAd. This ClassAd specifies all necessary attributes, such that HTCondor can properly make matches. The grid resource identification is accomplished by using condor_advertise to send a ClassAd representing the grid resource, which is then used by HTCondor to make matches.

5.3.9.1 Job Submission

To submit a grid universe job intended for a single, specific gt2 resource, the submit description file for the job explicitly specifies the resource:

grid_resource = gt2 grid.example.com/jobmanager-pbs

If there were multiple gt2 resources that might be matched to the job, the submit description file changes:

grid_resource   = $$(resource_name)
requirements    = TARGET.resource_name =!= UNDEFINED

The grid_resource command uses a substitution macro. The substitution macro defines the value of resource_name using attributes as specified by the matched grid resource. The requirements command further restricts that the job may only run on a machine (grid resource) that defines grid_resource. Note that this attribute name is invented for this example. To make matchmaking work in this way, both the job (as used here within the submit description file) and the grid resource (in its created and advertised ClassAd) must agree upon the name of the attribute.

As a more complex example, consider a job that wants to run not only on a gt2 resource, but on one that has the Bamboozle software installed. The complete submit description file might appear:

universe        = grid
executable      = analyze_bamboozle_data
output          = aaa.$(Cluster).out
error           = aaa.$(Cluster).err
log             = aaa.log
grid_resource   = $$(resource_name)
requirements    = (TARGET.HaveBamboozle == True) && (TARGET.resource_name =!= UNDEFINED)
queue

Any grid resource which has the HaveBamboozle attribute defined as well as set to True is further checked to have the resource_name attribute defined. Where this occurs, a match may be made (from the job's point of view). A grid resource that has one of these attributes defined, but not the other results in no match being made.

Note that the entire value of grid_resource comes from the grid resource's ad. This means that the job can be matched with a resource of any type, not just gt2.

5.3.9.2 Advertising Grid Resources to HTCondor

Any grid resource that wishes to be matched by HTCondor with a job must advertise itself to HTCondor using a ClassAd. To properly advertise, a ClassAd is sent periodically to the condor_collector daemon. A ClassAd is a list of pairs, where each pair consists of an attribute name and value that describes an entity. There are two entities relevant to HTCondor: a job, and a machine. A grid resource is a machine. The ClassAd describes the grid resource, as well as identifying the capabilities of the grid resource. It may also state both requirements and preferences (called rank) for the jobs it will run. See Section 2.3 for an overview of the interaction between matchmaking and ClassAds. A list of common machine ClassAd attributes is given in the Appendix on page .

To advertise a grid site, place the attributes in a file. Here is a sample ClassAd that describes a grid resource that is capable of running a gt2 job.

# example grid resource ClassAd for a gt2 job
MyType         = "Machine"
TargetType     = "Job"
Name           = "Example1_Gatekeeper"
Machine        = "Example1_Gatekeeper"
resource_name  = "gt2 grid.example.com/jobmanager-pbs"
UpdateSequenceNumber  = 4
Requirements   = (TARGET.JobUniverse == 9)
Rank           = 0.000000
CurrentRank    = 0.000000

Some attributes are defined as expressions, while others are integers, floating point values, or strings. The type is important, and must be correct for the ClassAd to be effective. The attributes

MyType         = "Machine"
TargetType     = "Job"

identify the grid resource as a machine, and that the machine is to be matched with a job. In HTCondor, machines are matched with jobs, and jobs are matched with machines. These attributes are strings. Strings are surrounded by double quote marks.

The attributes Name and Machine are likely to be defined to be the same string value as in the example:

Name           = "Example1_Gatekeeper"
Machine        = "Example1_Gatekeeper"

Both give the fully qualified host name for the resource. The Name may be different on an SMP machine, where the individual CPUs are given names that can be distinguished from each other. Each separate grid resource must have a unique name.

Where the job depends on the resource to specify the value of the grid_resource command by the use of the substitution macro, the ClassAd for the grid resource (machine) defines this value. The example given as

grid_resource = "gt2 grid.example.com/jobmanager-pbs"

defines this value. Note that the invented name of this variable must match the one utilized within the submit description file. To make the matchmaking work, both the job (as used within the submit description file) and the grid resource (in this created and advertised ClassAd) must agree upon the name of the attribute.

A machine's ClassAd information can be time sensitive, and may change over time. Therefore, ClassAds expire and are thrown away. In addition, the communication method by which ClassAds are sent implies that entire ads may be lost without notice or may arrive out of order. Out of order arrival leads to the definition of an attribute which provides an ordering. This positive integer value is given in the example ClassAd as

UpdateSequenceNumber  = 4

This value must increase for each subsequent ClassAd. If state information for the ClassAd is kept in a file, a script executed each time the ClassAd is to be sent may use a counter for this value. An alternative for a stateless implementation sends the current time in seconds (since the epoch, as given by the C time() function call).

The requirements that the grid resource sets for any job that it will accept are given as

Requirements     = (TARGET.JobUniverse == 9)

This set of requirements state that any job is required to be for the grid universe.

The attributes

Rank             = 0.000000
CurrentRank      = 0.000000

are both necessary for HTCondor's negotiation to proceed, but are not relevant to grid matchmaking. Set both to the floating point value 0.0.

The example machine ClassAd becomes more complex for the case where the grid resource allows matches with more than one job:

# example grid resource ClassAd for a gt2 job
MyType         = "Machine"
TargetType     = "Job"
Name           = "Example1_Gatekeeper"
Machine        = "Example1_Gatekeeper"
resource_name  = "gt2 grid.example.com/jobmanager-pbs"
UpdateSequenceNumber  = 4
Requirements   = (CurMatches < 10) && (TARGET.JobUniverse == 9)
Rank           = 0.000000
CurrentRank    = 0.000000
WantAdRevaluate = True
CurMatches     = 1

In this example, the two attributes WantAdRevaluate and CurMatches appear, and the Requirements expression has changed.

WantAdRevaluate is a boolean value, and may be set to either True or False. When True in the ClassAd and a match is made (of a job to the grid resource), the machine (grid resource) is not removed from the set of machines to be considered for further matches. This implements the ability for a single grid resource to be matched to more than one job at a time. Note that the spelling of this attribute is incorrect, and remains incorrect to maintain backward compatibility.

To limit the number of matches made to the single grid resource, the resource must have the ability to keep track of the number of HTCondor jobs it has. This integer value is given as the CurMatches attribute in the advertised ClassAd. It is then compared in order to limit the number of jobs matched with the grid resource.

Requirements   = (CurMatches < 10) && (TARGET.JobUniverse == 9)
CurMatches     = 1

This example assumes that the grid resource already has one job, and is willing to accept a maximum of 9 jobs. If CurMatches does not appear in the ClassAd, HTCondor uses a default value of 0.

For multiple matching of a site ClassAd to work correctly, it is also necessary to add the following to the configuration file read by the condor_negotiator:

NEGOTIATOR_MATCHLIST_CACHING = False
NEGOTIATOR_IGNORE_USER_PRIORITIES = True

This ClassAd (likely in a file) is to be periodically sent to the condor_collector daemon using condor_advertise. A recommended implementation uses a script to create or modify the ClassAd together with cron to send the ClassAd every five minutes. The condor_advertise program must be installed on the machine sending the ClassAd, but the remainder of HTCondor does not need to be installed. The required argument for the condor_advertise command is UPDATE_STARTD_AD.

condor_advertise uses UDP to transmit the ClassAd. Where this is insufficient, specify the -tcp option to condor_advertise to use TCP for communication.

5.3.9.3 Advanced usage

What if a job fails to run at a grid site due to an error? It will be returned to the queue, and HTCondor will attempt to match it and re-run it at another site. HTCondor isn't very clever about avoiding sites that may be bad, but you can give it some assistance. Let's say that you want to avoid running at the last grid site you ran at. You could add this to your job description:

match_list_length = 1
Rank              = TARGET.Name != LastMatchName0

This will prefer to run at a grid site that was not just tried, but it will allow the job to be run there if there is no other option.

When you specify match_list_length, you provide an integer N, and HTCondor will keep track of the last N matches. The oldest match will be LastMatchName0, and next oldest will be LastMatchName1, and so on. (See the condor_submit manual page for more details.) The Rank expression allows you to specify a numerical ranking for different matches. When combined with match_list_length, you can prefer to avoid sites that you have already run at.

In addition, condor_submit has two options to help control grid universe job resubmissions and rematching. See the definitions of the submit description file commands globus_resubmit and globus_rematch at page and page . These options are independent of match_list_length.

There are some new attributes that will be added to the Job ClassAd, and may be useful to you when you write your rank, requirements, globus_resubmit or globus_rematch option. Please refer to the Appendix on page to see a list containing the following attributes:

NumJobMatches
NumGlobusSubmits
NumSystemHolds
HoldReason
ReleaseReason
EnteredCurrentStatus
LastMatchTime
LastRejMatchTime
LastRejMatchReason

The following example of a command within the submit description file releases jobs 5 minutes after being held, increasing the time between releases by 5 minutes each time. It will continue to retry up to 4 times per Globus submission, plus 4. The plus 4 is necessary in case the job goes on hold before being submitted to Globus, although this is unlikely.

periodic_release = ( NumSystemHolds <= ((NumGlobusSubmits * 4) + 4) ) \
   && (NumGlobusSubmits < 4) && \
   ( HoldReason != "via condor_hold (by user $ENV(USER))" ) && \
   ((CurrentTime - EnteredCurrentStatus) > ( NumSystemHolds *60*5 ))

The following example forces Globus resubmission after a job has been held 4 times per Globus submission.

globus_resubmit = NumSystemHolds == (NumGlobusSubmits + 1) * 4

If you are concerned about unknown or malicious grid sites reporting to your condor_collector, you should use HTCondor's security options, documented in Section 3.6.

Next: 5.4 Dynamic Deployment Up: 5. Grid Computing Previous: 5.2 Connecting HTCondor Pools Contents Index

htcondor-admin@cs.wisc.edu