Queue jobs for execution under HTCondor
condor_submit [-terse] [-verbose] [-unused] [-file submit_file] [-name schedd_name] [-remote schedd_name] [-addr <ip:port>] [-pool pool_name] [-disable] [-password passphrase] [-debug] [-append command …][-batch-name batch_name] [-spool] [-dump filename] [-interactive] [-allow-crlf-script] [-dry-run] [-maxjobs number-of-jobs] [-single-cluster] [-stm method] [<submit-variable>=<value>] [submit description file] [-queue queue_arguments]
condor_submit is the program for submitting jobs for execution under HTCondor. condor_submit requires one or more submit description commands to direct the queuing of jobs. These commands may come from a file, standard input, the command line, or from some combination of these. One submit description may contain specifications for the queuing of many HTCondor jobs at once. A single invocation of condor_submit may cause one or more clusters. A cluster is a set of jobs specified in the submit description between queue commands for which the executable is not changed. It is advantageous to submit multiple jobs as a single cluster because:
Multiple clusters may be specified within a single submit description. Each cluster must specify a single executable.
The job ClassAd attribute ClusterId identifies a cluster.
The submit description file argument is the path and file name of the submit description file. If this optional argument is the dash character (-), then the commands are taken from standard input. If - is specified for the submit description file, -verbose is implied; this can be overridden by specifying -terse.
If no submit discription file argument is given, and no -queue argument is given, commands are taken automatically from standard input.
Note that submission of jobs from a Windows machine requires a stashed password to allow HTCondor to impersonate the user submitting the job. To stash a password, use the condor_store_cred command. See the manual page for details.
For lengthy lines within the submit description file, the backslash (\) is a line continuation character. Placing the backslash at the end of a line causes the current line’s command to be continued with the next line of the file. Submit description files may contain comments. A comment is any line beginning with a pound character (#).
condor_submit mysubmitfile -append "queue input in A, B, C"
then the entire -append command line option and its arguments are converted to
condor_submit mysubmitfile -queue input in A, B, C
The submit description file is not modified. Multiple commands are specified by using the -append
option multiple times. Each new command is given in a separate -append option. Commands with
spaces in them will need to be enclosed in double quote marks.
On a Unix command line, the shell expands file globs before parsing occurs.
Note: more information on submitting HTCondor jobs can be found here: 2.5.
As of version 8.5.6, the condor_submit language supports multi-line values in commands. The syntax is the same as the configuration language (see more details here: 3.3.5).
Each submit description file describes one or more clusters of jobs to be placed in the HTCondor execution pool. All jobs in a cluster must share the same executable, but they may have different input and output files, and different program arguments. The submit description file is generally the last command-line argument to condor_submit. If the submit description file argument is omitted, condor_submit will read the submit description from standard input.
The submit description file must contain at least one executable command and at least one queue command. All of the other commands have default actions.
Note that a submit file that contains more than one executable command will produce multiple clusters when submitted. This is not generally recommended, and is not allowed for submit files that are run as DAG node jobs by condor_dagman.
The commands which can appear in the submit description file are numerous. They are listed here in alphabetical order by category.
In the java universe, the first argument must be the name of the class containing main.
There are two permissible formats for specifying arguments, identified as the old syntax and the new syntax. The old syntax supports white space characters within arguments only in special circumstances; when used, the command line arguments are represented in the job ClassAd attribute Args. The new syntax supports uniform quoting of white space characters within arguments; when used, the command line arguments are represented in the job ClassAd attribute Arguments.
Old Syntax
In the old syntax, individual command line arguments are delimited (separated) by space characters. To allow a double quote mark in an argument, it is escaped with a backslash; that is, the two character sequence \" becomes a single double quote mark within an argument.
Further interpretation of the argument string differs depending on the operating system. On Windows, the entire argument string is passed verbatim (other than the backslash in front of double quote marks) to the Windows application. Most Windows applications will allow spaces within an argument value by surrounding the argument with double quotes marks. In all other cases, there is no further interpretation of the arguments.
Example:
Produces in Unix vanilla universe:
New Syntax
Here are the rules for using the new syntax:
Example:
Produces:
Another example:
Produces:
And yet another example:
Produces:
Notice that in the new syntax, the backslash has no special meaning. This is for the convenience of Windows users.
There are two different formats for specifying the environment variables: the old format and the new format. The old format is retained for backward-compatibility. It suffers from a platform-dependent syntax and the inability to insert some special characters into the environment.
The new syntax for specifying environment values:
Example:
Produces the following environment entries:
Under the old syntax, there are no double quote marks surrounding the environment specification. Each environment entry remains of the form
Under Unix, list multiple environment entries by separating them with a semicolon (;). Under Windows, separate multiple entries with a vertical bar (|). There is no way to insert a literal semicolon under Unix or a literal vertical bar under Windows. Note that spaces are accepted, but rarely desired, characters within parameter names and values, because they are treated as literal characters, not separators or ignored white space. Place spaces within the parameter list only if required.
A Unix example:
This produces the following:
If the environment is set with the environment command and getenv is also set to true, values specified with environment override values in the submitter’s environment (regardless of the order of the environment and getenv commands).
If no path or a relative path is used, then the executable file is presumed to be relative to the current working directory of the user as the condor_submit command is issued.
If submitting into the standard universe, then the named executable must have been re-linked with the HTCondor libraries (such as via the condor_compile command). If submitting into the vanilla universe (the default), then the named executable need not be re-linked and can be any process which can run in the background (shell scripts work fine as well). If submitting into the Java universe, then the argument must be a compiled .class file.
If the environment is set with the environment command and getenv is also set to true, values specified with environment override values in the submitter’s environment (regardless of the order of the environment and getenv commands).
Note that this command does not refer to the command-line arguments of the program. The command-line arguments are specified by the arguments command.
where the configuration variable UID_DOMAIN is specified by the HTCondor site administrator. If UID_DOMAIN has not been specified, HTCondor sends the e-mail to:
Note that if a program explicitly opens and writes to a file, that file should not be specified as the output file.
Note that the priority setting in an HTCondor submit file will be overridden by condor_dagman if the submit file is used for a node in a DAG, and the priority of the node within the DAG is non-zero (see 2.10.9 for more details).
The optional argument <int expr> specifies how many times to repeat the job submission for a given set of arguments. It may be an integer or an expression that evaluates to an integer, and it defaults to 1. All but the first form of this command are various ways of specifying a list of items. When these forms are used <int expr> jobs will be queued for each item in the list. The in, matching and from keyword indicates how the list will be specified.
The optional argument <varname> or <list of varnames> is the name or names of of variables that will be set to the value of the current item when queuing the job. If no <varname> is specified the variable ITEM will be used. Leading and trailing whitespace be trimmed. The optional argument <slice> is a python style slice selecting only some of the items in the list of items. Negative step values are not supported.
A submit file may contain more than one queue statement, and if desired, any commands may be placed between subsequent queue commands, such as new input, output, error, initialdir, or arguments commands. This is handy when submitting multiple runs into one cluster with one submit description file.
The vanilla universe is the default (except where the configuration variable DEFAULT_UNIVERSE defines it otherwise), and is an execution environment for jobs which do not use HTCondor’s mechanisms for taking checkpoints; these are ones that have not been linked with the HTCondor libraries. Use the vanilla universe to submit shell scripts to HTCondor.
The standard universe tells HTCondor that this job has been re-linked via condor_compile with the HTCondor libraries and therefore supports taking checkpoints and remote system calls.
The scheduler universe is for a job that is to run on the machine where the job is submitted. This universe is intended for a job that acts as a metascheduler and will not be preempted.
The local universe is for a job that is to run on the machine where the job is submitted. This universe runs the job immediately and will not preempt the job.
The grid universe forwards the job to an external job management system. Further specification of the grid universe is done with the grid_resource command.
The java universe is for programs written to the Java Virtual Machine.
The vm universe facilitates the execution of a virtual machine.
The parallel universe is for parallel jobs (e.g. MPI) that require multiple machines in order to run.
The docker universe runs a docker container as an HTCondor job.
asks HTCondor to find all available machines with more than 60 megabytes of memory and give to the job the machine with the most amount of memory. The HTCondor User’s Manual contains complete information on the syntax and available attributes that can be used in the ClassAd expression.
is appended to the requirements expression for the job.
For pools that enable dynamic condor_startd provisioning, specifies the minimum number of CPUs requested for this job, resulting in a dynamic slot being created with this many cores.
is appended to the requirements expression for the job.
For pools that enable dynamic condor_startd provisioning, a dynamic slot will be created with at least this much disk space.
Characters may be appended to a numerical value to indicate units. K or KB indicates KiB, 210 numbers of bytes. M or MB indicates MiB, 220 numbers of bytes. G or GB indicates GiB, 230 numbers of bytes. T or TB indicates TiB, 240 numbers of bytes.
For pools that enable dynamic condor_startd provisioning, a dynamic slot will be created with at least this much RAM.
The expression
is appended to the requirements expression for the job.
Characters may be appended to a numerical value to indicate units. K or KB indicates KiB, 210 numbers of bytes. M or MB indicates MiB, 220 numbers of bytes. G or GB indicates GiB, 230 numbers of bytes. T or TB indicates TiB, 240 numbers of bytes.
For scheduler and local universe jobs, the requirements expression is evaluated against the Scheduler ClassAd which represents the the condor_schedd daemon running on the submit machine, rather than a remote machine. Like all commands in the submit description file, if multiple requirements commands are present, all but the last one are ignored. By default, condor_submit appends the following clauses to the requirements expression:
View the requirements of a job which has already been submitted (along with everything else about the job ClassAd) with the command condor_q -l; see the command reference for condor_q on page 2025. Also, see the HTCondor Users Manual for complete information on the syntax and available attributes that can be used in the ClassAd expression.
to ensure the job is matched to a machine that is capable of encrypting the contents of the execute directory. This support is limited to Windows platforms that use the NTFS file system and Linux platforms with the ecryptfs-utils package installed.
For more information about this and other settings related to transferring files, see the HTCondor User’s manual section on the file transfer mechanism.
Note that should_transfer_files is not supported for jobs submitted to the grid universe.
When a path to an input file or directory is specified, this specifies the path to the file on the submit side. The file is placed in the job’s temporary scratch directory on the execute side, and it is named using the base name of the original path. For example, /path/to/input_file becomes input_file in the job’s scratch directory.
A directory may be specified by appending the forward slash character (/) as a trailing path separator. This syntax is used for both Windows and Linux submit hosts. A directory example using a trailing path separator is input_data/. When a directory is specified with the trailing path separator, the contents of the directory are transferred, but the directory itself is not transferred. It is as if each of the items within the directory were listed in the transfer list. When there is no trailing path separator, the directory is transferred, its contents are transferred, and these contents are placed inside the transferred directory.
For grid universe jobs other than HTCondor-C, the transfer of directories is not currently supported.
Symbolic links to files are transferred as the files they point to. Transfer of symbolic links to directories is not currently supported.
For vanilla and vm universe jobs only, a file may be specified by giving a URL, instead of a file name. The implementation for URL transfers requires both configuration and available plug-in.
For HTCondor-C jobs and all other non-grid universe jobs, if transfer_output_files is not specified, HTCondor will automatically transfer back all files in the job’s temporary working directory which have been modified or created by the job. Subdirectories are not scanned for output, so if output from subdirectories is desired, the output list must be explicitly specified. For grid universe jobs other than HTCondor-C, desired output files must also be explicitly listed. Another reason to explicitly list output files is for a job that creates many files, and the user wants only a subset transferred back.
For grid universe jobs other than with grid type condor, to have files other than standard output and standard error transferred from the execute machine back to the submit machine, do use transfer_output_files, listing all files to be transferred. These files are found on the execute machine in the working directory of the job.
When a path to an output file or directory is specified, it specifies the path to the file on the execute side. As a destination on the submit side, the file is placed in the job’s initial working directory, and it is named using the base name of the original path. For example, path/to/output_file becomes output_file in the job’s initial working directory. The name and path of the file that is written on the submit side may be modified by using transfer_output_remaps. Note that this remap function only works with files but not with directories.
A directory may be specified using a trailing path separator. An example of a trailing path separator is the slash character on Unix platforms; a directory example using a trailing path separator is input_data/. When a directory is specified with a trailing path separator, the contents of the directory are transferred, but the directory itself is not transferred. It is as if each of the items within the directory were listed in the transfer list. When there is no trailing path separator, the directory is transferred, its contents are transferred, and these contents are placed inside the transferred directory.
For grid universe jobs other than HTCondor-C, the transfer of directories is not currently supported.
Symbolic links to files are transferred as the files they point to. Transfer of symbolic links to directories is not currently supported.
name describes an output file name produced by your job, and newname describes the file name it should be downloaded to. Multiple remaps can be specified by separating each with a semicolon. If you wish to remap file names that contain equals signs or semicolons, these special characters may be escaped with a backslash. You cannot specify directories to be remapped.
Setting when_to_transfer_output equal to ON_EXIT will cause HTCondor to transfer the job’s output files back to the submitting machine only when the job completes (exits on its own).
The ON_EXIT_OR_EVICT option is intended for fault tolerant jobs which periodically save their own state and can restart where they left off. In this case, files are spooled to the submit machine any time the job leaves a remote site, either because it exited on its own, or was evicted by the HTCondor system for any reason prior to job completion. The files spooled back are placed in a directory defined by the value of the SPOOL configuration variable. Any output files transferred back to the submit machine are automatically sent back out again as input files if the job restarts.
The combination of the max_retries, retry_until, and success_exit_code commands causes an appropriate OnExitRemove expression to be automatically generated. If retry command(s) and on_exit_remove are both defined, the OnExitRemove expression will be generated by OR’ing the expression specified in OnExitRemove and the expression generated by the retry commands.
Note: non-zero values of success_exit_code should generally not be used for DAG node jobs. At the present time, condor_dagman does not take into account the value of success_exit_code. This means that, if success_exit_code is set to a non-zero value, condor_dagman will consider the job failed when it actually succeeds. For single-proc DAG node jobs, this can be overcome by using a POST script that takes into account the value of success_exit_code (although this is not recommended). For multi-proc DAG node jobs, there is currently no way to overcome this limitation.
The process by which the condor_schedd claims a condor_startd is somewhat time-consuming. To amortize this cost, the condor_schedd tries to reuse claims to run subsequent jobs, after a job using a claim is done. However, it can only do this if there is an idle job in the queue at the moment the previous job completes. Sometimes, and especially for the node jobs when using DAGMan, there is a subsequent job about to be submitted, but it has not yet arrived in the queue when the previous job completes. As a result, the condor_schedd releases the claim, and the next job must wait an entire negotiation cycle to start. When this submit command is defined with a non-negative integer, when the job exits, the condor_schedd tries as usual to reuse the claim. If it cannot, instead of releasing the claim, the condor_schedd keeps the claim until either the number of seconds given as a parameter, or a new job which matches that claim arrives, whichever comes first. The condor_startd in question will remain in the Claimed/Idle state, and the original job will be "charged" (in terms of priority) for the time in this state.
As an example, if the job is to be removed once the output is retrieved with condor_transfer_data, then use
This command has been historically used to implement a form of job start throttling from the job submitter’s perspective. It was effective for the case of multiple job submission where the transfer of extremely large input data sets to the execute machine caused machine performance to suffer. This command is no longer useful, as throttling should be accomplished through configuration of the condor_schedd daemon.
For example: Suppose a job is known to run for a minimum of an hour. If the job exits after less than an hour, the job should be placed on hold and an e-mail notification sent, instead of being allowed to leave the queue.
This expression places the job on hold if it exits for any reason before running for an hour. An e-mail will be sent to the user explaining that the job was placed on hold because this expression became True.
periodic_* expressions take precedence over on_exit_* expressions, and *_hold expressions take precedence over a *_remove expressions.
Only job ClassAd attributes will be defined for use by this ClassAd expression. This expression is available for the vanilla, java, parallel, grid, local and scheduler universes. It is additionally available, when submitted from a Unix machine, for the standard universe.
For example, suppose a job occasionally segfaults, but chances are that the job will finish successfully if the job is run again with the same data. The on_exit_remove expression can cause the job to run again with the following command. Assume that the signal identifier for the segmentation fault is 11 on the platform where the job will be running.
This expression lets the job leave the queue if the job was not killed by a signal or if it was killed by a signal other than 11, representing segmentation fault in this example. So, if the exited due to signal 11, it will stay in the job queue. In any other case of the job exiting, the job will leave the queue as it normally would have done.
As another example, if the job should only leave the queue if it exited on its own with status 0, this on_exit_remove expression works well:
If the job was killed by a signal or exited with a non-zero exit status, HTCondor would leave the job in the queue to run again.
periodic_* expressions take precedence over on_exit_* expressions, and *_hold expressions take precedence over a *_remove expressions.
Only job ClassAd attributes will be defined for use by this ClassAd expression.
periodic_* expressions take precedence over on_exit_* expressions, and *_hold expressions take precedence over a *_remove expressions.
Only job ClassAd attributes will be defined for use by this ClassAd expression. Note that, by default, this expression is only checked once every 60 seconds. The period of these evaluations can be adjusted by setting the PERIODIC_EXPR_INTERVAL, MAX_PERIODIC_EXPR_INTERVAL, and PERIODIC_EXPR_TIMESLICE configuration macros.
Only job ClassAd attributes will be defined for use by this ClassAd expression. Note that, by default, this expression is only checked once every 60 seconds. The period of these evaluations can be adjusted by setting the PERIODIC_EXPR_INTERVAL, MAX_PERIODIC_EXPR_INTERVAL, and PERIODIC_EXPR_TIMESLICE configuration macros.
See the Examples section of this manual page for an example of a periodic_remove expression.
periodic_* expressions take precedence over on_exit_* expressions, and *_hold expressions take precedence over a *_remove expressions. So, the periodic_remove expression takes precedent over the on_exit_remove expression, if the two describe conflicting actions.
Only job ClassAd attributes will be defined for use by this ClassAd expression. Note that, by default, this expression is only checked once every 60 seconds. The period of these evaluations can be adjusted by setting the PERIODIC_EXPR_INTERVAL, MAX_PERIODIC_EXPR_INTERVAL, and PERIODIC_EXPR_TIMESLICE configuration macros.
COMMANDS SPECIFIC TO THE STANDARD UNIVERSE
If this command is not present (defined), then the value defaults to false.
If your job attempts to access a file mentioned in this list, HTCondor will force all writes to that file to be appended to the end. Furthermore, condor_submit will not truncate it. This list uses the same syntax as compress_files, shown above.
This option may yield some surprising results. If several jobs attempt to write to the same file, their output may be intermixed. If a job is evicted from one or more machines during the course of its lifetime, such an output file might contain several copies of the results. This option should be only be used when you wish a certain file to be treated as a running log instead of a precise result.
These options only apply to standard-universe jobs.
If needed, you may set the buffer controls individually for each file using the buffer_files option. For example, to set the buffer size to 1 MiB and the block size to 256 KiB for the file input.data, use this command:
Alternatively, you may use these two options to set the default sizes for all files used by your job:
If you do not set these, HTCondor will use the values given by these two configuration file macros:
Finally, if no other settings are present, HTCondor will use a buffer of 512 KiB and a block size of 32 KiB.
If your job attempts to access any of the files mentioned in this list, HTCondor will automatically compress them (if writing) or decompress them (if reading). The compress format is the same as used by GNU gzip.
The files given in this list may be simple file names or complete paths and may include * as a wild card. For example, this list causes the file /tmp/data.gz, any file named event.gz, and any file ending in .gzip to be automatically compressed or decompressed as needed:
Due to the nature of the compression format, compressed files must only be accessed sequentially. Random access reading is allowed but is very slow, while random access writing is simply not possible. This restriction may be avoided by using both compress_files and fetch_files at the same time. When this is done, a file is kept in the decompressed state at the execution machine, but is compressed for transfer to its original location.
Directs HTCondor to use a new file name in place of an old one. name describes a file name that your job may attempt to open, and newname describes the file name it should be replaced with. newname may include an optional leading access specifier, local: or remote:. If left unspecified, the default access specifier is remote:. Multiple remaps can be specified by separating each with a semicolon.
This option only applies to standard universe jobs.
If you wish to remap file names that contain equals signs or semicolons, these special characters may be escaped with a backslash.
If your job attempts to access a file mentioned in this list, HTCondor will cause it to be read or written at the execution machine. This is most useful for temporary files not used for input or output. This list uses the same syntax as compress_files, shown above.
If globus_rematch evaluates to True, then before the job is submitted again to globus, the condor_gridmanager will request that the condor_schedd daemon renegotiate with the matchmaker (the condor_negotiator). The result is this job will be matched again.
If the expression evaluates to True, then any previous submission to the grid universe will be forgotten and this job will be submitted again as a fresh submission to the grid universe. This may be useful if there is a desire to give up on a previous submission and try again. Note that this may result in the same job running more than once. Do not treat this operation lightly.
For a grid-type-string of batch, the single parameter is the name of the local batch system, and will be one of pbs, lsf, or sge.
For a grid-type-string of condor, the first parameter is the name of the remote condor_schedd daemon. The second parameter is the name of the pool to which the remote condor_schedd daemon belongs.
For a grid-type-string of cream, there are three parameters. The first parameter is the web services address of the CREAM server. The second parameter is the name of the batch system that sits behind the CREAM server. The third parameter identifies a site-specific queue within the batch system.
For a grid-type-string of ec2, one additional parameter specifies the EC2 URL.
For a grid-type-string of gt2, the single parameter is the name of the pre-WS GRAM resource to be used.
For a grid-type-string of gt5, the single parameter is the name of the pre-WS GRAM resource to be used, which is the same as for the grid-type-string of gt2.
For a grid-type-string of lsf, no additional parameters are used.
For a grid-type-string of nordugrid, the single parameter is the name of the NorduGrid resource to be used.
For a grid-type-string of pbs, no additional parameters are used.
For a grid-type-string of sge, no additional parameters are used.
For a grid-type-string of unicore, the first parameter is the name of the Unicore Usite to be used. The second parameter is the name of the Unicore Vsite to be used.
For transferring files other than stdin, see transfer_input_files.
For transferring files other than stdout, see transfer_output_files.
x509userproxy is relevant when the universe is vanilla, or when the universe is grid and the type of grid system is one of gt2, gt5, condor, cream, or nordugrid. Defining a value causes the proxy to be delegated to the execute machine. Further, VOMS attributes defined in the proxy will appear in the job ClassAd.
COMMANDS FOR PARALLEL, JAVA, and SCHEDULER UNIVERSES
If this command is not present, the value of kill_sig is used.
An example that specifies two disk files:
COMMANDS FOR THE DOCKER UNIVERSE
The HTCondor User’s manual section on Time Scheduling for Job Execution has further details.
The HTCondor User’s manual section on Time Scheduling for Job Execution has further details.
The HTCondor User’s manual section on Time Scheduling for Job Execution has further details.
Due to implementation details, a deferral time may not be used for scheduler universe jobs.
The HTCondor User’s manual section on Time Scheduling for Job Execution has further details.
For vanilla universe jobs where there is a shared file system, it is the current working directory on the machine where the job is executed.
For vanilla or grid universe jobs where file transfer mechanisms are utilized (there is not a shared file system), it is the directory on the machine from which the job is submitted where the input files come from, and where the job’s output files go to.
For standard universe jobs, it is the directory on the machine from which the job is submitted where the condor_shadow daemon runs; the current working directory for file input and output accomplished through remote system calls.
For scheduler universe jobs, it is the directory on the machine from which the job is submitted where the job runs; the current working directory for file input and output with respect to relative path names.
Note that the path to the executable is not relative to initialdir; if it is a relative path, it is relative to the directory in which the condor_submit command is run.
The value for each introduced ClassAd is given by the value of the Name attribute from the machine ClassAd of a previous execution (match). As a job is matched, the definitions for these attributes will roll, with LastMatchName1 becoming LastMatchName2, LastMatchName0 becoming LastMatchName1, and LastMatchName0 being set by the most recent value of the Name attribute.
An intended use of these job attributes is in the requirements expression. The requirements can allow a job to prefer a match with either the same or a different resource than a previous match.
Setting this expression does not affect the job’s resource requirements or preferences. For a job to only run on a machine with a minimum MachineMaxVacateTime, or to preferentially run on such machines, explicitly specify this in the requirements and/or rank expressions.
When a resource claim is to be preempted, this expression in the submit file specifies the maximum run time of the job (in seconds, since the job started). This expression has no effect, if it is greater than the maximum retirement time provided by the machine policy. If the resource claim is not preempted, this expression and the machine retirement policy are irrelevant. If the resource claim is preempted the job will be allowed to run until the retirement time expires, at which point it is hard-killed. The job will be soft-killed when it is getting close to the end of retirement in order to give it time to gracefully shut down. The amount of lead-time for soft-killing is determined by the maximum vacating time granted to the job.
Standard universe jobs and any jobs running with nice_user priority have a default max_job_retirement_time of 0, so no retirement time is utilized by default. In all other cases, no default value is provided, so the maximum amount of retirement time is utilized by default.
Setting this expression does not affect the job’s resource requirements or preferences. For a job to only run on a machine with a minimum MaxJobRetirementTime, or to preferentially run on such machines, explicitly specify this in the requirements and/or rank expressions.
In addition to commands, the submit description file can contain macros and comments.
Two pre-defined macros are supplied by the submit description file parser. The $(Cluster) or $(ClusterId) macro supplies the value of the ClusterId job ClassAd attribute, and the $(Process) or $(ProcId) macro supplies the value of the ProcId job ClassAd attribute. These macros are intended to aid in the specification of input/output files, arguments, etc., for clusters with lots of jobs, and/or could be used to supply an HTCondor process with its own cluster and process numbers on the command line.
The $(Node) macro is defined for parallel universe jobs, and is especially relevant for MPI applications. It is a unique value assigned for the duration of the job that essentially identifies the machine (slot) on which a program is executing. Values assigned start at 0 and increase monotonically. The values are assigned as the parallel job is about to start.
Recursive definition of macros is permitted. An example of a construction that works is the following:
As a result, foo = snap bar.
Note that both left- and right- recursion works, so
has as its result foo = bar snap.
The construction
by itself will not work, as it does not have an initial base case. Mutually recursive constructions such as:
will not work, and will fill memory with expansions.
A default value may be specified, for use if the macro has no definition. Consider the example
Where E is not defined within the submit description file, the default value 24 is used, resulting in
This is of limited value, as the scope of macro substitution is the submit description file. Thus, either the macro is or is not defined within the submit description file. If the macro is defined, then the default value is useless. If the macro is not defined, then there is no point in using it in a submit command.
To use the dollar sign character ($) as a literal, without macro expansion, use
In addition to the normal macro, there is also a special kind of macro called a substitution macro that allows the substitution of a machine ClassAd attribute value defined on the resource machine itself (gotten after a match to the machine has been made) into specific commands within the submit description file. The substitution macro is of the form:
As this form of the substitution macro is only evaluated within the context of the machine ClassAd, use of a scope resolution prefix TARGET. or MY. is not allowed.
A common use of this form of the substitution macro is for the heterogeneous submission of an executable:
Values for the OpSys and Arch attributes are substituted at match time for any given resource. This example allows HTCondor to automatically choose the correct executable for the matched machine.
An extension to the syntax of the substitution macro provides an alternative string to use if the machine attribute within the substitution macro is undefined. The syntax appears as:
An example using this extended syntax provides a path name to a required input file. Since the file can be placed in different locations on different machines, the file’s path name is given as an argument to the program.
On the machine, if the attribute input_file_path is not defined, then the path /usr/foo is used instead.
A further extension to the syntax of the substitution macro allows the evaluation of a ClassAd expression to define the value. In this form, the expression may refer to machine attributes by prefacing them with the TARGET. scope resolution prefix. To place a ClassAd expression into the substitution macro, square brackets are added to delimit the expression. The syntax appears as:
An example of a job that uses this syntax may be one that wants to know how much memory it can use. The application cannot detect this itself, as it would potentially use all of the memory on a multi-slot machine. So the job determines the memory per slot, reducing it by 10% to account for miscellaneous overhead, and passes this as a command line argument to the application. In the submit description file will be
To insert two dollar sign characters ($$) as literals into a ClassAd string, use
The environment macro, $ENV, allows the evaluation of an environment variable to be used in setting a submit description file command. The syntax used is
An example submit description file command that uses this functionality evaluates the submitter’s home directory in order to set the path and file name of a log file:
The environment variable is evaluated when the submit description file is processed.
The $RANDOM_CHOICE macro allows a random choice to be made from a given list of parameters at submission time. For an expression, if some randomness needs to be generated, the macro may appear as
When evaluated, one of the parameters values will be chosen.
While processing the queue command in a submit file or from the command line, condor_submit will set the values of several automatic submit variables so that they can be referred to by statements in the submit file. With the exception of Cluster and Process, if these variables are set by the submit file, they will not be modified during queue processing.
The automatic variables below are set before parsing the submit file, and will not vary during processing unless the submit file itself sets them.
condor_submit will exit with a status value of 0 (zero) upon success, and a non-zero value upon failure.
Or you can get the same results as the above submit file by using a list of arguments with the Queue statement
Note that each of the added commands is contained within quote marks because there are space characters within the command.
Including the command
in the submit description file causes this to happen.
in the submit description file before the queue command(s).
HTCondor User Manual
Center for High Throughput Computing, University of Wisconsin–Madison
Copyright © 1990-2018 Center for High Throughput Computing, Computer Sciences Department, University of Wisconsin-Madison, Madison, WI. All Rights Reserved. Licensed under the Apache License, Version 2.0.