Next: 10.3 Upgrading from the
Up: 10. Version History and
Previous: 10.1 Introduction to HTCondor
10.2 Development Release Series 8.5
This is the development release series of HTCondor.
The details of each version are described below.
- HTCondor version 8.5.8 released on December 13, 2016.
- On Linux, the starter now puts all jobs in a cgroup by default. The default
for CGROUP_MEMORY_LIMIT_POLICY is now "none". To disable cgroups, an admin
can set the BASE_CGROUP parameter to the empty string.
- Added first-class condor_submit commands supporting job retries.
(See section 11 for details.)
- condor_qedit now defaults to editing only jobs owned by the current user in the same way that
condor_q does. It also honors the CONDOR_Q_ONLY_MY_JOBS configuration variable.
- Added new parameter DOCKER_VOLUME_DIR_XXX_MOUNT_IF which is an expression,
evaluated in the context of the machine and job ad, which if it evaluates to a string,
becomes a docker volume mount. This allows admins to conditionally add docker volumes
for certain types of jobs.
- Added initial support for Singularity containers.
- The XferStatsLog file on the submit side now contains TCP statistics for both the
shadow point of view, and the starter point of view. The starter side line is prefixed
with the words "peer stats from starter".
- Configuration variables of the form
SUBSYS.LOCALNAME.VARIABLE no longer work.
The use of the SUBSYS prefix before LOCALNAME never worked fully, and was only necessary for while
as a workaround for a bug that was fixed many years ago. condor_config_val and the condor_master will
now produce warning messages when the configuration has variables that appear to of this form and
begin with a known SUBSYS name like MASTER or COLLECTOR.
- The SLOT_WEIGHT parameter can now be set on the central manager,
instead of all the execute nodes. If the execute nodes set this parameter, it
will override the central manager setting.
- New submit command gce_json_file can be used with
grid-type gce jobs to specify a file that contains JSON object members
that should be added to the instance description submitted to the GCE
- A number of command-line tools now support bash auto-completion.
- The minimum update time for condor_dagman node status files
now defaults to 60 seconds.
- Added the new DAGMAN_REMOVE_NODE_JOBS configuration
macro, which allows users to configure whether condor_dagman itself
removes its node jobs when it is removed (note that the
node jobs are also removed by the condor_schedd).
This configuration macro defaults to True, which represents
a change in behavior compared to previous HTCondor versions.
(See section 2.10.7 for more details.)
- The -AllowLogError argument to condor_submit_dag and
condor_dagman, and the DAGMAN_ALLOW_LOG_ERROR configuration
macro, are no longer supported, and generate warnings if used.
- condor_dagman now ignores the
DAGMAN_LOG_ON_NFS_IS_ERROR configuration setting if
ENABLE_USERLOG_LOCKING is set to False.
- Added the ALL_NODES option to a number of condor_dagman
commands (see 2.10.9 for details).
- Changed the previous term "metaknob" to "configuration template"
and improved the configuration template documentation.
- The condor_schedd receiving a refreshed X.509 proxy credential
is now done in a non-blocking fashion.
- The Job Router now performs its automatic job ad transformations
when the TRANSLATE_JOB hook is used.
These are changes that should happen to all job ads being transformed
by the Job Router.
$F() configuration macro has new options to support
conversions of paths to Windows style path separators or to Unix style.
When used in condor_submit files it can do path completion as well.
$ENV() configuration macro now supports default values.
- A certificate mapfile can now use literal values rather than regular
expressions for the second field. This is useful when only a single identity
should be matched. The use of a literal is both more secure and faster to
search. The new configuration variable CERTIFICATE_MAPFILE_ASSUME_HASH_KEYS
enables this behavior, it defaults to false. It will most likely default
to true in a future version of HTCondor.
- The ClassAd userMap function now uses only commas as the separator for the
third field of the map file. This makes it possible to have values with spaces in them.
- The condor_collector will now allow more than one condor_negotiator to be registered.
And a new A new configuration variable COLLECTOR_ALLOW_ONLY_ONE_NEGOTIATOR, which defaults
to false has been added so that the old behavior can still be configured.
- The Requirements expression for Job transforms in the condor_schedd will now ignore the TARGET
prefix for attributes in the expression. This makes it easier to convert condor_job_router rules
to job transforms because the TARGET prefix is required in the condor_job_router but refers to nothing
in the job transform.
- The -better-analyze option of condor_q has been improved and the output reorganized.
- A new tool - condor_transform_ads has been added.
(See section 11 for details.)
- A join function has been added to the ClassAd language.
- condor_who has additional options for querying the state and readiness of the various daemons.
It has a command that can be used to wait for the daemons to startup with a timeout.
- When submitting a job that has an associated X.509 proxy, or when
authenticating to the condor_schedd using X.509, the X.509 and VOMS
attributes are securely extracted and carried along in the job ClassAd. This
allows them to be used, for example, in matchmaking policy and job routing.
- Made condor_credd configuration easier by automatically configuring
network connections to use encryption.
- When the Google Compute Engine breaks the results of a query into
multiple pages, the gce_gahp now retrieves all of the results,
instead of just the first page.
- Fixed a bug that caused file transfer to fail when a job created
by the Job Router has a different Owner than the original job.
- Fixed a bug that could result in "orphan" node jobs staying
in the queue when an instance of condor_dagman is removed.
- Fixed a regression introduced in v8.5.7 that prevents job preemption
due to priority from occurring, because
user priority and resources in use information cannot be referenced in
- Fixed COLLECTOR_FORWARD_FILTERING so that a startd ad
update is always forwarded when any of the Claim IDs change.
- Fixed a bug that made the Requirements keyword for job transforms
in the condor_schedd only work if it was all uppercase on Red Hat 7 and
some other platforms that use a newer version of the C++ compiler.
- Fixed a bug that allowed a user to bypass the MAX_RUNNING_SCHEDULER_JOBS_PER_OWNER
limit by specifying an accounting group or nice_user in their submit file.
- Fixed a bug in condor_c and the condor_job_router that could cause inaccurate
job totals to be reported by condor_q -batch.
- HTCondor version 8.5.7 released on September 29, 2016.
- Preemption due to job priority is likely to fail if PREEMPTION_REQUIREMENTS
attempts to reference any resource usage or priority attributes. This issue has
been fixed in v8.5.8. If you cannot upgrade to v8.5.8, a work-around for v8.5.7
is to set configuration macro NEGOTIATOR_CROSS_SLOT_PERI'S to True.
- Added the capability for the schedd to perform job ClassAd
transformations upon job submission (see 3.7.2
- Added the capability for more flexible connections between
splices in DAGs (see 2.10.9 for details).
Also added an INCLUDE command to the DAG language (see
2.10.9 for details).
- Simplified the DAG node priority algorithm: the "effective" priority
of a node is now simply the sum of the explicit node priority and the
overall DAG priority. (See section 2.10.9 for
- Allow the second argument of the ClassAd ternary operator
(expression ? value1 : value2) to be omitted. This new syntax means:
evaluate the expression, and if it evaluated to a defined value or
error, return it. If undefined, return value2.
- The time is now included after the SCHEDD or SUBMITTER name
in the banner of condor_q output.
- condor_status has a new -data option that, when used with -schedd
will show data transfer information; and -run will show information about running
jobs when used with -schedd.
- condor_q -batch will now show Total and Completed counts for non-DAG jobs
when querying a scheduler that is at least version 8.5.7
- condor_status and condor_q now support reading and writing ClassAds
in xml, json, and "new ClassAd" form as well as the traditional long form.
- HTCondor daemons now respect <LOCALNAME>.<SUBSYSTEM>_LOG
if passed a -local-name parameter, and default to using
$(LOG)/<Localname>Log if the former is not set.
- HTCondor now automatically passes the -local-name parameter to a
DC daemon if its entry in the DAEMON_LIST is not in the default
DC_DAEMON_LIST. This should result in simpler and less
- HTCondor now detects if an entry in DAEMON_LIST shares a
binary with an entry in DC_DAEMON_LIST and marks the former as
a DC daemon if so. This should result in simpler and less error-prone
- Increase the resolution of file transfer timing statistics in
the XferStatsLog to hundreds of a second.
- The default host based security meta-knob now works in IPv6
only networks out of the box.
- Old HAD configurations, with or without replication, should now work
by default (without shared port).
- HTCondor no longer gives up if a bad networking configuration is
detected while running a tool. This allows condor_config_val to be
used to debug the problem.
- The condor_negotiator by default no longer cross advertises the
user priority and resources in use from every slot in a machine ad to every
other slot in that machine ad. NEGOTIATOR_CROSS_SLOT_PRIOS = true
re-enables the old behavior. The accounting information for the current
user of the slot remains advertised.
- New submit attribute gce_preemptible allows the
creation of preemptible Google Compute Engine (GCE) instances.
These instances have a lower price, but can be interrupted at any time.
Also added support for service accounts with GCE.
- When submitting jobs to Slurm via the grid universe, the Slurm
partition can now be specified using the batch_queue
- Some old STARTD policy helper configuration variables were moved
into two new configuration templates -
FEATURE : UWCS_DESKTOP_POLICY_VALUES
and FEATURE : TESTINGMODE_POLICY_VALUES
- condor_submit on Windows will no longer insert the OSVERSIONINFO
fields like WindowsMajorVersion into each job automatically. This
is controlled by a new configuration variable SUBMIT_PUBLISH_WINDOWS_OSVERSIONINFO
which defaults to false.
- Added the option to cache the output of commands used in configuration
files, so that the command doesn't have to be re-run every time the
configuration file is referenced. Also added error and warning keywords
to allow configuration files to report errors and warnings.
- Fixed a bug in how the HAD daemon checks to see if it and its
corresponding replication daemon were configured to be on the same host.
- The EC2 GAHP now handles integer overflows when checking deadlines.
This prevents spurious time-outs on 32-bit systems which have been up for
more than 28 days.
- Lengthened the watchdog timeout in the systemd service file to 20 minutes.
Also, ping systemd at a third of the watchdog interval.
- Fixed a bug that could cause daemons to create a file named
dprintf_failure.SUBSYS if they failed to find the mail
- For grid-type batch jobs, improved handling of command
line arguments and environment variables that contain characters that
have meaning to the shell.
Previously, the presence of these characters would cause job execution
- Fixed a bug that caused condor_config_val to segfault when the -name
argument was used and the machine did not exist
- Fixed a bug that caused condor_q -autocluster to crash unless the -nobatch
option was also used.
- Fixed a bug in the Python bindings where a thread executed python
byte code without holding the global interpreter lock.
- HTCondor version 8.5.6 released on August 2, 2016.
- The default output of condor_q is now the -batch output.
To change the default back to its pre-8.5.6 value, set the new
configuration variable CONDOR_Q_DASH_BATCH_IS_DEFAULT to
- A new class - the Submit class - was added to the Python
bindings. It allows for the submission of HTCondor jobs via the
Python bindings using the same keywords and automatic behavior as
See section 6.7.1 for details.
- The ability to send condor_drain commands is now exposed
through the Python bindings.
See section 6.7.1 for details.
- The value of the configuration parameter
DOCKER_DROP_ALL_CAPABILITIES is now no longer just
true or false, but a ClassAd expression evaluated in the context
of the machine (my) and the job (target).
- When running Docker Universe containers on docker version 1.11
and newer, HTCondor now also sets -no-new-privs, to prevent
setuid and setgid programs from running in containers, unless
DOCKER_DROP_ALL_CAPABILITIES evaluated to false.
- The hostname of the container that Docker Universe jobs
run in is now set to a more useful name. Instead of a hash, it
now contains the job's owner, the cluster and proc of the job,
and the hostname of the machine the container runs on.
- New options have been added to condor_history, so that condor_history can be used as the
the HISTORY_HELPER for remote condor_history. The options are:
- -since Scanning of the history file stops when an expression becomes true or a job id is read.
- -completedsince Scanning of the history file stops when a job completed earlier than this time is read.
- -scanlimit Used by remote condor_history to limit the number of jobs read from the history file.
- -attributes Used by remote condor_history to limit the attributes transferred back.
- -inherit Used by remote condor_history to define the socket to write results to.
- -stream-results Used by remote condor_history so that results can be printed as they arrive.
- Condorhistory will default to doing a remote query if there is a SCHEDD_HOST configured. This behavior
can be defeated by passing the new -local argument.
- The high-availability and replication daemons may now use shared port.
- ClassAds can now be represented in JSON format.
condor_q, condor_status, and condor_history have a -json
command line option, which causes their output to be printed in JSON.
- condor_dagman now allows commands to be more flexibly ordered
within a DAG file. (See section 2.10.3 for details.)
- Any accounting_group and
accounting_group_user values specified for a
DAG are now propagated to all jobs of the workflow, including sub-DAGs.
- A new configuration variable MAX_RUNNING_SCHEDULER_JOBS_PER_OWNER
can be used to limit the number of DAGs that any single user can have running at
- Monitoring the status of PBS and SLURM jobs is now much more efficient.
Now, one query to the batch system is done for all jobs, instead of a
separate query for each job.
- Simplified how job leases are handled for grid universe jobs.
Now, all jobs going to the same remote resource share a single lease
- Added several statistics about commands issued to the GAHP server
to the grid ads that the condor_gridmanager sends to the condor_collector:
- The condor_shadow, condor_starter and condor_c-gahp daemons
now log TCP statistics for file transfers. See 3.5.2 for
- Job ads now include NumJobCompletions, which counts the
number of times a job exited of its own accord (successfully or not) and
then successfully completed file transfer (if any was requested).
- Kerberos authentication is now non-blocking, allowing an HTCondor
daemon authenticating clients with Kerberos to handle more simultaneous
- Password authentication is now non-blocking, allowing an HTCondor
daemon authenticating clients with the PASSWORD method to handle more
simultaneous incoming connections.
- The full path to the submit file is now available as an automatic
- A new function userMap() has been added to the ClassAd language
to facilitate the mapping of users to groups in the condor_schedd and
condor_job_router (see 4.1.2 for details).
- Configuration files now support the declaration of multi-line values, the is primarily of use when
configuring the condor_job_router.
- Configuration templates can now take arguments.
- Improved the performance of the condor_negotiator when running
with a large number of users or groups. The accounting data is only
written to disk when it changes, not unconditionally.
- Fixed a bug in Docker universe that required the name
of a transferred executable to begin with "./"
- Fixed a bug the prevented Docker universe jobs from reporting
their network usage correctly.
- condor_who now reports docker universe jobs more completely.
- Fixed bugs preventing HTCondor daemons from recognizing an address in
Sinful format as its own when operating in mixed (IPv4 and IPv6) mode. One
manifestation of this would be errors from the HAD daemon when specifying
hosts by name in the HAD_LIST or REPLICATION_LIST.
- condor_user_prio now more correctly shows information
about submitters flocking to a pool, but who haven't used
- No longer leak a file in the user's home directory each time a
job is submitted to SLURM.
- Fix a bug that prevented HTCondor from removing jobs from SLURM.
- Fixed a bug when attempting to authenticate using multiple
methods wherein if a method failed, the remaining methods were not
- Fixed a bug that prevented the condor_schedd from reading the
job's X.509 proxy file when writing information to the
- Fixed a bug in condor_q where the SIZE column would not grow as needed to fit the data.
- Fixed a bug where the condor_schedd did not treat a user as a queue superuser when it
should have if the configuration included a map file, which is common for GSI authentication.
- Lengthen the watchdog timeout in the systemd service file to 1 minute.
The previous value of 5 seconds has taken down HTCondor for a single slow DNS
- HTCondor version 8.5.5 released on June 6, 2016.
- The EC2 GAHP now rate-limits its requests, and responds to overload
warnings with an exponential back-off. Additionally, fewer operations are
now performed on a per-job basis (as few as one in some cases). The
resulting scalability improvements have been demonstrated to permit a single
GAHP to manage ten thousand instances. Because the overload condition is
account- and region- specific, the grid manager now launches a GAHP for
each account-region pair. We therefore recommend adding D_PID to
EC2_GAHP_DEBUG, for disambiguation, and this is now the default.
- The grid manager now assigns HoldReasonCodes and
HoldReasonSubCodes to EC2 jobs when they go on hold. Values are
subject to change until the stable release.
- The grid manager now advertises some metrics from the EC2 GAHP.
- Some Linux distributions for supercomputer compute nodes and
others distributions for docker images have no /var/run/utmp. HTCondor no longer
aborts when this file is missing, when it tries to determine keyboard
idle times, it just assumes these kinds of machines have no keyboards.
- Docker Universe jobs now correctly advertise RemoteUserCpu
and RemoteSysCpu in their job ad and in the job log file.
- A batch name specified for a DAG (with the condor_submit_dag
-batch-name option) is now propagated to all jobs of that DAG,
- The batch name for a condor_dagman job (if not set) now
defaults to DagFile+cluster (where DagFile
is the primary DAG file of the condor_dagman job, and cluster
is the HTCondor cluster of the condor_dagman job).
Because the batch name is now propagated throughout a workflow, if
no batch name is specified, the batch name for all jobs in the
workflow will be DagFile+cluster of the top-level
- The files named in the submit file attributes vm_disk,
xen_kernel, and xen_initrd now refer to
locations on the execute machine.
condor_submit no longer modifies these values or checks for their
existence on the submit machine.
If these files need to be transferred by HTCondor, then they should be
listed in transfer_input_files and their presence in
these vm universe attributes shouldn't include any path information.
- In the python bindings, an ExprTree can be cast to an integer
or floating point value.
- HTCondor now supports the following systemd features: Socket Activation,
Watchdog, Status message, and journald logging. In these release, the
Socket Activation is not configured, because the security system is not
prepared to properly handle the socket passed in from outside HTCondor.
- Added config knob DEFAULT_MASTER_SHUTDOWN_SCRIPT to
specify a default program to exec as root upon condor_master exit.
See Section 3.5.7 for details.
- The python bindings now support a per-thread security context,
allowing the modification of various parameters such as the pool password
and the X509UserProxy location.
- Fixed a bug that caused file transfers to fail when using Bosco.
- HTCondor version 8.5.4 released on May 2, 2016.
- The deltacloud type in the grid universe, which allowed
submission to Deltacloud services, has been removed.
- condor_status can now display the utilization of a condor_startd with
a single line of output for each machine rather than a line per slot.
In this release this output is enabled by passing -compact to condor_status
but in a future release this will be the default output of condor_status.
- Improved the performance of the condor_collector by
not computing dropped update statistics, statistics which
have never been accessible by users.
- The performance of the condor_history tool has been
- condor_user_prio now queries the condor_collector for
accounting information by default, when appropriate. This should be
much faster than the older way of querying the condor_negotiator. The
old path is still available by passing the -negotiator option to the
- The default value of DAGMAN_ALWAYS_RUN_POST has been
changed from True to False. This means that, by
default, if the PRE script of a DAG node fails, the POST script
of the node will not be run. (This had been the default
behavior until version 7.7.2. The 7.7.2-8.5.3 behavior can be
restored by setting DAGMAN_ALWAYS_RUN_POST to
True, or by passing the new -AlwaysRunPost
argument to condor_submit_dag.)
- The batch_gahp can now submit multi-core jobs to HTCondor.
- The batch_gahp's ability to generate a limited X.509 proxy
for use by the job on the execute machine can now be disabled, which is now
- The condor_schedd will now send submitter ad updates for idle submitters
less frequently than updates for submitters that have jobs in the queue. There
are two new configuration variables to control this behavior.
ABSENT_SUBMITTER_LIFETIME is the number of seconds after the last
job for that submitter leaves the queue that the submitter will continue to
send updates to the condor_collector. It defaults to 1 week.
ABSENT_SUBMITTER_UPDATE_RATE is the maximum rate in seconds at which
the condor_schedd will send updates to the condor_collector for a submitter
that has no jobs in the queue. It defaults to 5 minutes.
- Fixed a bug that caused the condor_schedd to exit when receiving
an updated X.509 proxy for a job.
- In expressions in the Job Router's configuration, attributes no
longer require a 'TARGET.' scope prefix.
- Fixed a bug in condor_q -xml that would put the XML header
after the body unless -stream was passed.
- HTCondor version 8.5.3 released on March 24, 2016.
- ENABLE_IPV4 and ENABLE_IPV6 both now accept
the special value "AUTO", which is true if an interface with the corresponding
protocol exists on the host, and false otherwise.
- ENABLE_IPV4 and ENABLE_IPV6 both now default
to the special value "AUTO". Additionally, the new configuration macro
PREFER_IPV4 is true by default. This macro causes HTCondor to
prefer IPv4 over IPv6 when choosing an address to advertise, when choosing
the address of daemon looked up in the collector, and when resolving DNS
- New configuration macros added: IPV4_ADDRESS,
- New attributes have been added to the Submitter ClassAd to indicate
the number of Idle and Running jobs for Scheduler universe and for Local
- Jobs can now be submitted to the SLURM batch scheduling system via
the new slurm type in the grid universe.
- In addition to logging to the file KERNEL_TUNING_LOG,
the default LINUX_KERNEL_TUNING_SCRIPT now also logs to
syslog and /etc/systcl.d/99-htcondor.conf.
- condor_history -autoformat now supports the j option to print
job ids like condor_q does.
- HTCondor is now built and linked with Globus 6.0.
- Pre-size the ClassAd hash table to improve the performance of the
condor_collector when getting ClassAd updates.
- The negotiator now forwards accounting information to the collector,
where it can be easily queried and monitored.
- Fixed a bug on condor_history that could result in truncation of
the job id field.
- HTCondor version 8.5.2 released on February 18, 2015.
- On Windows, configuring HTCondor to restrict the range of outbound
port numbers may cause substantial delays when using the command-line
tools. Since we now know that it's not free to do so, LOWPORT
and HIGHPORT no longer restrict the port numbers of outbound
connections on Windows. If you still require this functionality, use
OUT_LOWPORT and OUT_HIGHPORT.
- Fixed a bug that could cause a daemon to be in the wrong
privilege state when attempting to act as the user.
- The condor_startd history file now contains the peak
memory usage, by an exited job, not the more recent.
- When the condor_starter evicts a job, perhaps because
it has exceeded a memory limit, it does not transfer back to
the submit machine the sandbox of working files. This is
consistent with other types of holds.
- The condor_startd now advertises the following attributes
on Linux machines: CpuFamily CpuModelNumber CacheSize. These are
pulled from the /proc/cpuinfo file.
- condor_q has a new option -schedd-constraint which can be used
to constrain the queues displayed when using the -global option.
- When an HTCondor-C job is submitted to a remote condor_schedd,
the remote job ad now includes the attribute SubmitterGlobalJobId,
whose value is the same as the attribute GlobalJobId in the
original HTCondor-C job.
- The condor_schedd now sets environment variables for scheduler
universe jobs so that the jobs can more easily find the condor_schedd's
On machines where there are multiple condor_schedds running, this helps
DAGMan and similar applications contact the condor_schedd that started
- When SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION is set
to True, the related authorizations are now automatically enabled.
Previously, submit-side@matchsession and
execute-side@matchsession entries had to be added to the
ALLOW_DAEMON and ALLOW_CLIENT (if set)
authorization parameters in order for this feature to work.
- HTCondor version 8.5.0 released on October 12, 2015.
- The condor_startd history file contains two new attributes: BadputCausedByDraining and BadputCausedByPreemption, two boolean-valued attributes which are true if the job was evicted not by a user request.
- The python bindings have a new Claim API, allowing Computing-On-Demand (COD) to be
invoked via python.
- The python bindings can now submit multiple distinct processes using the submitMany
method, similar to a condor_submit file with multiple queue statements.
- The python bindings now provide improved support for managing multiple concurrent queries.
- As an experimental feature, the python bindings implement the HTCondor negotiation protocol.
- Changed "Condor" to "HTCondor" in condor_dagman
output (mainly in the dagman.out file).
- The new configuration parameter JOB_SPOOL_PERMISSIONS
controls the permissions on a job's spool directory managed by the
condor_schedd on unix.
It defaults to the value user, which results in a permissions
value of 0700.
Other valid values are group (permissions 0750) and
world (permissions 0755).
Previously, all job spool directories had access permissions of 0755.
- The condor_schedd no longer changes the ownership of spooled job
files that it manages.
Now, the files are always owned by the submitting user.
The previous behavior of changing ownership to/from the condor
account can be restored by setting the new configuration parameter
CHOWN_JOB_SPOOL_FILES to True.
Next: 10.3 Upgrading from the
Up: 10. Version History and
Previous: 10.1 Introduction to HTCondor