Next: 10.4 Stable Release Series
Up: 10. Version History and
Previous: 10.2 Upgrading from the
10.3 Development Release Series 8.3
This is the development release series of HTCondor.
The details of each version are described below.
- HTCondor version 8.3.2 released on December 23, 2014.
This version contains all bug fixes from HTCondor version 8.2.6.
- It is now possible run a dual-protocol (IPv4 and IPv6) submit node,
submitting to single-protocol execute nodes. This is preliminary work.
- The port used by the condor_shared_port daemon is now
9618 by default.
- Improved the handling when vm universe jobs failure to start.
Failures which do
not appear to be the fault of the job now cause the job to be rescheduled and
the machine stops advertising the ability to run vm universe jobs.
The new condor_condor_update_machine_ad tool facilitates changing
the machine ClassAd.
- The memory footprint of the condor_shadow has been
reduced when Kerberos or SSL authentication methods are not used,
as these libraries are now loaded on demand at run time.
- The responsiveness of a busy condor_schedd daemon to queries
has been improved.
- Added the ability to specify the block device mapping for EC2 jobs.
- The new python binding register() has been added
to allow python functions
be registered with the ClassAds library. This allows python
functions to be invoked from within ClassAds.
- The new python bindings externalRefs() and
internalRefs() have been added to allow the ClassAd object
to determine internal and external references from an expression.
- When the condor_startd has a live condor_starter,
claim keep alives are sent
by the existing TCP connection between the condor_starter and condor_schedd,
rather than creating a new connection to the
condor_schedd from the condor_startd.
- Added the DAGMan feature of ALWAYS-UPDATE for updates
of a DAGMan node status file.
Specifying this causes the node status file to be overwritten,
even if no nodes have changed status since the file was last written.
- Configuration variable MAX_JOBS_RUNNING has been
modified such that it only applies to job universes that require a
Scheduler and local universe jobs are no longer affected by this
The number of running scheduler and local universe jobs can be controlled
with configuration variables START_SCHEDULER_UNIVERSE and
- The specific versions of Globus GSI libraries to be loaded at run time
are determined at compile time.
- HTCondor now sets environment variable _CONDOR_JOB_AD for
scheduler universe jobs.
Its value will be the path to a file which contains
the job ClassAd as it was when the job was started.
This feature already exists for vanilla, parallel, java, and local
- The new -debug option to condor_userprio sends
debug output to stderr.
- HTCondor daemons now support a whitelist of statistics attributes to
publish from their ClassAd to the condor_collector.
This is intended to ease
configuration on systems that use ganglia for monitoring.
- New statistics have been added to the condor_schedd to monitor runtime
spent doing DNS queries, using fsync,
and rebuilding the priority list for negotiation.
Also additional attributes for average, maximum and minimum
have been added to runtime statistics for command handlers
for all HTCondor daemons.
These changes are intended to help direct future scalability work.
- The new daemon logging level, D_SUB_SECOND,
enables millisecond resolution timestamps in daemon logs.
- Fixed a bug introduced in HTCondor version 8.3.1 that caused daemons to be
unreachable if they were configured to use the condor_shared_port daemon,
but the condor_master was not.
- Updated the CREAM client library used in the cream_gahp.
This fixes the delegation of RFC format proxies, in addition to other
- Fixed a bug that could cause a segmentation fault of condor_dagman
for some DAG input file syntax errors,
rather than printing an appropriate error message.
- Fixed a bug that could cause the condor_shared_port daemon
to fail on Mac OS X platforms,
if configuration variable LOCK was not explicitly set
in a configuration file.
- Fixed a bug that caused both condor_dagman and the condor_schedd
daemon to generate commands to remove condor_dagman's node jobs when
the condor_dagman job is the target of condor_rm.
Now, only the condor_schedd generates the command,
avoiding the extra load of running two identical commands.
- Fixed a bug that caused the DAGMan node status file,
as detailed in section 2.10.11,
to not reflect the final status of a DAG when the DAG is removed
by issuing a condor_rm command,
or when the DAG is
aborted due to an ABORT-DAG-ON specification in the DAG input file.
- HTCondor version 8.3.1 released on September 11, 2014.
- If cgroups are enabled on Linux platforms,
the amount of swap space used by a job is now limited to the
size specified by the machine ClassAd attribute VirtualMemory
for the slot that the job is running on.
- The new configuration variable COLLECTOR_PORT specifies
the default port used by the condor_collector daemon and command line tools.
The default value is 9618.
This default is the same port as has been used in previous HTCondor versions.
- The condor_shared_port daemon will now work
if the default location given by configuration variable
DAEMON_SOCKET_DIR, which is $(LOCK)/daemon_sock,
is longer than 90 characters in length.
On Linux platforms, abstract sockets are now the primary method for
condor_shared_port to forward an incoming connection to the intended
- Improvements to CCB increase performance.
- The use of a single log file to write events and enforce the
dependencies of a DAG represented by a condor_dagman instance is mandatory.
To implement this,
the -dont_use_default_node_log command-line
option to condor_submit_dag is disabled,
and an attempt to set configuration variable
DAGMAN_ALWAYS_USE_NODE_LOG to False will generate an
- The new condor_dagman configuration variable
DAGMAN_SUPPRESS_JOB_LOGS allows users to prevent DAG node
jobs from writing to the log file specified in their submit description file.
See section 3.3.25 for details.
- New special variables @(OWNER) and @(NODE_NAME) are
available when defining configuration variable
These values make it easier to avoid log file name collisions.
- condor_submit will no longer insert an OpSys requirement
for a job
when one of OpSysAndVer, OpSysLongName, OpSysName,
or OpSysShortName is already specified by the user in
the Requirements expression of the submit description file.
- The configuration file $(HOME)/.condor/condor_config
is no longer considered for the single, initial, global configuration file.
Instead, a user-specific configuration file has been added as the
last file parsed.
The new configuration variable USER_CONFIG_FILE may change the
default file name or disable this feature.
Section 3.3.1 describes the ordering
in which configuration files are parsed.
- Daemons now authenticate many client network connections in
parallel, rather than one at a time.
This improves the scalability of daemons that receive many client
connections, like the condor_schedd and condor_collector.
The improvement is most noticeable when using the FS and GSI
- The GSI security libraries are now loaded into memory only when GSI
authentication is required.
This reduces memory usage when GSI authentication is not used.
The memory reduction will be most noticeable when there are many
condor_shadow processes running.
- Implemented fine-grained locking in the HTCondor python module to
allow other python threads to run during HTCondor calls.
- HTCondor version 8.3.0 released on August 12, 2014.
This release contains all improvements and bug fixes from
HTCondor version 8.2.2.
- When a daemon creates a child daemon process, it also creates a
security session shared with the child daemon.
This makes the initial communication between the daemons more efficient.
- Negotiation cycle performance has been improved, especially
over a wide-area network, by reducing network traffic and latency
between a submit machine and a central manager.
The new configuration variable
does performance tuning, as defined
in section 3.3.17.
- The synchronization of the job event log was improved by only
using fsync() where necessary and
fdatasync() where sufficient.
This should provide a small reduction in disk I/O to
the condor_schedd daemon.
- CPU usage by the condor_collector has been reduced when
handling normal queries from condor_status,
and CPU usage by the condor_schedd has been reduced when
handling normal queries from condor_q.
- HTCondor can now internally cache the result of Globus authorization
The caching behavior is enabled by setting configuration variable
GSS_ASSIST_GRIDMAP_CACHE_EXPIRATION to a non-zero value.
This feature will be useful for sites that use the Globus authorization
callouts based only on DN and VOMS FQAN, and for sites that have
- The job ClassAd attribute DAG_Status is included in
the dagman.out file.
- The new -DoRecovery command line option for condor_dagman
and condor_submit_dag causes condor_dagman to run in
- The new -ads option to condor_status permits a set of ClassAds
to be read from a file, processing the ClassAds as if they came from
- Daemon ClassAd hooks implementing Startd Cron functionality
can now return multiple ClassAds,
and the hooks can specify which ClassAds their output should merge into.
- Two new condor_schedd ClassAd statistics attributes are
available: JobsRunning and JobsAccumExceptionalBadputTime.
- Fixed a bug that caused condor_dagman to unnecessarily attempt
to read node job submit description files,
which could cause spurious warnings when in recovery mode.
Strictly speaking, the bug is fixed only for the
default case in which DAGMAN_ALWAYS_USE_NODE_LOG is set
- Fixed a bug in the condor_schedd daemon that caused the values
of the ClassAd attributes JobsRunningSizes and
JobsRunningRuntimes to be much larger than they should have been.
Next: 10.4 Stable Release Series
Up: 10. Version History and
Previous: 10.2 Upgrading from the