Next: 10.5 Stable Release Series
Up: 10. Version History and
Previous: 10.3 Stable Release Series
10.4 Development Release Series 8.3
This is the development release series of HTCondor.
The details of each version are described below.
- HTCondor version 8.3.8 released on August 27, 2015.
- On Linux platforms, the condor_master daemon now runs a script when it
This script tunes several Linux kernel parameters to the values
we suggest for better scalability.
New configuration variables ENABLE_KERNEL_TUNING,
KERNEL_TUNING_LOG, and LINUX_KERNEL_TUNING_SCRIPT
enable the use of the script and specify file locations.
- HTCondor Python bindings are now supported on all Windows platforms when 32-bit Python 2.7 is installed.
- The new configuration variable DOCKER_IMAGE_CACHE_SIZE
controls the number of Docker images kept on the local machine.
- The use of tools such as valgrind and coverity,
as well as other static analysis tools,
permitted a clean up of code.
Several minor memory leaks were fixed, unused code was removed,
uninitialized variables are now correctly initialized,
previously ignored error codes are now checked upon return,
and several compiler warnings were fixed.
- The caching of ClassAds to save memory will now be enabled by default on
It was enabled on other platforms in earlier versions of HTCondor.
This can be expected to reduce the memory usage of
the condor_collector and condor_schedd daemons.
- The condor_shared_port daemon may be directly addressed by setting
the shared port ID to "self".
For example, a daemon using condor_shared_port and listening
on IP address 22.214.171.124, port 9618,
can now be addressed as "<126.96.36.199:9618?noUDP=true&sock=self>".
This may be useful for certain tools such as condor_config_val.
- The condor_schedd now advertises ClassAd attributes
NumJobStartsDelayed and NumPendingClaims.
- The python bindings now work with python's pickle module.
- The performance of the condor_schedd handling of the RESCHEDULE
command has improved by removing unnecessary work.
This increases the peak rate at which jobs can be submitted,
as well as the rate at which jobs can be removed from the queue
when the submission rate is high.
- The new configuration variable
NEGOTIATOR_MAX_TIME_PER_SCHEDD limits how much time the
condor_negotiator will spend talking to each condor_schedd during a
The value is in seconds and it defaults to the number of seconds in one year.
- Within a submit description file,
setting transfer_output_files to the empty string (
will indicate that no output files should be transferred,
rather than producing a syntax error.
- The condor_kbdd will now ignore small mouse movements that occur
after a long period of inactivity on both keyboard and mouse.
The value of small is defined by the new
configuration variable KBDD_BUMP_CHECK_SIZE,
and the period of inactivity that triggers this behavior is defined
- The condor_userprio tool has been enhanced
with two new forms of -long output.
The -legacy option specifies the traditional output form.
The -modular option is a new form that has a separate
ClassAd for each user and group.
This new form allows condor_userprio to support
additional new arguments: -constraint and -autoformat.
- The -better-analyze option to condor_q
has been enhanced to understand that subexpressions referring
to the ClassAd attribute CurrentTime may evaluate differently
at different times.
These subexpressions are no longer
automatically treated as irrelevant to matchmaking.
- The new configuration variable JOB_IS_FINISHED_COUNT
works with JOB_IS_FINISHED_INTERVAL to control how many jobs
can leave the queue at a time. The default value is 1.
- Performance of sending updates to the Collector over TCP has been
improved. Previously, sending multiple ads concurrently to the
Collector could result in creating and authenticating multiple
TCP connections; now concurrent collector updates are serialized
over one TCP connection.
- Fixed a bug that prevented the condor_negotiator from preempting
jobs based on user priority when configuration variable
ALLOW_PSLOT_PREEMPTION was set to True.
- Fixed a bug that caused HTCondor daemons to crash
on RHEL 7 when using more than 1024 file descriptors.
- The default value of configuration variable
SLOTS_CONNECTED_TO_KEYBOARD has changed from 1 to
the value of NUM_CPUS.
With this change,
detected keyboard activity causes all slots to have their KeyboardIdle
- Fixed a bug in condor_who that sometimes prevented it from
producing any output unless the -allpids option was set.
This bug was introduced in HTCondor version 8.3.6.
- Fixed a bug that caused slow shut down on Windows platforms
when using condor_shared_port.
- Fixed a bug that could cause HTCondor daemons to crash given
malformed strings representing daemon addresses.
- The fix for the condor_collector pausing 30 seconds on start up
has been fixed to work when the configuration has
USE_SHARED_PORT = True.
- Fixed a bug that caused the condor_schedd daemon to crash
when using the parallel universe,
partitionable slots, and dollar dollar expansion.
- Fixed a bug in which HTCondor did not correctly set the
KeyboardIdle attribute on keyboards connected to pseudo terminals.
- Fixed an issue in which docker universe jobs had DOS-style line
endings in stdout and stderr.
Both a carriage return and newline were at the end of every line.
- When the condor_startd starts up and shuts down,
it removes any left over files and directories in the execute directory.
Previous versions of HTCondor would remove all sub-directories,
including the special lost+found directory,
which the file system check program may need when recovering files.
Now it never removes that directory.
- Removing a running docker universe container would sometimes
leave the container running and the job in the X state for 10 minutes.
Containers are now removed correctly and quickly.
- Fixed a bug in which the tuning configuration variable
MAX_ACCEPTS_PER_CYCLE was ignored when
the condor_shared_port is used,
potentially causing significant delays when issuing new commands to
heavily loaded condor_schedd daemons.
- The 12 second delay in the start up of condor_dagman
has been reduced to 3 seconds.
- When the condor_schedd is reconnecting to running jobs after
it no longer starts a condor_shadow process if the job lease has expired.
This improves the performance of a busy condor_schedd.
- The condor_negotiator will now use its older mechanism for directly
preempting dynamic slots when ALLOW_PSLOT_PREEMPTION is enabled,
but the condor_startd does not support pslot preemption.
- Fixed problems with the handling of
where <N> is a slot type number.
Also when a configuration variable name that begins with
SLOT_TYPE_<N> is explicitly set to nothing,
this now overrides the value for that slot type rather than being ignored.
- Fixed a bug that caused the ClassAd attributes DetectedCpus and
DetectedMemory to be inserted into all ClassAds.
These attributes are now inserted only into daemon ClassAds.
- Fixed a bug that caused condor_q to interpret a user name that
started with a number as a job ID.
- Fixed bugs in the $REAL() and $CHOICE()
that prevented them from correctly processing their parameters.
- All Linux platform HTCondor programs have been compiled with
-fPIC (Position Independent Code)
to prevent crashes when using CLASSAD_USER_LIBS
or CLASSAD_USER_PYTHON_MODULES to load ClassAd plug-in functions.
- Fixed a memory leak caused by the ClassAd function split().
- condor_vacate will now fail with an error message when the
-constraint argument is used more than once,
or when it is used with other arguments that cannot be handled
while processing the constraint.
Previous HTCondor versions would quietly ignore all but the last constraint.
- HTCondor version 8.3.7 released on July 27, 2015.
- The default values of the configuration variables and ClassAd attributes
listed in Table 10.2 have changed,
such that the default now represents the commonly configured value.
Changes to defaults in HTCondor 8.3.7
||largest positive integer
||largest positive integer
|MAX_JOBS_RUNNING (on Windows)
- The new configuration variable MAX_JOBS_PER_SUBMISSION
limits how many jobs may be submitted simultaneously in a single
use of condor_submit. This variable may be useful in catching user errors,
and in protecting a busy condor_schedd daemon
from the excessively lengthy interruption
required to accept a very large number of jobs at one time.
- The new configuration variable MAX_JOBS_PER_OWNER
limits the total number number of jobs that may be in the queue for each
owner (submitting user).
- The default value for configuration variable
UPDATE_COLLECTOR_WITH_TCP has changed to True.
The new configuration variable UPDATE_VIEW_COLLECTOR_WITH_TCP
controls whether UDP or TCP is used to forward updates to the
condor_collector daemons specified with CONDOR_VIEW_HOST.
Its default value is False,
such that UDP is the default protocol.
- A new matchmaking mode permits one or more dynamic slots to
be preempted in order to make enough resources available to their parent
partitionable slot to allow a job match.
The new mode is enabled by setting the new condor_negotiator
configuration variable ALLOW_PSLOT_PREEMPTION to True.
This variable defaults to False.
The new configuration variable ADVERTISE_PSLOT_ROLLUP_INFORMATION
controls whether a condor_startd daemon advertises additional attributes
about partitionable slot preemption to the condor_collector.
- The matchmaking optimizations enabled by CONSUMPTION_POLICY
and NEGOTIATOR_MATCHLIST_CACHING now work together properly.
Previously, machines that were configured with a consumption policy
could not benefit from the match list caching option of the
This may have led to an increased amount of time required to match
multiple jobs to a single partitionable slot.
- When using configuration variable ENCRYPT_EXECUTE_DIRECTORY
on Linux platforms,
file names are no longer encrypted by default.
Encryption of file names can be enabled by setting
the new configuration variable ENCRYPT_EXECUTE_DIRECTORY_FILENAMES
- Added the ability to audit connections made through the
condor_shared_port daemon's socket directory,
logging information in the condor_shared_port audit log.
This log allows an administrator to check for abuse.
New configuration variables SHARED_PORT_AUDIT_LOG,
and MAX_NUM_SHARED_PORT_AUDIT_LOG define the
condor_shared_port audit log.
- condor_dagman now allows a maximum of two submit attempts of a FINAL
node job, if the DAG has been removed with condor_rm.
- condor_dagman VARS values can now contain single quotes,
as described in section 2.10.8.
- The new submit command concurrency_limits_expr
allows a job to specify concurrency limits in a ClassAd expression
that may reference attributes of the machine ClassAds that the job
may be matched to.
- The maximum buffer size for condor_chirp commands has increased
from 1024 bytes to 5120 bytes.
- The new -cuda option for the condor_gpu_discovery tool
only runs detection software for CUDA GPUs,
ignoring OpenCL GPUs.
- condor_q will now abbreviate the address field in the header
for each condor_schedd.
Also, the incorrect header "Submitter:" is corrected to be "Schedd:"
- The -io option to condor_q now shows transfer information for
both vanilla and standard universe jobs.
Nothing will be printed for vanilla universe jobs that are not currently
in a transfer state.
- Updated the Python classad parsing interface with methods that
are able to parse ClassAds given in either the Old or New ClassAd format.
The parsing methods can automatically detect which format is used.
- Fixed a bug that could cause the condor_starter to wait forever
if its connection to the condor_shadow is lost after the job exits.
This bug is very likely to occur if CCB is being used and the CCB server
- The condor_kbdd now works with shared port enabled.
- Fixed a Windows platform bug that caused events to be lost when
EVENT_LOG_LOCKING or ENABLE_USERLOG_LOCKING
were set to False.
- The condor_collector no longer pauses for 30 seconds on start up
under some conditions.
This fix does not work when the configuration has
USE_SHARED_PORT = True.
- A bug fix in HTCondor version 8.3.5 allows standard universe jobs
to work on a mixed mode pool with both IPv4 and IPv6 machines
and with daemons that are of earlier 8.3.x releases.
- Fixed a parallel universe bug,
in which too many slots were claimed for a job
on a pool with partitionable slots.
- Fixed a bug that caused condor_status -submitters
-wide to truncate the submitter names within the second part of
- Fixed a bug in condor_dagman in which a combination of node
retries plus a FINAL node could cause a DAG getting aborted or
condor_rm'ed to re-submit nodes other than the FINAL node.
- Fixed a bug in condor_submit in which the use of submit command
ec2_iam_profile_arn would lead to failing EC2 requests.
- Fixed a bug in condor_dagman in which having both node retries
and ABORT-DAG-ON specified for the same node could cause the node
status to be reported incorrectly.
- On Windows platforms, fixed an incorrect Win32 default value for
configuration variable DAGMAN_CONDOR_RM_EXE.
The incorrect default caused condor_dagman
to not correctly remove running node jobs,
if the ABORT-DAG-ON feature was triggered.
- Files omitted from the new style RPM packages
for Red Hat Enterprise Linux 6 and 7 are now present.
the condor_credd binary, the condor_set_shutdown binary,
the condor_set_shutdown manual page,
and the condor_vm-gahp-vmware binary.
- HTCondor version 8.3.6 released on June 23, 2015.
- The new docker universe allows docker containers to be run
as HTCondor jobs on execute hosts that have docker installed and are
configured for HTCondor.
- Support has improved for IPv4/IPv6 mixed mode operation.
HTCondor daemons now advertise both an IPv4 and an IPv6 address,
if both modes are enabled.
An HTCondor tool or daemon will pick an address based on which
protocol(s) it has enabled.
This allows IPv4-only clients to function within mixed mode pools.
Older clients and daemons will use IPv4 exclusively.
- This version of HTCondor includes a full port for
Debian 8.0 (jessie) on the x86_64 architecture.
A full port includes support for the standard universe.
- HTCondor daemons now advertise an additional attribute,
AddressV1, for forward-compatibility.
- Two new command line options are implemented for condor_submit.
The -maxjobs option causes an error message to be printed and
no jobs to be submitted,
if the submission would have caused more
than a specified number of jobs to be submitted.
The -single-cluster option causes an error message to be printed and
no jobs to be submitted,
if the submission would have assigned more than a single cluster value.
- Tools now detect and report authorization failures;
previously, authorization failures were reported as network failures
or not reported at all.
With this enhancement,
an authorization denial results in an error message
that identifies who the user was mapped to and the authentication method,
and the tool's exit code indicates the failure.
- Changed the default value of configuration variables
ENABLE_USERLOG_LOCKING and EVENT_LOG_LOCKING
to False on Unix platforms and True on Windows platforms.
The Event Log is still locked by default to perform file rotation.
- Improved how DAGMan deals with node job submit description files that
use the new features of the queue command
to do file globbing.
- condor_dagman no longer creates an unused command socket.
- Various syntax errors within a DAG input file related to the
specification of a splice are now reported as fatal errors,
rather than being silently ignored.
Examples of syntax errors caught include a specification
of NOOP or DONE for a splice.
- Changed the default value of configuration variables
DAGMAN_MAX_PRE_SCRIPTS and DAGMAN_MAX_POST_SCRIPTS
from 0 to 20.
A value of 0 does not limit the number of scripts.
- Changed the timeout interval at which the condor_master daemon
must retry to configure the Windows firewall from 10 seconds to 5 seconds.
Also changed the default value of configuration variable
WINDOWS_FIREWALL_FAILURE_RETRY from 60 to 2,
such that there are many fewer retries by default.
This prevents the the condor_master daemon from hanging for 10 minutes
on start up,
when it is unable to configure the Windows firewall.
- MOUNT_UNDER_SCRATCH and ASSIGN_CPU_AFFINITY
are now evaluated against the Job ClassAd,
rather than being treated as constants.
- condor_history now properly handles multiple job ids and constraints on the command line.
- When condor_gpu_discovery is passed the config option,
it now returns the number of GPUs detected and other attributes in configuration syntax.
This makes it more convenient to use the output with a configuration script.
- When an updated X.509 proxy for a job is provided to the
condor_schedd over the network,
job ClassAd attribute x509UserProxyExpiration
is now updated in the job's ClassAd to reflect the new expiration time.
- The automatic variable $(Cluster) now has an alias of $(ClusterId),
and the automatic variable $(Process) now has an alias
such that when used within a submit description file,
these variables may have the same variable name as the corresponding
job ClassAd attribute name.
- The new -totals option to condor_q displays only the totals.
The modified -dag option
shows all of the jobs in the DAG when a DAG-ID argument is provided.
- Many of the Scheduler and Transfer statistics controlled by STATISTICS_TO_PUBLISH
are now tracked by user and published in the Submitter ClassAd by default.
- When the condor_collector queries for its own ClassAd,
it now returns the most current values for Collector statistics in that ClassAd.
New attributes have been added to the Collector's ClassAd
to represent the overall number of lost updates and the loss ratio,
as well as the largest number of unique Machine
and Submitter ClassAds the Collector has seen.
- Augmented the ClassAd function debug()
to print an appropriate error message for undefined ClassAd functions.
- Improved the compatibility of HTCondor version 8.3.6 daemons to work
with HTCondor version 8.2 daemons,
when running in a IPv4/IPv6 mixed mode pool.
- Fixed a problem in the RPM packages that prevented the
Java universe from operating.
The global configuration file is now updated in the RPM packages
to properly refer to the support files for the Java universe.
- Fixed a bug that could cause condor_ssh_to_job to hang
- The condor-externals RPM has been split into
condor-externals and condor-external-libs,
so that both the 32-bit and 64-bit external
libraries could be installed when running the 32-bit static condor_shadow
on a 64-bit system.
- HTCondor version 8.3.5 fixed a bug in which
a condor_schedd daemon would be unable to start
any job which used a condor_shadow, if both IPv6 and IPv4 were enabled.
- Fixed a bug that could cause GSI authentication or X.509 proxy
delegation to fail with an error message of
"couldn't set globus thread model".
- Fixed a bug in which requests for encrypted file transfers were not
honored when cryptographic keys were unavailable.
The default value of configuration variable
has changed to True
to support the minimum security configuration required for these transfers.
- Fixed a bug that resulted in HTCondor possibly not detecting all of
the CPU cores on Windows, when the number of cores was large.
The bug was first seen on a server with 24 real cores (48 Hyper-threaded cores).
- Fixed a bug in which Linux kernel version 4 would incorrectly report
a load average of -1.
- Fixed a bug that would prevent attributes set by
from being sent back to the condor_shadow or condor_schedd.
Also fixed a bug that would cause all uses of this condor_chirp command
to fail once 50 unique ClassAd attribute names had been set.
- Fixed a bug in which setting configuration variable
SHARED_PORT_PORT to 0
would prevent HTCondor from successfully starting up.
It now correctly
selects a random port within the permitted range on which to listen.
- Fixed a bug introduced in HTCondor version 8.3.2,
in which setting configuration variable
PRIVATE_NETWORK_INTERFACE would cause daemons to advertise a private
address which consisted of only the port number.
- Fixed an intermittent bug in condor_submit -i in which it would
return the error message "Failed to find starter for job".
- Fixed a bug in which condor_q incorrectly returned an exit code of 1,
instead of an exit code of 0.
This occurred when given the -global option
and there were no jobs in any queues.
- Fixed a bug that prevented the condor_collector daemon from forking
to handle queries about condor_startd ClassAds.
- Eliminated spurious warnings logged when the condor_shared_port
daemon was enabled.
- Fixed a bug that sometimes caused condor_dagman to not print
the list of failed nodes at the end of the dagman.out file upon
- Report an error if the condor_qsub program fails to print a job ID
when submitting a job to PBS.
- Fixed a bug that could cause a copy of a job's X.509 proxy
or file .condor_pid_ns_status to be transferred with
the job's output files.
- Fixed the ClassAd operators
=!= to match
their documented behavior.
The values 3 and 3.0 are not identical;
they are of different types.
The values abstime("2015-03-03 12:13:15+0000") and
abstime("2015-03-03 13:13:15+0100") are not identical;
they are of different time zones.
- Added an obsolete clause to the HTCondor RPM,
to prevent the ClassAd library packaged in EPEL from being selected
over the ClassAd library that we provide.
- The release tag in the RPM packaging is now set to 1 upon release.
Previously, it was set to the build ID, which violated the RPM Packaging
- When a job that invoked a script was submitted to HTCondor and
the script interpreter did not exist, HTCondor would claim that the
script file did not exist.
Now the error message properly indicates that the interpreter is invalid.
- Fixed a bug that could cause a daemon to go into an infinite loop
during authentication, consuming an ever-growing amount of memory.
- HTCondor version 8.3.5 released on April 20, 2015.
- The RPM packages have been restructured to allow running a 32-bit
static shadow on Red Hat Enterprise Linux 6. The new condor-all
RPM is used to install all of the RPMs for a typical HTCondor installation.
Since the binary distribution of HTCondor for Red Hat Enterprise Linux 6 and 7
consists of more that a handful of RPMs, the RPMs are only available from our
- New features increase the power of job specification
in the submit description file.
Submit description files are now parsed the same as configuration files.
- The queue submit command may be used in
flexible and powerful new ways to specify job submissions.
See section 2.5.2 for details.
- New macro functions are supported,
and may be used in submit description files as well as in configuration.
- condor_submit has new command line options -queue
to provide flexible and powerful new ways to specify job submissions,
as well as to test what job would be submitted without submitting.
- condor_submit now supports assignment of ClassAd attributes
on the command line.
- condor_submit accepts if and include
statements in the same way that configuration files do.
- The machine ClassAd attribute VirtualMemory is now set correctly
for dynamic slots running under a partitionable slot.
- An EC2 grid universe job now advertises its access key ID in
its job ClassAd.
- For an executing job, HTCondor now sets the environment variable
OMP_NUM_THREADS to the number of cores of the slot it is running in.
This prevents OpenMP-linked jobs (including Matlab)
from attempting to use more cores than have been provisioned.
- Improved the performance of writing to a user's job event log
and the event log.
Disabling locking when writing to these files,
as controlled by configuration variables ENABLE_USERLOG_LOCKING
and EVENT_LOG_LOCKING, is now safe on Unix platforms.
The default location of the lock file for rotating the event log
that is defined by configuration variable
EVENT_LOG_ROTATION_LOCK has been changed to
- HTCondor is now more likely to be compatible with Windows systems
that have a Winsock Layered Service Provider (LSP) installed.
- The new configuration variables SUBMIT_REQUIREMENT_NAMES,
support the ability to configure HTCondor to reject the submission
of jobs that do not meet specified criteria.
- A new DAGMan feature implements the retry of a PRE or POST script
after a specified delay,
where a retry attempt is
based on the exit code from the initial execution of the script.
- condor_dagman no longer supports Stork jobs.
- condor_dagman no longer has the capability to read individual per-
job log files. This means that recovery mode will no longer work on a
DAG originally submitted with version 7.9.1 or earlier.
- An experimental new feature allows PanDA monitoring of jobs,
as documented with a link from the HTCondor wiki page,
- The condor_schedd now generates a report about its attempts to
reconnect to the condor_startd daemons of previously running jobs on start up.
The report is written to the location specified by the new configuration
Once all reconnect attempts are complete, a copy of the report is also
emailed to HTCondor administrator account.
- The new ClassAd functions envV1ToV2() and
useful for manipulating the environment variable lists in job ClassAds.
- The new configuration variable CURB_MATCHMAKING
can be used to configure the condor_schedd daemon
to cease requesting more machine resources
from the central manager during overload situations.
- Some HTCondor-specific environment variables that are set in the
environment of a batch job, such as _CONDOR_JOB_AD,
are now also set for
condor_ssh_to_job and interactive job sessions.
- condor_q with the -constraint option will now display
a summary line.
- The new directQuery() Python binding queries a daemon
directly for its ClassAd,
instead of querying the condor_collector.
- The new send_alive() Python binding sends keep alive
messages to the condor_master daemon. This allows the Python bindings to be
used in a daemon managed by the condor_master.
- A new Python binding sets the HTCondor subsystem name and type.
This allows the bindings to initialize logging or configuration as if they
are a particular HTCondor daemon.
- The new log() Python binding provides the ability
to log messages via the HTCondor logging subsystem.
- HTCondor version 8.3.4 released on March 5, 2015.
- Fixed a condor_schedd daemon bug
that could have prevented jobs from matching resources
without an apparent reason,
when the condor_schedd was flocking
and configuration variable SIGNIFICANT_ATTRIBUTES was not set.
- If the condor_schedd daemon is configured to enable both
IPv4 and IPv6 communication,
it will not be able to start any jobs which use a condor_shadow.
In effect, mixed mode IPv4 and IPv6 does not work.
- HTCondor version 8.3.3 released on Feb. 19, 2015.
- Configuration variable ENCRYPT_EXECUTE_DIRECTORY
is now honored on Linux platforms.
The Linux platform must have the ecryptfs-utils package installed
and the Linux kernel must be version 2.6.29 or a more recent version.
The new submit command encrypt_execute_directory allows the user
to specify directory encryption on a per-job basis.
- By default, HTCondor will no longer have access to Linux
system credentials, such as OpenAFS tokens or eCryptFS keys.
This new behavior ensures these credentials cannot be
unintentionally obtained by user jobs.
For more information, see new configuration variable
in section 3.3.8.
- condor_q now offers the new command line option -autocluster,
which causes it to output condor_schedd daemon auto cluster information.
The information is an ID number and the number of jobs in each auto cluster.
- EC2 grid universe jobs may now specify an IAM (instance) profile.
- EC2 jobs may now specify security group IDs instead of names.
This allows the use of VPC instances with non-default security groups.
- EC2 jobs may now specify additional parameters to use when starting
the corresponding instance.
- Improved the preliminary support for IPv4 and IPv6 dual-protocol
by allowing them to work with the condor_shared_port daemon.
- The throughput of queries to the condor_collector has been
improved, as the condor_collector now never forks to handle queries
about condor_collector, condor_negotiator, and condor_schedd
- The new configuration variable ADD_SIGNIFICANT_ATTRIBUTES
lists job attributes to be added to the condor_negotiator-determined list
when considering auto clustering.
The new configuration variable REMOVE_SIGNIFICANT_ATTRIBUTES
lists job attributes to be removed from the condor_negotiator-determined
list when considering auto clustering.
- For all tools that support the -format or -autoformat option,
the new %r conversion specifier causes values to be displayed
in their unevaluated, or raw form.
- condor_dagman now prints the hold reason to the dagman.out
file when node jobs go on hold.
- New python bindings can be used to query and set the
configuration variables of running daemons.
- Python functions can be invoked directly from ClassAd expressions
within HTCondor daemons.
A system administrator must set the new configuration variable
to specify the python modules that are accessible from within ClassAds.
- The new Python LogReader class permits the reading and access
of individual daemon log events.
- The default value for configuration variable
DAGMAN_MAX_JOBS_IDLE has changed from 0,
which imposes no limits, to 1000.
- Configuration variable DAGMAN_USE_OLD_DAG_READER
is no longer supported.
Setting it to True will result in a warning,
and the setting will have no effect on how a DAG input file is read.
- ClassAd attributes written by the condor_schedd that
count the number of jobs in various states now include all jobs,
not only jobs that need to be matched by the condor_negotiator daemon.
These attributes include TotalRunningJobs, TotalIdleJobs,
TotalHeldJobs, and TotalRemovedJobs.
- condor_q and condor_history offer the new command line option
which limits number of results returned.
- A new, more efficient query protocol has been added as the default
when querying a condor_schedd daemon that is version 8.3.3 or later.
To disable this new protocol, set configuration variable
CONDOR_Q_USE_V3_PROTOCOL to False.
- Fixed a bug that prevented CCB servers from running on Windows platforms.
- Configuration variables ENABLE_IPV4 or
ENABLE_IPV6 may now be safely set in any configuration file.
Previously, setting them in any file other than the first configuration file
parsed could have led to unpredictable behavior.
- Fixed a bug in the condor_startd daemon introduced in HTCondor
The condor_startd slots could get stuck
forever in the Preempting/Killing state when they were claimed by
an HTCondor version 8.3.1 or older condor_schedd.
- Fixed an issue introduced in HTCondor version 8.3.0
that could cause the condor_schedd to become unresponsive when a client
set configuration variable CONDOR_Q_USE_V3_PROTOCOL
to its non default value of True.
- Configuration variable GRACEFULLY_REMOVE_JOBS now
controls how a running job is killed for all cases,
including when a job policy expression causes the job to be held or removed.
Previously, this configuration variable was consulted only
when the user ran condor_hold or condor_rm.
- The ability to transparently encrypt execute directories
is not supported by execute hosts using RHEL 7 and derivative distributions,
as these distributions no longer contain the eCryptfs kernel module.
- The submit commands
on_exit_hold and on_exit_remove
do not do what they are supposed to do
for local universe jobs on Windows machines.
- HTCondor version 8.3.2 released on December 23, 2014.
This version contains all bug fixes from HTCondor version 8.2.6.
- It is now possible run a dual-protocol (IPv4 and IPv6) submit node,
submitting to single-protocol execute nodes. This is preliminary work.
- The port used by the condor_shared_port daemon is now
9618 by default.
- Improved the handling when vm universe jobs failure to start.
Failures which do
not appear to be the fault of the job now cause the job to be rescheduled and
the machine stops advertising the ability to run vm universe jobs.
The new condor_condor_update_machine_ad tool facilitates changing
the machine ClassAd.
- The memory footprint of the condor_shadow has been
reduced when Kerberos or SSL authentication methods are not used,
as these libraries are now loaded on demand at run time.
- The responsiveness of a busy condor_schedd daemon to queries
has been improved.
- Added the ability to specify the block device mapping for EC2 jobs.
- The new python binding register() has been added
to allow python functions
be registered with the ClassAds library. This allows python
functions to be invoked from within ClassAds.
- The new python bindings externalRefs() and
internalRefs() have been added to allow the ClassAd object
to determine internal and external references from an expression.
- When the condor_startd has a live condor_starter,
claim keep alives are sent
by the existing TCP connection between the condor_starter and condor_schedd,
rather than creating a new connection to the
condor_schedd from the condor_startd.
- Added the DAGMan feature of ALWAYS-UPDATE for updates
of a DAGMan node status file.
Specifying this causes the node status file to be overwritten,
even if no nodes have changed status since the file was last written.
- Configuration variable MAX_JOBS_RUNNING has been
modified such that it only applies to job universes that require a
Scheduler and local universe jobs are no longer affected by this
The number of running scheduler and local universe jobs can be controlled
with configuration variables START_SCHEDULER_UNIVERSE and
- The specific versions of Globus GSI libraries to be loaded at run time
are determined at compile time.
- HTCondor now sets environment variable _CONDOR_JOB_AD for
scheduler universe jobs.
Its value will be the path to a file which contains
the job ClassAd as it was when the job was started.
This feature already exists for vanilla, parallel, java, and local
- The new -debug option to condor_userprio sends
debug output to stderr.
- HTCondor daemons now support a whitelist of statistics attributes to
publish from their ClassAd to the condor_collector.
This is intended to ease
configuration on systems that use ganglia for monitoring.
- New statistics have been added to the condor_schedd to monitor runtime
spent doing DNS queries, using fsync,
and rebuilding the priority list for negotiation.
Also additional attributes for average, maximum and minimum
have been added to runtime statistics for command handlers
for all HTCondor daemons.
These changes are intended to help direct future scalability work.
- The new daemon logging level, D_SUB_SECOND,
enables millisecond resolution timestamps in daemon logs.
- Fixed a bug introduced in HTCondor version 8.3.1 that caused daemons to be
unreachable if they were configured to use the condor_shared_port daemon,
but the condor_master was not.
- Updated the CREAM client library used in the cream_gahp.
This fixes the delegation of RFC format proxies, in addition to other
- Fixed a bug that could cause a segmentation fault of condor_dagman
for some DAG input file syntax errors,
rather than printing an appropriate error message.
- Fixed a bug that could cause the condor_shared_port daemon
to fail on Mac OS X platforms,
if configuration variable LOCK was not explicitly set
in a configuration file.
- Fixed a bug that caused both condor_dagman and the condor_schedd
daemon to generate commands to remove condor_dagman's node jobs when
the condor_dagman job is the target of condor_rm.
Now, only the condor_schedd generates the command,
avoiding the extra load of running two identical commands.
- Fixed a bug that caused the DAGMan node status file,
as detailed in section 2.10.12,
to not reflect the final status of a DAG when the DAG is removed
by issuing a condor_rm command,
or when the DAG is
aborted due to an ABORT-DAG-ON specification in the DAG input file.
- HTCondor version 8.3.1 released on September 11, 2014.
- If cgroups are enabled on Linux platforms,
the amount of swap space used by a job is now limited to the
size specified by the machine ClassAd attribute VirtualMemory
for the slot that the job is running on.
- The new configuration variable COLLECTOR_PORT specifies
the default port used by the condor_collector daemon and command line tools.
The default value is 9618.
This default is the same port as has been used in previous HTCondor versions.
- The condor_shared_port daemon will now work
if the default location given by configuration variable
DAEMON_SOCKET_DIR, which is $(LOCK)/daemon_sock,
is longer than 90 characters in length.
On Linux platforms, abstract sockets are now the primary method for
condor_shared_port to forward an incoming connection to the intended
- Improvements to CCB increase performance.
- The use of a single log file to write events and enforce the
dependencies of a DAG represented by a condor_dagman instance is mandatory.
To implement this,
the -dont_use_default_node_log command-line
option to condor_submit_dag is disabled,
and an attempt to set configuration variable
DAGMAN_ALWAYS_USE_NODE_LOG to False will generate an
- The new condor_dagman configuration variable
DAGMAN_SUPPRESS_JOB_LOGS allows users to prevent DAG node
jobs from writing to the log file specified in their submit description file.
See section 3.3.24 for details.
- New special variables @(OWNER) and @(NODE_NAME) are
available when defining configuration variable
These values make it easier to avoid log file name collisions.
- condor_submit will no longer insert an OpSys requirement
for a job
when one of OpSysAndVer, OpSysLongName, OpSysName,
or OpSysShortName is already specified by the user in
the Requirements expression of the submit description file.
- The configuration file $(HOME)/.condor/condor_config
is no longer considered for the single, initial, global configuration file.
Instead, a user-specific configuration file has been added as the
last file parsed.
The new configuration variable USER_CONFIG_FILE may change the
default file name or disable this feature.
Section 3.3.1 describes the ordering
in which configuration files are parsed.
- Daemons now authenticate many client network connections in
parallel, rather than one at a time.
This improves the scalability of daemons that receive many client
connections, like the condor_schedd and condor_collector.
The improvement is most noticeable when using the FS and GSI
- The GSI security libraries are now loaded into memory only when GSI
authentication is required.
This reduces memory usage when GSI authentication is not used.
The memory reduction will be most noticeable when there are many
condor_shadow processes running.
- Implemented fine-grained locking in the HTCondor python module to
allow other python threads to run during HTCondor calls.
- HTCondor version 8.3.0 released on August 12, 2014.
This release contains all improvements and bug fixes from
HTCondor version 8.2.2.
- When a daemon creates a child daemon process, it also creates a
security session shared with the child daemon.
This makes the initial communication between the daemons more efficient.
- Negotiation cycle performance has been improved, especially
over a wide-area network, by reducing network traffic and latency
between a submit machine and a central manager.
The new configuration variable
does performance tuning, as defined
in section 3.3.16.
- The synchronization of the job event log was improved by only
using fsync() where necessary and
fdatasync() where sufficient.
This should provide a small reduction in disk I/O to
the condor_schedd daemon.
- CPU usage by the condor_collector has been reduced when
handling normal queries from condor_status,
and CPU usage by the condor_schedd has been reduced when
handling normal queries from condor_q.
- HTCondor can now internally cache the result of Globus authorization
The caching behavior is enabled by setting configuration variable
GSS_ASSIST_GRIDMAP_CACHE_EXPIRATION to a non-zero value.
This feature will be useful for sites that use the Globus authorization
callouts based only on DN and VOMS FQAN, and for sites that have
- The job ClassAd attribute DAG_Status is included in
the dagman.out file.
- The new -DoRecovery command line option for condor_dagman
and condor_submit_dag causes condor_dagman to run in
- The new -ads option to condor_status permits a set of ClassAds
to be read from a file, processing the ClassAds as if they came from
- Daemon ClassAd hooks implementing Startd Cron functionality
can now return multiple ClassAds,
and the hooks can specify which ClassAds their output should merge into.
- Two new condor_schedd ClassAd statistics attributes are
available: JobsRunning and JobsAccumExceptionalBadputTime.
- Fixed a bug that caused condor_dagman to unnecessarily attempt
to read node job submit description files,
which could cause spurious warnings when in recovery mode.
Strictly speaking, the bug is fixed only for the
default case in which DAGMAN_ALWAYS_USE_NODE_LOG is set
- Fixed a bug in the condor_schedd daemon that caused the values
of the ClassAd attributes JobsRunningSizes and
JobsRunningRuntimes to be much larger than they should have been.
Next: 10.5 Stable Release Series
Up: 10. Version History and
Previous: 10.3 Stable Release Series