Next: 11.3 Upgrading from the
Up: 11. Version History and
Previous: 11.1 Introduction to HTCondor
11.2 Development Release Series 8.7
This is the development release series of HTCondor.
The details of each version are described below.
- HTCondor version 8.7.2 released on June 22, 2017.
- Our current implementation of late materialization is incompatible with
condor_dagman and will cause unexpected behavior, including failing without
warning. This is a top-priority issue which aim to resolve in an upcoming
- Improved the performance of the condor_schedd by setting the
default for the knob SUBMIT_SKIP_FILECHECKS to true. This prevents
the condor_schedd from checking the readability of all input files, and skips
the creation of the output files on the submit side at submit time.
Output files are now created either at transfer time, when file transfer
is on, or by the job itself, if a shared filesystem is used. As a result
of this change, it is possible that a job will run to completion, and only
then is put on hold because the output file on the submit machine cannot
- Changed condor_submit to not create empty stdout and stderr files before
submitting jobs by default. This caused confusion for users, and slowed down
the submission process. The older behavior, where condor_submit would fail
if it could not create this files, is available when the parameter
SUBMIT_SKIP_FILECHECKS is set to false. The default is now true.
- condor_q will now show expanded totals when querying a condor_schedd that is version 8.7.1 or later.
The totals for the current user and for all users are provided by the condor_schedd.
To get the old totals display set the configuration parameter CONDOR_Q_SHOW_OLD_SUMMARY to true.
- The condor_annex tool now logs to the user configuration directory. Added an
audit log of condor_annex commands and their results.
- Changed condor_off so that the -annex flag implies the
-master flag, since this is more likely to be the right thing.
- Added -status flag to condor_annex, which reports on
instances which are running but not in the pool.
- If invoked with an annex name and duration (but not an instance or slot
count), condor_annex will now adjust the duration of the named annex.
- Job input files which are downloaded from http:// web addresses now
have mechanisms to recover from transfer failures. This should increase the
reliability of using web-based input files, especially under slow and/or
unstable network conditions.
- Reduced load on the condor_collector by optimizing queries performed when
an HTCondor daemon needs to look up the address of another daemon.
- Reduced load on the condor_collector by optimizing queries performed
when using condor_q with several different command-line options such as
-submitter and -global.
- Added the condor_top tool,
an automated version of the now-defunct condor_top.pl
which uses the python bindings to monitor the status of daemons.
- Added a new option -cron to condor_gpu_discovery that allows it to be
used directly as an executable of a condor_startd cron job.
- The configuration variable MAX_RUNNING_SCHEDULER_JOBS_PER_OWNER
was set to default to 100. It formerly had no default value.
- Added a parameter DEDICATED_SCHEDULER_USE_SERIAL_CLAIMS which
defaults to false. When true, allows the dedicated schedule to use claimed/idle
slots that the serial scheduler has claimed.
- The condor_advertise tool now assumes an update command if one is not
specified on the command-line and attempts to determine exact command by
inspecting the first ad to be advertised.
- Improved support for running several condor_negotiators in a
NEGOTIATOR_NAME now works like MASTER_NAME.
condor_userprio has a -name option to select a specific
Accounting ads from multiple condor_negotiators can co-exist in the
- Package EC2 Annex components in the condor-annex-ec2 sub RPM.
- Added configuration parameter ALTERNATE_JOB_SPOOL,
an expression evaluated against the job ad, which specifies an alternate
spool directory to use for files related to that job.
- With an empty configuration file, HTCondor would behave as if
ALLOW_ADMINISTRATOR were *. Changed the default to
$(CONDOR_HOST), which is much less insecure.
- Fixed a bug in the condor_schedd where it did not account for the initial state of
late materialize jobs when calculating the running totals of jobs by state. This bug
resulted in condor_q displaying incorrect totals when CONDOR_Q_SHOW_OLD_SUMMARY
was set to false.
- Fixed a bug where the condor_schedd would incorrectly try to check the
validity of output files and directories for late materialize jobs. The condor_schedd
will now always skip file checks for late materialize jobs.
- Changed the output of the condor_status command so that the Load Average
field now displays the load average of just the condor job running in that
slot. Previously, load associated from outside of condor was proportionately
distributed into the condor slots, resulting in much confusion.
- Illegal chars ('+', '.') are now prohibited in DAGMan node names.
- Improve audit log messages by including the connection ID and properly
filtering out shadow and gridmanager modifications to the job queue log.
- condor_root_switchboard has been removed from the release, since
PrivSep is no longer supported.
- HTCondor version 8.7.1 released on April 24, 2017.
- Previously, when the number of forked children processing Collector
queries surpassed the maximum set by the configuration knob COLLECTOR_QUERY_WORKERS, the
Collector handled all new incoming queries in-processes (i.e. without
forking). As processing a query and sending out the result to the network
could take a long time, the result of servicing such queries in-process in
the Collector is likely to drop a lot of updates. So now in v8.7.1, instead of
servicing such queries in-process, they are queued up for servicing as soon as
query worker child processes become available. The configuration knob
COLLECTOR_QUERY_WORKERS_PENDING was introduced; see
- Default value for COLLECTOR_QUERY_WORKERS changed from 2 to 4.
- Introduced configuration macro
COLLECTOR_QUERY_WORKERS_RESERVE_FOR_HIGH_PRIO so that the
collector prioritizes queries that are important for the operation of the
pool (such as queries from the negotiator) ahead of servicing user
invocations of condor_status.
- Introduced configuration macro COLLECTOR_QUERY_MAX_WORKTIME to
define the maximum amount of time the collector may service a query from a
client like condor_status. See sectionrefparam:CollectorQueryMaxWorktime.
- Added several new statistics on collector query performance into the Collector
ClassAd, including ActiveQueryWorkers, ActiveQueryWokersPeak,
PendingQueries, PendingQueriesPeak, DroppedQueries,
and RecentDroppedQueries. See sectionrefsec:Collector-ClassAd-Attributes.
- Further refinement and initial documentation of the HTCondor Annex.
- Docker universe jobs can now use condor_chirp command
(if it is in the image).
- In the Job Router, when a candidate job matches multiple routes,
the first route is now always selected.
The old behavior of spreading jobs across all matching routes round-robin
style can be enabled by setting the new configuration parameter
JOB_ROUTER_ROUND_ROBIN_SELECTION to True.
- The condor_schedd now keeps a count of jobs by state for each owner and submitter
and will report them to condor_q. Condorq will display these totals unless the new
configuration parameter CONDOR_Q_SHOW_OLD_SUMMARY is set to true. In 8.7.1
this parameter defaults to true.
- Milestone 1 for late materialization in the condor_schedd was completed. This milestone adds the
undocumented option -factory to condor_q that can be used to submit a late materializing job cluster
to the condor_schedd. The condor_schedd will refuse the submission unless the configuration parameter
SCHEDD_ALLOW_LATE_MATERIALIZATION is set to true.
- Increased the default value for configuration parameter
NEGOTIATOR_SOCKET_CACHE_SIZE to 500.
- Added new DaemonCore statistics UdpQueueDepth to measure the
number of bytes in the UDP receive queue for daemons with a UDP command port.
- Improved speed of handling queries to the collector by caching the
the configuration knob SHARED_PORT_ADDRESS_REWRITING.
- The condor_collector on Linux now handles some queries in process and some
by forking a child process. This allows it to avoid the overhead of forking to handle
queries that will take little time. The policy for deciding which queries to handle in process
is controlled by a new configuration parameter HANDLE_QUERY_IN_PROC_POLICY.
- Added -limit option to condor_status and changed the condor_collector to honor it.
- condor_submit was changed to use the same utility library that the submit python bindings use.
This should help insure that submit via python bindings will give the same results as using condor_submit.
- HTCondor version 8.7.0 released on March 2, 2017.
- Optimized the code that reads reads ClassAds off the wire making the maximum possible update rate
for the Collector about 1.7 times higher than it was before.
- New statistics have been added to the Collector ad to show time spent handling queries.
- Changed the formatting of the printing of ClassAd expressions with
parentheses. Now there is no space character after every open parenthesis, or
before every close parenthesis
This looks more natural, is somewhat faster for the condor to parse, and
saves space. That is, an expression that used to print like
( ( ( foo ) ) )
now will print like this
- Technology preview of the HTCondor Annex. The HTCondor Annex allows one
to extend their HTCondor pool into the cloud.
- Added -annex option to condor_status and condor_off. Requires
an argument; the request is constrained to match machines whose
AnnexName ClassAd attribute matches the argument.
- A refreshed X.509 proxy is now forwarded to the remote cluster
- Added several new statistics to the Negotiator ad, mainly
detailing how time is spent in the negotiation cycle.
- Removed redundant updates to the job queue by the Job Router.
Next: 11.3 Upgrading from the
Up: 11. Version History and
Previous: 11.1 Introduction to HTCondor