Next: 10.5 Development Release Series
Up: 10. Version History and
Previous: 10.3 Upgrading from the
Contents
Index
Subsections
10.4 Stable Release Series 8.4
This is a stable release series of HTCondor.
As usual, only bug fixes (and potentially, ports to new platforms)
will be provided in future 8.4.x releases.
New features will be added in the 8.5.x development series.
The details of each version are described below.
Version 8.4.0
Release Notes:
- HTCondor version 8.4.7 not yet released.
New Features:
Bugs Fixed:
- Fixed a bug in Docker universe where the job would
not run with the correct group id.
(Ticket #5649).
- Fixed a performance problem in the condor_schedd that could
cause it to become unresponsive for several minutes after the
set of significant attributes for negotiation changes.
(Ticket #5648).
Version 8.4.6
Release Notes:
- HTCondor version 8.4.6 released on April 21, 2016.
New Features:
- condor_advertise -multiple now tolerates multiple blank lines in the
input file. It no longer quits parsing on the first first blank line that does not
follow a valid ClassAd.
(Ticket #5147).
Bugs Fixed:
- Fixed bug where when partitionable slots were
enabled in the condor_startd, a job would be unable
to start running on that machine in some cases.
(Ticket #5626).
- Fixed a bug that would cause the condor_startd
to crash when ALLOW_PSLOT_PREEMPTION was enabled.
(Ticket #5586).
- Fixed a bug introduced in version 8.3 that
removed the attribute REMOTE_GROUP_RESOURCES_IN_USE
from the job ad in the negotiator.
(Ticket #5593).
- Fixed a bug where HTCondor would regard as invalid text representations
of IPv6 addresses which were the longest possible. This bug typically
manifested as a failure to contact hosts which were advertising IPv6 addresses
of this sort.
(Ticket #5585).
- Fixed a memory leak in the condor_negotiator when
ALLOW_PSLOT_PREEMPTION was enabled.
(Ticket #5571).
- Fixed a bug where after a condor_schedd restart
the submitter attribute WEIGHTED_JOBS_RUNNING
would be incorrectly computed.
(Ticket #5637).
- Fixed a bug when using CLAIM_PARTITIONABLE_LEFTOVERS
and flocking.
Machines from a remote pool could be treated as if they were in the local
pool.
As a result, the RemotePool attribute would not be set in the ads
of jobs running on these machines, and the FlockedJobs and
RunningJobs attributes of submitter ads would have incorrect
values.
(Ticket #5577).
- Fixed a bug that could cause a job's supplemental groups to be set
incorrectly when SOFT_UID_DOMAIN is set to True.
(Ticket #5603).
- Fixed a bug that caused supplemental groups to be set incorrectly
when executing file transfer plugins and various hooks.
(Ticket #5600).
- Fixed a bug that resulted in Windows 10 being reported as
WindowsUnknown in the OPSYSNAME attribute of the condor_startd
ClassAd.
(Ticket #5575).
- Fixed a typo in the LIMIT_JOB_RUNTIMES policy metaknob
that prevented the policy from working as intended.
(Ticket #5307).
Version 8.4.5
Release Notes:
- HTCondor version 8.4.5 released on March 22, 2016.
New Features:
- The default for DAGMAN_LOG_ON_NFS_IS_ERROR has
been changed from True to False. This is the result
of changes in the 8.3 series that mean that file locking is no
longer required on user logs.
(Ticket #5516).
Bugs Fixed:
- Fixed a bug where HTCondor would unconditionally retry non-successful
DNS lookups of the local system's hostname; this could cause delays of up
to sixty seconds when using command-line tools on systems whose hostname
was not in DNS. We no longer retry on errors at all, and only retry
failures which are temporary.
(Ticket #5553).
- Fixed a bug that would cause condor_schedds flocking to remote
pools to not send no jobs, or fewer jobs than possible to the
remote pool. This was a result of not correctly setting
the submitter attribute WeightedJobsRunning for
flocked pools.
(Ticket #5539).
- Accounting group names that contain spaces are now rejected by
condor_submit and ignored by the condor_negotiator.
Previously, submitting a job with an accounting group name that contained
a space would cause the condor_negotiator to fail at startup.
(Ticket #5538).
- Fixed a bug whereby per-job history files (enabled by
the configuration setting PER_JOB_HISTORY_DIR) may briefly
appear to be empty or incomplete.
(Ticket #5562).
- Fixed a bug whereby ClassAds written into history files
may contain the same attribute multiple times.
(Ticket #5548).
- Fixed a bug that caused DAGMan to not work correctly with
some local universe node jobs. (This bug was introduced in version
8.3.0.)
(Ticket #5299).
- Fixed a bug that resulted in jobs managed by the condor_job_router
not reporting memory and disk usage of the job correctly.
(Ticket #5552).
- Reworked a bug fix from the 8.4.3 release that was designed to allow for
more than 100 dynamic slots to be a bit more generous in allocating Disk to
those slots.
Now, those slots are less prone to fail to match subsequent jobs.
(Ticket #5535).
- Fixed a bug in the randomization of ports within the LOWPORT to HIGHPORT range
that would sometimes generate ports outside of this range on Windows.
(Ticket #5555).
- Fixed a bug in condor_off -peaceful that could result in never
sending the "off" command to machines when at least one of the machines could
not be contacted when sending the previous "peaceful" command.
(Ticket #5504).
- When cgroups are in use, limit the amount of file system cache in the
kernel to prevent the OOM killer from killing jobs that use a large amount of
file system cache.
(Ticket #5500).
Version 8.4.4
Release Notes:
- HTCondor version 8.4.4 released on February 4, 2016.
New Features:
Bugs Fixed:
- Fixed a bug that caused the condor_collector to crash if
CONDOR_DEVELOPERS_COLLECTOR failed to resolve.
(Ticket #5492).
- Fixed a bug that caused Condor-C jobs to fail when
JobLeaseDuration was set to less than one hour (3600 seconds).
The remote job would be aborted due to the job lease not being renewed.
(Ticket #5446).
- Fixed a bug that could cause HTCondor to misreport an EC2 job as running
when it had in fact been purged from the service. Fixed bugs that could
cause a running EC2 job to be misreported as idle. HTCondor also no longer
sends EC2 services superfluous queries. (This may increase the latency
of HTCondor job status updates.)
(Ticket #4568).
- The grid manager now aborts if the GAHP hangs, which we detect by
the absence of a response after GRIDMANAGER_GAHP_RESPONSE_TIMEOUT
seconds. The EC2 GAHP now performs many fewer memory allocations in the
course of normal operations, which improves stability on some systems.
(Ticket #5442).
- Fixed a bug that caused the condor_master to fail if a shared port
daemon address file written by a version of HTCondor prior to 8.4.0
is present.
(Ticket #5488).
- Fixed a bug that caused updates to the job attribute
TimerRemove to not be respected while the job was being managed
by the condor_shadow, condor_gridmanager, or the Job Router.
(Ticket #5470).
- Fixed a bug where the job policy expression of a job could appear
in one of the Reason attributes of another job.
(Ticket #5466).
- Fixed a bug, that occurred on the Windows platform, that would cause
the condor_shadow to hang while trying to delete old shadow logs when the
value of MAX_NUM_SHADOW_LOG was larger than the default value of 1.
This bug would also sometimes result in the condor_schedd hanging.
(Ticket #5499).
Version 8.4.3
Release Notes:
- HTCondor version 8.4.3 released on December 16, 2015.
New Features:
Bugs Fixed:
- Fixed a bug that caused the -append option to be handled too
late to apply to the first Queue statement in a condor_submit file.
(Ticket #5414).
- Fixed a bug that prevented running more than 100 slots on a single
condor_startd with partitionable slots.
(Ticket #5398).
- Fixed a bug which caused ec2_iam_profile_name
not to work for Spot instances.
(Ticket #5410).
- Fixed a bug where the cgroup VM limit would not be set for sizes over
2 Gibibytes.
(Ticket #5434).
- Fixed bugs that prevented the HTCondor daemons from working promptly at
startup when the condor_shared_port daemon was in use on Windows platforms.
(Ticket #5283).
(Ticket #5430).
(Ticket #5431).
(Ticket #5432).
(Ticket #5433).
- Added SELinux type enforcement rules to allow the condor_schedd
to use sendmail on Enterprise Linux 7 platforms.
(Ticket #5418).
- Fixed a bug where HTCondor service would not start if the
condor_master.pid file was empty on Linux platforms.
(Ticket #5427).
Version 8.4.2
Release Notes:
- HTCondor version 8.4.2 released on November 17, 2015.
New Features:
- condor_history no longer reports an error when run on a system that does
not have a history file.
This change was made because the history file is not created until after the
first job runs.
So, users were always seeing an error message on a fresh installation of
HTCondor.
(Ticket #5374).
Bugs Fixed:
- Fixed a bug introduced in 8.4.1 that could cause the condor_schedd
to exit.
This affected remote submit, HTCondor-CE, and HTCondor-C.
(Ticket #4522).
- The TCP_FORWARDING_HOST is now honored by
HTCondor client programs.
(Ticket #5339).
- Fixed a problem where Standard Universe jobs could not restart
from a checkpoint in the Enterprise Linux 6 RPM distribution.
(Ticket #5382).
(Ticket #5383).
- Fixed bugs in the function of the DAGMan
DAGMAN_MAX_JOBS_IDLE/-maxidle throttle,
especially for node jobs that create multiple procs.
(Ticket #5333).
- Fixed a problem where the RPMs would claim to publicly provide
Globus shared libraries that are in a private location.
(Ticket #5349).
- Added a default request_memory for condor_submit -interactive
of 512 megabytes. Formerly, the default was one, which is
insufficient in environments that strictly enforce memory
usage.
(Ticket #5344).
- Fixed a problem were the condor_classad RPM would claim to
provide a replacement for the classad RPM in EPEL.
(Ticket #5400).
- HTCondor now applies the configuration settings
GRIDMANAGER_GAHP_CALL_TIMEOUT and
GRIDMANAGER_CONNECT_FAILURE_RETRY_COUNT
when running grid universe jobs for EC2 or Google Compute Engine.
(Ticket #5300).
- Fixed a crash in the condor_schedd that happened when the
schedd was under load and being shutdown in the fast mode.
(Ticket #5371).
- Added a timeout to the condor_fetchlog command so that it
will not hang forever waiting for a unresponsive daemon.
(Ticket #5325).
- Fixed a problem that prevented HTCondor from building on some 64-bit Linux
platforms such as Arm64.
This was reported by Debian maintainers as their Bug 804386.
(Ticket #5380).
- Fixed a problem where the platform string was incorrect in the RPM
packages.
(Ticket #5384).
Known Issues:
- The DAGMan workflow log file is not correctly written for local
universe DAG node jobs that have no log file specified in the submit file,
which causes DAGMan to wait forever, thinking the jobs have not completed.
Note that this problem can be worked around by specifying any
log file for the job, even log = /dev/null.
(This bug is a regression that was introduced some time since version
8.2.4.)
(Ticket #5299).
- DAG node retries do not work correctly with DAG node submit files
that create more than one proc in the resulting cluster (such nodes
cause DAGMan to hang if the retry is activated).
We believe that this bug has existed since DAGMan first supported
multi-proc node jobs.
(Ticket #5350).
Version 8.4.1
Release Notes:
- HTCondor version 8.4.1 released on October 27, 2015.
Known Issues:
- Remote submit to an 8.4.1 condor_schedd is broken if file transfer is
used. This also means HTCondor-CE and HTCondor-C are broken. This bug will
be fixed in version 8.4.2.
(Ticket #4522).
- TCP_FORWARDING_HOST is disregarded by HTCondor clients
starting in version 8.3.6. This bug will be fixed in version 8.4.2 and 8.5.1.
(Ticket #5339).
New Features:
- Added support to allow an admin to always volume mount
certain directories into docker universe containers running
on a host.
(Ticket #5308).
- Added four policy metaknobs to simplify configuring a policy
to either preempt or hold jobs that use more memory
or CPU cores than provisioned in the slot. See the POLICY
category of metaknobs in section 3.3.1 for
additional information.
(Ticket #5250).
- Added configuration variables and documentation so that we uniformly prefer
<var>_ATTRS over <var>_EXPRS but support both. This includes
STARTD_ATTRS, STARTD_JOB_ATTRS and SUBMIT_ATTRS
which are often used by HTCondor sites which customize the configuration. These
configuration variables are now exclusively for use by HTCondor administrators;
The former default values for these variables have been moved into other configuration
which is reserved for use by HTCondor developers. This is done to prevent administrators
from accidentally removing the necessary defaults.
A warning about use of STARTD_EXPRS has been disabled unless
STARTD_ATTRS or SLOT_TYPE_<n>_STARTD_ATTRS is also used, since
the use all three of these at the same time is not supported.
(Ticket #5326).
- When condor_reconfig and condor_restart are run as root
they will check to see if the condor user has read access to all of the
configuration files before sending the command. This is done to prevent aborting the daemons
accidentally by sending reconfig after the admin creates a new config file and
forgets to give the condor user read access to that file.
(Ticket #4506).
- Added the -natural sort option to condor_status to sort the slots
in numerical order rather than alphabetical order.
(Ticket #5131).
Bugs Fixed:
Version 8.4.0
Release Notes:
- HTCondor version 8.4.0 released on September 14, 2015.
New Features:
Bugs Fixed:
- Fixed a bug introduced in HTCondor version 8.3.7 that caused the
condor_shared_port daemon to leak file descriptors.
Also made HTCondor work better when some HTCondor daemons
are using shared port, but the condor_master is not.
(Ticket #5259).
- The condor_starter lowers the OOM (out of memory) score of jobs
so the OOM killer is more likely to chose an HTCondor job rather than
an HTCondor daemon or other user process.
(Ticket #5249).
- Job submission fails if X.509 certificates are advertised with EC2
grid universe jobs.
Therefore EC2 grid universe jobs no longer advertise their access keys.
(Ticket #5252).
Next: 10.5 Development Release Series
Up: 10. Version History and
Previous: 10.3 Upgrading from the
Contents
Index