Next: 10.4 Development Release Series
Up: 10. Version History and
Previous: 10.2 Upgrading from the
Contents
Index
Subsections
10.3 Stable Release Series 8.6
This is a stable release series of HTCondor.
As usual, only bug fixes (and potentially, ports to new platforms)
will be provided in future 8.6.x releases.
New features will be added in the 8.7.x development series.
The details of each version are described below.
Version 8.6.1
Release Notes:
- HTCondor version 8.6.1 released on March 2, 2017.
New Features:
- condor_q now checks to see if authentication and security negotiation are enabled before attempting to
request only the current users jobs from the condor_schedd. Prior to this change, configurations that disabled
security or authentication would also need to set CONDOR_Q_ONLY_MY_JOBS to false.
(Ticket #6125).
- The CLAIMTOBE authentication method is now in the list of methods for READ access if no list of
authentication methods for READ or DEFAULT is specified in the configuration. This change allows sites that
use the default host based security model to use condor_q -global with the only-my-jobs feature
without making changes to their security configuration.
(Ticket #6125).
- The collector now records the authentication method used to determine the authenticated identity.
(Ticket #6122).
Bugs Fixed:
- Update Docker interface to be able to retrieve usage information
from running containers and to remove containers when certain errors
occurred when using Docker version 1.13.
(Ticket #6088).
- In Docker universe, all writes to files in /tmp and /var/tmp by default
write inside the container. There is a limit on the file size within the container,
and jobs that write a lot to /tmp may hit that. If a docker universe job now runs
on a system with MOUNT_UNDER_SCRATCH defined, HTCondor now adds those
mounts as volume mounts, so file writes do not go to the container, but to the host
file system.
(Ticket #6080).
- Fixed a bug in condor_status -format and condor_q -format that caused the
tools to truncate output to the width specified in the format specifier. The most likely manifestation of
this bug was that punctuation after the format would not be printed when the format had an explicit width.
(Ticket #6120).
- Fixed a bug that caused spurious shared port-related error
messages to appear in the dagman.out file (by adding the
new DAGMAN_USE_SHARED_PORT configuration macro).
(Ticket #6156).
- Fixed a bug that caused VM universe jobs to fail if the
vm_disk submit command contained spaces after a comma.
(Ticket #6132).
- Fixed a bug that can cause the Job Router and condor_c-gahp to
crash if they fail to submit a job due to submit transforms or
submit requirements.
(Ticket #6152).
- Fixed a bug that caused the Job Router to not route any jobs if
the JOB_ROUTER_DEFAULTS configuration parameter value
started with white space.
(Ticket #6128).
- Fixed several bugs in how the Job Router writes to job event logs.
(Ticket #6092).
- Removed Bosco's attempt to configure a default value for
grid_resource in the submit description file, as
condor_submit no longer supports this ability.
Also, Bosco now works with Slurm clusters.
(Ticket #6106).
- Changed Bosco's configuration of the condor_ft-gahp to eliminate
worrying error messages in the condor_ft-gahp's log file.
(Ticket #6107).
- Fixed a bug that could cause a grid batch job submitted to PBS or
Slurm to go on hold when the job's X.509 proxy is refreshed.
(Ticket #6136).
- Fixed a bug where the condor_gridmanager fails to put a job on
hold due to the desired hold reason containing invalid characters.
(Ticket #6142).
- Improved the hold reason when submission of a grid-type batch
job fails.
(Ticket #3377).
- Update helper scripts to work with current versions of Open MPI and MPICH2.
(Ticket #6024).
- Fixes a bug that could cause events for local universe jobs to not
be written to the global event log.
(Ticket #6100).
- Fixed a bug on execute machines that enable PID namespaces that
would generate a spurious error message in the daemon log when condor_off -fast was issued.
(Ticket #6137).
- Fixed a bug that could corrupt the job queue log file such that
the condor_schedd cannot restart.
The bug is mostly likely to occur if the disk becomes full.
(Ticket #6153).
- Incremented the ClassAd library version number, since the deprecated
iostream interface has been removed.
(Ticket #6050).
(Ticket #6115).
Version 8.6.0
Release Notes:
- HTCondor version 8.6.0 released on January 26, 2017.
New Features:
- Added two new job ClassAd attributes, CumulativeRemoteSysCpu and
CumulativeRemoteUserCpu, which keep a running total of system and user
CPU usage, respectively, across all job restarts. Also, immediately clear attributes
RemoteSysCpu and RemoveUserCpu on job start, instead of on first update.
(Ticket #6022).
- Added a new configuration knob, ALWAYS_REUSEADDR, which defaults
to True. When True, it tells HTCondor to set the
SO_REUSEADDR socket option, so that
the schedd can run large numbers of very short jobs without exhausting the
number of local ports needed for shadows.
(Ticket #6040).
- Changed the default value of IGNORE_LEAF_OOM to True.
(Ticket #5775).
Bugs Fixed:
- Fixed a bug causing unnecessarily slow updates from the condor_startd.
If you depend on the old behavior, set UPDATE_SPREAD_TIME to 8. A
value of 0 enables the fix.
(Ticket #6062).
- Fixed a race condition when running multiple concurrent jobs on the same claim.
When the starter exits, it notifies the shadow, which tells the startd to kill the starter.
Immediately after the shadows tells the startd, it fetches the next job, and tries to start it.
If the starter hasn't completely exited yet (perhaps it needs to clean up a large sandbox),
it will notice the shadow has closed the command socket, and the starter will go into disconnected
mode, and get confused. This has been fixed.
(Ticket #6049).
- Fixed an infelicity with condor_submit -i and docker universe,
where it would start an interactive shell without a container. Added error
message expressing that this combination is not currently supported.
(Ticket #6083).
- When a job claimed by the Job Router is held or removed, it is no
longer considered a failure of the job route chosen for that job.
(Ticket #5968).
- Fixed a bug in recovering a Google Compute Engine (GCE) job if the
condor_gridmanager restarts during submission of the instance request.
(Ticket #6078).
- Fixed a bug that could cause re-installation of a remote cluster
to fail in Bosco.
(Ticket #6042).
- Fixed a bug with handling the proxy files of grid-type batch jobs
when the proxy's file name is a relative path.
(Ticket #6053).
- Fixed a bug that caused the batch_gahp to crash when a job's
X.509 proxy is refreshed and the batch_gahp is configured to not
create a limited copy of the proxy.
(Ticket #6051).
- Fixed a bug in the virtual machine universe where RequestMemory
and RequestCPUs were not changing the resources assigned to the VM
created by HTCondor. Now, VM_Memory defaults to RequestMemory,
and the number of CPUs defaults to RequestCPUs.
(Ticket #5998).
Next: 10.4 Development Release Series
Up: 10. Version History and
Previous: 10.2 Upgrading from the
Contents
Index