Next: condor_fetchlog
Up: 9. Command Reference Manual
Previous: condor_convert_history
Contents
Index
Subsections
condor_dagman
meta scheduler of the jobs submitted as the nodes of a DAG or DAGs
condor_dagman
[-debug level]
[-rescue filename]
[-maxidle numberOfJobs]
[-maxjobs numberOfJobs]
[-maxpre NumberOfPREscripts]
[-maxpost NumberOfPOSTscripts]
[-noeventchecks]
[-allowlogerror]
[-usedagdir]
-lockfile filename
[-waitfordebug]
[-autorescue 0|1]
[-dorescuefrom number]
-csdversion version_string
[-allowversionmismatch]
[-DumpRescue]
[-verbose]
[-force]
[-notification value]
[-dagman DagmanExecutable]
[-outfile_dir directory]
[-update_submit]
[-import_env]
-dag dag_file
[-dag dag_file_2 ... -dag dag_file_n ]
condor_dagman is a meta scheduler for the Condor jobs within
a DAG (directed acyclic graph) (or multiple DAGs).
In typical usage,
a submitter of jobs that are organized into a DAG submits the
DAG using condor_submit_dag.
condor_submit_dag does error checking on aspects of the DAG
and then submits condor_dagman as a Condor job.
condor_dagman uses log files to coordinate the further
submission of the jobs within the DAG.
As part of daemoncore, the set of command-line arguments
given in
section 3.9.2
work for condor_dagman.
Arguments to condor_dagman are either automatically set
by condor_submit_dag
or they are specified as command-line arguments to condor_submit_dag
and passed on to condor_dagman.
The method by which the arguments are set is
given in their description below.
condor_dagman can run multiple, independent DAGs. This is done
by specifying multiple -dag arguments.
Pass multiple
DAG input files as command-line arguments to condor_submit_dag.
Debugging output may be obtained by using the
-debug level option.
Level values and what they produce is described as
- level = 0; never produce output,
except for usage info
- level = 1; very quiet, output severe errors
- level = 2; normal output, errors and warnings
- level = 3; output errors, as well as all warnings
- level = 4; internal debugging output
- level = 5; internal debugging output; outer loop debugging
- level = 6; internal debugging output; inner loop debugging;
output DAG input file lines as they are parsed
- level = 7; internal debugging output; rarely used;
output DAG input file lines as they are parsed
- -debug level
- An integer level of debugging output.
level is an integer, with values of 0-7 inclusive,
where 7 is the most verbose output.
This command-line option to condor_submit_dag
is passed to condor_dagman or
defaults to the value 3, as set by condor_submit_dag.
- -rescue filename
- Sets the file name of
the rescue DAG to write in the case of a failure.
As passed by condor_submit_dag, the name of the file
will be the name of the DAG input file concatenated with
the string .rescue. This argument is now optional,
and in general it is preferred to not specify it. This allows
condor_dagman to automatically generate an appropriate
rescue DAG name.
- -maxidle NumberOfJobs
- Sets the maximum number of idle
jobs allowed before condor_dagman stops submitting more jobs.
If DAG nodes have a cluster with more than one job in it,
each job in the cluster is counted individually.
Once idle jobs start to run, condor_dagman will resume submitting jobs.
NumberOfJobs is a positive integer.
This command-line option to condor_submit_dag is passed to
condor_dagman.
If not specified, the number of idle jobs is unlimited.
- -maxjobs numberOfJobs
- Sets the maximum number of
clusters
within the DAG that will be submitted to Condor at one time.
numberOfJobs is a positive integer.
This command-line option to condor_submit_dag is passed to
condor_dagman.
If not specified, the default number of clusters is unlimited.
If a cluster contains more than one job,
only the cluster is counted for purposes of maxjobs.
- -maxpre NumberOfPREscripts
- Sets the maximum number
of PRE
scripts within the DAG that may be running at one time.
NumberOfPREScripts is a positive integer.
This command-line option to condor_submit_dag is passed to
condor_dagman.
If not specified,
the default number of PRE scripts is unlimited.
- -maxpost NumberOfPOSTscripts
- Sets the maximum number of
POST scripts within the DAG that may be running at one time.
NumberOfPOSTScripts is a positive integer.
This command-line option to condor_submit_dag is passed to
condor_dagman.
If not specified,
the default number of POST scripts is unlimited.
- -noeventchecks
- This argument is no longer used;
it is now ignored. Its functionality is now implemented by
the DAGMAN_ALLOW_EVENTS configuration macro
(see section 3.3.26).
- -allowlogerror
- This optional argument has
condor_dagman try to run the specified DAG, even in the case
of detected errors in the user log specification.
As of version 7.3.2, this argument has an effect only on
DAGs containing Stork job nodes.
- -usedagdir
- This optional argument causes
condor_dagman to run each specified DAG as if the directory
containing that DAG file was the current working directory. This
option is most useful when running multiple DAGs in a single
condor_dagman.
- -lockfile filename
- Names the file
created and used as a lock file.
The lock file prevents execution of two of the
same DAG, as defined by a DAG input file.
A default lock file ending with the suffix .dag.lock
is passed to condor_dagman by condor_submit_dag.
- -waitfordebug
- This optional argument causes
condor_dagman to wait at startup until someone attaches to
the process with a debugger and sets the wait_for_debug
variable in main_init() to false.
- -autorescue 0|1
- Whether to automatically run
the newest rescue DAG for the given DAG file, if one exists
(0 = false, 1 = true).
- -dorescuefrom number
- Forces condor_dagman to
run the specified rescue DAG number for the given DAG. A value
of 0 is the same as not specifying this option. Specifying a
nonexistent rescue DAG is a fatal error.
- -csdversion version_string
- version_string
is the version of the condor_submit_dag program. At startup,
condor_dagman checks for a version mismatch with the
condor_submit_dag version in this argument.
- -allowversionmismatch
- This optional argument causes
condor_dagman to allow a version mismatch between
condor_dagman itself and the .condor.sub file produced
by condor_submit_dag (or, in other words, between
condor_submit_dag and condor_dagman). WARNING! This option
should be used only if absolutely necessary. Allowing version
mismatches can cause subtle problems when running DAGs.
(Note that, starting with version 7.4.0, condor_dagman no longer
requires an exact version match between itself and the
.condor.sub file. Instead, a "minimum compatible version"
is defined, and any .condor.sub file of that version or
newer is accepted.)
- -DumpRescue
- This optional argument causes
condor_dagman to immediately dump a Rescue DAG and then exit,
as opposed to actually running the DAG. This feature is mainly
intended for testing. The Rescue DAG file is produced whether or not
there are parse errors reading the original DAG input file.
The name of the file differs if there was a parse error.
- -verbose
- (This argument is included only to be passed
to condor_submit_dag if lazy submit file generation is used for
nested DAGs.) Cause condor_submit_dag to give verbose error
messages.
- -force
- (This argument is included only to be passed
to condor_submit_dag if lazy submit file generation is used for
nested DAGs.) Require condor_submit_dag to overwrite the files
that it produces, if the files already exist. Note that
dagman.out will be appended to, not overwritten. If
new-style rescue DAG mode is in effect, and any new-style rescue
DAGs exist, the -force flag will cause them to be renamed,
and the original DAG will be run. If old-style rescue DAG mode
is in effect, any existing old-style rescue DAGs will be deleted,
and the original DAG will be run. Section 2.10.7
details rescue DAGs.
- -notification value
- (This argument is only
included to be passed to condor_submit_dag if lazy submit file
generation is used for nested DAGs.) Sets the e-mail notification
for DAGMan itself. This information will be used within the Condor
submit description file for DAGMan. This file is produced by
condor_submit_dag. See notification
within the section of submit description file commands in the
condor_submit manual page on page
for specification of value.
- -dagman DagmanExecutable
- (This argument is
included only to be passed to condor_submit_dag if lazy submit
file generation is used for nested DAGs.) Allows the
specification of an alternate condor_dagman executable to be
used instead of the one found in the user's path. This must be
a fully qualified path.
- -outfile_dir directory
- (This argument is included
only to be passed to condor_submit_dag if lazy submit file
generation is used for nested DAGs.) Specifies the directory in
which the .dagman.out file will be written. The
directory may be specified relative to the current
working directory as condor_submit_dag is executed, or
specified with an absolute path. Without this option, the
.dagman.out file is placed in the same directory as the
first DAG input file listed on the command line.
- -update_submit
- (This argument is included only to
be passed to condor_submit_dag if lazy submit file generation
is used for nested DAGs.) This optional argument causes an existing
.condor.sub file to not be treated as an error; rather, the
.condor.sub file will be overwritten, but the existing
values of -maxjobs, -maxidle, -maxpre, and
-maxpost will be preserved.
- -import_env
- (This argument is included only to be
passed to condor_submit_dag if lazy submit file generation is
used for nested DAGs.) This optional argument causes
condor_submit_dag to import the current environment into
the environment command of the .condor.sub file it
generates.
- -dag filename
- filename is the name of the
DAG input file that is set as an argument to condor_submit_dag,
and passed to condor_dagman.
condor_dagman will exit with a status value of 0 (zero) upon success,
and it will exit with the value 1 (one) upon failure.
condor_dagman is normally not run directly, but submitted as a Condor
job by running condor_submit_dag. See the condor_submit_dag manual
page for examples.
Center for High Throughput Computing, University of Wisconsin-Madison
Copyright © 1990-2012 Center for High Throughput Computing,
Computer Sciences Department,
University of Wisconsin-Madison, Madison, WI. All Rights Reserved.
Licensed under the Apache License, Version 2.0.
See the Condor Version 7.6.10 Manual or
http://research.cs.wisc.edu/htcondor/
for
additional notices.
Next: condor_fetchlog
Up: 9. Command Reference Manual
Previous: condor_convert_history
Contents
Index
htcondor-admin@cs.wisc.edu