Using Stork with Condor DAGMan
Condor DAGMan supports
data placement jobs using Stork, as well as traditional CPU "number crunching"
jobs. Further, CPU jobs can be run on a local Condor pool, or on remote Grid
sites using Condor-G.
Thus, you can specify a DAG to:
- Stage in your data, using Stork
- Process your data, using Condor
- Stage out your data, using Stork
The Condor Project presented a
hands on tutorial for using Stork, Condor-G and DAGMan together at The 2nd International Summer School on Grid
Computing 2004, July 2004.
Important Note: The interface for specifying the Stork server to DAGMan
has changed since this tutorial was assembled. The Stork server is now set via
the STORK_SERVER macro in the Condor configuration
file in all current releases of Condor. For example, your Condor
configuration file now contains the entry:
STORK_SERVER = storkhost.domain
In the above tutorial, as well as Condor versions 6.5.5, 6.6.0 through 6.6.5
inclusive, and 6.7.0 through 6.7.2 inclusive; the -Storkserver was
specified on the condor_submit_dag
command line. For example:
condor_submit_dag ... -Storkserver storkhost.domain -storklog /path/to/Stork.user_log ... # OBSOLETE INTERFACE!
The -storklog command line option to condor_submit_dag has
always been required.
Also, in the current architecture, the Stork server must be run at the user
level (not root) to be compatible with DAGMan. This may change in a future
release.