This is an outdated version of the HTCondor Manual. You can find current documentation at http://htcondor.org/manual.
next up previous contents index
Next: 5.2 Connecting Condor Pools Up: 5. Grid Computing Previous: 5. Grid Computing   Contents   Index


5.1 Introduction

A goal of grid computing is to allow the utilization of resources that span many administrative domains. A Condor pool often includes resources owned and controlled by many different people. Yet collaborating researchers from different organizations may not find it feasible to combine all of their computers into a single, large Condor pool. Condor shines in grid computing, continuing to evolve with the field.

Due to the field's rapid evolution, Condor has its own native mechanisms for grid computing as well as developing interactions with other grid systems.

Flocking is a native mechanism that allows Condor jobs submitted from within one pool to execute on another, separate Condor pool. Flocking is enabled by configuration within each of the pools. An advantage to flocking is that jobs migrate from one pool to another based on the availability of machines to execute jobs. When the local Condor pool is not able to run the job (due to a lack of currently available machines), the job flocks to another pool. A second advantage to using flocking is that the user (who submits the job) does not need to be concerned with any aspects of the job. The user's submit description file (and the job's universe) are independent of the flocking mechanism.

Other forms of grid computing are enabled by using the grid universe and further specified with the grid_type. For any Condor job, the job is submitted on a machine in the local Condor pool. The location where it is executed is identified as the remote machine or remote resource. These various grid computing mechanisms offered by Condor are distinguished by the software running on the remote resource.

When Condor is running on the remote resource, and the desired grid computing mechanism is to move the job from the local pool's job queue to the remote pool's job queue, it is called Condor-C. The job is submitted using the grid universe, and the grid_type is condor. Condor-C jobs have the advantage that once the job has moved to the remote pool's job queue, a network partition does not affect the execution of the job. A further advantage of Condor-C jobs is that the universe of the job at the remote resource is not restricted.

When other middleware is running on the remote resource, such as Globus, Condor can still submit and manage jobs to be executed on remote resources. A grid universe job, with a grid_type of gt2 or gt4 calls on Globus software to execute the job on a remote resource. Like Condor-C jobs, a network partition does not affect the execution of the job. The remote resource must have Globus software running.

Condor also facilitates the temporary addition of a Globus-controlled resource to a local pool. This is called glidein. Globus software is utilized to execute Condor daemons on the remote resource. The remote resource appears to have joined the local Condor pool. A user submitting a job may then explicitly specify the remote resource as the execution site of a job.

Starting with Condor Version 6.7.0, the grid universe replaces the globus universe. Further specification of a grid universe job is done within the grid_resource command in a submit description file.


next up previous contents index
Next: 5.2 Connecting Condor Pools Up: 5. Grid Computing Previous: 5. Grid Computing   Contents   Index
htcondor-admin@cs.wisc.edu