Condor-PVM

Condor-PVM

Condor has a PVM submit Universe which allows the user to submit PVM jobs to the Condor pool.

The PVM Universe

Note that Condor-PVM is an optional Condor module. To check and see if it has been installed at your site, enter the following command:

%  ls -l `condor_config_val PVMD`
(notice the use of backticks in the above command). If this shows the file "condor_pvmd" on your system, Condor-PVM is installed. If not, ask your site administrator to download Condor-PVM from the condor contrib area at http://www.cs.wisc.edu/condor/downloads and install it.

We have created an example PVM program "mandelbrot" for you. This program computes and draws a mandelbrot graph on the screen line by line. The program fits in the "Master-Worker Paradigm" well.

The master program first creates the workers. It then sends a block of lines to be computed to each worker. When a result comes back from a worker, the master draws the lines on the screen. The master then sends another block of lines to this worker to compute. A worker program simply waits for the lines sent by the master, does the computation, and then sends the results back to the master.

Preliminary: Please change to the example directory by typing:
%  cd ~/workbook/pvm/mandelbrot

We have created the executable file "master" and "worker" for you. The submit-description file is "submit_pvm". Let us take a look at the submit file first.

##############################################
# PVM submit file for drawing mandelbrot graph
##############################################

# submit to PVM universe
universe = PVM

# the executable name of the master pvm program is "master"
# the program "worker" will be spawned during the execution of "master"
executable = master

#arguments: number_of_works  number_of_lines_sent_to_a_worker display
arguments = 2 10 :0.0

# Let file "out" be the stdout
output = out_master

# Let file "err" be the stderr
error = err_master

# machine_count = max..min
machine_count = 2..2

queue

Note that "universe = PVM" tells condor that this job should be submitted to the PVM universe.

The line "machine_count = 2..2" tells condor not to start this program if the number of machines in architecture class 0 is less than 2. It also tell Condor that the maximum number of machines the program needs in class 0 is 2.

Now, let's submit this program to Condor by typing:

%  condor_submit submit_pvm

In this run, the master requires one worker. We can look at the output file "out_master" to see the configuration of the PVM virtual machine, and the activity of the master. Type

% cat out_master
to view the contents of file out_master. The file "work_stdout" displays the stdout of the worker program.

You can shrink and enlarge the display box by typing "e" and "s". To zoom in a certain region, just "press and drag" the left mouse button. To quit, type "q".

Now lets change the arguments to the master program to require more workers, say 4. We need the following changes in submit_pvm:

arguments = 4 10 :0.0
If we submit the program again. by
%  condor_submit submit_pvm
By examining the file "out_master", we see that the program starts with 2 remote machines. The master will request 2 other remote machines to be added to PVM, before it spawns off the two workers.

Now let's set the "min" and "max" field of machine_count to 4 and re-submit the program. By examining "out_master", we see when the master program starts, there are already four remote machines.