Note that Condor-PVM is an optional Condor module. To check and see if it has been installed at your site, enter the following command:
% ls -l `condor_config_val PVMD`(notice the use of backticks in the above command). If this shows the file "condor_pvmd" on your system, Condor-PVM is installed. If not, ask your site administrator to download Condor-PVM from the condor contrib area at http://www.cs.wisc.edu/condor/downloads and install it.
We have created an example PVM program "mandelbrot" for you. This program computes and draws a mandelbrot graph on the screen line by line. The program fits in the "Master-Worker Paradigm" well.
The master program first creates the workers. It then sends a block of lines to be computed to each worker. When a result comes back from a worker, the master draws the lines on the screen. The master then sends another block of lines to this worker to compute. A worker program simply waits for the lines sent by the master, does the computation, and then sends the results back to the master.
% cd ~/workbook/pvm/mandelbrot
We have created the executable file "master" and "worker" for you. The submit-description file is "submit_pvm". Let us take a look at the submit file first.
############################################## # PVM submit file for drawing mandelbrot graph ############################################## # submit to PVM universe universe = PVM # the executable name of the master pvm program is "master" # the program "worker" will be spawned during the execution of "master" executable = master #arguments: number_of_works number_of_lines_sent_to_a_worker display arguments = 2 10 :0.0 # Let file "out" be the stdout output = out_master # Let file "err" be the stderr error = err_master # machine_count = max..min machine_count = 2..2 queue
Note that "universe = PVM" tells condor that this job should be submitted to the PVM universe.
The line "machine_count = 2..2" tells condor not to start this program if the number of machines in architecture class 0 is less than 2. It also tell Condor that the maximum number of machines the program needs in class 0 is 2.
Now, let's submit this program to Condor by typing:
% condor_submit submit_pvm
In this run, the master requires one worker. We can look at the output file "out_master" to see the configuration of the PVM virtual machine, and the activity of the master. Type
% cat out_masterto view the contents of file out_master. The file "work_stdout" displays the stdout of the worker program.
You can shrink and enlarge the display box by typing "e" and "s". To zoom in a certain region, just "press and drag" the left mouse button. To quit, type "q".
Now lets change the arguments to the master program to require more workers, say 4. We need the following changes in submit_pvm:
arguments = 4 10 :0.0If we submit the program again. by
% condor_submit submit_pvmBy examining the file "out_master", we see that the program starts with 2 remote machines. The master will request 2 other remote machines to be added to PVM, before it spawns off the two workers. Now let's set the "min" and "max" field of machine_count to 4 and re-submit the program. By examining "out_master", we see when the master program starts, there are already four remote machines.