Banner
Title: Condor Practical
Subtitle: Submitting a VM Universe Job
Tutor: Alain Roy and Todd Tannenbaum
Authors: Alain Roy and Ben Burnett

9.0 Submitting a VM universe job

9.1 What is the VM universe?

Using VM universe Condor allows jobs to be Virtual Machines instead of simply executables. Virtual Machines allow for a greater flexibility with regards to the types of jobs users can submit. It allows a user to run applications written for one platform to be run on top of an arbitrary platform, without the need to port the original application to the new platform. VM universe supports several virtual machine applications, today we will be looking at VMware Server, but similar jobs can be run using Xen, etc.

9.2 Submitting a VM job

For your convenience, we have created a VM for this exercise. It is a small Linux VM. You should download the configuration file and disk image to your local test directory.

Create a submit file. Name this file simple.vm.sub.

Universe                     = vm
Executable                   = any_name_you_like
Log                          = simple.vm.log.txt
vm_type			     = vmware
vm_memory		     = 64
vmware_dir		     = C:\condor-test
vmware_should_transfer_files = Yes
Queue

Note the lack of real executable in this universe (as we mentioned above: the VM image itself is the executable in this universe). So why do we have an executable name? The executable name is provided to identify the job when you run condor_q. Accordingly, you can change it to change it to something more representative, like: linux_vm_test or something similar. You may also wish to put your name in it, if you are all running as Administrator.

Now submit your job:

C:\condor-test> condor_submit simple.vm.sub
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 26.

C:\condor-test> condor_q

-- Submitter: leovinus : <128.105.48.96:50589> : leovinus
 ID      OWNER/NODENAME   SUBMITTED     RUN_TIME ST PRI SIZE CMD
   6.0   aroy           11/20 15:31   0+00:02:46 R  0   0.0  any_name_you_like

1 jobs; 0 idle, 1 running, 0 held


-- Submitter: leovinus : <128.105.48.96:50589> : leovinus
 ID      OWNER/NODENAME   SUBMITTED     RUN_TIME ST PRI SIZE CMD
   6.0   aroy           11/20 15:31   0+00:02:56 R  0   0.0  any_name_you_like

1 jobs; 0 idle, 1 running, 0 held


-- Submitter: leovinus : <128.105.48.96:50589> : leovinus
 ID      OWNER/NODENAME   SUBMITTED     RUN_TIME ST PRI SIZE CMD
   6.0   aroy           11/20 15:31   0+00:03:06 R  0   0.0  any_name_you_like

1 jobs; 0 idle, 1 running, 0 held


-- Submitter: leovinus : <128.105.48.96:50589> : leovinus
 ID      OWNER/NODENAME   SUBMITTED     RUN_TIME ST PRI SIZE CMD
   6.0   aroy	        11/20 15:31   0+00:03:16 R  0   0.0  any_name_you_like

1 jobs; 0 idle, 1 running, 0 held

...

-- Submitter: leovinus : <128.105.48.96:50589> : leovinus
 ID      OWNER/NODENAME   SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 idle, 0 running, 0 held

The first time the image starts up, it will run its fake "job" which will run for 10-15 minutes. Just enough time to ask your instructors some difficult questions. The second time the job is run, it will do nothing. This is so we can open up the image using VMware Server and view /root/job.out for the results.

While you wait for the job to finish, it may be possible to view the running VM by launching VMware Server Console:

Start Menu > VMware > VMware Server > VMware Server Console.

We say that it may be possible, because your job may be running on a machine other than your own. However, if there is a job running, you will see something similar to this:

VMware Server Console

In this case Ben Burnett is running a job on leovinus on cluster 2 and is process number 0 in this cluster.

Note that we get little to no information from this view, except that job has run; or, rather, is running. We'll get back to this in while, but for now we'll just explain what is going on in the background.

This is how the Linux image works: the job is run from /etc/rc.d/rc.start/60.job. It invokes /root/job in the background to do the actual work. The job itself will run if and only if /root/job.out doesn't exist. This is so you can extract the output during the next boot. By removing /root/job.out you can force the job to run again.

After waiting 15 minutes or so, you can check on the status of your job using more:

C:\condor-test> more simple.vm.log.txt

000 (013.000.000) 06/20 01:41:16 Job submitted from host: <193.10.156.74:40295>
...
001 (013.000.000) 06/20 01:41:20 Job executing on host: <193.10.156.74:40304>
...
005 (013.000.000) 06/20 01:41:35 Job terminated.
        (1) Normal termination (return value 0)
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Run Local Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Remote Usage
                Usr 0 00:00:00, Sys 0 00:00:00  -  Total Local Usage
        0  -  Run Bytes Sent By Job
        0  -  Run Bytes Received By Job
        0  -  Total Bytes Sent By Job
        0  -  Total Bytes Received By Job

Congratulations, you've submitted a VM job to Condor!

Top

9.3 VMs on your Condor pool

Condor keeps track of which computers have a functional VMware Server and which version it is. You can find this out by using condor_status:

C:\condor-test> condor_status -vm

Name               VMType Ver        State     Activity LoadAv VMMe ActvtyTime  VMNetworking

slot1@leovinus     vmware server1.0. Unclaimed Idle     0.000   256  0+00:10:04 [Not-Supported]
slot2@leovinus     vmware server1.0. Unclaimed Idle     0.120   256  0+00:10:05 [Not-Supported]
...

                     Total Owner Claimed Unclaimed Matched Preempting Backfill

       INTEL/WINNT60     7     0       0         7       0          0        0

               Total     7     0       0         7       0          0        0

C:\condor-test> condor_status -l slot1@leovinus | find "VM"
...
HasVM = TRUE
VM_AvailNum = 10000
VM_GAHP_VERSION = "0.0.1"
VM_Type = "vmware"
VM_Version = "server1.0.4"
VM_Memory = 256
VM_Networking = FALSE

Next: Finishing up

Top