Demonstration of ping()

This notebook will demonstrate using the ping() method to determine if an HTCondor service, such as the Collector and Schedd, are alive and responding.

First we import the htcondor module and instantiate a Collector() object.

In [1]:
import htcondor
print( htcondor.version() )   # display version of HTCondor used
col = htcondor.Collector()
$CondorVersion: 8.7.9 May  2 2018 PRE-RELEASE-UWCS $

Next, we get the address for the collector and send it a ping(). The ping() method will throw an expcetion on failure. Then we do the same for the schedd.

In [2]:
try:
    collector_ad = col.locate(htcondor.DaemonTypes.Collector)
    htcondor.SecMan().ping(collector_ad)
    print('Collector is responsive')
except:
    print('Collector is NOT responding')
Collector is responsive
In [3]:
try:
    schedd_ad = col.locate(htcondor.DaemonTypes.Schedd)
    htcondor.SecMan().ping(schedd_ad)
    print('Schedd is responsive')
except:
    print('Schedd is NOT responding')
Schedd is responsive

To demonstrate things work as expected when a service is down, the below will send a command to the condor_master telling it to shutdown the schedd. WARNING!! This will shutdown the schedd on the local machine, don't do this on a production setup!!

In [4]:
master_ad = col.locate(htcondor.DaemonTypes.Master)
htcondor.send_command(master_ad,
                      htcondor.DaemonCommands.DaemonOff,
                      "SCHEDD")
print('Sent command to shut down the schedd')
Sent command to shut down the schedd
In [5]:
try:
    schedd_ad = col.locate(htcondor.DaemonTypes.Schedd)
    htcondor.SecMan().ping(schedd_ad)
    print('Schedd is responsive')
except:
    print('Schedd is NOT responding')
Schedd is NOT responding

There you have it. At this point, you may want to issue a "condor_on -schedd" command to turn the schedd back on, as we shut it off above.