How To Configure Folding At Home

Here's how to contribute your HTCondor pool's spare cycles to Folding@Home.

The basic idea is to use HTCondor's "work fetch" mechanism to start the Folding@Home client when you have idle resources. This means you don't have Folding@Home jobs cluttering up your schedulers or in your the accountant, and you don't have to do anything special to preserve your Folding@Home progress.

If you wish to run Folding@Home jobs as work submitted to a HTCondor Schedd and tracked by HTCondor's accounting mechanism this is not the recipe for you

Assumptions

  • We assume that you're configuring a Linux execute machine.
  • We assume that execute machine already has Folding@Home installed. Many of the ways to install the Folding@Home client on a machine try to configure it to run in the background; you'll want to disable that.
  • We assume that /opt/fah is a great place to put our Folding@Home bits.

Instructions

  1. Create user named 'backfill' for Folding@Home to run as.
  2. Make directories so that Folding@Home can preserve its progress when preempted: create /opt/fah/slots/slot1 , /opt/fah/slots/slot2/ , and the like (up to the number of cores on your machine). Make sure that the 'backfill' user can read and write to them ( chown them appropriately).
  3. Create the following files:

/opt/fah/fetch_work
#!/bin/bash
# extract SlotId from the Machine classad passed on stdin
eval `awk '/^SlotID/ {print "export _CONDOR_BACKFILL_SLOTID="$3}'`
# build a job classad from a template in the HTcondor config called FAH_JOB
condor_config_val -macro '$(FAH_JOB)'

The following configuration can be changed if you want credit for your contribution.

/opt/fah/config.xml
<config>
  <!-- Client Control -->
  <fold-anon v='true'/>

  <!-- Folding Slot Configuration -->
  <gpu v='false'/>

  <!-- Folding Slots -->
  <slot id='0' type='CPU'/>
</config>

  1. If you're using static slots, use this recipe for your configuration.
  2. If you're using partitionable slots, use this recipe for your configuration.

Example output showing one idle partitionable slot and 4 slots that will only run backfill jobs.

$ condor_status example.domain
Name                          OpSys      Arch   State     Activity LoadAv Mem     ActvtyTime

backfill2@example.domain LINUX      X86_64 Claimed   Busy      0.000   1000  4+18:49:55
backfill3@example.domain LINUX      X86_64 Claimed   Busy      1.000   1000  4+18:47:15
backfill4@example.domain LINUX      X86_64 Claimed   Busy      0.000   1000  4+18:48:04
backfill5@example.domain LINUX      X86_64 Claimed   Busy      1.000   1000  4+18:48:11
slot1@example.domain     LINUX      X86_64 Unclaimed Idle      0.000 507340 11+19:08:11

               Machines Owner Claimed Unclaimed Matched Preempting  Drain

  X86_64/LINUX        5     0       4         1       0          0      0

         Total        5     0       4         1       0          0      0