|
-
Use of the BLAH registry is triggerable by configuration and tries to be
as transparent as possibile.
Real life issues - specimen #2
- The dilemma: telling a fast,
completed job from a lost, disappeared job.
- 'History' commands are heavy
for all batch systems, and there can be disrupting surprises.
- Most batch systems can be configured to keep terminated jobs in their active job database for a while.
- Still, for reasons that would deserve
some investigation by Platform, LSF's command line tools (e.g. bjobs)
can be up to four times slower than obtaining the same
information via the lsbatch API!
- Ulrich Schwickerath, a friend at CERN, maintains a few
API-based tools for LSF that can be used in conjunction with BLAH. in the cernops/info.dynamic-scheduler-lsf github package.
- But one needs to know: this knowledge still needs to be kept together in one place.
In the pipeline
- With just a fraction of three people at hand,
not much is boiling in terms of new features:
- A high availability mode, where
updates to the job registry are shared (optionally via
multicast) among a pool of BLAH servers (so that all can service requests
on the same set of jobs) is in the code. Hasn't seen thorough testing yet.
- Better, configurable logging (other than what
is exchanged in the command line and/or logged to stdout and stderr).
Conclusions (1)
- We could not keep it as simple and dumb as we hoped,
but the BLAH(P) is still around, now pushing around the majority of WLCG jobs, and possibly a few Higgs bosons here and there.
- Simple semantics provided (as usual) the
ability to compose new tools based on any real need at hand.
- However, pushing the scale factor up to WLCG center
production needs required measures that are normally not enabled when BLAH(P)
is used from within HTCondor.
- These can be enabled if needed, but knowledge
and documentation are still not gathered in a single place, as it mostly comes
from people with firsthand expertise at managing the various batch systems.
- Now that we don't depend on decisions and priorities set by upper-tier
projects we should perhaps start populating the github Wiki.
Conclusions (2)
- BLAH doesn't have much of a roadmap ahead,
but we are as usual open to suggestions, requests (feature or support), and
to directly provide info beyond the scarce documentation:
blah@mi.infn.it
- Thank you for your time. Hope it wasn't too BLAH.
|
|
|