HTCondor Week 2016

University of Wisconsin — Madison, Wisconsin — May 17–20, 2016

picture of Madison

Thursday, May 19

 8:00 am  9:00 am Coffee and Registration
coffee, tea, ice water
Session Moderators: Lauren Michael, Brian Lin
 9:00 am  9:20 am Research Computing Taxonomy
Categorizing research computing into several main groups and discussing which computational approaches are best suited to each group (HTC vs. HPC workflows)
Christina Koch
Center for High Throughput Computing
 9:25 am  9:45 am Resizeable Jobs for CMS
Brian Bockelman
University of Nebraska-Lincoln
 9:50 am 10:10 am HTCondor: CMS Experience
Compact Muon Solenoid (CMS) is a particle physics experiment at CERN Large Hadron Collider (LHC). The talk presents an overview of how HTCondor is being used in the CMS grid computing infrastructure and highlights the scale we have managed to achieve availing various features provided by the software.
Farrukh Aftab Khan
10:15 am 10:35 am Using HTCondor Glideins to Run in IceCube Heterogeneous Resources
This talk will discuss how IceCube is using the Condor glideins in our simulation production system. We get to most of our distributed resources via Condor glide ins. For those sites which do not present a grid interface (a CE), we have recently developed a simple pilot factory which we can run from a local account at the site, as a cron. This allows us to seamlessly integrate remote sites into our condor pools in a very convenient way. We are running this system in several sites, including supercomputers from XSEDE where we are interested in running GPU jobs. We will describe the system and how it helped IceCube run jobs at multiple remote locations with very small operational overhead.
David Schultz
10:35 am 10:55 am Break
croissant with jam and butter, whole fruit, coffee, tea, ice water, soda
Session Moderators: Lauren Michael, Kent Wenger
10:55 am 11:15 am HTCondor-CE: For When the Grid Is Dark and Full of Terrors
With the HTCondor-CE usage spreading out amongst Europe, this talk looks at what benefits a HTCondor-on-HTCondor deployment can bring, an in-depth of the CERN CE deployment, and how the HTCondor eco-system can change the grid for the better.
Iain Steers
11:20 am 11:40 am OSG as a "Universal Adapter” for LIGO
Peter Couvares
Caltech (LIGO)
11:45 am 12:05 pm HTCondor Use in Operational Data Processing for Hubble/JWST
This talk will discuss how we are currently using HTCondor here at STScI in the operational system for processing and reprocessing Hubble Space Telescope (HST) data for our science users, some of the issues we are working on, and our plans for addressing them. We also will be using HTCondor in our data processing system for the James Webb Space Telescope (JWST), to launch in 2018, using similar patterns, and the differences in those plans compared with out HST architecture will also be presented.
Michael Swam
Space Telescope Science Institute (STScI)
12:10 pm 12:25 pm High Throughput Computing at BNL
Michael Ernst
Brookhaven National Laboratory
12:25 pm  1:30 pm Lunch
build-your-own flatbread buffet: sliced grilled chicken breast; hummus, tzatziki, and sweet chutney; curry-dusted roasted vegetables; herb and lemon scented faro; mixed greens with cucumbers, tomatoes, and goat cheese; grilled flatbreads; spanikopita; coffee; tea; ice water; soda
Session Moderators: Lauren Michael, Carl Edquist
 1:30 pm  1:50 pm GeoDeepDive: A Cyberinfrastructure to Support Text and Data Mining of Published Documents
The published scientific literature contains a large amount of data and information that has utility beyond the scope of the original investigation. For example, fossil occurrences are commonly described in the literature as part of local and regional field work, but literature-based syntheses of millions of fossil occurrences from around the world are required to generate an accurate history of life on Earth. Here we describe GeoDeepDive (GDD), a High Throughput cyberinfrastructure to support the reliable, scalable, and automated fetching of documents from content providers, the preprocessing of those documents by software tools that provide annotations for machine reading, and the indexing of those documents based on known vocabularies of scientific terms. The infrastructure currently contains more than 1.2 million documents ( from six different content providers and grows at a rate of 60K documents per week. Emphasis is currently placed on geoscience- and bioscience-related content, but the infrastructure is not domain specific. Software applications can be written by scientists to extract data and information from these documents using the GeoDeepDive application template and testing datasets. The GDD infrastructure supports the running of applications against the whole of the relevant document set and the continual updating of the result set as new relevant documents are acquired and processed. The high throughput computing capabilities of HTCondor are critical to the processing of documents and the deployment of new tools against the entire library as they are developed and as the collection of documents grows.
Ian Ross
GeoDeepDive (UW)
 1:55 pm  2:15 pm Flying HTCondor at 100gbps Over the Golden State
Creating a common submission infrastructure for the LHC participating University of California campuses.
Jeff Dost
 2:20 pm  2:40 pm Taming Local Users and Remote Clouds with HTCondor at CERN
The CERN tier-0 batch service caters for a diverse set of users, from the grid to “local”, and increasingly runs on a diverse infrastructure. We’ll report on how we’ve been using HTCondor to cater for our user groups, how recent HTCondor developments are helping us, and how we’re integrating cloud resources into out pool.
Ben Jones
 2:45 pm  3:05 pm The Fermilab HEPCloud Facility: Adding 60,000 Cores for Science!
Burt Holzman
Fermi National Accelerator Laboratory
 3:05 pm  3:25 pm Break
assorted bruschetta, pretzels, mixed dried fruits, coffee, tea, ice water, soda
Session Moderators: Lauren Michael, Christina Koch
 3:25 pm  3:55 pm Amazon AWS and the HTCondor Annex
Will St. Clair
Amazon Web Services
Todd Miller
Center for High Throughput Computing
 4:00 pm  4:20 pm Comprehensive Grid and Job Monitoring with Fifemon
Batch monitoring at Fermilab and how other Condor users can leverage it to quickly set up detailed monitoring of their own pools.
Kevin Retzke
FIFE support group (USDC) at Fermilab
 4:25 pm  4:45 pm Monitoring HTCondor: A Common One-Stop Solution?
We’ve all heard of the gangliad, soon to be the metricsd, and the python api. Can we throw away those ad-hoc scripts that turned into critical monitoring components and work towards a common solution that can be shared across deployments?
Iain Steers
 4:50 pm  5:10 pm IDPL Tips and Tricks
Greg Thain
Center for High Throughput Computing
 5:10 pm  5:15 pm Closing Remarks
Miron Livny
Center for High Throughput Computing

Reception sponsored by Core Computational Technology at the Morgridge Institute for Research

We will have a reception following the Thursday session. It will be held in the main courtyard/atrium of the Discovery Building just outside the DeLuca Forum where HTCondor week sessions occur. The reception will take place from 5pm to 7pm. Non-alcoholic drinks and appetizers will be served; beer and wine will be available on a cash basis.

Specific talks and times are subject to change.