Get To Know Todd TannenbaumJanuary 23, 2023
As the technical lead of the CHTC, how did you get started?
Long ago, I came to UW-Madison to major in computer sciences and upon graduation, accepted a job as the Unix systems administrator in the Computer Aided EngineeringCenter in the College of Engineering.
While there, I was introduced to HTCondor –which went by the name of Condor at that time– and created and managed an HTCondor installation consisting of about 200 Unix workstations deployed across the College. As the years went by, I became the director of (what used to be called) “The Model Advanced Facility”, which served as a high-performance computing and visualization resource in Engineering. We had both HPC supercomputer systems and also a HTCondor cluster. I found that the majority of engineers I worked with had their problems fit very well with the high throughput computing paradigm, so our HTCondor installation was more popular than our expensive HPC supercomputers. However, HTCondor didn’t quite do what I needed it to do so I walked over to the computer sciences building and met with Miron Livny. He suggested I attend the HTCondor developers meeting, which I started doing. Ultimately I made the decision that working in high throughput computing research was more personally rewarding for me than being a director. So in 1997, I switched from engineering to computer sciences to work on HTCondor full time.
What is the HTCondor software suite and why is it important to researchers?
Today, scientific research is oftentimes predicated on access to lots of computing cycles for simulations and analysis. Imagine your work requires running a computer simulation that takes an hour to complete on your nice new laptop; now imagine you have 10,000 such simulations you need to run. With just your laptop, this would take over a year to complete, but if you could effectively use 10,000 computers in an organized manner, you could be done in an hour. The HTCondor Software Suite (HTCSS) enables a researcher or engineer to easily harness the computing capacity of a large number of computers that may be geographically distributed or owned and managed by different organizations, allowing these people to submit and track very large numbers of computing jobs.
HTCSS also provides services for the owners of the servers. It makes sure the capacity is equitably shared amongst groups of researchers and minimizes the chances that one researcher’s computing negatively impacts the computing of another researcher.
HTCSS has enjoyed wide-spread adoption; it has been instrumental in providing the enormous amount of computing required for two recent Nobel Prizes (and hopefully counting!), and is used not only at universities and government labs worldwide, but also in industry including companies like SpaceX, Dreamworks, and Boeing.
How is HTCSS connected to CHTC?
The HTCondor Software Suite (HTCSS) is the product of three decades of continuous research and development on high-throughput computing within the Center for High Throughput Computing (CHTC) and the UW-Madison Computer Sciences Department. Although HTCSS is open source, all members of the core development team responsible for the support, enhancement, and evolution of the HTCSS work at the CHTC. UW-Madison alone uses HTCSS as its cornerstone technology to complete nearly 250,000 compute jobs each day for the benefit of research groups across the Madison campus as they work on challenges in every field, delivering faculty and graduate students at UW-Madison the computing equivalent of approximately 30,000 computers (cpu cores) running 24 hrs every day.
In addition, the computing infrastructure at UW-Madison managed by the CHTC is a great experimental laboratory for the development of the HTCondor Software Suite itself. We heavily utilize the CHTC facilitators to provide feedback to the HTCSS developers about places where the software is working well and where improvements are needed, what researchers are finding helpful or confusing, and which new features we should add.
How has HTCSS evolved over the years?
When HTCondor was first conceived, it was used primarily just at UW-Madison to deliver a few dozen compute hours per day to a handful of users. Today HTCSS is in use at universities, government labs, and commercial organizations worldwide; the software is downloaded more than 100,000 times each month from our website and has grown to over a million lines of code. We’ve made a lot of changes to deal with ever increasing amounts of scientific data and sets of jobs/machines. Also, as the technology of computing keeps evolving, HTCondor is evolving with it. For instance, HTCondor manages GPU resources and containers. Back when I started, there were no GPUs or containers (software that emulates another computer).
What does your day at work look like?
I split my time between management and technological duties. I talk with the other developers that work with the HTCSS about any support emergencies in the user community. I also work on the design of new features or the best ways to fix bugs. I still find some time for my favorite part, which is hands-on work of writing code and doing direct support for the community – such as answering support emails. This is something unique that we do here. At a lot of software development organizations, the people that handle support questions are different from those who write code and the two rarely meet. But here, all the developers, including myself, take turns with first level support– answering user support questions directly. We feel this is important to not lose touch with end users who use the software daily.
What has been your favorite memory so far?
Many years from now when I look back at my career, I think I will look back fondly on how our work here surpasses simply making shareholders more wealthy. It really is (and has been) about enabling scientific discovery via computing for the benefit of humankind. It is nice to work in academia and still have your work be relevant in “the real world”, outside of just academic papers. Another thing I will look fondly upon is the long list of colleagues I’ve had the privilege of working with all over the years. A lot of fun, motivated and extremely intelligent people.
Where do you see HTCSS in the next 5 years?
I’d like to see HTCondor being more accessible to an ever wider range of researchers and engineers. I’d like HTCondor to have even more impact on the individual researcher at smaller institutions and schools, including community colleges. These are things we are already doing, now but I imagine an even bigger impact in five years.
What would you say has been the greatest impact of your job?
My greatest impact is in having the HTCSS enable High Throughput Computing to maintain relevance and keep delivering computing capacity to researchers for scientific discovery. The idea I’m helping humankind as opposed to just a group of shareholders is what I derive the most satisfaction from.
What has been the greatest challenge so far?
Drinking from the fire hose! There’s so much we could be working on, so much we should be working on to balance the needs of supporting existing communities versus building new mechanisms to attract more people to the community, all while trying to balance my own competing technology -vs- management duties.
A lot of times ‘what to do’ is an easier problem than answering the ‘who’ and the ‘when’. There’s so much you want to do but only so many hours in a day and only so much staff effort available. Figuring out where to apply the effort to have the largest impact is probably the biggest challenge.
How do you like to spend your free time?
I like cycling (road cycling, I am not coordinated enough for hard-core mountain biking!) and sailing. Both of these
aren’t very conducive to winter which is very unfortunate, so I am generally a happier person in spring, summer and fall
than in winter. Although in winter I get to watch the Green Bay Packers, which is usually a lot of fun, albeit not as
much this year perhaps!
I enjoy playing and listening to all kinds of music. I’ve played bass guitar since high school in several bands over the years and I’m also a novice guitar player. My favorite band is The Clash.
What are some of your favorite books? What books have influenced your work?
I’m actually one of the founding members of Jordan’s Big 10 Pub Book Club – the Big Ten Pub is the closest pub to the computer science building, just down on Regent Street. We started the book club about fifty books ago to bring together people who like both books and beer. We most recently read ‘Rendezvous with Rama’ by Arthur C. Clarke. We’ve even had a few authors of the books we’ve read join our club discussions.
Books that have directly influenced my work are probably ones reserved for Mountain-Dew drinking software nerds, such as ‘Effective C++’ by Scott Meyers and ‘Transaction Processing’ by Jim Gray. We have applied a lot of concepts from the database community into the distributed computing world over the years.
If you could travel anywhere outside of the country, where would you go?
Probably the U.S. Virgin Islands because of the amazing sailing opportunities.
What is one of your hidden talents?
I like to cook Indian food. My family really likes my Rajma Dal recipe, a vegetarian red kidney bean curry. This past weekend I made Sambar which is actually in a tupperware in my fridge for lunch. My older son is vegetarian. He decided at the age of four to be vegetarian after asking me where meat comes from. I told him meat comes from the meat aisle in the grocery store, but as an inquisitive four year old, he didn’t like my answer and went to ask his mom instead who then gave him a more detailed answer. Ever since then, he’s refused to eat meat and that really helped jumpstart my Indian cooking interest - there are so many tasty vegetarian dishes in Indian cuisine.