BOINC in the RCC
Within the clusters HPC manages in the RCC, absolute priority is given to jobs submitted by local cluster users. They use TORQUE to submit and manage their jobs. TORQUE jobs run with a reserved set of resources for a guaranteed amount of time. They run until they finish, or until the requested amount of time is exceeded. Most of the compute jobs on Maxwell, our flagship cluster, run under TORQUE. For various reasons, there are times when one or more compute nodes in a cluster are not in use by TORQUE jobs. The period of time a node is idle may be short or relatively long, but it is always unpredictable and thus cannot be rescheduled by TORQUE.
Rather than allow compute cycles to go to waste, we decided to use a cycle scavenging system Condor, along with TORQUE.
Condor is useful for certain kinds of jobs that can run intermittently, suspending and resuming perhaps repeatedly over a long period of time. The jobs use checkpointing to maintain state if they are interrupted. When they resume, they restore their previous state and continue from where they left off. As we use it, Condor also supports BOINC .
BOINC jobs also run intermittently, saving their results by checkpointing. At times, BOINC will report results to a central project server and ask for more work. It is up to the project server to manage the progress of the computational problem. Usually a project server will hand out just enough work to keep a BOINC instance busy for a few hours - expecting to get the results within a few days at most. A particular BOINC installation will typically run computations for more than one research project from one run to the next.
Condor and BOINC jobs are scheduled opportunistically. If a compute node is left idle by TORQUE for even for a few minutes, cycle scavenging starts. The moment TORQUE needs the node again, Condor / BOINC will back off. Even at this rate, the accumulated computing time is significant. Over time in a grid of several thousand cores, it has amounted to hundreds of cpu years.
The UH IT HPC team is the primary contributor of cycles for research directed in the Department of Computer Science at the University of Houston: Virtual Prairie (VIP) powered by BOINC. We also contribute support to several projects at the World Community Grid (WCG). Two major current projects are Discovering Denque Drugs and Help Fight Childhood Cancer.
Contributing Your Cycles
There are many projects in the world of volunteer computing. We believe that VIP and the projects in the WCG are very worthy of support. Should you decide to contribute your own cycles, you are welcome to join our team UH IT HPC. Of course, there are many other teams to choose from - or you could decide to create your own team. It all benefits the same cause.
Join the VIP Project
Join the World Community Grid
A Green Hint
UH staff, faculty, and students are encouraged to allow the software to run during weekdays, rather than leave University computers on at night or over weekends. Basically, use your computer as you normally would. BOINC is designed to run in the background.
Run Boinc Only On Authorized Computers
You should run BOINC only on computers which you own, or for which you have obtained the owner's permission.