This year Chameleon hosted the IndySCC competition, in which teams optimize a variety of HPC workloads in order to complete the most computations during a 48 hour final, all while staying below a strict power cap. Each year Supercomputing hosts the Student Cluster Competition (SCC), drawing teams from around the world to push HPC software and hardware to, and sometimes beyond, its limits. IndySCC is meant to be analogous to the in-person SCC, but uses cloud/shared resources so that more teams can participate. In the months leading up to the competition, the teams participate in a “Virtual Course”, completing assignments to familiarize themselves with cloud computing concepts, as well as the HPC workloads. 5 teams participated in IndySCC 2021, each consisting of up to 6 students plus an advisor. The teams were from UIUC, Monash University, Australia, ETH Zurich, Switzerland, Texas A&M, and Sun Yat-sen University, China.
Starting in July, the teams made short term leases on Chameleon to complete each assignment, as well as a lease well in advance for the 48 hour competition run on November 6-7. Each team reserved 2 P100 nodes on CHI@TACC, in rack 11 to ensure no bandwidth restrictions. The SCC committee also reserved 2 nodes as “spares”, to be used in the case of hardware issues.
In addition to Chameleon hardware, NCAR contributed 5 ARM Thunder X2 machines, operating as a Chameleon Associate site. Each team was allowed to reserve one of these for use in the competition, with the intent to make the site generally available to the community after SC21.
Before the 48 hour final competition, teams selected the specific nodes they planned to use; this could be any subset of 2x P100 and 1x ARM. The teams were required to submit results from 4 applications, HPCG, GROMACS, John the Ripper, each of which they had previously completed an assignment and benchmark for. At the beginning of the competition itself, Devito: Fast Stencil Computation from Symbolic Specification was announced as the “Mystery Application”. Teams were judged privately based on how many correct answers they submitted from each application, with a penalty applied if they exceeded the 1100 watt power cap.
The Chameleon team made available power monitoring Grafana displays for CHI@TACC , and CHI@NCAR resources. CHI@TACC information was obtained from the DCMI interface of each node’s BMC) and CHI@NCAR by querying smart PDUs, as nodes lacked DCMI support. In both cases, the information was aggregated by a Prometheus instance on the CHI@NCAR site, then made available to the teams and the public via Grafana, using Chameleon’s keycloak instance for single-sign-on. The power usage during the competition is shown on the attached graph, but winners are yet to be announced.
This effort will be presented as a talk at the HPCSYSPROS workshop during SC21, on Sunday November 14th.