SORS: The Many Colors of Chameleon: Building a Reconfigurable Testbed for Systems Research
Objectives
Abtract: Computer Science experimental testbeds allow investigators to explore a broad range of state-of-the-art hardware options, assess scalability of their systems, and provide conditions that allow deep reconfigurability and isolation so that one user does not impact the experiments of another. An experimental testbed is also in a unique position to provide methods facilitating experiment analysis and crucially, improve repeatability and reproducibility of experiments both from the perspective of the original experimenter and those building on or extending their results. Providing these capabilities at least partially within a commodity framework improves the sustainability of systems experiments and thus makes them available to a broader range of experimenters.
Chameleon is a large-scale, deeply reconfigurable testbed built specifically to support the features described above. It currently consists of roughly 20,000 cores, a total of 5PB of total disk space hosted at the University of Chicago and TACC, and leverages 100 Gbps connection between the sites. The hardware includes a large-scale homogenous partition to support large-scale experiments, as well a diversity of configurations and architectures including Infiniband, GPUs, FPGAs, storage hierarchies with a mix of HDDs, SDDs, NVRAM, and high memory as well as non-x86 architectures such as ARMs and Atoms. To support systems experiments, Chameleon provides a configuration system giving users full control of the software stack including root privileges, kernel customization, and console access. To date, Chameleon has supported 3,000+ users working on 500+ projects.
This talk will describe the evolution of the testbed as well as the current work towards broadening the range of supported experiments. In particular, I will discuss recently deployed hardware and new networking capabilities allowing experimenters to deploy their own switch controllers and experiment with Software Defined Networking (SDN). I will also describe new capabilities targeted at improving experiment management, monitoring, and analysis as well as tying together testbed features to improve experiment repeatability. Finally, I will outline our plans for packaging the Chameleon infrastructure to allow others to reproduce its configuration easily and thereby making the process of configuring a CS testbed more sustainable.
Short bio: Kate Keahey is one of the pioneers of infrastructure cloud computing. She created the Nimbus project, recognized as the first open source Infrastructure-as-a-Service implementation, and continues to work on research aligning cloud computing concepts with the needs of scientific datacenters and applications. To facilitate such research for the community at large, Kate leads the Chameleon project, providing a deeply reconfigurable, large-scale, and open experimental platform for Computer Science research. To foster the recognition of contributions to science made by software projects, Kate co-founded and serves as co-Editor-in-Chief of the SoftwareX journal, a new format designed to publish software contributions. Kate is a Scientist at Argonne National Laboratory and a Senior Fellow at the Computation Institute at the University of Chicago.