Speaker
Kate Keahey, Senior Fellow at Computation Institute University of Chicago
Abstract
Computer Science experimental testbeds allow investigators to explore a broad range of different state-of-the-art hardware options, assess scalability of their systems, and provide conditions that allow deep reconfigurability and isolation so that one user does not impact the experiments of another. An experimental testbed is also in a unique position to provide methods facilitating experiment analysis and crucially, improve repeatability and reproducibility of experiments both from the perspective of the original experimenter and those building on or extending their results. Providing these capabilities at least partially within a commodity framework improves the sustainability of systems experiments and thus makes them available to a broader range of experimenters.
Chameleon is a large-scale, deeply reconfigurable testbed built specifically to support the features described above. It currently consists of ~600 nodes (~15,000 cores), a total of 5PB of total disk space hosted at the University of Chicago and TACC, and leverages 100 Gbps connection between the sites. The hardware includes a large-scale homogenous partition to support large-scale experiments as well a diversity of configurations and architectures including Infiniband, GPUs, FPGAs, storage hierarchies with a mix of HDDs, SDDs, NVRAM, and high memory as well as non-x86 architectures such as ARMs and Atoms. To support systems experiments, Chameleon provides a configuration system giving users full control of the software stack including root privileges, kernel customization, and console access. To date, Chameleon has supported over 2,000 users working on over 300 projects.
This talk will describe the current systems as well as the extensions projected for the near future that will allow us to broaden the range of supported experiments. This will be accomplished by deploying new hardware with state-of-the-art architectures and new networking capabilities allowing experimenters to deploy their own switch controllers and experiment with Software Defined Networking (SDN). I will also describe new capabilities targeted at improving experiment monitoring and analysis as well as tying together testbed features to improve experiment repeatability. Finally, I will outline our plans for packaging the Chameleon infrastructure to allow others to reproduce our configuration.