The Chameleon architecture consists of a set of standard cloud units (SCUs), each of which is a single rack with 42 compute nodes, 4 storage nodes attached to 128TB of local storage in configurable arrays, and an OpenFlow compliant network switch. In addition to the homogeneous SCUs, a variety of heterogeneous hardware types is available to experiment with alternative technologies. The testbed also includes a shared infrastructure with a persistent storage system accessible across the testbed, a top-level network gateway to allow access to public networks, and a set of management and provisioning servers to support user access, control, monitoring and configuration of the testbed. Chameleon is physically distributed between the Texas Advanced Computing Center (TACC) and the University of Chicago (UC) through 100Gbps Internet2 links, to allow users to examine the effects of a distributed cloud.
The standard cloud unit is a self-contained rack with all the components necessary to run a complete cloud infrastructure, and the capability to combine with other units to form a larger experiment. The rack consists of 42 Dell R630 servers; each with 24 cores delivered in dual socket Intel Xeon E5-2670 v3 “Haswell” processors (each with 12 cores @ 2.3GHz), each with 128 GiB of RAM. In addition to the compute servers, each unit contains storage provided by four Dell FX2 servers, each of which have attached 16 2TB hard drives, for a total of 128TB of raw disk storage per unit. These storage servers also contain dual Intel Xeon E5-2650 v3 Haswell processors (each with 10 cores @ 2.3 GHz), 64 GiB of RAM, and can be combined across SCUs to create a significant Hadoop infrastructure with more than a PB of storage. Each node in the SCU connects to a Dell switch at 10Gbps, with 160Gbps of bandwidth to the core network from each SCU. The total system contains 12 SCUs (10 at TACC and 2 at UC) for a total of 13,056 cores, 66 TiB of RAM, and 1.5PB of configurable storage in the SCU subsystem.
Networking is changing rapidly, and the network fabric is as much a part of the research focus of Chameleon as the compute or storage. For the Chameleon network, every switch in the research network is a fully OpenFlow compliant programmable Dell S6000 switch. Each node connects to this network at 10Gbps, and from each unit four 40Gbps uplinks provide 160Gbps per rack to the Chameleon core network. The core switches (Dell S6000 switches) aggregate to 100Gbps Ethernet links, which connect to the backbone 100Gbps services at both UC and TACC. A separate 1 Gbps Ethernet management network extends to every node, to maintain monitoring and connectivity when the research network is either isolated from the public networks or otherwise in an experimental mode. A Fourteen Data Rate (FDR) Infiniband network (56Gbps) is also deployed on one SCU to allow exploration of alternate networks.
While storage is dynamically provisioned to researchers to be used as an experiment needs within the SCUs, Chameleon also provides a shared storage system. The shared storage provides more than 3.6PB of raw disk in the initial configuration, which is partitioned between a filesystem and an object store that is persistent between experiments. The shared storage is comprised of four Dell R630 servers with 128 GiB of RAM, four MD3260 external drive arrays, and six MD3060e drive expansion chassis, populated by 600 6TB near line SAS drives. The system also includes a dozen PowerEdge R630 servers as management nodes to provide for login access to the resource, data staging, system monitoring, and hosting various OpenStack services.
The heterogeneous hardware consists of two storage hierarchy nodes, two K80 GPU nodes, and two M40 GPU nodes. Each of the additional six nodes is a Dell PowerEdge R730 server with the same CPUs as the SCUs. The two storage hierarchy nodes have been designed to enable experiments using multiple layers of caching: they are configured with 512 GB of memory, two Intel P3700 NVMe of 2.0 TB each, four Intel S3610 SSDs of 1.6 TB each, and four 15K SAS HDDs of 600 GB each. The GPU nodes are targeting experiments using accelerators to improve the performance of some algorithms, experiments with new visualization systems, and deep machine learning. Each K80 GPU node is upgraded with an NVIDIA Tesla K80 accelerator, consisting of two GK210 chips with 2496 cores each (4992 cores in total) and 24 GB of GDDR5 memory. Each M40 node is upgraded with an NVIDIA Tesla M40 accelerator, consisting of a GM200 chip with 3072 cores and 24 GiB of GDDR5 memory. In order to make it easy for users to get started with the GPU nodes, we have developed a CUDA appliance that includes NVIDIA drivers as well as the CUDA framework. These nodes can be discovered thanks to the resource discovery interface. For more information on how you can reserve these nodes, see the heterogeneous hardware section of the bare metal user’s guide.
You can browse detailed information about the resources offered for bare metal reconfiguration in our Resource Discovery portal.