Bare Metal or KVM? Which Should You Choose and When

A detailed comparison of hardware access, reservation systems, and storage options for users

Most of the Chameleon testbed is available via bare metal instances. This is because launching a bare metal instance gives you full control over a physical server in our data centers, allowing you to e.g., collect benchmarks and power usage without worrying about virtualization overhead or noisy neighbors. The drawback is that in order to provide sufficient isolation between experiments we can only have one experiment at a time on a bare metal node. This functionality is implemented through the OpenStack + Ironic mainstream open source Infrastructure-as-a-Service implementation that has been heavily modified to suit the science use case, for example, by adding support for federated identity. The most important such addition is support for advance reservations that allow you to claim temporary ownership of specific resources in the future

In addition to bare metal we also support virtualized instances, currently through the KVM@TACC site. Virtualized instances introduce additional overhead but on the other hand we can host several virtual instances on one node – and therefore several experiments executing concurrently – allowing for a much better use of resources. Historically, KVM@TACC consisted of homogenous compute nodes, without GPUs or other interesting features, and while it was also operated using OpenStack, we did not support many of the features present on bare metal partition of the testbed, including advance reservations and allocation charging. This changed over the last few months: we now have virtualized instances with H100 GPUs and are in general looking to spruce up our virtualized offering in order to provide more cost-effective access to GPU resources. To ensure that you can make efficient use of those features we've been bringing the KVM@TACC experience up to parity with bare metal.

The table below calls out the most important differences that persist between the bare metal and virtualized offerings on the system.

Feature Bare Metal* KVM
Hardware Discovery Hardware browser Documentation
Resource model Non-fungible resources Fungible resources
Availability calendar Node calendar - individual availability per node Flavor calendar, aggregates availability over time
Advance Reservations Physical nodes Networks Floating IPs Virtualized Flavors
Lease limits 1 week Depends on the flavor (1 week for GPU VMs, 6 months for other VMs)
Snapshots cc-snapshot, available from within your instance only Snapshots via the API
Storage Object store NFS shares Persistant cinder volumes
Firewall Security firewalld service, within your instance security groups

*These features apply to CHI@TACC and CHI@UC. The status of these features at CHI@NCAR is still in progress as the site is under construction.

Bare Metal on Chameleon. To launch a bare metal instance, first you must make a reservation for a node, which refers to the physical server hardware you will be using. When making this reservation, you'll need to specify how many nodes you want (which will be how many instances you can provision), and you can specify a property filter for those nodes (such as matching a type of node, name, or other feature). You can use the hardware browser to see the specification of each node, and the hardware calendar to see when each node is free to reserve. In order to share resources effectively between users on Chameleon, we enforce limits on reservations, including limiting maximum reservation length to 7 days.

Once your lease is active, you can provision an instance onto the reserved nodes using a Chameleon supported image (or you are free to bring your own). The Chameleon supported images come with cc-snapshot, which is a tool that can snapshot your running baremetal instance, which you can use to install and configure experiment software once, and reuse that work in the future.

Unique to bare metal sites, you can also reserve networks and floating IPs. These networks do things like setup layer 2 stitching to FABRIC. Reservable floating IPs work the same as ad-hoc floating IPs, but their lifecycle is tied to the corresponding lease's lifecycle. Only our bare metal sites have support for the Swift object store, which lets you store large amounts of data, meaning CHI@UC and CHI@TACC are the only sites that you can use cc-mount-object-store. Similarly, those sites are the only place that we support the shared file system, which lets you mount a persistent NFS share on your bare metal instance.

KVM on Chameleon. KVM has undergone a transformation in recent months. Now to get started with your experiment, you'll need to make a reservation for a flavor. On KVM, resources are virtualized and fungible, and the hardware specification is abstracted away. Instead, the flavor contains a higher level specification for a number of CPU threads, GB of memory, GB or storage space, and in some cases GPUs. When making this reservation, you'll specify which of our existing flavors meets your requirements, and how many VMs you want to launch. Like for bare metal, we enforce some limits on flavor leases to ensure resources are shared between users. We permit up to 6 month leases for standard flavors, but for flavors using GPUs, leases must be no more than 7 days in length.

When a reservation is made, you'll be assigned a slot on one of KVM@TACC's hosts that your VM will run on. If you are interested in knowing what kind of hardware is being used, you can read more in our documentation. Similar to bare metal, we have an availability calendar for flavor availability, which allows you to see the capacity of each type of resource request over time as a line graph. This is particularly useful for checking how many GPUs are available, which are relatively scarce.

Once your lease is active, you'll be able to launch an instance using a new flavor that is tied to your reservation. The same Chameleon supported images work with VMs as on bare metal, but a few things are still different. On KVM@TACC, you can create a snapshot outside of your instance, which is a much quicker process. Additionally, you'll notice the firewalld service from our images is not enabled by default in a VM. Instead, you can manage traffic with security groups, and which you'll have to set up even for SSH.

While KVM@TACC does not have its own object store, you are able to configure an instance with authentication to CHI@TACC's object store, which would allow you to read or write data there as on bare metal. Additionally, instead of the shared file system, KVM@TACC supports persistent storage volumes, which allow you to add persistent storage devices to your instances and mount them directly, rather than through our NFS server.

The Future of KVM. As it stands currently, most workflows on bare metal have an equivalent workflow on KVM with the recent addition of bounded and advanced reservations. Over Phase 4 of Chameleon, we are looking at further improvements towards bringing bare metal and KVM usage together, by reducing friction with the shared file system, improving floating IP management, improving testbed interfaces (python-chi), and eventually hardware between bare metal and KVM. As always, we would love to hear from you if there are any features related to Chameleon infrastructure that helps us prioritize this work. Please feel free to join our discussion <here> on the Chameleon forum.</here>

Accelerate Your Research with NVIDIA H100 GPUs on KVM@TACC

Tips and tricks for making the most of Chameleon's new GPU resources and reservation-based workflow

NVIDIA H100 GPUs are now available on KVM@TACC through a new reservation-based system. Learn how to leverage cutting-edge GPU acceleration, persistent storage, and flexible networking to maximize your research productivity within time-limited virtual machines.


Add a comment

No comments