Category – Featured

Leveraging New and Improved Chameleon Images

Less Setup, More Science: Streamlined Images with Built-in Tools and Drivers

What's the secret ingredient that makes our new Chameleon images so much better? From automatic SSH configuration to built-in rclone support, these aren't your ordinary cloud images. Find out what makes them special.

Chameleon Changelog for April 2025

This month, we have new OS images with AMD ROCm and Ubuntu 24 on ARM. Additionally, we have improvements to mounting object store buckets using rclone, a new message-of-the-day, and we’ve fixed the firewall confusion on KVM@TACC.
 

REPETO Releases Report on Challenges of Practical Reproducibility for Systems and HPC Computer Science

Findings from the November 2024 Community Workshop on Practical Reproducibility in HPC

View or contribute to the experiment packaging and style checklists (appendix A and B) on our GitHub repository here.

Download the report here.

We’re excited to announce the publication of the NSF-sponsored REPETO Report on Challenges of Practical Reproducibility for Systems and HPC Computer Science, a culmination of our Community Workshop on Practical Reproducibility in HPC, held in November 2024 in Atlanta, GA (reproduciblehpc.org).

Fair-CO2: Fair Attribution for Cloud Carbon Emissions

Understanding and accurately distributing responsibility for carbon emissions in cloud computing

Leo Han, a second-year Ph.D. student at Cornell Tech, conducted pioneering research on the fair attribution of cloud carbon emissions, resulting in the development of Fair-CO2. Enabled by the unique bare-metal capabilities and flexible environment of Chameleon Cloud, this work tackles the critical issue of accurately distributing responsibility for carbon emissions in cloud computing. This research underscores the potential of adaptable testbeds like Chameleon in advancing sustainability in technology.

Faster Multimodal AI, Lower GPU Costs

HiRED: Cutting Inference Costs for Vision-Language Models Through Intelligent Token Selection

High-resolution Vision-Language Models (VLMs) offer impressive accuracy but come with significant computational costs—processing thousands of tokens per image can consume 5GB of GPU memory and add 15 seconds of latency. The HiRED (High-Resolution Early Dropping) framework addresses this challenge by intelligently selecting only the most informative visual tokens based on attention patterns. By keeping just 20% of tokens, researchers achieved a 4.7× throughput increase and 78% latency reduction while maintaining accuracy across vision tasks. This research, conducted on Chameleon's infrastructure using RTX 6000 and A100 GPUs, demonstrates how thoughtful optimization can make advanced AI more accessible and affordable.