Less Setup, More Science: Streamlined Images with Built-in Tools and Drivers
-
May 19, 2025
by -
Paul Marshall
What's the secret ingredient that makes our new Chameleon images so much better? From automatic SSH configuration to built-in rclone support, these aren't your ordinary cloud images. Find out what makes them special.
This month, we have new OS images with AMD ROCm and Ubuntu 24 on ARM. Additionally, we have improvements to mounting object store buckets using rclone, a new message-of-the-day, and we’ve fixed the firewall confusion on KVM@TACC.
Findings from the November 2024 Community Workshop on Practical Reproducibility in HPC
-
May 1, 2025
by -
Marc Richardson
View or contribute to the experiment packaging and style checklists (appendix A and B) on our GitHub repository here.
Download the report here.
We’re excited to announce the publication of the NSF-sponsored REPETO Report on Challenges of Practical Reproducibility for Systems and HPC Computer Science, a culmination of our Community Workshop on Practical Reproducibility in HPC, held in November 2024 in Atlanta, GA (reproduciblehpc.org).
Understanding and accurately distributing responsibility for carbon emissions in cloud computing
-
April 29, 2025
by -
Leo Han
Leo Han, a second-year Ph.D. student at Cornell Tech, conducted pioneering research on the fair attribution of cloud carbon emissions, resulting in the development of Fair-CO2. Enabled by the unique bare-metal capabilities and flexible environment of Chameleon Cloud, this work tackles the critical issue of accurately distributing responsibility for carbon emissions in cloud computing. This research underscores the potential of adaptable testbeds like Chameleon in advancing sustainability in technology.
HiRED: Cutting Inference Costs for Vision-Language Models Through Intelligent Token Selection
High-resolution Vision-Language Models (VLMs) offer impressive accuracy but come with significant computational costs—processing thousands of tokens per image can consume 5GB of GPU memory and add 15 seconds of latency. The HiRED (High-Resolution Early Dropping) framework addresses this challenge by intelligently selecting only the most informative visual tokens based on attention patterns. By keeping just 20% of tokens, researchers achieved a 4.7× throughput increase and 78% latency reduction while maintaining accuracy across vision tasks. This research, conducted on Chameleon's infrastructure using RTX 6000 and A100 GPUs, demonstrates how thoughtful optimization can make advanced AI more accessible and affordable.