HiRED: Cutting Inference Costs for Vision-Language Models Through Intelligent Token Selection
High-resolution Vision-Language Models (VLMs) offer impressive accuracy but come with significant computational costs—processing thousands of tokens per image can consume 5GB of GPU memory and add 15 seconds of latency. The HiRED (High-Resolution Early Dropping) framework addresses this challenge by intelligently selecting only the most informative visual tokens based on attention patterns. By keeping just 20% of tokens, researchers achieved a 4.7× throughput increase and 78% latency reduction while maintaining accuracy across vision tasks. This research, conducted on Chameleon's infrastructure using RTX 6000 and A100 GPUs, demonstrates how thoughtful optimization can make advanced AI more accessible and affordable.
Streamline Your Research Workflow with Trovi's New GitHub Integration
-
April 21, 2025
by -
Mark Powers
Learn how to leverage Trovi's new GitHub integration to easily create and update reproducible research artifacts. This step-by-step guide shows you how to configure your GitHub repository with RO-crate metadata and import it directly into Trovi, enabling better collaboration and adherence to FAIR principles for your experiments.
This month, we have reminders for KVM@TACC and CHI@Edge outages later this month. Additionally, we have version 1.1 of python-chi, and improvements to reservations!
Making code edits more effective, robust, and transparent through explicit transformation rules
-
March 24, 2025
by -
Weichen Li
In this interview, Weichen Li, a PhD student from the University of Chicago discusses research on improving code editing through explicit transformation rules. EditLord breaks down the code editing process into clear, step-by-step transformations, significantly enhancing editing performance, robustness, and functional correctness compared to existing methods.
How to Preserve Your Valuable Data on Chameleon Cloud
-
March 17, 2025
by -
Marc Richardson
Understanding how to preserve your valuable research on Chameleon Cloud is crucial for research continuity and community contribution. Here's how to extend the lifespan of your resources through smart public sharing
This month, we are excited to announce new updates to the Trovi dashboard, and the launch of the Chameleon User Forums. Additionally, please note our new data policies, as these will take effect soon!
Chameleon-Powered Research Shows the Path to Efficient Scientific Computing
Scientific workflows often fail in unexpected ways, but traditional detection systems require massive amounts of training data. This groundbreaking approach generates just the right data needed to train anomaly detection models, improving accuracy while reducing resource consumption.
Pardon our dust! This month, we have been revising, modernizing, and upgrading to improve Chameleon services. We have updates on the upcoming KVM plans, FPGA changes, and more.
Streamlining Scientific Validation Through Automated Reproducibility Infrastructure
-
Jan. 27, 2025
by -
Klaus Kraßnitzer
The AutoAppendix project evaluates computational artifact reproducibility across SC24 conference submissions, revealing that most researchers struggle with creating truly replicable experiments despite their importance to scientific validity. By developing one-click reproduction templates for the Chameleon Cloud platform, this research aims to transform how computational scientists share and validate their work, potentially saving countless hours of frustration for both authors and reviewers.