Reported Outages

KVM@TACC Issues launching instances

Resolved Posted by Cody Hammock on February 07, 2025
Outage start Friday, February 07, 2025 10 a.m.
Expected end Friday, February 07, 2025 1:26 p.m.

Resolved: Instance launches on KVM@TACC are once again working.  

Some users are experiencing the message "not enough hosts available" when attempting to launch an instance on KVM@TACC. Staff are troubleshooting the problem.

KVM@TACC Unavailable Jan 22, 2025

Resolved Posted by Cody Hammock on January 22, 2025
Outage start Wednesday, January 22, 2025 1 p.m.
Expected end Wednesday, January 22, 2025 3:47 p.m.

Resolved: Service has been restored for KVM@TACC

Starting at approximately 1:00 PM Central, KVM@TACC's authentication service has been interrupted. Staff is working to restore the service.

upcoming maintenance window for KVM@TACC

Resolved Posted by Michael Sherman on January 17, 2025
Outage start Tuesday, January 28, 2025 6 a.m.
Expected end Tuesday, January 28, 2025 6 p.m.

The Jan 28th maintenance window has concluded, and KVM@TACC is back to normal.

We were able to migrate the loadbalancer and database services to a redundant configuration, but had to roll back the migration of other control-plane services due to unforseen complications. We'll be investigating a path forward, and a subsequent (shorter) maintenance window to finish the migration for each service.

Authentication + Trovi maintenance

Resolved Posted by Mark Powers on December 11, 2024
Outage start Thursday, December 19, 2024 9:30 a.m.
Expected end Thursday, December 19, 2024 10:15 a.m.

On Thursday, December 19 at 9:30 am CT, Chameleon's authentication service and Trovi will be down for network maintenance. During this time, users will not be able to access the Chameleon sites (CHI@UC, CHI@TACC, KVM), or authenticate to the user portal. Running nodes will be unaffected.

CHI@UC upstream network issue

Resolved Posted by Michael Sherman on December 05, 2024
Outage start Thursday, December 05, 2024 1:20 p.m.
Expected end Thursday, December 05, 2024 6 p.m.

We're receiving reports of issues reaching CHI@UC, but the issues seem depend on the source address, so we have not been able to reproduce them consistently.

This manifests as timeouts for horizon dashboard at https://chi.uc.chameleoncloud.org, or disconnections when connecting to instances over ssh.

We're working with our network provider to troubleshoot the issue.

chi@uc network outage

Resolved Posted by Michael Sherman on November 20, 2024
Outage start Wednesday, November 20, 2024 3:31 p.m.
Expected end Wednesday, November 20, 2024 6 p.m.

The outage should now be resolved, but we're continuing to monitor.

Please let us know if you observe new issues connecting to baremetal nodes, or accessing shared storage.

CHI@Edge down

Resolved Posted by Michael Sherman on November 05, 2024
Outage start Tuesday, November 05, 2024 12 p.m.
Expected end Wednesday, November 06, 2024 4:21 p.m.

As of around 2pm on Wednesday, CHI@Edge is back online, and device reservation and container launch are working.

We're continuing to monitor the system.

Email notification reliability issues

Resolved Posted by Mark Powers on October 25, 2024
Outage start Friday, October 25, 2024 9 a.m.
Expected end Wednesday, November 06, 2024 11:03 a.m.

Update: Email services should be working as normal again. 


Email notifications to Gmail addresses are not reliably being received from our services. In particular this affects lease end reminder emails. We are working on improving this service in the near future.

Please take care noting the end time of your lease to ensure your experiment isn't terminated unexpectedly.

Upcoming Maintenance window for Chameleon Auth server

Resolved Posted by Mark Powers on October 21, 2024
Outage start Tuesday, October 29, 2024 9:30 a.m.
Expected end Tuesday, October 29, 2024 9:40 a.m.

On the morning of Tuesday, October 29th, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.

We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.

Upcoming Maintenance window for Chameleon Auth server

Resolved Posted by Mark Powers on September 17, 2024
Outage start Tuesday, September 24, 2024 9:30 a.m.
Expected end Tuesday, September 24, 2024 9:42 a.m.

On the morning of Tuesday, September 24th, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.

We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.