Reported Outages

CHI@NU down

Posted by Michael Sherman on May 30, 2023
Outage start Tuesday, May 30, 2023 2 a.m.
Expected end Tuesday, May 30, 2023 6 p.m.

We've received alerts that the CHI@NU site is currently down, and are investigating. 

Datacenter Maintenance affecting CHI@UC May 7-12!

Resolved Posted by Michael Sherman on April 19, 2023
Outage start Sunday, May 07, 2023 5 p.m.
Expected end Friday, May 12, 2023 5:55 p.m.

05/12/23: CHI@UC is back online. All systems look stable, but we'll be keeping an eye on it over the weekend to be sure. 

Pleas let us know if you encounter issues!

Authentication system outage

Resolved Posted by Michael Sherman on April 03, 2023
Outage start Monday, April 03, 2023 10:06 a.m.
Expected end Monday, April 03, 2023 11:01 a.m.

Update 11:01am: The upstream issue seems to have resolved, and sites are again accessible. We'll be monitoring the situation to see if it remains stable.

Authentication to all sites is currently down due to a failure in our central authentication system.

Existing instances should be unaffected, but users won’t be able to create or modify existing instances.

This seems to be caused by a failure in our upstream DNS provider, we are working with them to get an ETA for resolution.

TACC Network maintenance 26 March 2023

Resolved Posted by Cody Hammock on March 23, 2023
Outage start Sunday, March 26, 2023 8 a.m.
Expected end Sunday, March 26, 2023 1:12 p.m.

Update: Maintenance was completed a 1:12 PM (CDT) yesterday Sunday, 26 March 2023.

Network maintenance will be carried out between 8:00 AM and 2:00 PM (CDT) on Sunday, 26 March 2023. Access to all TACC systems will be unavailable during this time, including CHI@TACC, KVM@TACC, and the Chameleon Portal. Instances will continue to run, but users will have no access to TACC services and systems until the upgrade is complete.

Please submit any questions you may have via the TACC User Portal.

CHI@EVL maintenance window April 3rd

Resolved Posted by Michael Sherman on March 22, 2023
Outage start Monday, April 03, 2023 11 a.m.
Expected end Wednesday, April 05, 2023 4 p.m.

On April 3rd, CHI@EVL will be down while we replace the controller node. Conservatively, this should take about 4 hours before services are restored. Running instances and leases won't be modified, but will not be accessible during the outage window.

temporary outage for object store at CHI@UC

Resolved Posted by Michael Sherman on March 15, 2023
Outage start Thursday, March 16, 2023 1 p.m.
Expected end Thursday, March 16, 2023 1:30 p.m.

Update 5PM CT: Object store is back online via a workaround.
There will be a blip tomorrow at 1PM so we can test a permanent fix for this issue that triggered this outage.

CHI@IIT currently down

Resolved Posted by Michael Sherman on January 27, 2023
Outage start Friday, January 27, 2023 5:01 p.m.
Expected end Monday, January 30, 2023 5:14 p.m.

CHI@IIT is back up, but we're still waiting for the arrival of replacement hardware. Currently, bringing the site back online requires in-person actions, and so you'll observe instability until said hardware is installed. We plan for this work to be completed by the end of this week, subject to parts availability.

Network maintenance at TACC January 25, 2023

Resolved Posted by Cody Hammock on January 25, 2023
Outage start Wednesday, January 25, 2023 11 a.m.
Expected end Wednesday, January 25, 2023 4:10 p.m.

COMPLETE: The work is complete. Please let us know via the helpdesk if you encounter any ongoing issues.

In order to perform some necessary network maintenance, there will be brief interruptions to compute instances for CHI@TACC and KVM@TACC.

chi@edge public IPs unavailable

Resolved Posted by Michael Sherman on January 18, 2023
Outage start Wednesday, January 18, 2023 5:35 p.m.
Expected end Tuesday, January 31, 2023 5:11 p.m.

Current status:
Stability issues are resolved, we had observed and fixed deadlocks in both container launches and lease creation, due to an upstream eventlet bug.
Public Floating IPs are now functional again. The network operated by our infrastructure provider was filtering some mac-addresses, which bridged networks from working.