Reported Outages

CHI@NU Public network down

Resolved Posted by Michael Sherman on June 02, 2022
Outage start Wednesday, June 01, 2022 9 p.m.
Expected end Thursday, June 02, 2022 4:36 p.m.

Update: 4:36 PM. NU has restored network connectivity, and the site is back up.


The network providing access to CHI@NU has gone down, preventing access to the site. The web UI and all instances are inaccessible.

Site staff are investigating, and we'll update here with a timeline for resolution.

Baremetal Provisioning Outage for CHI@TACC

Resolved Posted by Cody Hammock on June 01, 2022
Outage start Wednesday, June 01, 2022 8 a.m.
Expected end Friday, June 03, 2022 2:22 p.m.

Resolved: The system is now operating normally. Thank you for your patience.

CHI@TACC is currently experiencing an outage in provisioning baremetal nodes. This does not affect currently running instances, but prevents the launch of new ones. The team is working to resolve the issue.

Upcoming Maintenance window at UC

Resolved Posted by Michael Sherman on May 25, 2022
Outage start Monday, June 06, 2022 8 a.m.
Expected end Tuesday, June 07, 2022 6 p.m.

Update 5:30 pm: Issues are resolved, all nodes are usable again.


Update 4pm June 7th: Provisioning of baremetal nodes is restored. We're seeing failures to create leases for P2 nodes (types compute_skylake, gpu_rtx_6000), but reservation of P3 nodes is succeeding.

Unplanned Jupyter downtime May 13

Resolved Posted by Jason Anderson on May 13, 2022
Outage start Friday, May 13, 2022 12:06 p.m.
Expected end Friday, May 13, 2022 1:42 p.m.

We are experiencing an outage of the Jupyter environment and are working to restore service shortly, stay tuned, and apologies for the lack of notice.

Instance Provisioning failures at UC

Resolved Posted by Michael Sherman on May 11, 2022
Outage start Wednesday, May 11, 2022 8 a.m.
Expected end Wednesday, May 11, 2022 4:13 p.m.

Update: 4PM 05/11/2022: This issue is now resolved, provisioning and connectivity should be restored for all UC nodes.


An issue is affecting the provisioning of new instances on P3 nodes at UC. Existing nodes are unaffected.

Network switch failure for P2 nodes at UC

Resolved Posted by Michael Sherman on May 04, 2022
Outage start Wednesday, May 04, 2022 10 a.m.
Expected end Tuesday, May 10, 2022 6:31 p.m.

Update: 6pm 05/10/22: The outage is now resolved. Both switches are now functional, and P2 nodes from nc01-nc64 are back online. New instances have no issues, existing instances may still have connectivity issues. If you have those issues, please try removing and re-attaching the network port to your instance.

Provisioning network failure at CHI@UC

Resolved Posted by Michael Sherman on May 02, 2022
Outage start Friday, April 29, 2022 11:20 a.m.
Expected end Monday, May 02, 2022 1:17 p.m.

Update 05/03/22: This issue is now resolved. It was caused by a combination of two factors: misconfiguration of the DHCP behavior for out-of-band interfaces, and a failure causing an out of band switch to power off.

All affected nodes should be reservable again. If you have an instance that has become inaccessable, please get in touch with us via the helpdesk.

kvm@TACC Unavailable April 22, 2022

Resolved Posted by Cody Hammock on April 22, 2022
Outage start Thursday, April 21, 2022 8 p.m.
Expected end Friday, April 22, 2022 4:08 p.m.

KVM@TACC was unavailable starting in the evening of April 21, 2022. It has been resolved.

CHI@UC down

Resolved Posted by Michael Sherman on March 24, 2022
Outage start Thursday, March 24, 2022 10:25 a.m.
Expected end Thursday, March 24, 2022 12 p.m.

Update: This has been resolved as of 11:42 AM, and the site is back up. Running nodes should not have been affected, aside from the temporary loss of network connectivity.


CHI@UC is currently down due to a failure of the controller node's load-balancer. We will update here with more information.