Reported Outages

upstream networking issue for CHI@UC

Resolved Posted by Michael Sherman on March 04, 2025
Outage start Tuesday, March 04, 2025 7:05 p.m.
Expected end Wednesday, March 05, 2025 12 p.m.

11AM March 5th:

This outage is now resolved, CHI@UC appears to be working as normal.

We don't yet have a root cause from our provider, but are working with them to prevent a repeat.

Jupyter Planned Outage

Resolved Posted by Mark Powers on February 19, 2025
Outage start Monday, March 03, 2025 9 a.m.
Expected end Tuesday, March 04, 2025 4:48 p.m.

Resolved: Jupyter service should be working as normal.


Update: We are waiting on DNS updates to finish the updating of our Jupyter infrastructure, and so the outage is ongoing. Thank you for your patience.


We will be updating our Jupyter infrastructure on the morning of Monday, March 3, which will take down JupyterHub and running JupyterLab instances.

Jupyterhub severs launch errors

Resolved Posted by Mark Powers on February 11, 2025
Outage start Tuesday, February 11, 2025 3 p.m.
Expected end Tuesday, February 11, 2025 4:30 p.m.

UPDATE: Jupyterhub service is back to normal

---

Users are experiencing errors when launching Jupyter environments on Chameleon's Jupyterhub. We are working on a resolution.

TACC Network Maintenance Wednesday 19 February 2025

Resolved Posted by Cody Hammock on February 10, 2025
Outage start Wednesday, February 19, 2025 6 a.m.
Expected end Wednesday, February 19, 2025 6:47 a.m.

Resolved: Work has been completed.

TACC network infrastructure will not be available from 6:00AM to 7:00AM (CDT) on Wednesday, February 19 2025. Network maintenance will be performed during this time. This will impact access to the Chameleon Portal, JupyterHub, CHI@TACC, KVM@TACC, and CHI@Edge. Experiments and VMs will continue to run, but will not have connectivity outside of TACC.

KVM@TACC Issues launching instances

Resolved Posted by Cody Hammock on February 07, 2025
Outage start Friday, February 07, 2025 10 a.m.
Expected end Friday, February 07, 2025 1:26 p.m.

Resolved: Instance launches on KVM@TACC are once again working.  

Some users are experiencing the message "not enough hosts available" when attempting to launch an instance on KVM@TACC. Staff are troubleshooting the problem.

KVM@TACC Unavailable Jan 22, 2025

Resolved Posted by Cody Hammock on January 22, 2025
Outage start Wednesday, January 22, 2025 1 p.m.
Expected end Wednesday, January 22, 2025 3:47 p.m.

Resolved: Service has been restored for KVM@TACC

Starting at approximately 1:00 PM Central, KVM@TACC's authentication service has been interrupted. Staff is working to restore the service.

upcoming maintenance window for KVM@TACC

Resolved Posted by Michael Sherman on January 17, 2025
Outage start Tuesday, January 28, 2025 6 a.m.
Expected end Tuesday, January 28, 2025 6 p.m.

The Jan 28th maintenance window has concluded, and KVM@TACC is back to normal.

We were able to migrate the loadbalancer and database services to a redundant configuration, but had to roll back the migration of other control-plane services due to unforseen complications. We'll be investigating a path forward, and a subsequent (shorter) maintenance window to finish the migration for each service.

Authentication + Trovi maintenance

Resolved Posted by Mark Powers on December 11, 2024
Outage start Thursday, December 19, 2024 9:30 a.m.
Expected end Thursday, December 19, 2024 10:15 a.m.

On Thursday, December 19 at 9:30 am CT, Chameleon's authentication service and Trovi will be down for network maintenance. During this time, users will not be able to access the Chameleon sites (CHI@UC, CHI@TACC, KVM), or authenticate to the user portal. Running nodes will be unaffected.

CHI@UC upstream network issue

Resolved Posted by Michael Sherman on December 05, 2024
Outage start Thursday, December 05, 2024 1:20 p.m.
Expected end Thursday, December 05, 2024 6 p.m.

We're receiving reports of issues reaching CHI@UC, but the issues seem depend on the source address, so we have not been able to reproduce them consistently.

This manifests as timeouts for horizon dashboard at https://chi.uc.chameleoncloud.org, or disconnections when connecting to instances over ssh.

We're working with our network provider to troubleshoot the issue.

chi@uc network outage

Resolved Posted by Michael Sherman on November 20, 2024
Outage start Wednesday, November 20, 2024 3:31 p.m.
Expected end Wednesday, November 20, 2024 6 p.m.

The outage should now be resolved, but we're continuing to monitor.

Please let us know if you observe new issues connecting to baremetal nodes, or accessing shared storage.