Outage start |
Thursday, August 01, 2024 6 p.m. |
Expected end |
Tuesday, August 06, 2024 6 p.m. |
We're observing intermittent provisioning failures for a small set of nodes at CHI@UC, all of which are "phase 2" nodes, mostly rtx_6000s.
The nodes currently known to be affected are listed below, and have been placed into a non-reservable maintenance mode until we can resolve the issue.
Outage start |
Thursday, July 25, 2024 12 p.m. |
Expected end |
Thursday, July 25, 2024 6 p.m. |
At CHI@UC, when submitting a request for a new lease, users are intermittently receiving an error message about "enforcement failed". After restarting a relevant service, lease requests are now succeeding again.
We suspect this issue is related to a service token expiring and not being correctly renewed, and are investigating a proper fix.
Outage start |
Tuesday, June 18, 2024 10 a.m. |
Expected end |
Tuesday, June 18, 2024 10:06 a.m. |
UPDATE: All services are updated and should be working as expected
On the morning of Tuesday, June 18th, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.
We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.
Outage start |
Wednesday, May 29, 2024 10:28 a.m. |
Expected end |
Wednesday, May 29, 2024 11:28 a.m. |
CHI@UC had a brief interruption in connectivity, preventing access to chi.uc.chameleoncloud.org. This issue manifested as a failure for the UC control-plane servers to contact the chameleon authentication server, triggered by what was planned to be unrelated work elsewhere in the network.
Upon discovering the issue, the other network changes were backed out, and service has been restored.
Outage start |
Saturday, May 25, 2024 8:30 a.m. |
Expected end |
Tuesday, May 28, 2024 10 a.m. |
CHI@TACC was unavailable from May 25 through the morning of May 28, 2024. This effected the API services and web interface, but did not impact running instances.
We have corrected the problem, and service is restored.
Outage start |
Tuesday, June 04, 2024 9:30 a.m. |
Expected end |
Tuesday, June 04, 2024 10:10 a.m. |
UPDATE: All services are updated and should be working as expected
On the morning of Tuesday, June 4th, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.
We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.