Reported Outages

CHI@Edge down

Resolved Posted by Michael Sherman on November 05, 2024
Outage start Tuesday, November 05, 2024 12 p.m.
Expected end Wednesday, November 06, 2024 4:21 p.m.

As of around 2pm on Wednesday, CHI@Edge is back online, and device reservation and container launch are working.

We're continuing to monitor the system.

Email notification reliability issues

Resolved Posted by Mark Powers on October 25, 2024
Outage start Friday, October 25, 2024 9 a.m.
Expected end Wednesday, November 06, 2024 11:03 a.m.

Update: Email services should be working as normal again. 


Email notifications to Gmail addresses are not reliably being received from our services. In particular this affects lease end reminder emails. We are working on improving this service in the near future.

Please take care noting the end time of your lease to ensure your experiment isn't terminated unexpectedly.

Upcoming Maintenance window for Chameleon Auth server

Resolved Posted by Mark Powers on October 21, 2024
Outage start Tuesday, October 29, 2024 9:30 a.m.
Expected end Tuesday, October 29, 2024 9:40 a.m.

On the morning of Tuesday, October 29th, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.

We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.

Upcoming Maintenance window for Chameleon Auth server

Resolved Posted by Mark Powers on September 17, 2024
Outage start Tuesday, September 24, 2024 9:30 a.m.
Expected end Tuesday, September 24, 2024 9:42 a.m.

On the morning of Tuesday, September 24th, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.

We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.

CHI@NU outage

Resolved Posted by Michael Sherman on September 09, 2024
Outage start Sunday, September 08, 2024 12:45 p.m.
Expected end Wednesday, September 11, 2024 6 p.m.

The CHI@NU site is currently inaccessible due to an issue with its proxy server interfering with federated login.

You will observe symptoms including the message "No active allocations" from the dashboard, or an HTTP 403 error from the API. Don't worry, nothing is wrong with your account, the failure just happens to be prevending the site from fetching your list of projects correctly.

We've notified the site operator, and will post updates here.

Upcoming Maintenance window for Chameleon Auth server

Resolved Posted by Mark Powers on August 22, 2024
Outage start Tuesday, August 27, 2024 9:30 a.m.
Expected end Tuesday, August 27, 2024 9:35 a.m.

On the morning of Tuesday, August 27rd, there will be a brief outage affecting login to all Chameleon sites and services at 9:30 AM central time.

We expect a 5-10 minute outage while we apply updates to the service that handles federated login.
This won't affect any running workloads or nodes, but you may need to refresh your browser once the outage ends.

DNS outage for chi.uc.chameleoncloud.org

Resolved Posted by Michael Sherman on August 22, 2024
Outage start Thursday, August 22, 2024 2:12 p.m.
Expected end Thursday, August 22, 2024 2:50 p.m.

At 2:12 PM a DNS issue took down access to chi.uc.chameleoncloud.org. We have rolled back the responsible change, and the site should come back online in the next 30 minutes.

CHI@TACC Maintenance window for OpenStack version upgrade

Resolved Posted by Cody Hammock on August 07, 2024
Outage start Wednesday, August 21, 2024 8 a.m.
Expected end Tuesday, August 27, 2024 9:44 a.m.

RESOLVED: CHI@TACC maintenance is complete.

UPDATE: The OpenStack upgrade is complete. Staff are aware of some potential issues launching new instances, and are working to resolve them. Existing instances are unaffected, and once again available.

At 8am central time, on August 21st, CHI@TACC will be unavailable for use, as we upgrade the core OpenStack services on the controller hosts. During this time, the chi.tacc.chameleoncloud.org webpage and APIs will be inaccessible, as will network connectivity to any running instances.

CHI@UC: some rtx_6000 nodes not provisioning

Resolved Posted by Michael Sherman on August 05, 2024
Outage start Thursday, August 01, 2024 6 p.m.
Expected end Tuesday, August 06, 2024 6 p.m.

We're observing intermittent provisioning failures for a small set of nodes at CHI@UC, all of which are "phase 2" nodes, mostly rtx_6000s.

The nodes currently known to be affected are listed below, and have been placed into a non-reservable maintenance mode until we can resolve the issue.