VM service discovery fail

Incident Report for Flying Circus

Resolved

During a routine update of our Configuration Management Database (CMDB), the service‑discovery mechanism broke. Service discovery is used, for example, to tell a virtual machine which NFS server it should mount.

The CMDB schema was altered in a way that was technically backward‑compatible, but the semantic meaning of several records changed. The existing service‑discovery software continued to run the previous version, interpreting the updated records under the old schema. As a result, it returned incorrect service endpoints.

We will review the update processes within our information security management system to prevent this kind of failure the future.
Posted May 19, 2026 - 09:40 CEST

Monitoring

A fix has been implemented and we are monitoring the results.
Posted May 19, 2026 - 09:14 CEST

Update

We are continuing to work on a fix for this issue.
Posted May 19, 2026 - 09:05 CEST

Identified

Machines were not able to discover services causing outages.
Posted May 19, 2026 - 08:58 CEST
This incident affected: RZOB (production) (VM servers, VM storage cluster) and RZHAL (backup and development data center) (VM servers, VM storage cluster).