Skip to content

Bug: BMCSettings POST-based settings can never be verified via GET — permanent diff loop #858

@atd9876

Description

@atd9876

Which component is affected?

BMCSettings controller (internal/controller/bmcsettings_controller.go) and the OEM helpers verification logic (bmc/oem_helpers.gohttpBasedGetBMCSettingAttribute()).

Describe the bug

When a BMCSettings resource contains a key with the POST HTTP method prefix (e.g., "POST /redfish/v1/Managers/1/SecurityService/HttpsCert/Certificates/"), the controller can never verify that the setting was applied successfully. After applying the POST, verification does a GET on the same URI and compares the response to the original POST body using isSubMap(). Per the Redfish specification (DSP0266 v1.20.1 §7.9), POST creates a new collection member with a service-assigned URI, and the response structure is fundamentally different from the request body. This causes a permanent diff, and the controller re-applies the POST on every reconcile cycle (~2 minutes), creating duplicate resources on the BMC indefinitely.

To reproduce

  1. Apply a BMCSettingsSet targeting an HPE iLO BMC with a POST-based key
  2. Wait for the BMCSettings child to enter InProgress and apply settings
  3. Observe that the BMCSettings never transitions to Applied — it stays InProgress forever
  4. Check the BMC's certificate collection — a new duplicate certificate is created every ~2 minutes
Resource definitions used
apiVersion: metal.ironcore.dev/v1alpha1
kind: BMCSettingsSet
metadata:
  name: ad-ldap-hpe-ilo
spec:
  clusterScope: true
  bmcSelector:
    matchLabels:
      manufacturer: hpe
  template:
    version: "1.68"
    serverMaintenancePolicy: Enforced
    settingsMap:
      # PATCH key — works correctly (idempotent)
      "PATCH /redfish/v1/AccountService": |
        {
          "ActiveDirectory": {
            "ServiceEnabled": true,
            "ServiceAddresses": ["ldaps.example.com"]
          }
        }
      # POST key — triggers infinite loop
      "POST /redfish/v1/Managers/1/SecurityService/HttpsCert/Certificates/": |
        {
          "CertificateType": "PEM",
          "CertificateString": "-----BEGIN CERTIFICATE-----\nMIID...snip...\n-----END CERTIFICATE-----"
        }

Expected behavior

After successfully POSTing the certificate (HTTP 201 Created), the controller should consider the POST key as applied and not attempt to re-POST on subsequent reconciles. The BMCSettings should transition to Applied state.

Actual behavior

The controller enters an infinite loop:

  1. POST certificate → succeeds (201)
  2. Verification: GET /redfish/v1/Managers/1/SecurityService/HttpsCert/Certificates/ returns a collection of member links ({"Members": [{"@odata.id": "/redfish/v1/.../1"}]}) — structurally different from the POST body
  3. isSubMap(specValue, getResponse) returns false → diff detected
  4. Next reconcile: re-enters apply branch, re-POSTs the certificate
  5. A new duplicate certificate resource is created on the BMC each cycle (or overwrites the existing one depending on vendor behavior)

Over time this either exhausts BMC resources (if certificates accumulate) or causes repeated HTTPS service restarts (if the certificate is replaced), both of which can destabilize the management controller.

Controller logs
# Every ~2 minutes, settings are re-applied:
{"level":"debug","msg":"BMC settings issued successfully","SettingKeys":["PATCH /redfish/v1/AccountService","POST /redfish/v1/Managers/1/SecurityService/HttpsCert/Certificates/"]}

# Verification always finds a diff:
{"level":"debug","msg":"Current BMC settings fetched","SettingKeys":["PATCH /redfish/v1/AccountService","POST /redfish/v1/Managers/1/SecurityService/HttpsCert/Certificates/"]}

# BMCSettings never reaches Applied — stays InProgress indefinitely

Environment

  • metal-operator version/commit: main (pre-PR-812)
  • Kubernetes version: v1.30
  • Deployment method: Helm
  • Hardware / BMC type: HPE iLO 6 firmware v1.68

Additional context

The root cause is in bmc/oem_helpers.go httpBasedGetBMCSettingAttribute() (~line 140). For ALL keys (including POST), it does:

resp, err := c.Get(parts[1])  // GETs the URI from the key
// ... then compares with isSubMap(specValue, getResponse)

Per the Redfish standard:

  • POST is not idempotent — it creates a new resource each time
  • The POST body (e.g., a PEM certificate string) is decomposed by the BMC into structured fields (Subject, Issuer, SerialNumber, etc.)
  • GET on the collection returns member links, not the original POST body
  • Password/key fields return null on GET (security requirement §13.2)

The correct verification for POST keys is the HTTP response code at apply time (201 Created = success). No GET-based comparison will ever match.

This issue interacts with the missing ConditionBMCResetPostSettingApply for HPE (Issue #857) — together they cause the controller to re-enter the apply branch every reconcile and re-POST indefinitely.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions