Skip to content

Add TTL support for temporary BMC users#798

Open
stefanhipfel wants to merge 5 commits into
mainfrom
feature/bmcuser-ttl
Open

Add TTL support for temporary BMC users#798
stefanhipfel wants to merge 5 commits into
mainfrom
feature/bmcuser-ttl

Conversation

@stefanhipfel
Copy link
Copy Markdown
Contributor

@stefanhipfel stefanhipfel commented Apr 10, 2026

This commit adds automatic expiration functionality for BMCUser resources, enabling temporary users for debugging, maintenance windows, and temporary access scenarios.

Features:

  • Dual expiration approach: TTL (relative) and ExpiresAt (absolute)
  • Immutable expiration time (calculated once at creation)
  • Automatic cleanup of both K8s objects and BMC accounts
  • Warning period with condition changes (Active -> ExpiringSoon -> Expired)

Summary by CodeRabbit

  • New Features

    • Temporary BMC users with automatic expiration via TTL or absolute timestamp; status shows controller-calculated expiration.
    • Expiration lifecycle conditions: Active, ExpiringSoon, Expired; pre-expiration warnings and coordinated requeueing.
  • Validation

    • Server-side validation enforcing TTL vs ExpiresAt mutual exclusivity, TTL limits, and future timestamps; update-time warnings when expiration was already calculated.
  • Documentation

    • Usage guide, README features entry, and sample manifests illustrating temporary user scenarios and monitoring.

@github-actions github-actions Bot added size/XXL api-change documentation Improvements or additions to documentation enhancement New feature or request labels Apr 10, 2026
@stefanhipfel stefanhipfel marked this pull request as ready for review April 21, 2026 12:14
@stefanhipfel stefanhipfel requested a review from a team as a code owner April 21, 2026 12:14
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

Warning

Rate limit exceeded

@stefanhipfel has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 58 minutes and 2 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6cabc11f-d01f-4079-9579-ed475a16f6c3

📥 Commits

Reviewing files that changed from the base of the PR and between 96608ca and 9ef2da6.

⛔ Files ignored due to path filters (2)
  • dist/chart/templates/crd/metal.ironcore.dev_bmcusers.yaml is excluded by !**/dist/**
  • dist/chart/templates/webhook/webhooks.yaml is excluded by !**/dist/**
📒 Files selected for processing (15)
  • README.md
  • api/v1alpha1/bmcuser_types.go
  • api/v1alpha1/zz_generated.deepcopy.go
  • cmd/main.go
  • config/crd/bases/metal.ironcore.dev_bmcusers.yaml
  • config/samples/metal_v1alpha1_bmcuser_temporary.yaml
  • config/webhook/manifests.yaml
  • docs/.vitepress/config.mts
  • docs/api-reference/api.md
  • docs/usage/bmcuser-ttl.md
  • internal/controller/bmcuser_controller.go
  • internal/controller/bmcuser_controller_test.go
  • internal/controller/suite_test.go
  • internal/webhook/v1alpha1/bmcuser_webhook.go
  • internal/webhook/v1alpha1/bmcuser_webhook_test.go
📝 Walkthrough

Walkthrough

Adds temporary-expiration support for BMCUser: new spec.ttl and spec.expiresAt (mutually exclusive), controller logic to calculate/persist status.expiresAt, condition lifecycle updates (Active/ExpiringSoon/Expired), validating webhook enforcing constraints and emitting warnings, CRD/schema updates, samples, docs, and tests.

Changes

Temporary BMCUser expiration (TTL / ExpiresAt)

Layer / File(s) Summary
API Type & Constants
api/v1alpha1/bmcuser_types.go
Add spec.TTL *metav1.Duration, spec.ExpiresAt *metav1.Time, status.ExpiresAt *metav1.Time, status.Conditions []metav1.Condition, XValidation mutual-exclusivity, printer column, and Active condition/reason constants.
DeepCopy
api/v1alpha1/zz_generated.deepcopy.go
Auto-generated DeepCopy support for new nullable fields (TTL, ExpiresAt, and Conditions).
CRD Schema & Printer
config/crd/bases/metal.ironcore.dev_bmcusers.yaml
Add spec.ttl (duration pattern), spec.expiresAt (date-time), status.expiresAt, status.conditions, printer column, and x-kubernetes-validations mutual-exclusivity.
Controller Reconciler
internal/controller/bmcuser_controller.go
Add Conditions *conditionutils.Accessor field; compute and persist status.ExpiresAt once; check expiry and delete expired users; update Active condition (Active/ExpiringSoon/Expired); coordinate reconcile timing with rotation scheduling; add helpers: calculateExpirationTime, checkExpiration, updateActiveCondition, checkAndUpdateExpirationWarning, coordinateRequeue.
Controller Tests & Suite Wiring
internal/controller/bmcuser_controller_test.go, internal/controller/suite_test.go
Add tests asserting TTL/ExpiresAt behavior, immutability of calculated status.ExpiresAt, nil behavior for permanent users; wire Conditions accessor into test manager setup.
Validating Webhook
internal/webhook/v1alpha1/bmcuser_webhook.go, config/webhook/manifests.yaml
Add BMCUserCustomValidator and SetupBMCUserWebhookWithManager; validate mutual exclusivity, future expiresAt, positive TTL ≤ 1 week; emit warnings on updates when status.expiresAt already calculated; register webhook manifest entry.
Webhook Tests
internal/webhook/v1alpha1/bmcuser_webhook_test.go
Unit/integration tests for create/update validation, warning emission, and helper change-detection functions.
Samples & Docs
config/samples/metal_v1alpha1_bmcuser_temporary.yaml, docs/usage/bmcuser-ttl.md, docs/api-reference/api.md, docs/.vitepress/config.mts, README.md
Add sample manifests demonstrating TTL/expiresAt scenarios; add detailed usage doc and API docs; update docs navigation; add Features entry in README.
Main Wiring
cmd/main.go
Wire Conditions accessor into BMCUserReconciler setup and register BMCUser webhook when ENABLE_WEBHOOKS ≠ "false".

Sequence Diagram

sequenceDiagram
    actor Operator as Operator
    participant KAPI as Kubernetes API
    participant WH as BMCUser Webhook
    participant Ctrl as BMCUser Controller
    participant Status as Status Subresource
    participant BMC as BMC System

    Operator->>KAPI: Create BMCUser (spec.ttl / spec.expiresAt)
    KAPI->>WH: Invoke ValidateCreate
    WH->>WH: Check ttl ⊕ expiresAt, ttl bounds, expiresAt > now
    WH-->>KAPI: Admit or Reject
    KAPI-->>Operator: Create/Reject response

    KAPI->>Ctrl: Reconcile BMCUser
    Ctrl->>Ctrl: calculateExpirationTime -> sets status.expiresAt (if unset)
    Ctrl->>Status: Patch status.expiresAt
    Ctrl->>Ctrl: checkExpiration (now > status.expiresAt?)
    alt Expired
        Ctrl->>Status: updateActiveCondition(Expired)
        Ctrl->>BMC: Delete remote BMC user
        Ctrl->>KAPI: Delete BMCUser resource
    else NotExpired
        Ctrl->>Ctrl: checkAndUpdateExpirationWarning -> Active / ExpiringSoon
        Ctrl->>Status: Patch condition
        Ctrl->>Ctrl: coordinateRequeue (rotation vs expiration)
        Ctrl-->>KAPI: RequeueAfter(next)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

highlight

Suggested reviewers

  • afritzler
  • nagadeesh-nagaraja
  • davidgrun
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description provides a clear overview of features and behavior but does not follow the repository's required template structure with 'Proposed Changes' section and 'Fixes #' reference. Restructure the description to follow the template: use 'Proposed Changes' section with bullet points and include 'Fixes #' issue reference.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly and concisely summarizes the main feature: adding TTL (time-to-live) support for temporary BMC users, which is the primary objective of this changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/bmcuser-ttl

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
internal/controller/suite_test.go (1)

322-333: Reuse the shared accessor variable.

All other reconcilers in this function use the accessor declared at line 160. Creating a fresh one here is harmless but inconsistent.

♻️ Proposed change
 		Expect((&BMCUserReconciler{
 			Client:             k8sManager.GetClient(),
 			Scheme:             k8sManager.GetScheme(),
 			DefaultProtocol:    metalv1alpha1.HTTPProtocolScheme,
 			SkipCertValidation: true,
-			Conditions:         conditionutils.NewAccessor(conditionutils.AccessorOptions{}),
+			Conditions:         accessor,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/suite_test.go` around lines 322 - 333, The test creates a
new condition accessor for BMCUserReconciler but other reconcilers reuse the
shared accessor variable declared earlier; update the BMCUserReconciler setup to
use the existing accessor instead of creating a fresh one by replacing the
Conditions: conditionutils.NewAccessor(...) with Conditions: accessor (reference
the accessor variable) when constructing BMCUserReconciler in the
SetupWithManager call so the test is consistent with other reconcilers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/v1alpha1/bmcuser_types.go`:
- Around line 44-49: The ExpiresAt field comment is ambiguous about permanence;
update the doc for ExpiresAt (the ExpiresAt *metav1.Time field) to mirror the
TTL comment style and make clear that when neither ExpiresAt nor TTL is set the
user is permanent and that ExpiresAt and TTL are mutually exclusive (only one
should be set). Reword the sentence "If not set along with TTL, the user is
permanent." to something like: "If neither ExpiresAt nor TTL is set, the user is
permanent." and keep the note about mutual exclusivity.

In `@config/crd/bases/metal.ironcore.dev_bmcusers.yaml`:
- Around line 139-196: status.conditions in the generated CRD is missing
type-keyed list markers so the API server treats it as a plain array; update the
Go API type (BMCUserStatus.Conditions) to add the standard kubebuilder markers
(e.g. // +listType=map and // +listMapKey=type) on the Conditions field in the
BMCUserStatus struct, then run the CRD generator (make manifests) to regenerate
the YAML rather than editing config/crd/bases/*.yaml by hand; ensure the field
name remains Conditions and the element type matches the Condition struct used
by the controller.

In `@config/samples/metal_v1alpha1_bmcuser_temporary.yaml`:
- Around line 38-39: The sample manifest uses a past absolute expiration
"expiresAt" which will fail admission; update the example by either replacing
expiresAt with a far-future placeholder timestamp (e.g., "2099-01-01T00:00:00Z")
or convert the example to use a TTL field (e.g., add a "ttl" key with a duration
like "168h") so the sample remains applyable; locate the expiresAt entry in the
manifest and make the change there.

In `@docs/usage/bmcuser-ttl.md`:
- Around line 32-43: Example BMCUser manifests use past expiresAt timestamps
causing webhook rejection; update the expiresAt fields in the BMCUser examples
(e.g., the manifest with kind: BMCUser and metadata name: maintenance-user and
any other BMCUser examples) to a future ISO8601/RFC3339 timestamp (for example
2030-01-01T00:00:00Z) so the manifests pass the “must be in the future”
validation, and apply the same replacement to any other occurrences of expired
timestamps in the examples.

In `@internal/controller/bmcuser_controller.go`:
- Around line 641-647: The requeue logic in bmcuser_controller.go currently
always sets expirationRequeue to 5 * time.Minute during the warning period which
can delay deletion past status.ExpiresAt; change the branch that runs when
timeUntilExpiration <= warningThreshold to set expirationRequeue to the sooner
of 5 minutes and timeUntilExpiration (e.g., expirationRequeue =
min(5*time.Minute, timeUntilExpiration)), ensuring you handle zero/negative
timeUntilExpiration by clamping to a small positive duration or immediate
requeue so expired BMC users are reconciled at or before status.ExpiresAt.

In `@internal/webhook/v1alpha1/bmcuser_webhook.go`:
- Around line 39-94: In ValidateUpdate of BMCUserCustomValidator apply the same
strict validations as ValidateCreate: enforce mutual exclusivity of Spec.TTL and
Spec.ExpiresAt and return an error if violated, validate Spec.TTL is >0 and <=
10 years, and validate Spec.ExpiresAt (if set) is in the future; additionally,
when oldObj.Status.ExpiresAt != nil produce warnings for any change to TTL or
ExpiresAt including removal (nil), switching between TTL and ExpiresAt, or
modification of values (not just differing non-nil values) so that removal or
switching after an expiration was calculated also yields a warning.

---

Nitpick comments:
In `@internal/controller/suite_test.go`:
- Around line 322-333: The test creates a new condition accessor for
BMCUserReconciler but other reconcilers reuse the shared accessor variable
declared earlier; update the BMCUserReconciler setup to use the existing
accessor instead of creating a fresh one by replacing the Conditions:
conditionutils.NewAccessor(...) with Conditions: accessor (reference the
accessor variable) when constructing BMCUserReconciler in the SetupWithManager
call so the test is consistent with other reconcilers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ac2cc53b-da49-4748-ae8d-07155ec7ddb6

📥 Commits

Reviewing files that changed from the base of the PR and between 87a9fe2 and c4ed27c.

⛔ Files ignored due to path filters (2)
  • dist/chart/templates/crd/metal.ironcore.dev_bmcusers.yaml is excluded by !**/dist/**
  • dist/chart/templates/webhook/webhooks.yaml is excluded by !**/dist/**
📒 Files selected for processing (14)
  • README.md
  • api/v1alpha1/bmcuser_types.go
  • api/v1alpha1/zz_generated.deepcopy.go
  • cmd/main.go
  • config/crd/bases/metal.ironcore.dev_bmcusers.yaml
  • config/samples/metal_v1alpha1_bmcuser_temporary.yaml
  • config/webhook/manifests.yaml
  • docs/.vitepress/config.mts
  • docs/api-reference/api.md
  • docs/usage/bmcuser-ttl.md
  • internal/controller/bmcuser_controller.go
  • internal/controller/bmcuser_controller_test.go
  • internal/controller/suite_test.go
  • internal/webhook/v1alpha1/bmcuser_webhook.go

Comment thread api/v1alpha1/bmcuser_types.go
Comment thread config/crd/bases/metal.ironcore.dev_bmcusers.yaml
Comment thread config/samples/metal_v1alpha1_bmcuser_temporary.yaml Outdated
Comment thread docs/usage/bmcuser-ttl.md
Comment thread internal/controller/bmcuser_controller.go
Comment thread internal/webhook/v1alpha1/bmcuser_webhook.go Outdated
Copy link
Copy Markdown
Member

@afritzler afritzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of concerns with the current expiration approach.

Hard deletion on expiration feels too aggressive. Once the TTL fires, the BMC account is removed from the hardware immediately. If someone is still mid-debugging session, they lose access with no way to recover. And since the expiration is immutable, they can't extend it either. The Expired condition is also effectively unobservable: it gets set and then r.Delete() is called in the same reconcile, so by the time any monitoring could react, the object is already gone.

I think a two-phase approach would be safer:

  1. On expiration: disable the account on the BMC, keep the K8s object around
  2. After a grace period (or manual cleanup): actually delete (I would delegate the cleanup in the first step to somebody else)

This gives operators time to react, makes the ExpiringSoon warning actually actionable, and preserves an audit trail of who had access and when.

Why a condition instead of a state? The lifecycle here is linear: Active -> ExpiringSoon -> Expired. That's a state machine, not an independent observation about the resource. A status.state field would model this more naturally than an Active condition that's True with reason ExpiringSoon and then False with reason Expired. Conditions are better suited for orthogonal signals (like Ready, Degraded) that can be true/false independently of each other.

Comment thread README.md
Comment on lines +15 to +23
## Features

- **Automated Server Discovery**: Discover and manage bare metal servers through BMC interfaces
- **Server Provisioning**: Automated server boot configuration and provisioning
- **BIOS and BMC Management**: Configure and update BIOS/BMC settings and firmware
- **Temporary Users with TTL**: Automatic expiration and cleanup of temporary BMC users for debugging and maintenance
- **Server Claims**: Reserve and claim servers for specific workloads
- **Server Maintenance**: Coordinate maintenance operations across server fleets

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this feature out of the landing page (for now).

Comment thread docs/usage/bmcuser-ttl.md
@@ -0,0 +1,185 @@
# Temporary BMC Users with TTL
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the expiration of users to the bmsuser.md?

collapsed: false,
items: [
{ text: 'metalctl', link: '/usage/metalctl' },
{ text: 'Temporary BMC Users', link: '/usage/bmcuser-ttl' },
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the actual doc file: I would move the doc to the concept section of BMCUser.

@stefanhipfel
Copy link
Copy Markdown
Contributor Author

stefanhipfel commented Apr 23, 2026

I have a couple of concerns with the current expiration approach.

Hard deletion on expiration feels too aggressive. Once the TTL fires, the BMC account is removed from the hardware immediately. If someone is still mid-debugging session, they lose access with no way to recover. And since the expiration is immutable, they can't extend it either. The Expired condition is also effectively unobservable: it gets set and then r.Delete() is called in the same reconcile, so by the time any monitoring could react, the object is already gone.

I think a two-phase approach would be safer:

  1. On expiration: disable the account on the BMC, keep the K8s object around
  2. After a grace period (or manual cleanup): actually delete (I would delegate the cleanup in the first step to somebody else)

This gives operators time to react, makes the ExpiringSoon warning actually actionable, and preserves an audit trail of who had access and when.

Why a condition instead of a state? The lifecycle here is linear: Active -> ExpiringSoon -> Expired. That's a state machine, not an independent observation about the resource. A status.state field would model this more naturally than an Active condition that's True with reason ExpiringSoon and then False with reason Expired. Conditions are better suited for orthogonal signals (like Ready, Degraded) that can be true/false independently of each other.

TTL:
I would however expect a Deletion when using TTL, because this is exactly what it means: time to live. I am afraid of having many disabled users in our bmc after a while, that nobody knows about. (And no one dares to delete) Most would just create a new user, once the old one is deleted/disabled anyway. I don't see the benefit of disabling and needing a clean up process.
As a compromise, I would rather keep the BMCUser CRD around with state expired maybe? But I would prefer the User to be deleted on the BMC.

State:
We discussed state some time ago. I think it should come from the BMC itself, e.g. Active, Disabled etc, hence I decided to use Conditions instead, otherwise we would overwrite them.
I could just remove the conditions. BMC User states could be implemented in a separate PR which fetches the actual state from the BMC.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
internal/webhook/v1alpha1/bmcuser_webhook.go (1)

38-64: 💤 Low value

LGTM - correct expiration validation logic.

The validation correctly enforces:

  • Mutual exclusivity of TTL and ExpiresAt
  • Future-date requirement for ExpiresAt (using time.Now() is appropriate for admission webhooks)
  • TTL bounds (positive and ≤ 1 week)

Error messages follow Kubernetes API conventions.

♻️ Optional: Extract max TTL as a constant

Line 58 hardcodes the maximum TTL. Extracting it as a package constant would improve maintainability:

+const maxBMCUserTTL = 168 * time.Hour // 1 week
+
 func validateBMCUserExpirationSpec(spec *metalv1alpha1.BMCUserSpec) error {
   // ...
   if spec.TTL != nil {
     if spec.TTL.Duration <= 0 {
       return fmt.Errorf("spec.ttl must be positive")
     }
-    if spec.TTL.Duration > 168*time.Hour { // 1 week
+    if spec.TTL.Duration > maxBMCUserTTL {
       return fmt.Errorf("spec.ttl exceeds maximum of 1 week")
     }
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/webhook/v1alpha1/bmcuser_webhook.go` around lines 38 - 64, The TTL
max (168*time.Hour) in validateBMCUserExpirationSpec is hardcoded; extract it as
a package-level constant (e.g., MaxBMCUserTTL or MaxTTLDuration = 168*time.Hour)
and replace the literal in the TTL validation (the comparison in
validateBMCUserExpirationSpec and its comment/error message) with that constant
so the limit is centralized and maintainable.
internal/webhook/v1alpha1/bmcuser_webhook_test.go (1)

113-269: 💤 Low value

LGTM - thorough update validation and warning coverage.

The ValidateUpdate tests correctly verify:

  • Rejection of invalid spec values on update
  • Warning emission when TTL or ExpiresAt changes after status.ExpiresAt is calculated
  • All field-change scenarios including removal and mode switching

The test at lines 241-268 correctly expects 2 warnings when switching from TTL to ExpiresAt after expiration was calculated (both fields changed).

🧹 Optional test hygiene improvement

In real webhook admission, newObj would not include status fields (users only update spec). The tests set Status on newUser (lines 175-177, 204-206, 230-232, 258-260), which is harmless since the webhook never reads newObj.Status, but removing it would make tests more realistic:

 newUser := &metalv1alpha1.BMCUser{
   Spec: metalv1alpha1.BMCUserSpec{
     TTL: &metav1.Duration{Duration: 12 * time.Hour},
   },
-  Status: metalv1alpha1.BMCUserStatus{
-    ExpiresAt: &expiresAt,
-  },
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/webhook/v1alpha1/bmcuser_webhook_test.go` around lines 113 - 269,
Tests in ValidateUpdate set Status on newUser even though admission only reads
spec; remove assignments to newUser.Status in the warning test cases to make
them realistic while keeping assertions the same. Locate the tests named "should
warn when TTL changes after expiration calculated", "should warn when ExpiresAt
changes after expiration calculated", "should warn when TTL is removed after
expiration calculated", and "should warn when switching from TTL to ExpiresAt
after expiration calculated" and delete the Status:
metalv1alpha1.BMCUserStatus{ExpiresAt: &expiresAt} (and similar) from the
newUser declarations; leave oldUser.Status intact and keep calls to
validator.ValidateUpdate and the existing Expect assertions unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@internal/webhook/v1alpha1/bmcuser_webhook_test.go`:
- Around line 113-269: Tests in ValidateUpdate set Status on newUser even though
admission only reads spec; remove assignments to newUser.Status in the warning
test cases to make them realistic while keeping assertions the same. Locate the
tests named "should warn when TTL changes after expiration calculated", "should
warn when ExpiresAt changes after expiration calculated", "should warn when TTL
is removed after expiration calculated", and "should warn when switching from
TTL to ExpiresAt after expiration calculated" and delete the Status:
metalv1alpha1.BMCUserStatus{ExpiresAt: &expiresAt} (and similar) from the
newUser declarations; leave oldUser.Status intact and keep calls to
validator.ValidateUpdate and the existing Expect assertions unchanged.

In `@internal/webhook/v1alpha1/bmcuser_webhook.go`:
- Around line 38-64: The TTL max (168*time.Hour) in
validateBMCUserExpirationSpec is hardcoded; extract it as a package-level
constant (e.g., MaxBMCUserTTL or MaxTTLDuration = 168*time.Hour) and replace the
literal in the TTL validation (the comparison in validateBMCUserExpirationSpec
and its comment/error message) with that constant so the limit is centralized
and maintainable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6e36d72c-1cd3-4559-92aa-055de58739dc

📥 Commits

Reviewing files that changed from the base of the PR and between c4ed27c and 96608ca.

⛔ Files ignored due to path filters (1)
  • dist/chart/templates/crd/metal.ironcore.dev_bmcusers.yaml is excluded by !**/dist/**
📒 Files selected for processing (9)
  • api/v1alpha1/bmcuser_types.go
  • config/crd/bases/metal.ironcore.dev_bmcusers.yaml
  • config/samples/metal_v1alpha1_bmcuser_temporary.yaml
  • docs/api-reference/api.md
  • docs/usage/bmcuser-ttl.md
  • internal/controller/bmcuser_controller.go
  • internal/controller/suite_test.go
  • internal/webhook/v1alpha1/bmcuser_webhook.go
  • internal/webhook/v1alpha1/bmcuser_webhook_test.go
✅ Files skipped from review due to trivial changes (1)
  • docs/usage/bmcuser-ttl.md
🚧 Files skipped from review as they are similar to previous changes (6)
  • internal/controller/suite_test.go
  • docs/api-reference/api.md
  • config/samples/metal_v1alpha1_bmcuser_temporary.yaml
  • config/crd/bases/metal.ironcore.dev_bmcusers.yaml
  • api/v1alpha1/bmcuser_types.go
  • internal/controller/bmcuser_controller.go

@stefanhipfel stefanhipfel force-pushed the feature/bmcuser-ttl branch 2 times, most recently from f025d16 to 502c84c Compare May 11, 2026 13:14
This commit adds automatic expiration functionality for BMCUser resources,
enabling temporary users for debugging, maintenance windows, and temporary
access scenarios.

Features:
- Dual expiration approach: TTL (relative) and ExpiresAt (absolute)
- Immutable expiration time (calculated once at creation)
- Automatic cleanup of both K8s objects and BMC accounts
- Warning period with condition changes (Active -> ExpiringSoon -> Expired)
- Smart requeue coordination with password rotation
- Webhook validation for mutual exclusivity and valid values
- Fully backward compatible with existing permanent users

API Changes:
- Added TTL and ExpiresAt fields to BMCUserSpec (mutually exclusive)
- Added ExpiresAt and Conditions to BMCUserStatus
- Added condition constants and types
- Added ExpiresAt printer column for kubectl output

Controller Changes:
- Implemented expiration calculation (once at creation)
- Implemented expiration checking and automatic deletion
- Implemented condition management with warning periods
- Coordinated requeue timing between rotation and expiration
- Added Conditions accessor to BMCUserReconciler

Webhook Changes:
- Created BMCUser validation webhook
- Validates mutual exclusivity of TTL/ExpiresAt
- Validates future dates and reasonable values
- Warns on changes to immutable fields

Testing:
- Added 4 comprehensive unit tests for TTL functionality
- All 105/105 tests passing
- Test coverage maintained

Documentation:
- Added comprehensive usage guide (docs/usage/bmcuser-ttl.md)
- Added examples for debugging, maintenance, and vendor access
- Updated README.md with Features section
- Linked documentation in VitePress navigation
- Added sample YAMLs for temporary users

Files modified:
- api/v1alpha1/bmcuser_types.go
- internal/controller/bmcuser_controller.go
- internal/controller/bmcuser_controller_test.go
- internal/controller/suite_test.go
- cmd/main.go
- config/crd/bases/metal.ironcore.dev_bmcusers.yaml
- config/webhook/manifests.yaml
- dist/chart/templates/* (Helm chart updates)
- README.md
- docs/.vitepress/config.mts

Files added:
- internal/webhook/v1alpha1/bmcuser_webhook.go
- config/samples/metal_v1alpha1_bmcuser_temporary.yaml
- docs/usage/bmcuser-ttl.md

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
This commit fixes two CI failures:

1. Test failures - TTL tests were not properly cleaning up BMCSecrets
   - BMCUsers create secrets with owner references
   - envtest has no garbage collector, so secrets persist after user deletion
   - Added proper cleanup using owner reference matching (like existing rotation test)
   - All 4 new TTL tests now pass cleanly

2. Code generation check failure - API docs were stale
   - Regenerated docs/api-reference/api.md to include TTL/ExpiresAt fields
   - Docs now include spec.ttl, spec.expiresAt, status.expiresAt, status.conditions

Test results:
- All 9 BMCUser tests pass (5 existing + 4 new TTL tests)
- AfterEach cleanup validates no leaked resources

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Resolves all 7 CodeRabbit AI issues from PR review:

1. Fix controller requeue bug that could delay expiration past status.ExpiresAt
   - Change warning period requeue to min(5min, timeUntilExpiration)

2. Refactor webhook validation to close validation gaps
   - Extract validateBMCUserExpirationSpec helper for DRY code
   - Add ttlChanged and expiresAtChanged helpers for nil-safe comparisons
   - Apply full validation on updates (future dates, TTL ranges)
   - Warn on all expiration field changes (removals, switches, modifications)
   - Add comprehensive test coverage (13 new test cases, 346 lines)

3. Add missing CRD list markers for proper API server merge behavior
   - Add +listType=map and +listMapKey=type to Conditions field

4. Update sample YAML timestamps from 2026 to 2099 to prevent validation failures

5. Update documentation example timestamps from 2026 to 2099

6. Clarify ExpiresAt field documentation for permanence condition

7. Fix test suite consistency by reusing shared accessor variable

Additional improvement:
- Reduce TTL maximum from 10 years to 1 week for more reasonable temporary access

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Include updated BMCUser CRD in Helm chart with:
- List type markers for conditions array
- Updated validation for 1 week TTL maximum

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Fixes the CI "Run code generation checks" failure by regenerating the
API reference documentation to match the updated ExpiresAt field comment
from commit 2f8f841.

Changes:
- Run 'make docs' to regenerate docs/api-reference/api.md
  - Updates ExpiresAt description from "If not set along with TTL"
    to "If neither ExpiresAt nor TTL is set" for clarity
- Update example timestamps in docs/usage/bmcuser-ttl.md
  - Change from past dates (2026-04-10/11) to future dates (2026-05-18/19)
  - Maintains 30-minute warning period context in examples

All functional bugs were already fixed in commit 2f8f841 - this is
purely documentation cleanup to unblock CI.

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
@stefanhipfel stefanhipfel force-pushed the feature/bmcuser-ttl branch from 502c84c to 9ef2da6 Compare May 11, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change area/metal-automation documentation Improvements or additions to documentation enhancement New feature or request size/XXL

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants