Skip to content

Revert "feat(controller): pair with controller for channel/device pairing"#1179

Closed
evanjacobson wants to merge 1 commit intomainfrom
revert-1095-feature/kiloclaw-pair-with-controller
Closed

Revert "feat(controller): pair with controller for channel/device pairing"#1179
evanjacobson wants to merge 1 commit intomainfrom
revert-1095-feature/kiloclaw-pair-with-controller

Conversation

@evanjacobson
Copy link
Contributor

Reverts #1095

Auto-merge was configured and the approval triggered merge before Florian could review.

@evanjacobson
Copy link
Contributor Author

We will roll-forward if anything comes up in the post-review for #1095

@@ -588,23 +586,6 @@ export class KiloClawInstance extends DurableObject<KiloClawEnv> {
// ── Lifecycle ───────────────────────────────────────────────────────

async start(userId?: string): Promise<void> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Concurrent start() calls can create duplicate machines

This method no longer has any in-flight guard, so startAsync() and a direct RPC start() can overlap after the first await. If both calls observe flyMachineId === null, they can each fall through to machine creation and orphan a duplicate Fly machine.

}
}

await flyMachines.ensureVolume(flyConfig, this.ctx, this.s, this.env, 'start');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Missing metadata recovery reintroduces duplicate-machine starts

If DO storage loses flyMachineId but the user's machine still exists on Fly, start() now goes straight into volume/machine creation. The removed attemptMetadataRecovery(...) call was the safeguard that reattached to the existing machine instead of provisioning a second one.

// so teardown wins. We bypass loadState() because it no-ops when already loaded.
const currentStatus = await this.ctx.storage.get('status');
if (!currentStatus || currentStatus === 'destroying') {
if (currentStatus === 'destroying') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Deleted instances can be resurrected here

finalizeDestroyIfComplete() clears storage with deleteAll(), which makes ctx.storage.get('status') return undefined. This guard used to bail out on both undefined and 'destroying'; with only the 'destroying' check left, a background start() can repersist status: 'running' after the instance was fully deleted.

})
);
} else {
// Transient Fly API error — leave in 'starting', alarm will retry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Timed-out starts can get stuck in starting forever

After STARTING_TIMEOUT_MS, a transient getMachine() failure now just logs and returns. That leaves startingAt intact and never transitions back to stopped, so the user cannot retry until Fly happens to respond again.

sql`(${kiloclaw_subscriptions.status} IS NULL OR ${kiloclaw_subscriptions.status} NOT IN ('active', 'trialing'))`,
isNull(kiloclaw_subscriptions.suspended_at)
)
sql`(${kiloclaw_subscriptions.status} IS NULL OR ${kiloclaw_subscriptions.status} NOT IN ('active', 'trialing'))`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Suspended users are now included in earlybird warning emails

Dropping the suspended_at IS NULL filter means accounts already in the suspension/destruction flow also match this query. They'll get a claw_earlybird_* warning even though they're already suspended, which is a user-visible regression from the previous lifecycle behavior.

@kilo-code-bot
Copy link
Contributor

kilo-code-bot bot commented Mar 17, 2026

Code Review Summary

Status: 5 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 5
SUGGESTION 0

Fix these issues in Kilo Cloud

Issue Details (click to expand)

WARNING

File Line Issue
kiloclaw/src/durable-objects/kiloclaw-instance/index.ts 588 Concurrent start() calls can create duplicate Fly machines.
kiloclaw/src/durable-objects/kiloclaw-instance/index.ts 613 Removing metadata recovery reintroduces duplicate-machine starts when flyMachineId is lost.
kiloclaw/src/durable-objects/kiloclaw-instance/index.ts 765 Background start() can repersist state after deleteAll() and resurrect a deleted instance.
kiloclaw/src/durable-objects/kiloclaw-instance/reconcile.ts 329 Timed-out starts now stay stuck in starting after transient Fly API failures.
src/lib/kiloclaw/billing-lifecycle-cron.ts 191 Earlybird warning sweep now includes already-suspended accounts.
Other Observations (not in diff)

N/A

Files Reviewed (17 files)
  • kiloclaw/controller/src/index.ts - 0 issues
  • kiloclaw/controller/src/pairing-cache.test.ts - 0 issues
  • kiloclaw/controller/src/pairing-cache.ts - 0 issues
  • kiloclaw/controller/src/routes/pairing.test.ts - 0 issues
  • kiloclaw/controller/src/routes/pairing.ts - 0 issues
  • kiloclaw/controller/src/supervisor.test.ts - 0 issues
  • kiloclaw/controller/src/supervisor.ts - 0 issues
  • kiloclaw/src/durable-objects/gateway-controller-types.ts - 0 issues
  • kiloclaw/src/durable-objects/kiloclaw-instance.test.ts - 0 issues
  • kiloclaw/src/durable-objects/kiloclaw-instance/gateway.ts - 0 issues
  • kiloclaw/src/durable-objects/kiloclaw-instance/index.ts - 3 issues
  • kiloclaw/src/durable-objects/kiloclaw-instance/pairing.ts - 0 issues
  • kiloclaw/src/durable-objects/kiloclaw-instance/reconcile.ts - 1 issue
  • src/app/(app)/claw/components/modelSupport.ts - 0 issues
  • src/lib/kiloclaw/billing-lifecycle-cron.ts - 1 issue
  • src/lib/kiloclaw/stripe-handlers.ts - 0 issues
  • src/routers/kiloclaw-router.ts - 0 issues

Reviewed by gpt-5.4-20260305 · 2,144,195 tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant