Skip to content

Dashboard enrichment follow-ups: pending-count index + DNS-error visibility #129

Description

@jiashuoz

Follow-up from #114 review. Two medium-severity items that don't block but are worth cleaning up.

1. Missing supporting index for the PendingCount subquery in `ListAgentsByUser`

The enriched-agent path runs five correlated subqueries per agent row (identity/store.go:584-630). Four of them hit good indexes; the PendingCount subquery does not.

```sql
(SELECT count(*) FROM messages
WHERE agent_id = a.id AND status = 'pending_approval')
```

Existing indexes that the planner can use:

  • `idx_messages_agent_created(agent_id, created_at DESC)` — leading column matches, but `status` is residual.
  • `idx_messages_pending_approval(approval_expires_at) WHERE status='pending_approval'` — partial, but leading column is wrong for an agent lookup.

In practice the planner falls back to `idx_messages_agent_created` and filters residually. For users with normal pending volumes this is fine. For users with many agents × growing archival pending volume, this could degrade list-agents latency over time.

Author already acknowledged the cost concern in #114's commit body ("Switch to denormalized columns if read cost ever bites"). Two cleaner fixes:

Option A (cheapest): add a composite index.
```sql
CREATE INDEX IF NOT EXISTS idx_messages_agent_pending
ON messages (agent_id)
WHERE status = 'pending_approval';
```
Partial index on the pending subset, leading column `agent_id` — exactly what the subquery wants. Idempotent.

Option B (denormalize): cache the count on `agent_identities` with triggers or a periodic refresh job. More moving parts; defer until A proves insufficient.

Recommend A as a 5-line migration.

2. `handleVerifyDomain` silently swallows DNS lookup errors

`internal/agent/api.go` — `checkDomainRecords` was reshaped in #114 to return a per-record diagnostic. Previously, `net.LookupTXT` failures short-circuited with HTTP 400 ("DNS lookup failed for X: ..."). Now the error is swallowed:

```go
if txts, err := net.LookupTXT(domain); err == nil {
// ... process records ...
}
// no else branch — err is dropped
```

Same pattern for `net.LookupMX`. If TXT is genuinely missing, the response is 412 with `mx/spf` all "missing" — fine for the UI which renders chips. But true infrastructure failures (network down, resolver returning SERVFAIL, etc.) now look identical to "TXT record missing" → 412 with same shape. CLI and SDK callers lose the distinction.

Fix: preserve the structured diagnostic but surface the lookup error. Two options:

Option A: add a top-level field.
```json
{
"mx": "missing",
"spf": "missing",
"dkim": "deferred",
"verified": false,
"dns_error": "lookup_failed: dial tcp ..." // optional, only set on err
}
```

Option B: tri-state per record.
```go
type RecordStatus string
const (
RecordFound RecordStatus = "found"
RecordMissing RecordStatus = "missing"
RecordError RecordStatus = "error" // new
RecordDeferred RecordStatus = "deferred"
)
```
Per-record granularity (TXT errors don't affect MX lookup status, etc.) — but more surface area.

Option A is the lower-friction choice. Also: log the actual `err` server-side regardless (the current code drops it on the floor).

Effort

Both small — A1 is a 5-line migration + index, A2 is ~10 lines of Go + a new optional field in the response struct. Could land as one PR.

Test coverage gaps

  • No test that exercises the real-DNS failure path on `handleVerifyDomain`. Hard without DNS mocking — acceptable as long as the new error field is at least visible to the operator via log.
  • No explicit negative-window test for `?window=-5` on `/api/dashboard/stats` (Atoi succeeds, `<= 0` falls back to default). Behavior is correct but unpinned.

Existing mitigations

  • The PR's commit body acknowledges the cost concern on (1) — this is documented, just unaddressed.
  • DKIM "deferred" is correctly handled end-to-end (backend hardcodes; UI hides the row) so (2) doesn't propagate to that surface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions