Skip to content

Fleet storage-failure cluster post-OTA (ANTHIAS-23/24/25/26/27/28/2B/1H/1M/29/1E/1F) + ASGI websocket race (1K/1N) #3035

@vpetersson

Description

@vpetersson

Failing-storage cluster — hardware, not a code regression

A spread of I/O errors across several distinct devices, all downstream of dying SD cards / disks (not fixable in app code; the whitenoise fix #3026 only covers the staticfiles-scan EIO):

  • One pi4-64, server 564343794b17 — ANTHIAS-23/24/25/26/27 are all this single device: [Errno 5] I/O error reading app files, unable to open database file, redis MISCONF Errors writing to the AOF file: Input/output error. Comprehensive storage failure.
  • pi3-64 c2f30254297c — ANTHIAS-1H ([Errno 5] on /venv/lib/...) and ANTHIAS-1M (No module named 'gi') — 1M is a symptom of the same EIO (the import failed because the fs couldn't read the module), not a real packaging bug.
  • pi3 e6cf413fb85d — ANTHIAS-28 database disk image is malformed.
  • pi4-64 f02cb9d2cd6a — ANTHIAS-2B [Errno 5].
  • non-balena x86 e33e2b310559 — ANTHIAS-1E/1F database is locked (the Residual SQLite 'database is locked' at connect-time under heavy Channels load on x86 (Sentry ANTHIAS-1E/1F) #3029 box).
  • non-balena x86 shivaguru-LO… — ANTHIAS-29 unable to open database file.

Pattern worth noting: this cluster appeared right after the OTA roll. Each OTA rewrites image layers, which stresses marginal SD cards into failure — the same mechanism as ANTHIAS-Y. Action: fleet-health triage (identify + reflash/replace these devices), not a code change. Resolving these in Sentry would be misleading; leaving open as a health signal.

ANTHIAS-1K / 1N — ASGI websocket lifecycle race (x86)

RuntimeError: Unexpected ASGI message 'websocket.send', after sending 'websocket.close' and the mirror Expected ASGI message 'websocket.send' or 'websocket.close', in anthias_server.app.consumers. The asset-update consumer sends on a socket the client already closed. Guard the send / treat a closed socket as a normal disconnect. Low volume, on the new build.

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions