Skip to content

fix(server): stream the backup as an async iterator so it doesn't time out#3074

Merged
vpetersson merged 3 commits into
masterfrom
fix/backup-stream-async-asgi
Jun 12, 2026
Merged

fix(server): stream the backup as an async iterator so it doesn't time out#3074
vpetersson merged 3 commits into
masterfrom
fix/backup-stream-async-asgi

Conversation

@vpetersson

@vpetersson vpetersson commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Issues Fixed

Fixes issue #3073 — "Unable to Get backup since recent updates" (download dialog appears, body sits at 0 B, then times out on Pi 3B+/Pi 4B running v2026.06.3).

Description

The streaming backup added in #3005 never actually streamed under ASGI. StreamingHttpResponse only streams an asynchronous iterator; handed the synchronous stream_backup() generator, Django 5.2's __aiter__ falls into its sync-iterator branch and does await sync_to_async(list)(...) — draining the whole generator (building the entire tar.gz, buffered chunk-by-chunk in a RAM list) before the first response byte goes out. The headers (and so the browser download prompt) ship immediately, then the body stays at 0 B for the multi-minute build and the browser aborts. This is the exact failure #3005 set out to fix, plus an OOM risk on a 1 GB Pi with a multi-GB library.

The unit tests only called the helper directly, so they never traversed Django's ASGI __aiter__ and didn't catch it.

Fix:

  • Add astream_backup(), an async wrapper that pulls each chunk from the existing producer/pipe generator via sync_to_async(next, thread_sensitive=False) so bytes flow as the archive is built and the memory footprint stays flat. thread_sensitive=False keeps the blocking pipe read off Django's single shared sync executor.
  • On client disconnect Django aclose()s the async generator; we close the underlying sync generator so the producer thread stops taring instead of leaking.
  • Point the download view at astream_backup() (so response.is_async is True and Django takes its real streaming path).
  • Add a regression test that drives aiter(StreamingHttpResponse(...)) — the real ASGI consumption path — and round-trips through recover().

Checklist

  • I have performed a self-review of my own code.
  • New and existing unit tests pass locally with my changes.
  • I have done an end-to-end test for Raspberry Pi devices (Pi 4B — see verification comment).
  • I have tested my changes for x86 devices.
  • I added a documentation for the changes I have made (when necessary).

🤖 Generated with Claude Code

…e out

StreamingHttpResponse only streams an asynchronous iterator under ASGI.
Handed the sync stream_backup() generator, Django's __aiter__ falls back
to `await sync_to_async(list)(...)`, which drains the whole generator —
building the entire archive (buffered in RAM) before the first response
byte. That silently reintroduced the 0-bytes-then-timeout failure the
streaming path was meant to fix and risks OOM on a 1 GB Pi (issue #3073).

- add astream_backup(): async wrapper pulling each chunk via
  sync_to_async(thread_sensitive=False) so bytes flow as the tar builds
- close the underlying sync generator on disconnect to stop the producer
- point the download view at astream_backup() (is_async path)
- regression test drives aiter(StreamingHttpResponse(...)) — the real
  ASGI consumption path the unit-level test never exercised

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vpetersson vpetersson requested a review from a team as a code owner June 12, 2026 08:38
@vpetersson vpetersson self-assigned this Jun 12, 2026
@vpetersson vpetersson requested a review from Copilot June 12, 2026 08:38

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes backup downloads timing out under ASGI by ensuring the backup stream is an async iterator, so Django’s ASGI streaming path sends bytes as the archive is produced (instead of buffering the entire generator in memory first).

Changes:

  • Add astream_backup() as an async wrapper around the existing stream_backup() generator.
  • Update the settings backup download view to use astream_backup() so ASGI truly streams.
  • Add a regression test that iterates StreamingHttpResponse via the ASGI aiter(...) consumption path and validates recovery.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/test_backup_helper.py Adds an ASGI-path regression test validating async streaming + recover() round-trip.
src/anthias_server/lib/backup_helper.py Introduces astream_backup() async generator wrapper using sync_to_async to pull chunks.
src/anthias_server/app/views.py Switches the backup download endpoint from stream_backup() to astream_backup() and updates docstring.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/anthias_server/lib/backup_helper.py Outdated
@vpetersson

Copy link
Copy Markdown
Contributor Author

On-device verification (Raspberry Pi 4 Model B Rev 1.5, 4 GB)

Built the pi4-64 server image from this branch, deployed it to a Pi 4 testbed, populated the device to 835 MB / 17 assets (uploaded sample videos via the real /api/v2/file_asset + /api/v2/assets flow), and drove the actual POST /settings/backup/ web endpoint with a real CSRF token, sampling the downloaded file size every second.

Before (current master image blank-3f05ead-pi4-64, the broken sync path):

t= 0s..68s  size=0          ← dead air: whole ~877 MB archive built in RAM first
t=69s       size=157 MB     ← then dumped all at once
t=71s       size=757 MB
TTFB=0.012s  TOTAL=72.0s  SIZE=877 MB  HTTP=200

The body sat at 0 bytes for 69 seconds (TTFB only reflects the headers — which is why the browser shows a download dialog that then stalls). The full archive was also held in RAM. On a multi-GB library this dead air exceeds the browser's request timeout → exactly issue #3073.

After (this branch, astream_backup()):

t= 1s  size=5.9 MB
t=10s  size=103 MB
t=30s  size=316 MB
t=60s  size=646 MB
t=81s  size=873 MB   (done)
TTFB=0.013s  TOTAL=82.1s  SIZE=877 MB  HTTP=200

Bytes flow from the first second and grow steadily (~11 MB/s) for the whole build — the browser sees continuous progress and never stalls, and server memory stays flat. The streamed archive passes gzip -t and tar tzf (contains .anthias/anthias.db, .anthias/anthias.conf, anthias_assets/), and the server logs show no "StreamingHttpResponse must consume synchronous iterators" warning — confirming Django took the async streaming path.

Testbed restored to its original image and asset library afterwards.

  • End-to-end tested on a Raspberry Pi (Pi 4B).

- wrap next() in a typed helper with a None sentinel; sync_to_async(next)
  resolved next's single-arg overload, tripping mypy on the 2-arg call
- use a distinct file handle in the regression test (text- then
  binary-mode reuse of one variable is a mypy type conflict)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/anthias_server/lib/backup_helper.py
Copilot review: with next() and close() on sync_to_async's shared pool,
a client disconnect mid-next() could run gen.close() on another thread
concurrently — raising "generator already executing" and leaking the
producer thread.

- drive next() and the cleanup close() through one dedicated
  single-worker executor so they can never overlap; the queued close()
  runs only after any in-flight next() returns
- add a regression test that aclose()s the async generator mid-stream
  and asserts the producer thread exits cleanly

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vpetersson vpetersson requested a review from Copilot June 12, 2026 10:51
@sonarqubecloud

Copy link
Copy Markdown

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

@vpetersson vpetersson merged commit eb8909a into master Jun 12, 2026
10 checks passed
@vpetersson vpetersson deleted the fix/backup-stream-async-asgi branch June 12, 2026 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants