Skip to content

Convert report API to ADRF views (prep for async embedding trigger)#230

Open
samuelvkwong wants to merge 29 commits into
mainfrom
feat/adrf-views
Open

Convert report API to ADRF views (prep for async embedding trigger)#230
samuelvkwong wants to merge 29 commits into
mainfrom
feat/adrf-views

Conversation

@samuelvkwong

@samuelvkwong samuelvkwong commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

Convert the report API from sync DRF to async ADRF as a 1:1 viewset translation, with the async/sync seam expressed uniformly across every handler. radis/reports/api/viewsets.py keeps its name and ReportViewSet keeps its place; the structural changes are:

  • viewsets.GenericViewSetadrf.viewsets.GenericViewSet with adrf.mixins.* async mixins.
  • def create/retrieve/update/destroyasync def acreate/aretrieve/aupdate/adestroy.
  • @action(detail=False, …) for bulk_upsert stays a single class member, now async def.
  • New radis/reports/api/operations.py — async write operations (create_report_from_validated, update_report_from_validated, delete_report) using native async ORM (aget_or_create, acreate, asave, aset, aclear, adelete). Operations do not own transactions.
  • ReportSerializer becomes adrf.serializers.ModelSerializer with async def acreate / aupdate. Each of those is a @sync_to_async(thread_sensitive=True) @transaction.atomic helper that bridges to the corresponding operation via async_to_sync(...). The serializer owns its atomicity; the view does not.
  • The _bulk_upsert_reports helper (legacy) is renamed to bulk_upsert_reports, stays in viewsets.py, and is now async def. Its phases are explicitly split:
    • Phase 1 (payload dedupe, pure CPU) — @sync_to_async(thread_sensitive=True) helper.
    • Phase 2 (Language/Modality/existing-Reports preflight) — native async ORM (async for, await abulk_create).
    • Phase 3 (build new_reports / updated_reports lists, pure CPU) — @sync_to_async(thread_sensitive=True) helper.
    • Phase 4 (atomic writes) — @sync_to_async(thread_sensitive=True) @transaction.atomic helper with inline sync ORM (bulk_create / bulk_update / through-table churn).

URLs are wired through adrf.routers.DefaultRouter (not DRF's). See "Router choice is load-bearing" below.

Why now: the follow-up PR will embed each uploaded report inline, during the upload request, by awaiting the async embedding client from inside the view so the report's vector is populated by the time the API returns (no eventual-consistency window). The embedding call is I/O-bound, so an async view lets us yield to the event loop while it's in flight instead of pinning a worker thread. This PR is the structural prerequisite — no inline embedding wiring yet.

Bridge architecture

Each view handler is a pure async coordinator:

  • await sync_to_async(serializer.is_valid)(raise_exception=True) — DRF is_valid is sync; sync_to_async-wrapped from the view.
  • report = await serializer.asave() — async-native, the serializer's acreate/aupdate owns the atomic block.
  • transaction.on_commit(on_commit) — registered after asave returns; fires immediately (atomic already committed) or queues under the test fixture (django_capture_on_commit_callbacks).
  • await sync_to_async(lambda: serializer.data)() — DRF to_representation is sync.

The atomic-owning helper inside ReportSerializer.acreate / aupdate is:

@sync_to_async(thread_sensitive=True)
@transaction.atomic
def _atomic():
    return async_to_sync(operations.create_report_from_validated)(validated_data)
return await _atomic()

thread_sensitive=True ensures the outer wrapper, the inner async_to_sync event loop, and the sync adapters Django's async ORM calls internally all run on the same Django thread, so the transaction context applies to every write the operation performs.

adestroy follows the same shape directly (no serializer involved): a @sync_to_async(thread_sensitive=True) @transaction.atomic helper calling async_to_sync(operations.delete_report)(report).

bulk_upsert_reports Phase 4 keeps inline sync ORM in its atomic helper because the writes are single-statement bulk operations that don't decompose into per-entity ops.

Router choice is load-bearing

DRF's rest_framework.routers.DefaultRouter maps HTTP methods to the sync action names (POST → create, PUT → update, GET → retrieve, DELETE → destroy). Because adrf.mixins.* inherit from DRF's sync mixins, those sync method names exist on the class as fully-functional sync methods — so DRF-router-driven dispatch silently calls the inherited sync mixin implementations and never reaches our acreate/aretrieve/aupdate/adestroy overrides. The four tests that exercise behavior added by our overrides (created/deleted on_commit handlers, ?full=true documents, ?upsert=true 201) fail invisibly under this misconfiguration.

adrf.routers.DefaultRouter rewrites the action mapping to the a-prefixed names whenever view_is_async=True, so dispatch reaches our overrides. Verified locally with resolve(...).func.actions == {"post": "acreate", "put": "aupdate", ...}.

The async-shape guard in test_report_api.py pins every dispatched method to inspect.iscoroutinefunction to catch a future contributor accidentally overriding the sync sibling — but that guard cannot catch a mis-wired router. The router choice is part of the architectural contract.

API contract — byte-for-byte unchanged

URLs (/api/reports/, /api/reports/{document_id}/, /api/reports/bulk-upsert/), route names (report-list, report-detail, report-bulk-upsert), response shapes, status codes, query params (?upsert, ?full, ?replace), 405-for-PATCH, and the 403-vs-404 semantics on PUT?upsert=true against an unknown id are all preserved. The router's default lookup_value_regex = [^/.]+ already forbids . in document_id — the legacy behavior — without explicit configuration.

The browsable API root at /api/reports/ (auto-generated DefaultRouter index view) is preserved as well.

Contract is locked by radis/reports/tests/test_report_api.py — written before the rewrite, passing against both the legacy DRF viewset (during intermediate commits) and the new ADRF viewset.

Test plumbing

HTTP-based tests use django.test.AsyncClient + async def + @pytest.mark.asyncio + @pytest.mark.django_db(transaction=True). The sync Client would dispatch the async view via async_to_sync, which nested with our database_sync_to_async deadlocks asgiref's thread executor (Django 6.0 + channels 4). AsyncClient runs the view in the test's event loop with no outer wrapping. transaction=True is needed because the database_sync_to_async thread doesn't see a TestCase-wrapped transaction.

Sync ORM helpers (UserFactory.create, Token.objects.create_token) are wrapped in await sync_to_async(...)() at the call site to satisfy Django's async_unsafe guard.

Note on Django's current async ORM surface

Every a* method on QuerySet/Manager (aget, aget_or_create, abulk_create, aset, acreate, asave, adelete, ...) is literally await sync_to_async(self.X)() in Django 6.0 / 6.1 — there is no native async DB backend in core today. Phase 2's async for / await calls dispatch to the asgiref thread pool just like our explicit sync_to_async calls. The win today is architectural clarity, not runtime concurrency. Once a native async DB backend ships, the @sync_to_async @transaction.atomic + async_to_sync(operations.X) helpers in the serializer, adestroy, and Phase 4 collapse to async with async_atomic(): return await operations.X(...). operations.py does not change; Phases 1 and 3 stay sync_to_async-wrapped because they are pure CPU.

Out of scope

  • Wiring the inline async embedding call into the create/update paths (follow-up PR).
  • Other API surfaces (radis.search, radis.chats, radis.extractions, ...).
  • Migrations, settings, env vars.

Design + plan docs

  • docs/superpowers/specs/2026-06-08-adrf-report-views-design.md
  • docs/superpowers/plans/2026-06-08-adrf-report-views.md

Both docs explain the router-choice trap explicitly so the next person reading the spec doesn't repeat the mistake this PR made and recovered from.

Test plan

  • uv run cli lint — 0 errors, 0 warnings, 0 informations (ruff + pyright + djlint)
  • uv run cli test -- --cov243 passed on the latest CI run.
  • Manual smoke — endpoint coverage is in test_report_api.py (URL resolution, POST/GET/PUT/DELETE/PATCH→405, ?upsert=true 201/403, ?full=true document fetchers, bulk-upsert reject paths, async-shape guard pinning every dispatched method to iscoroutinefunction).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Reports API now supports asynchronous request processing and async bulk-upsert for improved performance.
  • Behavior Changes / Bug Fixes

    • PATCH is no longer supported (returns 405).
    • Bulk-upsert enforces payload validation and clearer per-item error reporting with truncation.
  • Tests

    • Added end-to-end async tests covering list, detail, create, update/upsert, delete, bulk-upsert, permissions, and event handlers.
  • Documentation

    • Expanded design and migration plan with verification checklist.
  • Chores

    • Added configurable bulk-upsert DB batch size.

samuelvkwong and others added 6 commits June 8, 2026 07:11
Capture the design for replacing the DRF ReportViewSet with three
adrf.views.APIView subclasses (list/detail/bulk-upsert), preserving the
existing API contract. This is the structural prerequisite for a
follow-up PR that triggers the async embedding pipeline from the
upload path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five-task plan, TDD-flavored:
1. Move bulk_upsert_reports into its own module (pure refactor)
2. Lock the wire contract with end-to-end tests + async-shape guards
3. Add the three ADRF view classes
4. Swap urls.py to the new views; delete ReportViewSet
5. Lint + full test + manual smoke + open PR

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/bulk.py

Pure code move with one rename (_bulk_upsert_reports -> bulk_upsert_reports)
since it's now the only public symbol of the new module. The DRF viewset
becomes a thinner HTTP wrapper. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lock the wire-level contract for all five report endpoints before the
ADRF rewrite. The three iscoroutinefunction guards fail today and will
go green once the new ADRF view classes land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce ReportListAPIView, ReportDetailAPIView, and
ReportBulkUpsertAPIView following ADIT's adrf.views.APIView pattern.
The classes are unreachable until urls.py is swapped in the next
commit; the async-shape guards in test_report_api.py go green now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop DefaultRouter in favor of explicit path() entries wired to the
three new ADRF views. Deletes radis/reports/api/viewsets.py.

URLs, response shapes, status codes, query-param semantics, and
permission behavior are byte-for-byte identical to the prior DRF
implementation — guarded by radis/reports/tests/test_report_api.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Rewrites the Reports API to a single async ADRF ReportViewSet with async CRUD entry points and an async bulk_upsert helper/action, switches routing to ADRF router, converts serializers/operations for async writes, adds async end-to-end tests and migrated bulk-upsert tests, and introduces a bulk DB batch-size setting.

Changes

Async ADRF Report ViewSet Migration

Layer / File(s) Summary
Design & Implementation Plan
docs/superpowers/plans/2026-06-08-adrf-report-views.md, docs/superpowers/specs/2026-06-08-adrf-report-views-design.md
Design spec and plan document async migration approach, preserved API invariants, async/sync hygiene, transaction.on_commit scheduling, and rollout/test verification.
Router & URL Registration
radis/reports/api/urls.py
Router import switched to adrf.routers.DefaultRouter; ReportViewSet registered with basename="report" at the empty-string prefix to preserve endpoint routes.
Client Test Transaction Marker
radis-client/tests/test_client.py
Updated @pytest.mark.django_db to @pytest.mark.django_db(transaction=True) for async test compatibility.
Async Operations (domain writes)
radis/reports/api/operations.py
Add async domain write helpers (create_report_from_validated, update_report_from_validated, delete_report) that build/replace nested relations; functions do not manage caller transactions.
Async Serializers & Delegation
radis/reports/api/serializers.py
Convert ReportSerializer to ADRF AsyncModelSerializer; add acreate/aupdate delegating to operations inside transaction wrappers bridged for sync compatibility.
Async ViewSet Module & Bulk Helper
radis/reports/api/viewsets.py
Introduce ADRF-focused async viewset header and implement bulk_upsert_reports with async preflight, dedupe, async pre-creates, CPU-side build, and sync-wrapped atomic bulk persistence plus post-commit reloads for handlers.
ReportViewSet Methods & Actions
radis/reports/api/viewsets.py
Implement ADRF async ReportViewSet methods: acreate, aretrieve, aupdate, adestroy, and async bulk_upsert action; disable PATCH at dispatcher level and validate per-item via sync-to-async serializer calls.
End-to-end Report API Tests
radis/reports/tests/test_report_api.py
New async test suite covering URL resolution, POST/GET/PUT/DELETE/upsert/permission semantics, PATCH 405, full=true enrichment, bulk-upsert validation, on-commit handler capture, and an async-shape guard asserting coroutine methods.
Bulk-upsert Test Migration
radis/reports/tests/test_bulk_upsert.py
Tests migrated from sync to async: use AsyncClient, sync_to_async setup helpers, awaited requests, async ORM assertions, and import updated to bulk_upsert_reports.
Settings: Bulk DB Batch Size
radis/settings/base.py
Added REPORTS_BULK_DB_BATCH_SIZE env-backed setting (default 1000) to control bulk-upsert batching.

Estimated code review effort:
🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs:

  • openradx/radis#187: Related bulk-upsert implementation changes and deduping/replace handling that overlap with the async bulk helper/action work.

"a rabbit pens this little rhyme,
ADRF hops in, step in time,
async views and bulk abide,
tests stand guard and handlers ride,
commit — then indexing chime."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.32% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: converting the report API from DRF to ADRF views, with context about preparing for async embedding.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/adrf-views

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the radiology report API by replacing the synchronous DRF ReportViewSet with three explicit asynchronous ADRF APIView subclasses (ReportListAPIView, ReportDetailAPIView, and ReportBulkUpsertAPIView) and extracting the bulk upsert logic into a separate module. End-to-end tests and async-shape guards are also introduced to ensure API contract preservation. The code review identified critical concurrency and transaction safety issues in the new async views, including thread-safety concerns when separating report.adelete() from transaction.on_commit registration, potential event loop blocking from accessing request.data inside database_sync_to_async blocks, and an opportunity to optimize external document fetching by utilizing asyncio.gather for parallel execution.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread radis/reports/api/views.py Outdated
Comment thread radis/reports/api/views.py Outdated
Comment thread radis/reports/api/views.py Outdated
Comment thread radis/reports/api/views.py Outdated
@samuelvkwong samuelvkwong marked this pull request as draft June 8, 2026 08:39
samuelvkwong and others added 4 commits June 8, 2026 08:54
Four fixes raised by Gemini code-assist review:

1. delete: combine `report.delete()` and `transaction.on_commit()`
   registration into a single `database_sync_to_async` block wrapped
   in `transaction.atomic()`. Previously, `await report.adelete()`
   ran on the async connection and the on_commit registration ran on a
   separate sync connection, so the callback was not bound to the
   delete's transaction.

2. post / put / bulk-upsert: materialize `request.data` on the async
   thread (and capture as a local) before entering any
   `database_sync_to_async` wrapper. Parsing the ASGI body stream
   from a worker thread risks SynchronousOnlyOperation under ASGI and
   is the most likely cause of the failing test_report_api / test_bulk_upsert
   cases in the previous CI run.

3. get (?full=true): replace the sequential per-fetcher
   `database_sync_to_async` loop with `asyncio.gather`, so multiple
   document_fetchers run concurrently instead of one-at-a-time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Django's test Client dispatches an async view via async_to_sync, which
spawns a thread that does not see the test's wrapping atomic transaction.
The first DB query inside the async view (e.g. groups field validation
in ReportSerializer) then hits "the connection is closed".

Fix: mark every HTTP-based test (and the live_server-based radis-client
test) with @pytest.mark.django_db(transaction=True) so pytest-django
uses TransactionTestCase semantics (truncate after, no hidden
transaction). The two tests that exercise the helpers directly
(test_bulk_upsert_dedupes_metadata_keys, test_report_data_valid) keep
the default marker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The sync Client dispatches an async view via async_to_sync, which
nested with our own database_sync_to_async deadlocks asgiref's
thread executor under pytest-django (Django 6.0 + channels 4).

Switch all HTTP tests in test_report_api.py and the two HTTP-based
tests in test_bulk_upsert.py to django.test.AsyncClient + async def
+ @pytest.mark.asyncio + @pytest.mark.django_db(transaction=True).
This runs the async view in the test's event loop with no outer
async_to_sync wrapping, eliminating the deadlock.

ORM assertions inside async tests now use a* variants (aexists,
aget, acount) since we can't use sync ORM from async without
sync_to_async.

The two helper-direct tests (test_bulk_upsert_dedupes_metadata_keys,
test_report_data_valid) stay sync — they don't hit HTTP.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Django's async-unsafe sentinel raises SynchronousOnlyOperation when
sync ORM (factory_boy `.create()`, `.objects.create_token()`,
`user.groups.add()`) is called directly from an `async def` test.

Wrap each helper invocation with `await sync_to_async(...)`:
- test_report_api.py: every `_staff_user_and_token()` and
  `_non_staff_user_and_token()` call now goes through `sync_to_async`.
- test_bulk_upsert.py: factor the per-test factory setup into a
  shared `_create_staff_user_group_token(label)` helper and call it
  via `sync_to_async`.

Helper return types tightened so pyright can resolve `.pk` access.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong added a commit that referenced this pull request Jun 8, 2026
Pull in PR #230's ADRF report-API rewrite (ReportListAPIView,
ReportDetailAPIView, ReportBulkUpsertAPIView; native async ORM + serializer
wrap via channels.db.database_sync_to_async). Skips the trailing temp CI
commit bc2dc07 (marked 'revert before merge' by author) — that change
does not belong on this branch.

Sets up the inline async embedding wiring on top.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong added a commit that referenced this pull request Jun 8, 2026
Pulls in PR #230's three follow-up test commits (86ac291, 07b8751,
06cbfd3): use transaction=True on the HTTP tests, migrate them to
AsyncClient, and wrap sync ORM helpers with sync_to_async. These fix
async-safety issues that were causing report-api test failures on this
branch after the inline async embedding wiring landed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong added a commit that referenced this pull request Jun 8, 2026
…trator

Replaces the periodic Job/Task orchestrator with inline async embedding
triggered directly from the ADRF report views (PR #230). When a report is
POSTed, PUT, or bulk-upserted, the view awaits AsyncEmbeddingClient via a
new helper `embed_reports_inline`, which fetches the just-created RSV rows,
calls `embed_documents` once, and bulk_updates the embeddings.

Why this is simpler: no periodic cron, no queue, no system user, no
EmbeddingJob/EmbeddingTask models, no embeddings_worker container — and
reports become semantically searchable at the moment the API call returns
(modulo a sub-second embedding round-trip).

Failure policy unchanged in spirit: if the embedding service is unreachable
or returns malformed data, the helper logs a WARNING and returns; the RSV
row keeps embedding=NULL and the report stays searchable via the FTS half.
Never raises into the request handler.

Changes:
- Add AsyncEmbeddingClient (httpx.AsyncClient sibling of EmbeddingClient)
- Refactor EmbeddingClient internals into shared helpers (_resolve_config,
  _normalize_response) so both clients reuse validation + post-processing
- Add radis/pgsearch/utils/inline_embedding.py with embed_reports_inline()
- Wire the helper into ReportListAPIView.post, ReportDetailAPIView.put,
  and ReportBulkUpsertAPIView.post (radis/reports/api/views.py)
- Override EMBEDDING_PROVIDER_URL="" in radis/settings/test.py so the
  inline path fast-fails into the documented WARNING fallback in CI
- Delete radis/pgsearch/tasks.py orchestrator entries (kept the FTS
  bulk_index_reports + enqueue_bulk_index_reports used by bulk.py)
- Delete EmbeddingJob/EmbeddingTask models
- Slim the squashed migration to schema-only (extension + embedding
  column + HNSW index); drop the Job/Task tables and system-user RunPython
- Remove EMBEDDING_DRAIN_CRON / EMBEDDING_INDEX_PRIORITY /
  EMBEDDING_SYSTEM_USERNAME settings
- Drop the embeddings_worker container from all three compose files
- Delete the now-obsolete orchestrator tests
  (test_embedding_launcher / test_process_embedding_job /
  test_process_embedding_task / test_migrations_system_user /
  test_models_embedding)

Existing reports with NULL embeddings remain searchable via the FTS half.
Operators who need to retroactively embed historical reports can re-PUT
each affected document; no backfill command is reinstated.

Verified locally: pgsearch (62 tests) and reports (20 tests) suites both
pass against the merged feat/adrf-views.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong added a commit that referenced this pull request Jun 8, 2026
§3 architecture diagram swapped from cron+Job/Task orchestrator to async
ADRF view + embed_reports_inline. §3 components table updated to reflect
AsyncEmbeddingClient + inline_embedding helper instead of orchestrator
tasks + system user. §4.2 squashed migration is schema-only. §4.5
dim-change procedure uses re-PUT rather than launcher.defer. §6 rewritten
end-to-end as inline architecture: the helper, view wiring across the
three ADRF endpoints, AsyncEmbeddingClient, failure semantics, historical
NULL handling. §8.2 drops EMBEDDING_DRAIN_CRON/INDEX_PRIORITY/
SYSTEM_USERNAME constants. §8.4 drops embeddings_worker compose block.
§9 error-handling table swaps orchestrator-era rows for the inline-call
failure. §10 testing replaces the three orchestrator test files with
test_inline_embedding. §11.3 dim-change mitigation now points to §4.5.
§12 rollout becomes a 6-step plan keyed off PR #230's ADRF views landing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong and others added 5 commits June 8, 2026 21:53
…nqueue

The follow-up PR will await the async embedding client from inside the
view to embed the report inline during the upload request, so the
vector is populated before the API returns. The previous wording
suggested enqueueing a background job, which is a different design.
ADRF is what lets the view handler `await` the I/O-bound embedding
call without holding a worker thread for the duration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switch from three explicit `adrf.views.APIView` subclasses
(ReportListAPIView, ReportDetailAPIView, ReportBulkUpsertAPIView) to a
single `adrf.viewsets.GenericViewSet` subclass plus the four async
mixins from `adrf.mixins` and an `@action` for bulk-upsert. URL wiring
goes back to `DefaultRouter.register("", ReportViewSet, basename="report")`.

Why: the legacy class was a `GenericViewSet` + sync mixins; this is the
minimum-diff async equivalent. The router-generated URLs, route names
(`report-list` / `report-detail` / `report-bulk-upsert`), and default
`lookup_value_regex` ([^/.]+) match the legacy contract verbatim, so no
test or client change is needed beyond the async-shape guards. The
browsable API root at /api/reports/ comes back for free.

PATCH still returns 405, now via `http_method_names` instead of an
explicit `apartial_update` override.

The async-shape guard tests are folded into a single test that pins
every dispatched method (acreate / aretrieve / aupdate / adestroy /
bulk_upsert) to `inspect.iscoroutinefunction`. The risk this guards is
specific to the viewset shape: `adrf.mixins.*ModelMixin` inherits from
DRF's sync mixins, so each class has both `create` and `acreate` (etc.)
on the MRO; an accidental sync override would silently flip dispatch
back to sync.

Spec + plan updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This is a sync-DRF → async-ADRF conversion of the existing
ReportViewSet, not a file restructure. Undoes the earlier
extract-into-bulk.py + rename-to-views.py steps:

  - Move bulk_upsert_reports back into radis/reports/api/viewsets.py
    (delete radis/reports/api/bulk.py).
  - Rename radis/reports/api/views.py back to viewsets.py.
  - Point urls.py at `from .viewsets import ReportViewSet`.
  - Update test_bulk_upsert.py and radis-client/tests/test_client.py
    to import / patch under `radis.reports.api.viewsets`.
  - Update spec + plan to match: one module, no file split, no rename.

Behaviour is unchanged from the previous viewset commit (7215e2f).
CI on that commit was green; the file moves here are import-path
only and should keep CI green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…c overrides

DRF's `rest_framework.routers.DefaultRouter` maps HTTP methods to the
SYNC action names (POST → 'create', PUT → 'update', GET → 'retrieve',
DELETE → 'destroy'). Because adrf.mixins.* inherit from DRF's sync
mixins, those sync names exist on the class as fully-functional sync
methods — so router-driven dispatch silently calls the inherited
sync mixin implementations, NEVER our acreate/aretrieve/aupdate/adestroy
overrides. That broke the four tests that exercise behaviour added by
our overrides:

  - test_post_creates_report_and_fires_created_handler   (no on_commit)
  - test_get_full_includes_documents_from_fetchers       (no `?full=`)
  - test_put_upsert_creates_when_missing                 (no `?upsert=`)
  - test_delete_removes_report_and_fires_deleted_handler (no on_commit)

`adrf.routers.DefaultRouter` rewrites the action mapping to the
a-prefixed names whenever `view_is_async=True`, so POST → 'acreate',
PUT → 'aupdate', GET → 'aretrieve', DELETE → 'adestroy'. Dispatch now
hits our overrides. Verified with `resolve(...).func.actions` showing
the a-prefixed mapping.

Spec + plan call out that the router choice is load-bearing (the
async-shape guard test catches override identity but cannot catch
a mis-wired router).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@samuelvkwong samuelvkwong marked this pull request as ready for review June 9, 2026 09:42

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/superpowers/specs/2026-06-08-adrf-report-views-design.md (2)

189-189: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify “router removal” wording to “router migration.”

Given this design keeps routing via adrf.routers.DefaultRouter, “router removal” is inaccurate and may confuse regression intent for the route-resolution test.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md` at line 189,
Update the test wording and assertion guidance: change the comment/commit
message that says “router removal” to “router migration” and ensure the
regression guard in radis/reports/tests/test_bulk_upsert.py asserts that the
bulk-upsert endpoint still resolves under the current routing (using
adrf.routers.DefaultRouter); specifically, add one explicit assertion that the
route resolves (e.g., client.get/post to the bulk-upsert route returns the
expected status) and update any related test text to reference “router
migration” instead of “router removal.”

19-25: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve the architecture contradiction in “In scope.”

This section says to drop ReportViewSet/router and move to three APIViews + explicit path() routes, but later sections define the chosen design as a single ADRF ReportViewSet on adrf.routers.DefaultRouter. Keep one architecture only, or this spec will drive conflicting implementations/reviews.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md` around lines
19 - 25, The spec contains a contradiction between using three
adrf.views.APIView subclasses (ReportListAPIView, ReportDetailAPIView,
ReportBulkUpsertAPIView) with explicit path() routes and using a single ADRF
ReportViewSet registered on adrf.routers.DefaultRouter; pick one approach and
make the entire document consistent: either remove all mentions of
ReportViewSet/DefaultRouter/router registration and ensure
radis/reports/api/urls.py and the “In scope” list describe the three APIViews
and the rename of _bulk_upsert_reports to bulk_upsert_reports, or keep
ReportViewSet/DefaultRouter as the single chosen design and remove the APIView
references and explicit path() instructions (also revert or document how
bulk_upsert_reports is exposed via the viewset action). Ensure every section (In
scope, endpoints list, URL wiring, and helper rename) aligns with the single
chosen architecture and reference the exact symbols above when updating text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md`:
- Around line 87-90: Update the example imports to use ADRF modules and the
viewsets module: replace the DRF router import so it uses DefaultRouter from
adrf.routers (symbol DefaultRouter) and change the ReportViewSet import to come
from .viewsets (symbol ReportViewSet) instead of rest_framework.routers and
.views; keep the rest of the wiring unchanged so the router and ReportViewSet
integration matches the async dispatch design.
- Line 27: The spec incorrectly references Django's synchronous Client for the
new async ADRF contract; update all mentions (e.g., the line describing the new
test file `radis/reports/tests/test_report_api.py` and any other occurrences) to
recommend using Django's AsyncClient and async test plumbing (async test
functions or pytest-asyncio/Django async test utilities) so tests exercise async
dispatch end-to-end rather than the synchronous Client.

In `@radis/reports/api/viewsets.py`:
- Around line 283-296: The commit callback that reindexes touched_report_ids
(calling bulk_upsert_report_search_vectors or enqueue_bulk_index_reports) is
only present in the bulk path's on_commit closure; update the single-record
handlers in acreate() and aupdate() to mirror that logic: when a single Report
is created/updated, register the same on_commit behavior (collect the single
document_id into touched_report_ids and, inside the on_commit callback, call
bulk_upsert_report_search_vectors(touched_report_ids) if
settings.PGSEARCH_SYNC_INDEXING else
enqueue_bulk_index_reports(touched_report_ids)) and still invoke
reports_created_handlers / reports_updated_handlers as before so single-record
POST/PUT commits also update the PGSEARCH index.
- Around line 354-359: The document fetches wrapped by _fetch use
database_sync_to_async with its default thread_sensitive=True which serializes
work onto one worker thread; update the calls in the _fetch coroutine to use
database_sync_to_async(fetcher.fetch, thread_sensitive=False)(report) (or
convert fetcher.fetch implementations to be async) so asyncio.gather can run
them in parallel; locate _fetch, fetcher.fetch, document_fetchers and
asyncio.gather in this block and apply the thread_sensitive=False change (or
switch the fetchers to native async) accordingly.

---

Outside diff comments:
In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md`:
- Line 189: Update the test wording and assertion guidance: change the
comment/commit message that says “router removal” to “router migration” and
ensure the regression guard in radis/reports/tests/test_bulk_upsert.py asserts
that the bulk-upsert endpoint still resolves under the current routing (using
adrf.routers.DefaultRouter); specifically, add one explicit assertion that the
route resolves (e.g., client.get/post to the bulk-upsert route returns the
expected status) and update any related test text to reference “router
migration” instead of “router removal.”
- Around line 19-25: The spec contains a contradiction between using three
adrf.views.APIView subclasses (ReportListAPIView, ReportDetailAPIView,
ReportBulkUpsertAPIView) with explicit path() routes and using a single ADRF
ReportViewSet registered on adrf.routers.DefaultRouter; pick one approach and
make the entire document consistent: either remove all mentions of
ReportViewSet/DefaultRouter/router registration and ensure
radis/reports/api/urls.py and the “In scope” list describe the three APIViews
and the rename of _bulk_upsert_reports to bulk_upsert_reports, or keep
ReportViewSet/DefaultRouter as the single chosen design and remove the APIView
references and explicit path() instructions (also revert or document how
bulk_upsert_reports is exposed via the viewset action). Ensure every section (In
scope, endpoints list, URL wiring, and helper rename) aligns with the single
chosen architecture and reference the exact symbols above when updating text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3aface82-eb2b-4979-99f9-467994501c38

📥 Commits

Reviewing files that changed from the base of the PR and between 00232c9 and 103f36b.

📒 Files selected for processing (7)
  • docs/superpowers/plans/2026-06-08-adrf-report-views.md
  • docs/superpowers/specs/2026-06-08-adrf-report-views-design.md
  • radis-client/tests/test_client.py
  • radis/reports/api/urls.py
  • radis/reports/api/viewsets.py
  • radis/reports/tests/test_bulk_upsert.py
  • radis/reports/tests/test_report_api.py
✅ Files skipped from review due to trivial changes (1)
  • docs/superpowers/plans/2026-06-08-adrf-report-views.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • radis/reports/api/urls.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/superpowers/specs/2026-06-08-adrf-report-views-design.md (2)

189-189: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify “router removal” wording to “router migration.”

Given this design keeps routing via adrf.routers.DefaultRouter, “router removal” is inaccurate and may confuse regression intent for the route-resolution test.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md` at line 189,
Update the test wording and assertion guidance: change the comment/commit
message that says “router removal” to “router migration” and ensure the
regression guard in radis/reports/tests/test_bulk_upsert.py asserts that the
bulk-upsert endpoint still resolves under the current routing (using
adrf.routers.DefaultRouter); specifically, add one explicit assertion that the
route resolves (e.g., client.get/post to the bulk-upsert route returns the
expected status) and update any related test text to reference “router
migration” instead of “router removal.”

19-25: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve the architecture contradiction in “In scope.”

This section says to drop ReportViewSet/router and move to three APIViews + explicit path() routes, but later sections define the chosen design as a single ADRF ReportViewSet on adrf.routers.DefaultRouter. Keep one architecture only, or this spec will drive conflicting implementations/reviews.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md` around lines
19 - 25, The spec contains a contradiction between using three
adrf.views.APIView subclasses (ReportListAPIView, ReportDetailAPIView,
ReportBulkUpsertAPIView) with explicit path() routes and using a single ADRF
ReportViewSet registered on adrf.routers.DefaultRouter; pick one approach and
make the entire document consistent: either remove all mentions of
ReportViewSet/DefaultRouter/router registration and ensure
radis/reports/api/urls.py and the “In scope” list describe the three APIViews
and the rename of _bulk_upsert_reports to bulk_upsert_reports, or keep
ReportViewSet/DefaultRouter as the single chosen design and remove the APIView
references and explicit path() instructions (also revert or document how
bulk_upsert_reports is exposed via the viewset action). Ensure every section (In
scope, endpoints list, URL wiring, and helper rename) aligns with the single
chosen architecture and reference the exact symbols above when updating text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md`:
- Around line 87-90: Update the example imports to use ADRF modules and the
viewsets module: replace the DRF router import so it uses DefaultRouter from
adrf.routers (symbol DefaultRouter) and change the ReportViewSet import to come
from .viewsets (symbol ReportViewSet) instead of rest_framework.routers and
.views; keep the rest of the wiring unchanged so the router and ReportViewSet
integration matches the async dispatch design.
- Line 27: The spec incorrectly references Django's synchronous Client for the
new async ADRF contract; update all mentions (e.g., the line describing the new
test file `radis/reports/tests/test_report_api.py` and any other occurrences) to
recommend using Django's AsyncClient and async test plumbing (async test
functions or pytest-asyncio/Django async test utilities) so tests exercise async
dispatch end-to-end rather than the synchronous Client.

In `@radis/reports/api/viewsets.py`:
- Around line 283-296: The commit callback that reindexes touched_report_ids
(calling bulk_upsert_report_search_vectors or enqueue_bulk_index_reports) is
only present in the bulk path's on_commit closure; update the single-record
handlers in acreate() and aupdate() to mirror that logic: when a single Report
is created/updated, register the same on_commit behavior (collect the single
document_id into touched_report_ids and, inside the on_commit callback, call
bulk_upsert_report_search_vectors(touched_report_ids) if
settings.PGSEARCH_SYNC_INDEXING else
enqueue_bulk_index_reports(touched_report_ids)) and still invoke
reports_created_handlers / reports_updated_handlers as before so single-record
POST/PUT commits also update the PGSEARCH index.
- Around line 354-359: The document fetches wrapped by _fetch use
database_sync_to_async with its default thread_sensitive=True which serializes
work onto one worker thread; update the calls in the _fetch coroutine to use
database_sync_to_async(fetcher.fetch, thread_sensitive=False)(report) (or
convert fetcher.fetch implementations to be async) so asyncio.gather can run
them in parallel; locate _fetch, fetcher.fetch, document_fetchers and
asyncio.gather in this block and apply the thread_sensitive=False change (or
switch the fetchers to native async) accordingly.

---

Outside diff comments:
In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md`:
- Line 189: Update the test wording and assertion guidance: change the
comment/commit message that says “router removal” to “router migration” and
ensure the regression guard in radis/reports/tests/test_bulk_upsert.py asserts
that the bulk-upsert endpoint still resolves under the current routing (using
adrf.routers.DefaultRouter); specifically, add one explicit assertion that the
route resolves (e.g., client.get/post to the bulk-upsert route returns the
expected status) and update any related test text to reference “router
migration” instead of “router removal.”
- Around line 19-25: The spec contains a contradiction between using three
adrf.views.APIView subclasses (ReportListAPIView, ReportDetailAPIView,
ReportBulkUpsertAPIView) with explicit path() routes and using a single ADRF
ReportViewSet registered on adrf.routers.DefaultRouter; pick one approach and
make the entire document consistent: either remove all mentions of
ReportViewSet/DefaultRouter/router registration and ensure
radis/reports/api/urls.py and the “In scope” list describe the three APIViews
and the rename of _bulk_upsert_reports to bulk_upsert_reports, or keep
ReportViewSet/DefaultRouter as the single chosen design and remove the APIView
references and explicit path() instructions (also revert or document how
bulk_upsert_reports is exposed via the viewset action). Ensure every section (In
scope, endpoints list, URL wiring, and helper rename) aligns with the single
chosen architecture and reference the exact symbols above when updating text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3aface82-eb2b-4979-99f9-467994501c38

📥 Commits

Reviewing files that changed from the base of the PR and between 00232c9 and 103f36b.

📒 Files selected for processing (7)
  • docs/superpowers/plans/2026-06-08-adrf-report-views.md
  • docs/superpowers/specs/2026-06-08-adrf-report-views-design.md
  • radis-client/tests/test_client.py
  • radis/reports/api/urls.py
  • radis/reports/api/viewsets.py
  • radis/reports/tests/test_bulk_upsert.py
  • radis/reports/tests/test_report_api.py
✅ Files skipped from review due to trivial changes (1)
  • docs/superpowers/plans/2026-06-08-adrf-report-views.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • radis/reports/api/urls.py
🛑 Comments failed to post (4)
docs/superpowers/specs/2026-06-08-adrf-report-views-design.md (2)

27-27: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Update test client guidance to AsyncClient consistently.

The spec still says Django Client, but this migration’s contract is async ADRF behavior and async test plumbing. Keeping sync-client guidance here risks false confidence on async dispatch coverage.

Also applies to: 191-191

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md` at line 27,
The spec incorrectly references Django's synchronous Client for the new async
ADRF contract; update all mentions (e.g., the line describing the new test file
`radis/reports/tests/test_report_api.py` and any other occurrences) to recommend
using Django's AsyncClient and async test plumbing (async test functions or
pytest-asyncio/Django async test utilities) so tests exercise async dispatch
end-to-end rather than the synchronous Client.

87-90: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the URL wiring sample to ADRF imports.

The snippet currently shows DRF router and .views import, which contradicts the async dispatch design and the implemented wiring. It should use from adrf.routers import DefaultRouter and import ReportViewSet from .viewsets.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/superpowers/specs/2026-06-08-adrf-report-views-design.md` around lines
87 - 90, Update the example imports to use ADRF modules and the viewsets module:
replace the DRF router import so it uses DefaultRouter from adrf.routers (symbol
DefaultRouter) and change the ReportViewSet import to come from .viewsets
(symbol ReportViewSet) instead of rest_framework.routers and .views; keep the
rest of the wiring unchanged so the router and ReportViewSet integration matches
the async dispatch design.
radis/reports/api/viewsets.py (2)

283-296: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Mirror the bulk-upsert PGSEARCH callback in acreate() and aupdate().

The bulk path reindexes touched_report_ids on commit, but the single-record POST/PUT paths only invoke reports_created_handlers / reports_updated_handlers. That leaves /api/reports/ and /api/reports/{document_id}/ able to commit successfully while the PGSEARCH index stays stale for those reports.

Suggested fix
         def on_commit():
             for handler in reports_created_handlers:
                 logger.debug(
                     f"{handler.name} - handle newly created reports: "
                     f"{[report.document_id]}"
                 )
                 handler.handle([report])
+            report_ids = [report.pk]
+            if settings.PGSEARCH_SYNC_INDEXING:
+                bulk_upsert_report_search_vectors(report_ids)
+            else:
+                enqueue_bulk_index_reports(report_ids)
         def on_commit():
             handlers = (
                 reports_created_handlers
                 if report is None
                 else reports_updated_handlers
             )
             event = "newly created" if report is None else "updated"
             for handler in handlers:
                 logger.debug(
                     f"{handler.name} - handle {event} reports: "
                     f"{[saved.document_id]}"
                 )
                 handler.handle([saved])
+            report_ids = [saved.pk]
+            if settings.PGSEARCH_SYNC_INDEXING:
+                bulk_upsert_report_search_vectors(report_ids)
+            else:
+                enqueue_bulk_index_reports(report_ids)

Also applies to: 327-336, 392-406

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radis/reports/api/viewsets.py` around lines 283 - 296, The commit callback
that reindexes touched_report_ids (calling bulk_upsert_report_search_vectors or
enqueue_bulk_index_reports) is only present in the bulk path's on_commit
closure; update the single-record handlers in acreate() and aupdate() to mirror
that logic: when a single Report is created/updated, register the same on_commit
behavior (collect the single document_id into touched_report_ids and, inside the
on_commit callback, call bulk_upsert_report_search_vectors(touched_report_ids)
if settings.PGSEARCH_SYNC_INDEXING else
enqueue_bulk_index_reports(touched_report_ids)) and still invoke
reports_created_handlers / reports_updated_handlers as before so single-record
POST/PUT commits also update the PGSEARCH index.

354-359: ⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Channels 4.2.0 / asgiref, does channels.db.database_sync_to_async default to thread_sensitive=True such that multiple calls awaited via asyncio.gather execute serially on the same thread executor?

💡 Result:

Yes, channels.db.database_sync_to_async (which inherits from asgiref.sync.SyncToAsync) defaults to thread_sensitive=True [1]. In this mode, multiple calls to database_sync_to_async that are awaited via asyncio.gather will execute serially because they share a single-threaded executor [2][3][4]. The thread_sensitive=True setting is designed to ensure that all synchronous functions marked as "thread-sensitive" run in the same specific thread [4][5]. This is necessary for compatibility with code that relies on thread-local storage or non-thread-safe resources, such as many database adapters used by the Django ORM [2][4][5]. When the outermost program is asynchronous, this shared resource is a single dedicated sub-thread [4]. Consequently, if you launch multiple synchronous tasks using asyncio.gather, they will queue up and execute one after the other within that single, shared thread, rather than running in parallel across multiple threads [2][4]. If you set thread_sensitive=False, each call will instead run in its own brand-new thread, which would allow for parallel execution but may lead to issues if your code depends on thread-local state or specific database connection constraints [4][5].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the exact code around the referenced lines
python3 - <<'PY'
from pathlib import Path
path = Path("radis/reports/api/viewsets.py")
start, end = 330, 390
lines = path.read_text().splitlines()
for i in range(start, end+1):
    if 1 <= i <= len(lines):
        print(f"{i}:{lines[i-1]}")
PY

# Find where document_fetchers is built and where fetcher.fetch is defined/used
rg -n "document_fetchers|_fetch\\(|fetcher\\.fetch\\(|database_sync_to_async\\(" radis/reports/api/viewsets.py

# If fetchers come from elsewhere, try to locate "document_fetchers" usage/definition across the repo
rg -n "document_fetchers" radis | head -n 50

# Locate the fetcher type/class (best-effort) by searching for ".source" and "fetch(" patterns
rg -n "class .*Fetcher|def fetch\\(|fetch\\(" radis/reports | head -n 50

Repository: openradx/radis

Length of output: 3788


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect imports and database_sync_to_async definition/usage near top of the file
python3 - <<'PY'
from pathlib import Path
p = Path("radis/reports/api/viewsets.py")
lines = p.read_text().splitlines()
for i in range(1, min(120, len(lines))+1):
    print(f"{i}:{lines[i-1]}")
PY

# Show definition of DocumentFetcher and how fetcher.fetch is constructed
python3 - <<'PY'
from pathlib import Path
p = Path("radis/reports/site.py")
lines = p.read_text().splitlines()
for i in range(1, min(140, len(lines))+1):
    print(f"{i}:{lines[i-1]}")
PY

# Find where DocumentFetcher fetch callables come from; also locate stub/fetchers usage
rg -n "DocumentFetcher\\(|fetchers|document_fetchers\\[" radis/reports | head -n 200

# Locate implementations of fetch functions (best-effort)
rg -n "def fetch\\(|async def fetch\\(" radis/reports | head -n 200

# Locate any database_sync_to_async calls that override thread_sensitive
rg -n "database_sync_to_async\\(.*thread_sensitive|thread_sensitive=" radis | head -n 200

Repository: openradx/radis

Length of output: 8683


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Locate all document fetcher registrations (to inspect what fetcher.fetch actually is)
rg -n "register_document_fetcher\\(" radis || true

# 2) Inspect the exact fetcher callables used in registrations (common names: fetch, get_document, etc.)
#    We search for functions passed as the 2nd arg to register_document_fetcher
rg -n "register_document_fetcher\\([^,]+,\\s*([A-Za-z0-9_\\.]+)\\)" radis || true

# 3) Look for any thread_sensitive overrides in the repo (esp. around database_sync_to_async)
rg -n "database_sync_to_async\\(.*thread_sensitive" radis || true
rg -n "thread_sensitive\\s*=" radis || true

# 4) Also list all database_sync_to_async usages inside radis/reports for context
rg -n "database_sync_to_async\\(" radis/reports || true

Repository: openradx/radis

Length of output: 461


🏁 Script executed:

#!/bin/bash
set -euo pipefail

python3 - <<'PY'
from pathlib import Path
p = Path("radis/reports/tests/test_report_api.py")
lines = p.read_text().splitlines()
start, end = 140, 240
for i in range(start, end+1):
    if 1 <= i <= len(lines):
        print(f"{i}:{lines[i-1]}")
PY

Repository: openradx/radis

Length of output: 3887


asyncio.gather() won’t parallelize these document fetches

channels.db.database_sync_to_async defaults to thread_sensitive=True, so the fetcher.fetch() calls wrapped here will queue onto the same single worker thread. As a result, ?full=true will still scale roughly linearly with the number of document_fetchers despite using asyncio.gather. If fetcher.fetch doesn’t rely on Django/thread-local constraints, switch to database_sync_to_async(..., thread_sensitive=False) (or make the fetchers native async).

async def _fetch(fetcher):
    return fetcher.source, await database_sync_to_async(fetcher.fetch)(report)

results = await asyncio.gather(*(_fetch(f) for f in document_fetchers.values()))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radis/reports/api/viewsets.py` around lines 354 - 359, The document fetches
wrapped by _fetch use database_sync_to_async with its default
thread_sensitive=True which serializes work onto one worker thread; update the
calls in the _fetch coroutine to use database_sync_to_async(fetcher.fetch,
thread_sensitive=False)(report) (or convert fetcher.fetch implementations to be
async) so asyncio.gather can run them in parallel; locate _fetch, fetcher.fetch,
document_fetchers and asyncio.gather in this block and apply the
thread_sensitive=False change (or switch the fetchers to native async)
accordingly.

@NumericalAdvantage

Copy link
Copy Markdown
Collaborator

My review :

Right now the lexical indexing (which is cheaper to calculate compared to embedding vector) is done locally (on radx server) in batches, routed through the background Procrastinate queue.
And now for semantic search (which takes longer to calculate, is done via an unreliable service) is being done asynchronously .. I dont think this is so good because it would run the expensive, remote, failure-prone operation (embedding) inline, while blocking the report upload .. while the cheap, local, reliable one (lexical) stays deferred. Right after upload a report would be findable by meaning (semantic, inline) but maybe not yet by keyword (lexical, still queued). I think for jobs like this eventual consistency is more important ..

If you are still not convinced, think of the following scenario :

today nothing in the indexing path can be "down," because there's no external service in it. The outage risk is entirely created by the proposed "inline embedding" design.

This is what I mean - you have two ways of wiring things up -

Ordering A — embed, then save. Request arrives → await embed(text) → embedding service is down → the await raises → request returns a 5xx → the report is never saved. During an embedding outage, ingestion stops dead. Every upload fails; the radis-client or the ETL pipeline either errors out or retry-loops against a dead service.

Ordering B — save, then embed. Request → save report → await embed → store vector → respond. Embedding down → the report is saved but the embed step throws. Now either the request 500s (the client thinks it failed and may retry, creating duplicate/re-upsert churn) or you swallow the error and return 201 — and either way you're left with a saved-but-unembedded report: it exists, but it's invisible to semantic search, silently inconsistent, with nothing scheduled to fix it.

(and you havent written any test cases to catch this behaviour .. ;))

All this is happening because you have chosen to do the upload and embedding together. You could just reuse the Procrastinate stuff of lexical indexing .. i think that is the more technically fitter solution ..

But ALSO:::

we dont have a lot of work to do to get ninja running for our async support. useing adrf is arguably the poorer solution .. especially now that there is nothing aprt from reportviewsets that uses DRF .. we have very little change to manage the migration to Ninja ... there would not be a better time. My strong sugesstion would be to use Ninja ..

Cheers

samuelvkwong and others added 2 commits June 10, 2026 11:25
Restructure the viewset and bulk_upsert_reports so the async/sync seam
is uniform and reads predictably across every handler:

  - Native async ORM (aget, async-for comprehensions, abulk_create) for
    anything that does NOT need to be atomic.
  - One sync helper closure per atomic block, decorated with:
        @sync_to_async(thread_sensitive=True)
        @transaction.atomic
    The sync helper owns the entire atomic write plus the
    transaction.on_commit registration, on a single thread and a
    single DB connection. `thread_sensitive=True` ensures the helper
    runs on Django's shared sync thread so the transaction semantics
    hold.

Changes:

  - bulk_upsert_reports is now `async def`. Preflight (Language /
    Modality dedupe + ensure-exists, plus the existing-Reports lookup)
    uses native async ORM. The atomic write block (Phase 4) is a
    nested sync helper with the new decorator stack.
  - acreate / aupdate / adestroy switched from
    `database_sync_to_async def _x()` to
    `@sync_to_async(thread_sensitive=True) @transaction.atomic def _x()`.
    adestroy no longer needs the explicit `with transaction.atomic`
    inside the body — the decorator does it.
  - aretrieve's serializer.data + fetcher fetches switch to plain
    sync_to_async(..., thread_sensitive=True) (no atomic needed).
  - The bulk_upsert action splits per-payload DRF validation (sync, no
    atomic) from the now-async bulk_upsert_reports call.
  - test_bulk_upsert_dedupes_metadata_keys becomes async to call the
    now-async bulk_upsert_reports.

Caveat documented in the module docstring: Django 6.0/6.1's native
async ORM methods (aget, abulk_create, ...) still wrap sync_to_async
internally — there is no native async DB backend in Django today (PR
#17275 stalled since 2024). The Phase 2 native async preflight in
bulk_upsert_reports does not parallelize at the DB layer today; the
win is architectural clarity. When Django ships native async DB
support, the code shape is positioned to benefit automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…elpers

`ReportSerializer.create` and `ReportSerializer.update` already open
their own `with transaction.atomic():` block to wrap the multi-step
Language → Report → groups → Metadata → Modalities write. Decorating
the outer `_create` / `_save` helpers with another `@transaction.atomic`
opened a redundant outer transaction whose only inner work was a
savepoint from the serializer's own block — functionally a no-op but
misleading about who owns atomicity at each layer.

Remove the decorator from `acreate._create` and `aupdate._save`. The
`transaction.on_commit(...)` registration following `serializer.save()`
in those helpers now runs outside any active transaction in production
(serializer.save committed already) and fires immediately — same
end-state as before, just without claiming ownership we don't have.

`adestroy._delete_and_schedule` keeps `@transaction.atomic` because the
`report.delete()` + `transaction.on_commit` registration must be bound
to the same transaction (Gemini's original review fix).
`bulk_upsert_reports._do_atomic_writes` keeps it because the multi-row
churn must commit atomically.

Module docstring updated to describe the criterion ("does this helper
own a transaction?") rather than blanket-applying the stack.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong and others added 8 commits June 10, 2026 11:43
Restructure the report API to follow the canonical async-with-atomic
pattern: domain writes live as `async def` operations that use native
async ORM (`aget_or_create`, `acreate`, `asave`, `aset`, `aclear`,
`adelete`); view-level atomic helpers hold the transaction and invoke
the async operations via `async_to_sync(...)`. `thread_sensitive=True`
on the outer wrapper keeps the entire call chain on one Django thread,
so the transaction context applies to every write.

Files:

  - New `radis/reports/api/operations.py`:
      - `create_report_from_validated(validated_data)` — creates a
        Report + Language/get-or-create + groups + Metadata bulk +
        Modalities/get-or-create via native async ORM.
      - `update_report_from_validated(report, validated_data)` —
        replaces all mutable fields + nested associations (metadata
        delete + recreate, modalities aclear+aset, groups aset).
      - `delete_report(report)` — `await report.adelete()`.
    None of these own a transaction; the caller does.

  - `radis/reports/api/serializers.py`:
      - `ReportSerializer.create` / `.update` are sync shims that
        delegate to `async_to_sync(operations.*_report_from_validated)`.
        DRF's standard `serializer.save()` works unchanged for any
        future caller; the `with transaction.atomic():` block formerly
        owned by the serializer is gone — atomicity moves up to the
        view's helper where it belongs.

  - `radis/reports/api/viewsets.py`:
      - `acreate` / `aupdate` re-add `@transaction.atomic` on the
        sync_to_async helper now that the serializer no longer owns
        atomicity. The helper body is just `serializer.is_valid` +
        `serializer.save()` + `transaction.on_commit(...)`.
      - `adestroy` replaces the bare `report.delete()` call inside the
        atomic helper with `async_to_sync(operations.delete_report)(report)`,
        matching the pattern across the file.
      - Phase 4 of `bulk_upsert_reports` keeps inline sync ORM
        (bulk_create / bulk_update / through-table churn) because those
        are already single-statement bulk operations; decomposing them
        into per-entity async ops would be noise. The pattern applies
        uniformly to single-entity write paths.

  - Module docstring updated to describe the operations.py +
    async_to_sync architecture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ReportSerializer now subclasses adrf.serializers.ModelSerializer.
The sync `create`/`update` shims that wrapped operations in
async_to_sync are replaced by native `async def acreate`/`aupdate`
that `await` the operations directly:

    async def acreate(self, validated_data):
        return await operations.create_report_from_validated(validated_data)

    async def aupdate(self, report, validated_data):
        return await operations.update_report_from_validated(report, validated_data)

The view's atomic helpers now bridge to the async serializer via
`async_to_sync(serializer.asave)()` inside the
`@sync_to_async(thread_sensitive=True) @transaction.atomic` block.
Net bridge count is unchanged (the `async_to_sync` simply moved from
inside the serializer up to the view's atomic helper) but two things
get cleaner:

  - The serializer reads as native async — no sync shim importing
    `async_to_sync` purely to bridge back to an async operation.
  - Async callers (notably the forthcoming inline-embedding flow that
    awaits the embedding client) can `await serializer.asave()`
    directly, eliminating a double bridge (async → sync_to_async →
    sync → async_to_sync → async) that the sync-shim version would
    have required.

Other changes:

  - `cast(ReportSerializer, self.get_serializer(...))` is used inside
    the view's atomic helpers so pyright resolves `.asave` against
    the actual ADRF type rather than DRF's `BaseSerializer` stub.
  - `serializers.py` imports `operations` at module scope (no longer
    needs a function-scope import for the bridge).
  - `serializers.py` drops the now-unused `async_to_sync` import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`ReportSerializer.acreate` and `aupdate` now each open their own
`@sync_to_async(thread_sensitive=True) @transaction.atomic` helper
that invokes the corresponding operation via `async_to_sync`. The
view's `acreate` / `aupdate` handlers are now pure async orchestration:

  - `serializer = self.get_serializer(...)`
  - `await sync_to_async(serializer.is_valid)(raise_exception=True)`
  - `report = await serializer.asave()`           ← transaction lives inside
  - `transaction.on_commit(on_commit)`             ← fires post-commit
  - `await sync_to_async(lambda: serializer.data)()` for the response

Bridge accounting (per request):

  - Old shape (atomic in view): 1 sync_to_async outer + 1 async_to_sync
    inside (for serializer.asave) = 2 bridges per write.
  - New shape (atomic in serializer): 1 sync_to_async (is_valid) +
    1 sync_to_async + 1 async_to_sync inside serializer.acreate +
    1 sync_to_async (serializer.data) = 4 bridges per write.

The 2 extra bridges are sync_to_async hops with µs-level dispatch
overhead, negligible against the SQL latency they straddle. The win is
that the serializer reads as the unit that owns "save + its
transaction", and the view reads as pure orchestration around an
async-native serializer that can be awaited directly by future async
callers (notably the forthcoming inline-embedding flow).

`adestroy` and `bulk_upsert_reports` keep their existing
`@sync_to_async @transaction.atomic` shape because neither involves a
serializer.

Module docstring updated to describe the new layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phases 1 (payload dedupe) and 3 (build new_reports / updated_reports
lists) of `bulk_upsert_reports` are pure-CPU work that used to run
inline in the outer async function — which means they were running on
the event loop thread and blocking it during the iteration. Wrap them
both in `@sync_to_async(thread_sensitive=True)` helpers so they execute
on the asgiref thread pool and the event loop stays free.

  - Phase 1 (`_dedupe_payload`) — dedupes the input payload by
    `document_id` and emits a warning on duplicates. Returns the deduped
    list back to the outer scope.
  - Phase 3 (`_build_report_lists`) — iterates the validated payload,
    builds the `new_reports` / `updated_reports` lists using
    `language_by_code` and `existing_by_document_id` from Phase 2.
    Returns the four collections back to the outer scope.

`report_field_names` is hoisted above Phase 1 since both Phase 3 and
Phase 4 reference it. The dedupe helpers (`_dedupe_by_key`,
`_dedupe_metadata`, `_dedupe_groups`) move inside `_do_atomic_writes`
because that's their only call site, keeping the outer scope clean.

Performance note (documented in the module docstring): there's no
real wall-clock benefit today because Django's `aget`, `abulk_create`
and friends are still `sync_to_async`-wrapped sync ORM internally, so
Phase 2's "native async" path schedules on the same thread pool as
Phases 1/3/4. The win is purely structural: event loop stays free of
CPU work, and the code reads as "async coordination dispatching CPU
work to threads", which composes correctly the moment Django ships
native async DB support.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…xpected cleanup

The previous docstring blurb pointed at PR #17275 (stale since 2024)
as the canonical "native async DB" tracker. That's wrong — #17275
has had no maintainer engagement since mid-2024. The actual active
work is:

  - django/django PR #18408 (Arfey) — AsyncConnectionHandler,
    AsyncCursor, async_atomic, async test bases. Open, not draft,
    ~1300 LOC. Django core's stated gate is production evidence from
    an external prototype before merging.
  - `django-async-backend` on PyPI (also Arfey) — the external
    prototype. Async reads shipped in v0.0.3 (March 2026): aget,
    acount, aexists, async for, filtering, `async_atomic`. Writes are
    in-progress; select_related / prefetch_related / multi-connection
    `gather` parallelism remain unimplemented.

The comment now spells out both pointers and lists the concrete
cleanup the codebase is positioned for once async writes land:

  - ReportSerializer.acreate / aupdate collapse from
    `@sync_to_async @transaction.atomic def _atomic: async_to_sync(operations.X)`
    to `async with async_atomic(): return await operations.X(...)`
    — 6 lines per method → 3.
  - adestroy and bulk_upsert_reports Phase 4 follow the same pattern.
  - operations.py does not change; its `await` calls just stop
    being internally sync_to_async-wrapped.
  - Phases 1 and 3 of bulk_upsert_reports stay sync_to_async-wrapped
    because they're pure CPU; correct regardless of the DB backend.

Comment-only change; no behavior impact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bulk-upsert batch size for `bulk_upsert_reports` was hardcoded as
a module-level constant in viewsets.py. Move it to settings as
`REPORTS_BULK_DB_BATCH_SIZE` (env-overridable, default 1000) so it
can be tuned per-environment without a code change. Aligns with the
existing `PGSEARCH_BULK_INSERT_BATCH_SIZE` next to it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
radis/reports/api/operations.py (1)

82-87: 💤 Low value

Minor: aclear() before aset() is redundant.

aset() already replaces the entire M2M relation, so the preceding aclear() is unnecessary. This doesn't affect correctness but adds an extra query.

♻️ Optional simplification
-    await report.modalities.aclear()
     modality_instances: list[Modality] = []
     for modality in modalities:
         instance, _ = await Modality.objects.aget_or_create(**modality)
         modality_instances.append(instance)
     await report.modalities.aset(modality_instances)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radis/reports/api/operations.py` around lines 82 - 87, The code calls
report.modalities.aclear() before building modality_instances and then
report.modalities.aset(modality_instances); remove the redundant await
report.modalities.aclear() since aset() replaces the entire M2M relation itself.
Keep the loop that uses Modality.objects.aget_or_create(...) to populate
modality_instances and then call await
report.modalities.aset(modality_instances) directly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@radis/reports/api/serializers.py`:
- Around line 48-56: The docstring for ReportSerializer is incorrect about
transaction ownership; update it to state that acreate and aupdate do own/manage
their own transactions via the inner `@_atomic`() helper (used in functions
acreate and aupdate) rather than relying on the caller to provide
`@transaction.atomic`; reference ReportSerializer, acreate, aupdate, and the inner
_atomic helper when rewording the docstring so it accurately reflects that these
methods run their own atomic blocks.

In `@radis/reports/api/viewsets.py`:
- Around line 432-433: In aretrieve and adestroy replace the current exception
re-raise that does "raise Http404" inside the "except Report.DoesNotExist"
blocks with "raise Http404 from None" so the original Report.DoesNotExist is
suppressed and the traceback is not chained; locate the except blocks in the
aretrieve and adestroy methods and update the raise statements accordingly.
- Around line 512-516: In adestroy, the Report.DoesNotExist except block should
suppress the original exception per Ruff B904; change the handler to capture the
exception (e.g., "except Report.DoesNotExist as err:") and re-raise Http404 with
explicit chaining to suppress the original (use "raise Http404 from None") so
the original traceback is not shown; update the except clause in the adestroy
method accordingly.

In `@radis/settings/base.py`:
- Around line 167-168: REPORTS_BULK_DB_BATCH_SIZE is read from the environment
and must be validated at startup: after you read REPORTS_BULK_DB_BATCH_SIZE in
radis/settings/base.py add a guard that raises
django.core.exceptions.ImproperlyConfigured if REPORTS_BULK_DB_BATCH_SIZE <= 0
(ensure you import ImproperlyConfigured from django.core.exceptions at top of
the file); this guarantees a clear error during startup rather than passing an
invalid batch_size into functions like
radis.reports.api.viewsets.bulk_upsert_reports which call
abulk_create/bulk_create/bulk_update.

---

Nitpick comments:
In `@radis/reports/api/operations.py`:
- Around line 82-87: The code calls report.modalities.aclear() before building
modality_instances and then report.modalities.aset(modality_instances); remove
the redundant await report.modalities.aclear() since aset() replaces the entire
M2M relation itself. Keep the loop that uses
Modality.objects.aget_or_create(...) to populate modality_instances and then
call await report.modalities.aset(modality_instances) directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 63a41d2d-461b-4bb6-8a9f-407e069d7d9f

📥 Commits

Reviewing files that changed from the base of the PR and between 09b0a3c and dcbe6af.

📒 Files selected for processing (5)
  • radis/reports/api/operations.py
  • radis/reports/api/serializers.py
  • radis/reports/api/viewsets.py
  • radis/reports/tests/test_bulk_upsert.py
  • radis/settings/base.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • radis/reports/tests/test_bulk_upsert.py

Comment thread radis/reports/api/serializers.py Outdated
Comment on lines +432 to +433
except Report.DoesNotExist:
raise Http404

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Both aretrieve and adestroy should use raise Http404 from None.

Both methods raise Http404 within an except Report.DoesNotExist clause without exception chaining. Use from None to explicitly suppress the original exception and avoid confusing tracebacks.

🧰 Tools
🪛 Ruff (0.15.15)

[warning] 433-433: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radis/reports/api/viewsets.py` around lines 432 - 433, In aretrieve and
adestroy replace the current exception re-raise that does "raise Http404" inside
the "except Report.DoesNotExist" blocks with "raise Http404 from None" so the
original Report.DoesNotExist is suppressed and the traceback is not chained;
locate the except blocks in the aretrieve and adestroy methods and update the
raise statements accordingly.

Comment on lines +512 to +516
async def adestroy(self, request: Request, *args: Any, **kwargs: Any) -> Response:
try:
report = await Report.objects.aget(document_id=kwargs[self.lookup_field])
except Report.DoesNotExist:
raise Http404

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add exception chaining to suppress the original exception.

Per Ruff B904, within an except clause, use raise ... from None to indicate the original exception is intentionally suppressed, or raise ... from err to chain it.

🔧 Proposed fix
     async def adestroy(self, request: Request, *args: Any, **kwargs: Any) -> Response:
         try:
             report = await Report.objects.aget(document_id=kwargs[self.lookup_field])
         except Report.DoesNotExist:
-            raise Http404
+            raise Http404 from None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def adestroy(self, request: Request, *args: Any, **kwargs: Any) -> Response:
try:
report = await Report.objects.aget(document_id=kwargs[self.lookup_field])
except Report.DoesNotExist:
raise Http404
async def adestroy(self, request: Request, *args: Any, **kwargs: Any) -> Response:
try:
report = await Report.objects.aget(document_id=kwargs[self.lookup_field])
except Report.DoesNotExist:
raise Http404 from None
🧰 Tools
🪛 Ruff (0.15.15)

[warning] 516-516: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@radis/reports/api/viewsets.py` around lines 512 - 516, In adestroy, the
Report.DoesNotExist except block should suppress the original exception per Ruff
B904; change the handler to capture the exception (e.g., "except
Report.DoesNotExist as err:") and re-raise Http404 with explicit chaining to
suppress the original (use "raise Http404 from None") so the original traceback
is not shown; update the except clause in the adestroy method accordingly.

Source: Linters/SAST tools

Comment thread radis/settings/base.py Outdated
samuelvkwong added a commit that referenced this pull request Jun 11, 2026
A single operator-driven command that drives the same async embedding code
path the ADRF views use: load NULL-embedding RSV rows, batch them, await
embed_reports_inline per batch. One embedding code path in the system, two
call sites.

Serves three operationally distinct scenarios with the same mechanics:

  1. Historical backfill — reports loaded before the inline wiring shipped.
  2. Dim or model change — after §4.5 (or
     `ReportSearchVector.objects.update(embedding=None)` for a same-dim
     model swap), every row is NULL.
  3. Outage recovery — reports whose ADRF write succeeded but whose inline
     embedding step caught EmbeddingClientError.

Properties:
  - Idempotent (filter is `embedding IS NULL`; nothing-to-do is a no-op).
  - Resumable (no checkpoint state; re-runs pick up remaining NULLs).
  - Race-tolerant with live API traffic (deterministic model => concurrent
    overlapping writes produce identical vectors; cost is one wasted HTTP
    call per overlap, no corruption).
  - Failure-tolerant within a run (embed_reports_inline catches
    EmbeddingClientError; loop continues; next run retries the NULLs).

Tests cover empty pool, full drain in batches, --limit, and service-failure
no-crash. Tests use `transaction=True` so the helper's
database_sync_to_async writes are visible across connections — same pattern
as the test_report_api.py suite from PR #230.

Spec §4.5 (dim change), §5.4 (dev/prod model swap), §6.5 (recovery),
§11.3 (model-swap mitigation), and §12 step 5 (rollout) updated to point
at the command.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
samuelvkwong and others added 4 commits June 11, 2026 12:17
  - operations.py docstring trimmed: drop the calling-pattern paragraph
    (decorator stack + async_to_sync). That belongs at the call sites,
    not in the operations module.

  - serializers.py ReportSerializer docstring corrected: the serializer
    NOW owns the atomic block via `@sync_to_async @transaction.atomic`
    inside acreate/aupdate. The previous wording claimed the view's
    helper owned the transaction — stale from before that ownership
    moved into the serializer.

  - viewsets.py module docstring cut from ~70 lines to ~15. The bullet
    list of strategy notes moves to where the relevant code lives:
      * bulk_upsert_reports gets a function docstring describing the
        4-phase model + the caveat that Django's `a*` methods are still
        internally sync_to_async-wrapped.
      * adestroy gets an inline comment explaining why it owns its
        own atomic helper instead of delegating to a serializer.
      * Per-line comments at is_valid / asave / on_commit /
        serializer.data / clone_request were already accurate and stay.

Comment-only change; no behavior impact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… constant

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/aupdate

transaction.on_commit internally calls ensure_connection(), which is
sync-only and raises SynchronousOnlyOperation when invoked from an async
handler. adestroy and bulk_upsert_reports were already safe because
they register on_commit inside a @sync_to_async @transaction.atomic
helper; acreate and aupdate registered it directly in the async body
and 500'd on every request. Wrap the registration call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop comments that reference the prior DRF code, describe how the
conversion was performed, or restate what the code already shows.
Keep the comments that capture a genuine non-obvious invariant
(the router-dispatch trap, the async-unsafe transaction.on_commit
wrap, the 403-not-404 upsert permission re-check, and the bulk
phase outline).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants