fix(script): parallelize manifest size backfill#2313
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe script is refactored to support configurable concurrency and retry logic for backfilling manifest file sizes from object storage. New CLI arguments control worker count, batch sizing, retry attempts, and ID bounds. Storage and database operations now include retry-with-backoff handling, queries are unified with centralized filtering, and updates are performed in bulk batches across parallel worker ranges. ChangesManifest backfill concurrency and retry architecture
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Comment |
Merging this PR will not alter performance
Comparing Footnotes
|
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |



Summary (AI generated)
Motivation (AI generated)
The existing backfill walked one ordered cursor and performed one DB update per file, which is too slow for millions of manifest entries and wastes work when the connection drops mid-run.
Business Impact (AI generated)
This makes the production repair for missing manifest file sizes finish much faster while keeping database load bounded to indexed manifest id range scans and primary-key updates. It reduces the time users see
metadata not foundin bundle manifests.Test Plan (AI generated)
bun scripts/backfill_manifest_file_sizes.mjs --helpnode --check scripts/backfill_manifest_file_sizes.mjsbunx eslint --no-ignore scripts/backfill_manifest_file_sizes.mjsgit diff --check origin/main...HEADbun lintSummary by CodeRabbit