Skip to content

Worker System Rewrite Background

Benjamin Capodanno edited this page May 7, 2026 · 1 revision

A high level technical presentation walking through the MaveDB background job system rewrite — from a monolithic jobs.py to a managed pipeline architecture with durable job state, dependency-based coordination, and per-variant annotation tracking.

Slides

Download PDF

What It Covers

The presentation is structured around the specific problems the old system created and what each part of the new design addresses:

  • The old system's shape: a single 1,766-line file where jobs owned their own orchestration
  • Why state scattered across application tables, Redis, and logs made the workflow opaque
  • How declarative pipeline definitions replaced imperative job chaining
  • How the @with_pipeline_management decorator and manager classes separated domain logic from lifecycle concerns
  • Dependency-based coordination, fan-out, and crash recovery
  • Per-variant annotation status tracking and queryable outcomes
  • Operational scripts that reuse the same job and pipeline infrastructure rather than duplicating it

It closes with current gaps and the next steps for the system.

Clone this wiki locally