Paper Methods

This page maps manuscript methods text to concrete implementation and analysis paths in the repository.

Paper Scope

Current default: a JOSS-oriented software paper backed by existing workflow examples instead of a paper centered on one scientific campaign result.

Software Description Map

Topic	Primary paths
Lineage and differentiation from MELODIES-MONET	`../DAVINCI-MONET/README.md`, `../DAVINCI-MONET/CLAUDE.md`, `../DAVINCI-MONET/davinci_monet/pipeline/`, `../DAVINCI-MONET/davinci_monet/pairing/`
CLI and user surface	`../DAVINCI-MONET/davinci_monet/cli/`
Config schema and migration	`../DAVINCI-MONET/davinci_monet/config/`
Pipeline stages	`../DAVINCI-MONET/davinci_monet/pipeline/`
Pairing strategies	`../DAVINCI-MONET/davinci_monet/pairing/`
Plotting system	`../DAVINCI-MONET/davinci_monet/plots/`
Statistics framework	`../DAVINCI-MONET/davinci_monet/stats/`
Tests and validation	`../DAVINCI-MONET/tests/`, `../DAVINCI-MONET/pyproject.toml`, `../DAVINCI-MONET/CHANGELOG.md`

Workflow Types To Describe

Workflow	Why it matters	Evidence paths
Paired model-vs-observation	Core atmospheric model evaluation use case	`../DAVINCI-MONET/analyses/asia-aq/`
Observation-only	Enables campaign analysis even without model fields	`../DAVINCI-MONET/analyses/dc3/`
Satellite swath-to-grid	Extends the framework beyond point and track observations	`../DAVINCI-MONET/analyses/modis-aod/`, `../DAVINCI-MONET/davinci_monet/pairing/`

Candidate Methods Subsections

For JOSS, these are working notes for the Software design and State of the field sections, not a signal that the paper needs a long standalone methods chapter.

Lineage And Differentiation

This is the most important subsection for JOSS reviewers. The differentiation must be obvious on first read.

Questions to answer:

What changed enough from MELODIES-MONET to justify a separate software paper?
Which architectural changes are easy to explain and matter to users?
Which workflow capabilities are genuinely new or substantially cleaner in DAVINCI?

Concrete differentiators to draw from (see full comparison table in Paper Outline):

Procedural → stage-based pipeline: MELODIES-MONET uses a manual .open_models() → .open_obs() → .pair_data() sequence. DAVINCI uses composable Stage Protocol objects with shared PipelineContext, enabling obs-only auto-detection and pluggable stage ordering.
No types → typed runtime with stronger static checks: MELODIES-MONET has no type hints. DAVINCI ships py.typed, uses broad type annotations, enables several stricter mypy checks (check_untyped_defs, disallow_incomplete_defs, no_implicit_optional), and validates config with Pydantic schemas at parse time.
Reader-centric → geometry-driven pairing: MELODIES-MONET pairs by data source. DAVINCI auto-detects geometry (POINT, TRACK, PROFILE, SWATH, GRID) from dataset structure and dispatches to specialized strategies.
No obs-only mode → auto-detected obs-only pipeline: Entirely new capability. ObsPlotter base class with 5 dedicated renderers.
Ad hoc satellite handling → unified swath-to-grid binning: Numba-accelerated, configurable grid modes (match_model, resolution, explicit).
1000+ tests vs. limited coverage: Synthetic data fixtures cover plot types, pairing strategies, config parsing, and integration paths.
1,630x observation load speedup: Time filtering at file and data level, not available in MELODIES-MONET.

Configuration And Execution

Questions to answer:

How does a YAML config map to runtime stages?
What parts of the workflow are explicit in config versus inferred by the pipeline? (Key: obs-only mode is auto-detected from config structure)
How does Pydantic validation prevent misconfiguration before runtime?

Geometry-Aware Pairing

Questions to answer:

What geometry types are handled directly by the runtime? (5 DataGeometry values: POINT, TRACK, PROFILE, SWATH, GRID)
How should swath-to-grid binning be explained without overstating it as a separate enum geometry? (Answer: it is a strategy that operates on SWATH-geometry data but produces GRID-geometry output. The runtime dispatches it through the MODIS L2 reader path, not through DataGeometry enum dispatch)
What are the performance characteristics? (numba JIT, configurable grid modes)

Statistics And Plotting

Questions to answer:

Which metrics are first-class runtime outputs? (27 paired metrics; fixed descriptive set for obs-only)
Which plots are used as paper evidence versus user-facing examples?
How do obs-only plotters differ from paired plotters architecturally? (ObsPlotter vs. BasePlotter base class)

Performance

Treat this as optional for JOSS unless the numbers become central to the software contribution:

Time filtering at load: 1,630x speedup (163s -> 0.1s for 5-month file)
Numba-accelerated grid binning vs. pure-Python baseline
Configurable Dask concurrency for pairing

Case Study Design

Questions to answer:

Why are ASIA-AQ, DC3, and MODIS AOD enough to support the paper narrative? (Each exercises a distinct workflow type and geometry coverage without redundancy)
Which workflows do they cover?
Which one or two examples are enough for the compact JOSS evidence package?

Case Study	Workflow	Geometries	Novel aspect
ASIA-AQ	Paired	POINT, TRACK	Multi-observation breadth
DC3	Obs-only	TRACK, GRID (LMA)	No-model pipeline
MODIS-AOD	Paired + swath-to-grid	SWATH -> GRID	Satellite binning

Validation And QA

Use this section to gather concise methods language about software quality:

Test coverage and test categories in ../DAVINCI-MONET/tests/
Validation evidence in ../DAVINCI-MONET/tests/, ../DAVINCI-MONET/pyproject.toml, and checked-in analysis workflows
Known limitations and gaps that should be disclosed honestly

Methods Risks

Avoid describing not-yet-implemented behavior as production capability.
Keep observation-only statistics wording accurate: computed and plotted, but not currently exported to CSV by save_results.
Be explicit about which figures come from checked-in artifacts versus regenerated outputs.
SwathGridStrategy is not dispatched via DataGeometry enum — do not describe it as a sixth geometry type. It is a strategy used by the MODIS reader pipeline.
Performance numbers (1,630x, etc.) need to be reproducible on a named machine with a frozen commit. Record the benchmark setup.

Uh oh!

Paper Methods

Paper Methods

Paper Scope

Software Description Map

Workflow Types To Describe

Candidate Methods Subsections

Lineage And Differentiation

Configuration And Execution

Geometry-Aware Pairing

Statistics And Plotting

Performance

Case Study Design

Validation And QA

Methods Risks

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Getting Started

Usage

Analyses

Reference

Campaigns

Development

Clone this wiki locally