-
Notifications
You must be signed in to change notification settings - Fork 1
Paper Methods
This page maps manuscript methods text to concrete implementation and analysis paths in the repository.
Current default: a JOSS-oriented software paper backed by existing workflow examples instead of a paper centered on one scientific campaign result.
| Topic | Primary paths |
|---|---|
| Lineage and differentiation from MELODIES-MONET |
../DAVINCI-MONET/README.md, ../DAVINCI-MONET/CLAUDE.md, ../DAVINCI-MONET/davinci_monet/pipeline/, ../DAVINCI-MONET/davinci_monet/pairing/
|
| CLI and user surface | ../DAVINCI-MONET/davinci_monet/cli/ |
| Config schema and migration | ../DAVINCI-MONET/davinci_monet/config/ |
| Pipeline stages | ../DAVINCI-MONET/davinci_monet/pipeline/ |
| Pairing strategies | ../DAVINCI-MONET/davinci_monet/pairing/ |
| Plotting system | ../DAVINCI-MONET/davinci_monet/plots/ |
| Statistics framework | ../DAVINCI-MONET/davinci_monet/stats/ |
| Tests and validation |
../DAVINCI-MONET/tests/, ../DAVINCI-MONET/pyproject.toml, ../DAVINCI-MONET/CHANGELOG.md
|
| Workflow | Why it matters | Evidence paths |
|---|---|---|
| Paired model-vs-observation | Core atmospheric model evaluation use case | ../DAVINCI-MONET/analyses/asia-aq/ |
| Observation-only | Enables campaign analysis even without model fields | ../DAVINCI-MONET/analyses/dc3/ |
| Satellite swath-to-grid | Extends the framework beyond point and track observations |
../DAVINCI-MONET/analyses/modis-aod/, ../DAVINCI-MONET/davinci_monet/pairing/
|
For JOSS, these are working notes for the Software design and State of the field sections, not a signal that the paper needs a long standalone methods chapter.
This is the most important subsection for JOSS reviewers. The differentiation must be obvious on first read.
Questions to answer:
- What changed enough from MELODIES-MONET to justify a separate software paper?
- Which architectural changes are easy to explain and matter to users?
- Which workflow capabilities are genuinely new or substantially cleaner in DAVINCI?
Concrete differentiators to draw from (see full comparison table in Paper Outline):
-
Procedural → stage-based pipeline: MELODIES-MONET uses a manual
.open_models()→.open_obs()→.pair_data()sequence. DAVINCI uses composableStageProtocol objects with sharedPipelineContext, enabling obs-only auto-detection and pluggable stage ordering. -
No types → typed runtime with stronger static checks: MELODIES-MONET has no type hints. DAVINCI ships
py.typed, uses broad type annotations, enables several stricter mypy checks (check_untyped_defs,disallow_incomplete_defs,no_implicit_optional), and validates config with Pydantic schemas at parse time. - Reader-centric → geometry-driven pairing: MELODIES-MONET pairs by data source. DAVINCI auto-detects geometry (POINT, TRACK, PROFILE, SWATH, GRID) from dataset structure and dispatches to specialized strategies.
-
No obs-only mode → auto-detected obs-only pipeline: Entirely new capability.
ObsPlotterbase class with 5 dedicated renderers. -
Ad hoc satellite handling → unified swath-to-grid binning: Numba-accelerated, configurable grid modes (
match_model,resolution,explicit). - 1000+ tests vs. limited coverage: Synthetic data fixtures cover plot types, pairing strategies, config parsing, and integration paths.
- 1,630x observation load speedup: Time filtering at file and data level, not available in MELODIES-MONET.
Questions to answer:
- How does a YAML config map to runtime stages?
- What parts of the workflow are explicit in config versus inferred by the pipeline? (Key: obs-only mode is auto-detected from config structure)
- How does Pydantic validation prevent misconfiguration before runtime?
Questions to answer:
- What geometry types are handled directly by the runtime? (5 DataGeometry values: POINT, TRACK, PROFILE, SWATH, GRID)
- How should swath-to-grid binning be explained without overstating it as a separate enum geometry? (Answer: it is a strategy that operates on SWATH-geometry data but produces GRID-geometry output. The runtime dispatches it through the MODIS L2 reader path, not through DataGeometry enum dispatch)
- What are the performance characteristics? (numba JIT, configurable grid modes)
Questions to answer:
- Which metrics are first-class runtime outputs? (27 paired metrics; fixed descriptive set for obs-only)
- Which plots are used as paper evidence versus user-facing examples?
- How do obs-only plotters differ from paired plotters architecturally? (ObsPlotter vs. BasePlotter base class)
Treat this as optional for JOSS unless the numbers become central to the software contribution:
- Time filtering at load: 1,630x speedup (163s -> 0.1s for 5-month file)
- Numba-accelerated grid binning vs. pure-Python baseline
- Configurable Dask concurrency for pairing
Questions to answer:
- Why are ASIA-AQ, DC3, and MODIS AOD enough to support the paper narrative? (Each exercises a distinct workflow type and geometry coverage without redundancy)
- Which workflows do they cover?
- Which one or two examples are enough for the compact JOSS evidence package?
| Case Study | Workflow | Geometries | Novel aspect |
|---|---|---|---|
| ASIA-AQ | Paired | POINT, TRACK | Multi-observation breadth |
| DC3 | Obs-only | TRACK, GRID (LMA) | No-model pipeline |
| MODIS-AOD | Paired + swath-to-grid | SWATH -> GRID | Satellite binning |
Use this section to gather concise methods language about software quality:
- Test coverage and test categories in
../DAVINCI-MONET/tests/ - Validation evidence in
../DAVINCI-MONET/tests/,../DAVINCI-MONET/pyproject.toml, and checked-in analysis workflows - Known limitations and gaps that should be disclosed honestly
- Avoid describing not-yet-implemented behavior as production capability.
- Keep observation-only statistics wording accurate: computed and plotted, but not currently exported to CSV by
save_results. - Be explicit about which figures come from checked-in artifacts versus regenerated outputs.
- SwathGridStrategy is not dispatched via DataGeometry enum — do not describe it as a sixth geometry type. It is a strategy used by the MODIS reader pipeline.
- Performance numbers (1,630x, etc.) need to be reproducible on a named machine with a frozen commit. Record the benchmark setup.
Related pages:
- Implementation Plan
- Code Review
- Tech Debt
- TODO
- Derecho
- Plotting Alternatives
- Plans
- Design Docs
- Paper (internal)