-
Notifications
You must be signed in to change notification settings - Fork 1
Code Review
Migrated from REVIEW_CODEX.md in the main repository.
Reviewer: Codex (GPT-5)
Reviewed repository structure and key runtime paths end-to-end. Attempted full test run, but execution was blocked by missing runtime dependencies.
Reviewer: Claude Opus 4.6
Verified all 11 Codex findings against source code with line-level inspection. Identified additional issues not covered by original review.
Verdict on Codex Findings: 11/11 confirmed valid
-
Plot pipeline is disconnected from validated config schema - Plot schema defines
data: list[str]but runtime readspairsinstead. No mapping logic exists. -
Requested statistics metrics can be silently ignored - Config has default
stat_listthat shadows user-suppliedmetrics, affecting all production configs. -
ModelData.extract_surface()ignores CESM vertical convention - Always takesisel({level_dim: 0})when should check if pressure increases with index.
-
Relative file paths resolved against process CWD, not config file directory.
-
Skipped stages not finalized in runner UI/log lifecycle —
SKIPPEDstatus leaves live animation running. -
Time filtering has off-by-one bug for ISO format timestamps — regex doesn't match
Tseparator. -
time_toleranceis largely not enforced during pairing — all 5 strategies accept but none enforce it.
-
NME_%fallback formula in CSV summary is incorrect — uses RMSE instead of ME. -
CLI
--strictvalidation mode is not implemented — flag shown but not wired through. -
Swath pairing collapses residual dimensions by averaging — should interpolate instead.
-
Statistics stage silently swallows per-pair computation errors — not logged to console.
-
Variable name collision risk in paired dataset assembly.
-
Pairing stage can report success even when all pairings fail — returns COMPLETED when
paired_count == 0.
-
Statistics summary CSV loses pair identity — no
Paircolumn. -
README example passes wrong config type to
PipelineContext. -
Inconsistent Dask worker configuration between stages.
-
compute_tropospheric_column()defined but never used — dead code.
- No end-to-end tests for plotting using schema-native
plots.datareferences - No tests asserting
--strictvalidation behavior - No tests for
time_toleranceenforcement across pairing strategies - No tests for relative-path resolution from non-CWD locations
- No tests for
ModelData.extract_surface()with CESM-convention vertical coordinates - No tests for ISO-format (
Tseparator) time filtering behavior
Strong structural organization and substantial test coverage (792 tests passing) around core primitives. Clean, well-decomposed architecture.
Most impactful issues are integration mismatches between config schema and runtime pipeline stages.
Reviewer: Codex (GPT-5)
M1) Critical: MODISL2AODReader does not match the real monetio MODIS API.
M2) High: The standard pipeline still has no MODIS-specific load-and-bin path.
M3) High: SwathGridStrategy is implemented but unreachable from pipeline.
M4) Medium: The proposed MODIS YAML contract not compatible with current runtime contract.
M5) Medium: Test coverage misses real MODIS integration path.
- Implementation Plan
- Code Review
- Tech Debt
- TODO
- Derecho
- Plotting Alternatives
- Plans
- Design Docs
- Paper (internal)