docxology · docxology · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026 · Jun 12, 2026
@@ -3,7 +3,7 @@
 > Auto-generated by `scripts/generate_counts.py` from live repo state. Do not edit
 > manually — run `uv run python scripts/generate_counts.py --write` to refresh.
 
-**Generated from live repo state on 2026-06-11 (UTC).** Volatile literals are re-derived on every run: tracked `infrastructure/` Python-file count via `git ls-files infrastructure | grep .py` (**551**), project-scope + publishing test collection via `pytest --collect-only` (**228** / **395**), the public exemplar roster, and the importable module list. The per-exemplar test/coverage snapshot table is a measured snapshot (see Test Status).
+**Generated from live repo state on 2026-06-11 (UTC).** Volatile literals are re-derived on every run: tracked `infrastructure/` Python-file count via `git ls-files infrastructure | grep .py` (**565**), project-scope + publishing test collection via `pytest --collect-only` (**228** / **395**), the public exemplar roster, and the importable module list. The per-exemplar test/coverage snapshot table is a measured snapshot (see Test Status).
 
 This file aggregates verifiable facts from discovery scripts, CI configuration, and test execution. Human-written documentation should link here rather than duplicate lists or numbers.
 
@@ -89,7 +89,7 @@ Tracked Python modules (matches the drift gate):
 git ls-files infrastructure | grep -c '\.py$'
 ```
 
-(Last refreshed count: **551** on 2026-06-11 UTC — point-in-time; re-derive with the command above, the literal drifts as the tree changes.)
+(Last refreshed count: **565** on 2026-06-11 UTC — point-in-time; re-derive with the command above, the literal drifts as the tree changes.)
 
 See `infrastructure/AGENTS.md` for module-specific function signatures and entry points.
 

@@ -0,0 +1,42 @@
+# Documentation mega-file decomposition policy
+
+Human-authored guides above **800 lines** are tracked as **P1 watch** items in
+[`infrastructure/AGENTS.md`](../../infrastructure/AGENTS.md). They are not CI
+failures; decomposition is done when a guide's edit churn or navigation cost
+justifies the split.
+
+## When to split
+
+Split a mega guide when **any** of the following hold:
+
+1. Two or more distinct audiences (operator vs author vs API consumer) share one file.
+2. More than three unrelated TOC sections are edited in the same release cycle.
+3. Cross-link density inside the file exceeds ~40 internal anchors (grep `](#` count).
+4. A new leaf would drop the parent below **650 lines** without losing narrative flow.
+
+Do **not** split generated inventories (`docs/_generated/*`, `api-reference.md`);
+those are refreshed by scripts and are exempt.
+
+## Current P1 watch list (2026-06-11)
+
+| Path | Lines | Suggested leaf topics |
+| --- | ---: | --- |
+| [`docs/reference/api-reference.md`](../reference/api-reference.md) | 3245 | Generated — no split; refresh via `scripts/generate_api_reference_doc.py` |
+| [`docs/rules/manuscript_style.md`](../rules/manuscript_style.md) | 1145 | LaTeX math · citations · figures · accessibility |
+| [`docs/guides/figures-and-analysis.md`](../guides/figures-and-analysis.md) | 860 | Registry figures · analysis scripts · manifest hooks |
+| [`docs/rules/llm_standards.md`](../rules/llm_standards.md) | 800 | Prompt hygiene · Ollama workflow · review templates |
+| [`docs/reference/common-workflows.md`](../reference/common-workflows.md) | 813 | Pipeline · validation · publishing |
+
+## Leaf naming
+
+- Place operational splits under `docs/operational/<topic>/`.
+- Place author-facing splits under `docs/guides/<topic>-*.md`.
+- Keep the parent as a **hub** with a short intro + links; do not duplicate prose.
+
+## Verification
+
+After splitting:
+
+1. Run `uv run python scripts/lint_docs.py`.
+2. Update hub links in [`docs/documentation-index.md`](../documentation-index.md).
+3. Refresh measured counts: `uv run python scripts/generate_counts.py --write`.
@@ -70,7 +70,9 @@ NEXT STEPS
 
 This block is the canonical end-of-run summary. It is rendered by
 `format_multi_project_detailed_report` in
-[`infrastructure/core/pipeline/multi_project.py`](../../../infrastructure/core/pipeline/multi_project.py),
+[`infrastructure/reporting/multi_project_report.py`](../../../infrastructure/reporting/multi_project_report.py)
+(re-exported from
+[`infrastructure/core/pipeline/multi_project.py`](../../../infrastructure/core/pipeline/multi_project.py)),
 emitted by the orchestrator in
 [`infrastructure/orchestration/pipeline_runner.py`](../../../infrastructure/orchestration/pipeline_runner.py),
 and persisted verbatim to `docs/_generated/last-run-summary.md` after every

@@ -298,8 +298,9 @@ fi
 ## Pipeline Summary Format
 
 The end-of-run terminal summary block is rendered by **`format_multi_project_detailed_report`**
-in [`infrastructure/core/pipeline/multi_project.py`](../../infrastructure/core/pipeline/multi_project.py).
-This is the **canonical pipeline-completion reporting surface** — every full-run option
+in [`infrastructure/reporting/multi_project_report.py`](../../infrastructure/reporting/multi_project_report.py)
+(re-exported from [`infrastructure/core/pipeline/multi_project.py`](../../infrastructure/core/pipeline/multi_project.py)
+for backward compatibility). This is the **canonical pipeline-completion reporting surface** — every full-run option
 (interactive menu, `./run.sh --pipeline`, and direct `infrastructure.orchestration` invocations)
 prints this block via the orchestrator in
 [`infrastructure/orchestration/pipeline_runner.py`](../../infrastructure/orchestration/pipeline_runner.py).

@@ -3073,7 +3073,7 @@ Scan extracted text for common rendering issues.
 
 ### `validate_citations`
 
-*function — defined in `infrastructure.validation.content.markdown_validator`*
+*function — defined in `infrastructure.validation.content.validator_citations`*
 
 ```python
 validate_citations(md_paths: list[str], repo_root: str | Path, bib_file: str | Path | list[str | Path] | None=None) -> list[DiagnosticEvent]
@@ -3103,7 +3103,7 @@ Validate figure registry against manuscript references.
 
 ### `validate_images`
 
-*function — defined in `infrastructure.validation.content.markdown_validator`*
+*function — defined in `infrastructure.validation.content.validator_images`*
 
 ```python
 validate_images(md_paths: list[str], repo_root: str | Path, extra_search_dirs: list[str | Path] | None=None) -> list[DiagnosticEvent]
@@ -3123,7 +3123,7 @@ Validate all markdown files in a directory.
 
 ### `validate_math`
 
-*function — defined in `infrastructure.validation.content.markdown_validator`*
+*function — defined in `infrastructure.validation.content.validator_math`*
 
 ```python
 validate_math(md_paths: list[str], repo_root: str | Path) -> list[DiagnosticEvent]
@@ -3143,7 +3143,7 @@ Validate complete output directory structure.
 
 ### `validate_pandoc_pitfalls`
 
-*function — defined in `infrastructure.validation.content.markdown_validator`*
+*function — defined in `infrastructure.validation.content.validator_pitfalls`*
 
 ```python
 validate_pandoc_pitfalls(md_paths: list[str], repo_root: str | Path) -> list[DiagnosticEvent]
@@ -3163,7 +3163,7 @@ Perform comprehensive validation of PDF rendering.
 
 ### `validate_refs`
 
-*function — defined in `infrastructure.validation.content.markdown_validator`*
+*function — defined in `infrastructure.validation.content.validator_refs`*
 
 ```python
 validate_refs(md_paths: list[str], repo_root: str | Path, labels: set[str], anchors: set[str]) -> list[DiagnosticEvent]

@@ -103,8 +103,8 @@ Tracked after the P0 composability pass (stage registry, unified markdown discov
 | `validation/integrity/link_extract.py` | 446 | **Done** (2026-06-11 close-out) — path helpers in `_link_normalize.py` (96 LOC); skip policy in `link_skip_policy.py` (144 LOC) |
 | `validation/integrity/_link_normalize.py` | 96 | **Done** (2026-06-11 close-out) — project-root + template path resolution for link validation |
 | `validation/integrity/link_skip_policy.py` | 144 | **Done** (2026-06-11) — `PATH_SKIP_*` tables + `should_validate_path()` |
-| `rendering/pipeline.py` | 665 | **Partial** (2026-06-11) — DOCX metadata via `build_pandoc_metadata()`; P2: `_manuscript_source.py` + `_combined_exports.py` |
-| `validation/content/markdown_validator.py` | 607 | Extract image/ref/math validators + pitfalls/citations leaves (discovery in `content/discovery.py`) |
+| `rendering/pipeline.py` | ~180 | **Done** (2026-06-11) — orchestrator; leaves `_manuscript_source.py`, `_combined_exports.py` |
+| `validation/content/markdown_validator.py` | ~75 | **Done** (2026-06-11) — facade; leaves `validator_{images,refs,math,pitfalls,citations}.py` |
 | `search/literature/backends.py` | — | **Done** (2026-05-29 Wave 5) — package `search/literature/backends/` |
 | `doctor/detectors.py` | — | **Done** (2026-05-29 Wave 6) — package `doctor/detectors/` |
 | `reporting/_dashboard_charts.py` | 43 | **Done** (2026-05-29 Wave 7) — facade; chart families in `_dashboard_charts_*.py` |
@@ -123,7 +123,10 @@ Tracked after the P0 composability pass (stage registry, unified markdown discov
 | `project/drift/checks_boundary.py` | 95 | **Done** (2026-06-11 v2) — src/ ↔ infrastructure import boundary |
 | `publishing/archival.py` | 669 | P1 watch: split provider adapters before next archival feature |
 | `autoresearch/validation_checks.py` | 661 | P1 watch: monitor before next autoresearch feature wave |
-| `rendering/render_all_cli.py` | — | Remove `sys.path.insert`; use `--project` / discovery like other CLIs |
+| `rendering/render_all_cli.py` | — | **Done** (2026-06-11) — `--project` + `resolve_project_root`; legacy CWD `manuscript/` retained |
+| `documentation/generate_glossary_cli.py` | — | **Done** (2026-06-11) — top-level imports; no `sys.path.insert` |
+| Doc megas (>800 LOC) | — | Policy: [`docs/maintenance/doc-mega-decomposition.md`](../docs/maintenance/doc-mega-decomposition.md) |
+| Test module line count | — | Advisory: `scripts/gates/module_line_count_check.py --include-tests` (warn ≥800, no fail) |
 | Package barrels | — | Lazy `__getattr__` on wide `__init__.py` hubs (`validation`, `reporting`, `publishing`, `doctor`) |
 
 ## Function Signatures

@@ -9,6 +9,11 @@
 from pathlib import Path
 
 from infrastructure.core.logging.utils import get_logger
+from infrastructure.documentation.glossary_gen import (
+    build_api_index,
+    generate_markdown_table,
+    inject_between_markers,
+)
 
 logger = get_logger(__name__)
 
@@ -67,17 +72,6 @@ def main() -> int:
 
     _ensure_glossary_file(glossary_md)
 
-    sys.path.insert(0, str(repo))
-    try:
-        from infrastructure.documentation.glossary_gen import (
-            build_api_index,
-            generate_markdown_table,
-            inject_between_markers,
-        )
-    except Exception as exc:  # noqa: BLE001 — dynamic import; any import error is handled identically
-        logger.error(f"Failed to import glossary_gen from infrastructure/documentation/: {exc}")
-        return 1
-
     text = glossary_md.read_text(encoding="utf-8")
 
     entries = build_api_index(str(src_dir))