This document provides a detailed overview of the DARIL repository organization.
DARIL/
├── README.md # Main project documentation
├── .gitignore # Git ignore patterns
├── STRUCTURE.md # This file
├── configs/ # Configuration files
├── scripts/ # Executable scripts
├── src/ # Core source code
├── notebooks/ # Interactive demos
├── docs/ # Documentation
├── outputs/ # Generated outputs (gitignored)
├── data/ # Raw datasets (user-provided)
├── docker/ # Container configurations
└── archive/ # Historical versions
Contains YAML configuration files for different experiments:
config_dgx_all_v8.yaml- Main experimental configurationconfig_dgx_all.yaml- Alternative configurations- Defines hyperparameters, model settings, data paths
Usage: Reference these files when running experiments
python scripts/run_experiment_v8.py --config configs/config_dgx_all_v8.yamlEntry points for training, evaluation, and analysis:
run_experiment_v8.py- Primary training/evaluation pipelinerun_paper_generation.py- Generate all paper figuresrunai.sh- GPU cluster job submissionrun_file.sh- Batch processing utilitiesdelete_jobs.sh- Cluster job management
Usage: Run experiments from project root
python scripts/run_experiment_v8.py --method darilMain implementation modules organized by functionality:
autoregressive_il_trainer.py- DARIL model trainer (main IL approach)world_model_trainer.py- World model learning (Dreamer-inspired)world_model_rl_trainer.py- RL training in learned world modelsirl_direct_trainer.py- Inverse Reinforcement Learningirl_next_action_trainer.py- IRL for next action predictionarchive/- Previous implementations and experimental variants
- Metrics computation (mAP, precision, recall)
- Multi-horizon evaluation (1s, 2s, 3s, 5s, 10s, 20s)
- Comparison frameworks for IL vs RL
- Performance analysis tools
- MHA (Multi-Head Attention) encoder for current action recognition
- GPT-2 decoder for autoregressive future prediction
- World model architectures (VAE-based dynamics models)
- RL policy networks (PPO, SAC implementations)
- Baseline models for comparison
- Surgical action planning environment (Gym-compatible)
- State representations from video features
- Action space definitions (100 IVT triplet classes)
- Reward functions (expert-similarity, outcome-based)
- Episode management and trajectory handling
metrics.py- Evaluation metricslogger.py- Training loggingvisualization.py- Plotting utilitiesoptimizer_scheduler.py- Learning rate schedulesrl_optimization.py- RL-specific utilities- Various plotting and analysis tools
rl_debug_tools.py- RL training diagnosticsrl_diagnostic_script.py- Troubleshooting utilities
HTML-based interactive demos and Jupyter notebooks:
enhanced_interactive_viz.html- Interactive surgical action visualizerinteractive_surgical_grid.html- Grid-based action timelineupdated_visualization.html- Updated visualization dashboardvisualization/- Visualization module codesurgical_action_visualizer.py- Visualization generationmap_horizon_plotter.py- Multi-horizon mAP plotsarchive/- Previous visualization versions
Usage: Open HTML files in browser for interactive exploration
- LaTeX source files
- MICCAI 2025 COLAS Workshop submission
- Figures, tables, and supplementary materials
rl_mechanics_explanation.md- RL implementation detailsstrategic_improvement_plan.md- Future directionssafety_guardrails_framework.md- Clinical safety considerationsrepo_structure_design.md- This repository's design philosophy- Various planning and analysis documents
paper_generator.py- Automated figure generation- Scripts to create publication-ready visualizations
Contains all experiment outputs - not tracked by Git:
- Experiment logs organized by timestamp
- JSON files with metrics
- Evaluation reports
- Comparative analysis data
- Trained model weights (
.ptfiles) - Best model checkpoints
- Training state for resuming
- TensorBoard event files
- Text logs
- Training progress tracking
- Publication figures
- Training curves
- Evaluation visualizations
enhanced_data/- Augmented dataset versionsdatasets/- Processed features- Cached computations
Note: These directories are created automatically during training and are excluded from version control due to size.
User-provided raw data (not included in repository):
data/
└── cholect50/ # CholecT50 dataset
├── video_01/ # Swin features per video
├── video_02/
└── ...
Setup: Download CholecT50 from CAMMA and place extracted features here.
Container definitions for reproducible environments:
- Dockerfile for training environment
- Docker Compose configurations
- GPU-enabled container setup
Legacy implementations and experimental code:
- Previous experiment configurations
- Deprecated model implementations
- Research explorations
- Backup code
Note: Code here is kept for reference but may not be maintained.
Train a model?
→ scripts/run_experiment_v8.py + configs/config_dgx_all_v8.yaml
Understand the DARIL model?
→ src/training/autoregressive_il_trainer.py + src/models/
Evaluate results?
→ src/evaluation/ + check outputs/results/
Create visualizations?
→ notebooks/visualization/ + docs/paper_generation/
Modify RL environments?
→ src/environment/
Debug training issues?
→ src/debugging/ + outputs/logs/
- Scripts: Descriptive action-based names (
run_experiment_v8.py) - Modules: Noun-based names (
trainer.py,evaluator.py) - Configs: Context + version (
config_dgx_all_v8.yaml) - Outputs: Timestamped directories (
2025-07-07_00-11-00/)
data/cholect50
↓
[Feature Extraction]
↓
src/training/autoregressive_il_trainer.py
↓
outputs/models_saved/best_model.pt
↓
src/evaluation/
↓
outputs/results/metrics.json
↓
docs/paper_generation/paper_generator.py
↓
outputs/figures/publication_figures/
- Configure: Edit
configs/config_dgx_all_v8.yaml - Train: Run
python scripts/run_experiment_v8.py - Monitor: Check
outputs/logs/and TensorBoard - Evaluate: Metrics saved to
outputs/results/ - Visualize: Generate figures with
scripts/run_paper_generation.py - Analyze: Review results in
outputs/figures/
- Small files (< 1MB): Track in Git (code, configs, docs)
- Medium files (1-100MB): Exclude from Git, store in
outputs/ - Large files (> 100MB): User-provided in
data/, external storage
- Never commit to
outputs/- Auto-generated content - Version control configs - Track experiment setups
- Document in
docs/paper_notes/- Research decisions - Archive old code - Move to
archive/rather than delete - Use absolute imports - From project root:
from src.models import ...
- README.md - Main project documentation
- docs/paper_manuscript/ - Paper LaTeX source
- Paper (arXiv) - Published work
Last Updated: October 23, 2025
Repository: DARIL-When-Imitation-Learning-outperforms-Reinforcement-Learning-in-Surgical-Action-Planning