Skip to content

Add DTW, nDTW, and SDTW trajectory metrics#12

Open
nalinraut wants to merge 2 commits into
AmeyaWagh:mainfrom
nalinraut:feature/dtw-metrics
Open

Add DTW, nDTW, and SDTW trajectory metrics#12
nalinraut wants to merge 2 commits into
AmeyaWagh:mainfrom
nalinraut:feature/dtw-metrics

Conversation

@nalinraut

Copy link
Copy Markdown

Add Dynamic Time Warping based metrics for evaluating trajectories that may have different lengths or temporal alignment. These metrics are particularly useful for evaluating VLA models and policies using action chunking (e.g., ACT, Diffusion Policy).

New metrics:

  • DTWDistance: Raw DTW distance using dynamic programming (lower=better)
  • NormalizedDTW: Mapped to [0,1] using exp(-DTW/(|R|*d)) (higher=better)
  • SuccessWeightedDTW: nDTW weighted by task success (SDTW = nDTW * Success)

Key features:

  • Support for trajectories of different lengths (core advantage over MSE/ATE)
  • Tolerates temporal misalignment (hesitation, speed differences)
  • Optional custom normalization factor
  • Full torchmetrics.Metric compatibility with distributed training support
  • Comprehensive test suite and example usage

Reference: Ilharco et al., "General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping," arXiv:1907.05446, NeurIPS ViGIL Workshop, 2019.

from torchmetrics import Metric


def _compute_dtw(predicted: Tensor, reference: Tensor) -> Tensor:

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be an independenct function? can this be part of the metric class?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three classes (DTWDistance, NormalizedDTW, SuccessWeightedDTW) use it. Making them part of one would require other two to call DTWDistance._compute_dtw(...) without really depending on the class. Including it in all three would violate DRY. I can make a base class with this method as static or leave it as is in the module.

Comment thread src/robometric_frame/trajectory_quality/dtw.py Outdated
Add Dynamic Time Warping based metrics for evaluating trajectories that
may have different lengths or temporal alignment. These metrics are
particularly useful for evaluating VLA models and policies using action
chunking (e.g., ACT, Diffusion Policy).

New metrics:
- DTWDistance: Raw DTW distance using dynamic programming (lower=better)
- NormalizedDTW: Mapped to [0,1] using exp(-DTW/(|R|*d)) (higher=better)
- SuccessWeightedDTW: nDTW weighted by task success (SDTW = nDTW * Success)

Key features:
- Support for trajectories of different lengths (core advantage over MSE/ATE)
- Tolerates temporal misalignment (hesitation, speed differences)
- Optional custom normalization factor
- Full torchmetrics.Metric compatibility with distributed training support
- Comprehensive test suite and example usage

Reference: Ilharco et al., "General Evaluation for Instruction Conditioned
Navigation using Dynamic Time Warping," arXiv:1907.05446, NeurIPS ViGIL
Workshop, 2019.
@nalinraut nalinraut force-pushed the feature/dtw-metrics branch from cc8933e to 88aa716 Compare June 20, 2026 02:22
@codecov-commenter

Copy link
Copy Markdown

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants