Analytical FK backward: 9x faster gradient computation#70
Merged
Conversation
Replace autograd's graph-based backward through forward_kinematics_tensor with an analytical geometric Jacobian. The change is transparent — the analytical backward is used automatically when th.requires_grad=True. - Extract _fk_impl as standalone FK function - Add _FKAnalyticalBackward custom autograd Function - Precompute DOF-to-frame mapping and ancestor masks at chain init - Compatible with torch.compile and torch.vmap - Add 3 tests validating gradient correctness against finite differences - Bump version to 0.10.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allows callers to opt out of the analytical backward by passing analytical_grad=False, falling back to standard autograd. This restores support for higher-order gradients (create_graph=True) and gradients w.r.t. chain parameters when needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
38cfef7 to
d9c64c5
Compare
Move FK computation out of autograd.Function.forward() so apply() only receives plain tensors. The old design passed bfs_levels (list[Tensor]) through apply(), which torch._dynamo (PyTorch 2.4+) cannot trace. New design: compute FK on th.detach() first, then use the Function purely to attach the analytical backward — all apply() args are plain tensors. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
d9c64c5 to
7a67ac2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
forward_kinematics_tensorwith an analytical geometric Jacobian, yielding ~9x faster gradient computation on GPU and ~2.7x on CPUforward_kinematics_tensor(th)automatically uses the analytical backward whenth.requires_grad=True. No API changes for existing callers_fk_implas a standalone FK function and add_FKAnalyticalBackwardcustom autograd Functiontorch.compileandtorch.vmapanalytical_grad=Falseescape hatch for callers that need higher-order gradients (create_graph=True) or gradients w.r.t. chain parametersBenchmark (RTX 4070, H=10000 configs, 30-DOF robot)
How it works
The analytical backward computes
d(loss)/d(joint_angles)directly from the geometric Jacobian rather than replaying the autograd computation graph. For each DOF j, it sums contributions from all descendant links using a precomputed ancestor mask:d(loss)/d(q_j) = z_j · Σ_l (τ_l + (t_l - o_j) × ∂L/∂t_l)d(loss)/d(q_j) = z_j · Σ_l ∂L/∂t_lwhere
τ_lcaptures the rotation gradient and the cross product term captures the translation gradient.Breaking changes
The analytical backward only computes gradients w.r.t. joint angles. Two features from the old autograd path are not supported by default:
create_graph=True)Both are restored by passing
analytical_grad=Falsetoforward_kinematics_tensor.Test plan
torch.compileandtorch.vmapcompatibility verified🤖 Generated with Claude Code