A curated reading list tracing the development trajectory of learning-based control — from classical adaptive control through Gaussian-process and neural-network-based approaches, safe learning, Neural ODEs, and representation learning for dynamics.
Scope. This list focuses exclusively on learning-based control. For classical nonlinear / optimal / MPC foundations, see A-make/awesome-control-theory and alti3/Awesome-Control-Theory.
Curation principle. Quality and intellectual lineage over novelty. Each entry is either a foundational work, a turning point that reframed the problem, or a canonical reference for a topic. Entries are ordered chronologically so the trajectory is visible at a glance.
Legend. 📄 paper · 📘 book / monograph · 🎓 course / lecture notes · 💻 code · ⭐ personal must-read
Existing awesome-repos cover control theory learning resources beautifully — A-make/awesome-control-theory points newcomers to free textbooks, lectures, and simulators, while chauncygu/Safe-Reinforcement-Learning-Baselines and bitzhangcy/Safe-Deep-Reinforcement-Learning collect safe-RL papers comprehensively.
What is still missing is a repo that traces the development trajectory of learning-based control itself — the line running from MRAC and adaptive backstepping through GP-MPC, Neural ODEs, CBF-based safe learning, and Koopman-style representation learning — organized in the paper + code + project link format that has become standard in computer-vision awesome-lists (e.g. the 3DGS / diffusion paper trackers). When the author wanted to understand how learning-based control got here, the lineage had to be reconstructed from scratch; this repo tries to save the next reader that work.
We welcome contributions. If you know a paper that belongs on this trajectory, or an open-source implementation worth linking, please open a PR or an issue. Follow the inclusion criteria — foundational works and turning points only, not incremental variants. Missing or broken links, incorrect DOIs, and wrong citations are especially welcome as issues.
This list was organized with the assistance of Anthropic's Claude for literature search, citation checking, and structural drafting. All entries were reviewed by the author, and every included link has been manually verified; entries where a canonical link could not be confirmed are explicitly marked TBD rather than left with a guessed URL. Any remaining errors — misattributed authors, incorrect years, broken links, or questionable curation choices — are the author's responsibility, not the tool's.
If you use AI-assisted tooling to prepare your own PRs, we ask that you follow the same policy: verify every link before submitting, and mark unverified entries as TBD.
- 1. Adaptive Control — Classical Foundations
- 2. Learning-Based Control with Gaussian Processes
- 3. Safe Learning & Control Barrier Functions
- 4. Learning-Based MPC
- 5. Reinforcement Learning Meets Control
- 6. Neural ODEs & Continuous-Depth Models
- 7. Representation Learning for Dynamics
- 8. Physics-Informed & Structure-Preserving Learning
- 9. Emerging Paradigms (2023–2026)
- 10. Open Problems
- 11. Surveys & Roadmap Papers
- 12. Software & Benchmarks
- 13. Courses & Lecture Notes
- 14. How to Contribute
Note on links. Every link in this list has been verified at the time of the entry's addition. Entries without a verified URL are marked
TBDand are open contributions — if you have a canonical link, please open an issue or PR.Note on recency. A development trajectory includes where the field is going, not only where it came from. Section 9 tracks paradigm shifts from 2023 onward (diffusion policies, VLA models, neural certificates, behavior foundation models); Section 10 collects the open problems these shifts expose. Both sections will be updated aggressively — please submit PRs for missed paradigm-shift papers.
The pre-learning era. Parameter uncertainty, certainty equivalence, and Lyapunov-based adaptation. Included because the conceptual move from unknown parameter to unknown function is the origin story of learning-based control.
Design of Model-Reference Adaptive Control Systems for Aircraft (Whitaker, Yamron & Kezer, 1958)
- Paper: TBD (MIT Instrumentation Lab Report R-164)
- Note: the MIT rule; origin of MRAC.
Lyapunov Redesign of Model Reference Adaptive Control Systems (Parks, 1966) ⭐
- Paper: TBD
- Note: replaced the unstable MIT rule with Lyapunov-based adaptation.
On Self-Tuning Regulators (Åström & Wittenmark, 1973)
- Paper: TBD
- Note: STR; the other main branch alongside MRAC.
Stable Adaptive Systems (Narendra & Annaswamy, 1989)
- Book: Prentice Hall
- Note: rigorous MRAC stability theory.
Adaptive Control (2nd ed.) (Åström & Wittenmark, 1995) ⭐
- Book: Addison-Wesley
- Note: the standard graduate reference.
Robust Adaptive Control (Ioannou & Sun, 1996)
- Paper: https://flyingv.ucsd.edu/krstic/teaching/282/ioannousun.pdf
- Note: certainty equivalence + robustness modifications (σ-mod, dead zone, projection).
Nonlinear and Adaptive Control Design (Krstić, Kanellakopoulos & Kokotović, 1995) ⭐
- Book: Wiley
- Note: the adaptive-backstepping book; merges Lyapunov design with adaptation.
L1 Adaptive Control Theory (Hovakimyan & Cao, 2010)
- Book: SIAM
- Note: decouples estimation and control bandwidth.
The key shift: the unknown is no longer a parameter vector but an unknown function. GPs provide calibrated uncertainty, which closes the loop to safety and optimal exploration.
Gaussian Process Model Based Predictive Control (Kocijan, Murray-Smith, Rasmussen & Girard, 2004)
- Paper: TBD
- Note: the first mainstream GP-MPC formulation.
Bayesian Nonparametric Adaptive Control Using Gaussian Processes (Chowdhary, Kingravi, How & Vela, 2014)
- Paper: TBD
- Note: reinstates adaptive control in a Bayesian-nonparametric frame.
Reachability-Based Safe Learning with Gaussian Processes (Akametalu, Fisac, Gillula, Kaynama, Zeilinger & Tomlin, 2014)
- Paper: TBD
- Note: first to combine HJ reachability with GP uncertainty.
Safe Controller Optimization for Quadrotors with Gaussian Processes (Berkenkamp, Schoellig & Krause, 2016)
- Paper: TBD
- Note: safe Bayesian optimization demonstrated on hardware.
Safe Model-Based Reinforcement Learning with Stability Guarantees (Berkenkamp, Turchetta, Schoellig & Krause, 2017)
- Paper: https://arxiv.org/abs/1705.08551
- Code: https://github.com/befelix/safe_learning
- Note: Lyapunov stability verification combined with GP dynamics.
Cautious Model Predictive Control Using Gaussian Process Regression (Hewing, Kabzan & Zeilinger, 2020) ⭐
- Paper: https://arxiv.org/abs/1705.10702
- Note: chance-constrained GP-MPC; the reference formulation.
Forward-invariance-based safety filters. CBF-QP became the dominant way to wrap any learned / RL policy in a safety certificate.
Control Barrier Function Based Quadratic Programs for Safety Critical Systems (Ames, Xu, Grizzle & Tabuada, 2017) ⭐
- Paper: TBD
- Note: defines modern CBFs; the QP safety filter.
End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks (Cheng, Orosz, Murray & Burdick, 2019)
Control Barrier Functions: Theory and Applications (Ames, Coogan, Egerstedt, Notomista, Sreenath & Tabuada, 2019)
- Paper: https://arxiv.org/abs/1903.11199
- Note: the definitive CBF survey.
Learning for Safety-Critical Control with Control Barrier Functions (Taylor, Singletary, Yue & Ames, 2020)
- Paper: https://arxiv.org/abs/1912.10099
- Note: closes the loop — learn residual dynamics inside the CBF framework.
The branch where model-based control absorbs data-driven model learning. Distinct from GP-MPC (which is a special case) because it also covers learned dynamics from neural networks and iterative learning.
(This section is intentionally short — substantial contributions overlap with Section 2 (GP-MPC) and Section 3 (CBF-based safety). PRs with canonical additions welcome.)
Learning-Based Model Predictive Control for Autonomous Racing (Kabzan, Hewing, Liniger & Zeilinger, 2019)
- Paper: TBD
- Note: GP-augmented MPC deployed on a real race car.
The bridge from adaptive/optimal control to RL. Only entries on the control-theoretic side of the bridge are included here — for a comprehensive safe-RL list, see the repos linked in the preface.
Reinforcement Learning Is Direct Adaptive Optimal Control (Sutton, Barto & Williams, 1992) ⭐
- Paper: TBD
- Note: explicit conceptual bridge between RL and adaptive control.
Deep learning as a continuous dynamical system. The mathematical object closest to classical control in the modern deep-learning toolbox.
A Proposal on Machine Learning via Dynamical Systems (Weinan E, 2017) ⭐
- Paper: TBD
- Note: reframes deep learning as a control / dynamical-systems problem.
Stable Architectures for Deep Neural Networks (Haber & Ruthotto, 2017)
- Paper: https://arxiv.org/abs/1705.03341
- Note: stability-motivated ResNet-like architectures.
Neural Ordinary Differential Equations (Chen, Rubanova, Bettencourt & Duvenaud, 2018) ⭐
- Paper: https://arxiv.org/abs/1806.07366
- Code: https://github.com/rtqichen/torchdiffeq
- Note: NeurIPS 2018 best paper; defines the continuous-depth paradigm.
Augmented Neural ODEs (Dupont, Doucet & Teh, 2019)
- Paper: https://arxiv.org/abs/1904.01681
- Code: https://github.com/EmilienDupont/augmented-neural-odes
- Note: topological obstructions of vanilla Neural ODEs and the fix.
How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization (Finlay, Jacobsen, Nurbekyan & Oberman, 2020)
- Paper: https://arxiv.org/abs/2002.02798
- Note: OT-style regularization for stable training.
Neural Controlled Differential Equations for Irregular Time Series (Kidger, Morrill, Foster & Lyons, 2020)
- Paper: https://arxiv.org/abs/2005.08926
- Note: controlled-path perspective on sequence modeling.
Large-Time Asymptotics in Deep Learning (Esteve, Geshkovski, Pighin & Zuazua, 2020)
- Paper: https://arxiv.org/abs/2008.02491
- Note: turnpike-theoretic view of residual networks.
Scalable Gradients for Stochastic Differential Equations (Li, Wong, Chen & Duvenaud, 2020)
- Paper: https://arxiv.org/abs/2001.01328
- Code: https://github.com/google-research/torchsde
- Note: adjoint method for SDEs.
Neural ODE Control for Classification, Approximation and Transport (Ruiz-Balet & Zuazua, 2021 / SIAM Review 2023)
- Paper: https://arxiv.org/abs/2104.05278
- Note: control-theoretic analysis — controllability and simultaneous control of NODEs as the basis of classification and universal approximation.
System identification as representation learning. Koopman, SINDy, latent world models.
Hamiltonian Systems and Transformation in Hilbert Space (Koopman, 1931) ⭐
- Paper: TBD
- Note: the original infinite-dimensional linear lift.
Spectral Properties of Dynamical Systems, Model Reduction and Decompositions (Mezić, 2005) ⭐
- Paper: TBD
- Note: revived Koopman for applied dynamics.
A Data-Driven Approximation of the Koopman Operator: Extended Dynamic Mode Decomposition (Williams, Kevrekidis & Rowley, 2015)
- Paper: https://arxiv.org/abs/1408.4408
- Note: EDMD; the workhorse algorithm.
Dynamic Mode Decomposition with Control (Proctor, Brunton & Kutz, 2016)
Linear Predictors for Nonlinear Dynamical Systems: Koopman Operator Meets MPC (Korda & Mezić, 2018)
Deep Learning for Universal Linear Embeddings of Nonlinear Dynamics (Lusch, Kutz & Brunton, 2018)
- Paper: https://arxiv.org/abs/1712.09707
- Code: https://github.com/BethanyL/DeepKoopman
- Note: learn the Koopman lift end-to-end.
Modern Koopman Theory for Dynamical Systems (Brunton, Budišić, Kaiser & Kutz, 2022) ⭐
- Paper: https://arxiv.org/abs/2102.12086
- Note: the current authoritative survey.
Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems (Brunton, Proctor & Kutz, 2016) ⭐
- Paper: TBD
- Code: https://github.com/dynamicslab/pysindy
Data-Driven Discovery of Partial Differential Equations (Rudy, Brunton, Proctor & Kutz, 2017)
Data-Driven Discovery of Coordinates and Governing Equations (Champion, Lusch, Kutz & Brunton, 2019)
Embed to Control: a Locally Linear Latent Dynamics Model for Control from Raw Images (Watter, Springenberg, Boedecker & Riedmiller, 2015)
World Models (Ha & Schmidhuber, 2018)
- Paper: https://arxiv.org/abs/1803.10122
- Project: https://worldmodels.github.io/
Dream to Control: Learning Behaviors by Latent Imagination (Hafner, Lillicrap, Ba & Norouzi, 2020)
Building inductive bias — energy conservation, symplectic structure, Lagrangian / Hamiltonian form — directly into the network architecture.
Physics-Informed Neural Networks (Raissi, Perdikaris & Karniadakis, 2019) ⭐
- Paper: TBD
- Note: the PINN paper.
Hamiltonian Neural Networks (Greydanus, Dzamba & Yosinski, 2019)
Lagrangian Neural Networks (Cranmer, Greydanus, Hoyer, Battaglia, Spergel & Ho, 2020)
The trajectory extends into the present. This section tracks paradigm-shift papers from the past three years — works that introduce a new way of formulating the control problem, not just better performance on an existing one. Entries here are chronological; some are already becoming canonical, others may not survive the decade.
A shift from deterministic / Gaussian policies to generative policies that sample from a learned action-distribution score function. The key observation: multimodal demonstrations and high-dimensional action spaces had been poorly served by unimodal policy classes.
Diffusion Policy: Visuomotor Policy Learning via Action Diffusion (Chi, Xu, Feng, Cousineau, Du, Burchfiel, Tedrake & Song, 2023 / IJRR 2024) ⭐
- Paper: https://arxiv.org/abs/2303.04137
- Project: https://diffusion-policy.cs.columbia.edu/
- Code: https://github.com/real-stanford/diffusion_policy
- Note: the paradigm-defining paper; formulates visuomotor policies as conditional DDPMs with receding-horizon control.
A shift from task-specific policies to pretrained generalist controllers that share weights across robots, tasks, and embodiments. Open problem: how to reconcile foundation-model scale with control-theoretic safety and real-time constraints.
OpenVLA: An Open-Source Vision-Language-Action Model (Kim, Pertsch, Karamcheti, Xiao, Balakrishna, Nair, Rafailov, Foster, Lam, Sanketi, Vuong, Kollar, Burchfiel, Tedrake, Sadigh, Levine, Liang & Finn, 2024) ⭐
- Paper: https://arxiv.org/abs/2406.09246
- Code: https://github.com/openvla/openvla
- Project: https://openvla.github.io/
- Note: the first fully open-source 7B VLA; anchor point for community research.
π0: A Vision-Language-Action Flow Model for General Robot Control (Black, Brown, Driess, Esmail, Equi, Finn, Fusai, Groom, Hausman, Ichter et al., 2024)
- Paper: https://arxiv.org/abs/2410.24164
- Note: flow-matching action head on a VLM backbone; representative of the industrial VLA line.
The emerging counter-paradigm: take the large policy as given, then add a control-theoretic layer that certifies safety or stability. Not a replacement for neural-CBF / certificate work below, but a distinct line focused specifically on foundation-model-scale policies (diffusion, VLA).
AEGIS / VLSA: Vision-Language-Action Models with Plug-and-Play Safety Constraint Layer (Liu et al., 2025)
- Paper: https://arxiv.org/abs/2512.11891
- Note: CBF-QP safety filter wrapping a VLA; introduces the SafeLIBERO benchmark. The filter is the certified object; the VLA itself remains uncertified.
Safe and Stable Control via Lyapunov-Guided Diffusion Models (S²Diff) (Cheng, Yang & colleagues, NeurIPS 2025) ⭐
- Paper: https://arxiv.org/abs/2509.25375
- Note: first direct stability result for diffusion-sampled policies via Almost-Lyapunov theory; trades pointwise Lie-derivative descent for a small-violation-measure condition. A genuine weakening of the classical stability notion, which is what makes the result achievable.
PACS: From Demonstrations to Safe Deployment — Path-Consistent Safety Filtering for Diffusion Policies (Römer et al., 2025 / v2 2026-03)
- Paper: https://arxiv.org/abs/2511.06385
- Note: set-based reachability analysis as the safety layer around diffusion policies. Reported to outperform CBF-based filters by up to 68 % in task success on human-robot interaction tasks — evidence that reachability-based filtering respects the training distribution better than reactive projection.
A shift from hand-designed certificates to learned neural CBFs / Lyapunov functions with formal verification. The CBF community's response to the scaling limits of analytic design.
How to Train Your Neural Control Barrier Function: Learning Safety Filters for Complex Input-Constrained Systems (So, Serlin, Mann, Gonzales, Rutledge, Roy & Fan, 2023 / ICRA 2024)
- Paper: https://arxiv.org/abs/2310.15478
- Note: policy neural CBFs (PNCBFs); value function of a nominal policy as a CBF.
Learning Neural Network Barrier Functions with Termination Guarantees (Chen, Molu & Fazlyab, 2024)
- Paper: https://arxiv.org/abs/2403.07308
- Note: CEGIS-based fine-tuning with formal termination guarantees.
Learning Conservative Neural Control Barrier Functions from Offline Data (Tabbara et al., 2025)
- Paper: https://arxiv.org/abs/2505.00908
- Note: CQL-inspired offline CBF training; OOD states become safety costs.
Scalable Verification of Neural Control Barrier Functions Using Linear Bound Propagation (Vertovec et al., 2025)
- Paper: https://arxiv.org/abs/2511.06341
- Note: LBP + McCormick relaxation to scale NCBF verification beyond small networks.
A shift from single-task RL policies to pretrained primitive skill libraries that enable zero-shot or few-shot task adaptation — the behavioral analog of VLA for low-level control.
A Survey of Behavior Foundation Model: Next-Generation Whole-Body Control System of Humanoid Robots (Yuan, Yu, Ge, Yao, Wang, Chen, Li, Zhang, Zeng, Chen & Jin, 2025)
- Paper: https://arxiv.org/abs/2506.20487
- Note: the first comprehensive survey of BFMs for humanoid WBC; defines the category.
A curated set of open research questions visible in the current trajectory. These are drawn from recent surveys and position papers, cross-referenced against the paradigm shifts in Section 9. PRs adding new open problems (with a reference paper that articulates the problem clearly) are welcome.
Inclusion policy for the "control-theoretic perspective" entries below. We list only works that analyze, optimize, or provide guarantees for learning-based systems using tools from systems and control theory — Lyapunov / ISS stability, forward invariance / CBFs, Hamilton-Jacobi reachability, passivity / energy tanks, robust and distributionally robust control, conformal prediction coupled to a feedback loop. ML papers that merely add a regularization term to a loss do not qualify and are excluded. The classical-Lyapunov-for-a-7B-VLA question is, as of 2026-04, still open; what the literature offers instead are partial results — certificates for safety filters wrapping the FM, relaxed stability notions (almost-Lyapunov, UUB), or verification for small neural certificates. We label each entry accordingly.
10.1 Safety guarantees for foundation-model-driven control. VLA models and diffusion policies inherit the control loop but not the safety guarantees of the classical control stack. Classical Lyapunov / ISS arguments require closed-form dynamics and a closed-form policy
- Problem statement: Foundation Models in Robotics: Applications, Challenges, and the Future (Firoozi et al., 2023 / IJRR 2025) — https://arxiv.org/abs/2312.07843
- Recent work (control-theoretic perspective):
- AEGIS / VLSA (Liu et al., 2025) — plug-and-play CBF-QP safety filter wrapping a VLA; certifies the filter, not the VLA — https://arxiv.org/abs/2512.11891
- S²Diff (Cheng, Yang & colleagues, NeurIPS 2025) — first direct stability proof for diffusion-sampled policies, via Almost-Lyapunov theory; weakens the stability notion from pointwise Lie-derivative descent to small-violation-measure — https://arxiv.org/abs/2509.25375
- PACS — Path-Consistent Safety Filtering for Diffusion Policies (Römer et al., v2 2026-03) — set-based reachability analysis as the safety layer; outperforms reactive CBF safety filters on human-robot interaction tasks — https://arxiv.org/abs/2511.06385
- Manifold-Guided Lyapunov Control with Diffusion (Mukherjee & colleagues, 2024) — uses diffusion to generate Lyapunov functions as a way to scale certificate synthesis — https://arxiv.org/abs/2403.17692
10.2 Physical risk in open-world deployment. Classical safe-learning assumed closed industrial environments; FM-enabled robots operate alongside humans where physical interaction is unavoidable. The question is not just avoid collision but robust constrained control under nonlinear dynamics with humans in the loop.
- Problem statement: A Comprehensive Survey on Physical Risk Control in the Era of Foundation Model-enabled Robotics (Kojima et al., 2025) — https://arxiv.org/abs/2505.12583
- Recent work (control-theoretic perspective):
- Safe Physics-informed Machine Learning for Dynamics and Control (Drgoňa et al., 2025, tutorial) — unifies Lyapunov, CBF, reachability-analysis, and safety-filter perspectives for physics-constrained learning — https://arxiv.org/abs/2504.12952
- On Safety and Liveness Filtering Using Hamilton–Jacobi Reachability Analysis (2024) — HJ-reachability for joint safety + liveness filter synthesis; applicable as a wrapper around learned policies — https://arxiv.org/abs/2312.15347
10.3 Uncertainty quantification under distribution shift. GP-based methods provide calibrated uncertainty but don't scale; deep ensembles scale but aren't calibrated under closed-loop shift (the policy's own actions move the distribution). The control-theoretic response: conformal prediction coupled to a predictive controller, giving distribution-free probabilistic safety bounds.
- Recent work (control-theoretic perspective):
- Safe Planning in Dynamic Environments Using Conformal Prediction (Lindemann, Cleaveland, Shim & Pappas, RA-L 2023) — MPC with conformal-prediction regions around learned trajectory predictors, with formal collision-probability bounds — see their publication page; arxiv preprint: https://arxiv.org/abs/2210.10254
- Formal Verification and Control with Conformal Prediction (Lindemann, Zhao, Yu, Pappas & Deshmukh, 2024) — extends conformal coupling to STL specifications for runtime verification — https://arxiv.org/abs/2409.00536
- Conformal Prediction for Uncertainty-Aware Planning with Diffusion Dynamics Model (Sun, Jiang, Qiu, Nobel, Kochenderfer & Schwager, NeurIPS 2023) — CP-calibrated safety bounds for diffusion-based world models used inside a planner.
10.4 Real-time inference with foundation-scale policies. A standard diffusion policy runs at ~1.5 Hz; a 7B VLA eats GPU memory. Classical control wants ≥50 Hz for manipulation and ≥100 Hz for flight / contact-rich. The control-theoretic framing: what is the minimal inference cost that still preserves closed-loop stability / tracking margins of the original policy?
- Recent work (control-theoretic perspective):
- One-Step Diffusion Policy (OneDP) (Wang et al., ICLR 2025) — KL-divergence distillation; 1.5 Hz → 62 Hz on a Franka. The closed-loop-margin preservation question is raised but not fully answered — https://arxiv.org/abs/2410.21257
- Consistency Policy (Prasad, Lin, Wu, Zhou & Bohg, RSS 2024) — self-consistency distillation of a pretrained diffusion policy to few-step inference.
- Open: a formal result bounding the closed-loop performance gap between a multi-step teacher and a one-step student, in terms of a control-theoretic metric (ISS gain, tracking error bound).
10.5 Bridging learned and analytic certificates. Neural CBFs / Lyapunov functions scale where analytic constructions fail, but verification is the bottleneck: the verifier's cost grows super-linearly with network width.
- Recent work (control-theoretic perspective):
- Scalable Verification of Neural CBFs Using Linear Bound Propagation (Vertovec et al., 2025) — LBP + McCormick relaxation; extends verifiable network size by about an order of magnitude — https://arxiv.org/abs/2511.06341
- Verification of Neural CBFs with Symbolic Derivative Bounds Propagation (Hu, Yang, Wei & Liu, CoRL 2024) — symbolic bounds on the derivative, not just the value — https://arxiv.org/abs/2410.16281
- Certifying Stability of RL Policies using Generalized Lyapunov Functions (Long, Cortés & Atanasov, NeurIPS 2025) — generalized Lyapunov (multi-step weighted descent) enlarges certifiable regions for RL policies, including swing-up regimes where classical pointwise descent fails.
- Latent Representations for Control Design with Provable Stability and Safety Guarantees (2025) — dynamics-aware approximate conjugacy conditions that transfer latent-space Lyapunov/barrier guarantees back to the original state space — https://arxiv.org/abs/2505.23210
10.6 Generalization across embodiments. Open X-Embodiment made cross-robot pretraining possible but not morphology-transfer. From a control standpoint the question is geometric: what invariances must the policy respect to be embodiment-agnostic?
- Status (2026-04): mostly an ML-engineering problem so far; control-theoretic treatment is thin. The cleanest candidate is the Koopman / latent-dynamics line (Section 7), which gives a representation-learning objective compatible with downstream stability analysis. We have no high-quality control-theoretic paper to list here yet — PRs welcome.
10.7 Data scarcity for contact-rich and dexterous manipulation. Internet-scale text and image data contain no force, torque, or contact signals. The control-theoretic response: combine imitation / RL with passivity-based low-level control, so the learned policy inherits energetic-stability guarantees from the controller structure.
- Recent work (control-theoretic perspective):
- Diffusion-Based Impedance Learning for Contact-Rich Manipulation (Geiger, Asfour, Hogan & Lachner, 2025) — diffusion policy generates impedance trajectories; passivity guaranteed via energy-tank construction when stiffness decreases, combinable with Hogan-style impedance shaping otherwise — https://arxiv.org/abs/2509.19696
- Learning Variable Impedance Skills from Demonstrations with Passivity Guarantee (Zhang et al., 2024) — Lyapunov-based stability condition for learned variable-stiffness profiles — https://arxiv.org/abs/2306.11308
- Unified Force-Impedance Control (Shahriari & Haddadin, IJRR 2024) — passivity-based framework for rigid and flexible-joint robots via energy tanks; a canonical low-level layer to wrap learned high-level policies.
More open problems are welcome — please submit a PR pointing to a paper that articulates the problem. For the control-theoretic perspective lists, contributions must satisfy the inclusion policy at the top of this section.
Start here if you want a structured overview before diving into primary sources.
Learning-Based Model Predictive Control: Toward Safe Learning in Control (Hewing, Wabersich, Menner & Zeilinger, 2020) ⭐
- Paper: TBD (Annual Review of Control, Robotics, and Autonomous Systems 3:269–296)
- Note: the reference survey for learning-based MPC.
A Historical Perspective of Adaptive Control and Learning (Annaswamy & Fradkov, 2021)
- Paper: https://arxiv.org/abs/2108.11336
- Note: bridges classical adaptive control and the learning era.
Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning (Brunke, Greeff, Hall, Yuan, Zhou, Panerati & Schoellig, 2022) ⭐
- Paper: https://arxiv.org/abs/2108.06266
- Code: https://github.com/utiasDSL/safe-control-gym
- Note: unifies the control and RL vocabulary for safe learning.
Foundation Models in Robotics: Applications, Challenges, and the Future (Firoozi, Tucker, Tian, Majumdar, Sun, Liu, Zhu, Song, Kapoor, Hausman, Ichter, Driess, Wu, Lu & Schwager, 2023 / IJRR 2025)
- Paper: https://arxiv.org/abs/2312.07843
- Code: https://github.com/robotics-survey/Awesome-Robotics-Foundation-Models
- Note: the canonical statement of the FM-in-robotics open-problem list (safety, UQ, real-time, data scarcity, embodiment).
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions (Kushwaha & Biron, 2025; v3 2026-01)
- Paper: https://arxiv.org/abs/2508.09128
- Note: most recent systematic survey dedicated to Lyapunov- and barrier-function-based safe RL; explicitly takes the control-theoretic viewpoint.
- 💻 safe-control-gym — https://github.com/utiasDSL/safe-control-gym
- Unified benchmark for safe learning-based control and RL.
- 💻 GPyTorch — https://github.com/cornellius-gp/gpytorch
- 💻 torchdiffeq — https://github.com/rtqichen/torchdiffeq
- 💻 torchsde — https://github.com/google-research/torchsde
- 💻 Diffrax — https://github.com/patrick-kidger/diffrax
- 💻 PySINDy — https://github.com/dynamicslab/pysindy
- 💻 KoopmanMPC — https://github.com/MilanKorda/KoopmanMPC
- 💻 DeepKoopman — https://github.com/BethanyL/DeepKoopman
- 🎓 Underactuated Robotics (MIT 6.832) — Russ Tedrake. https://underactuated.mit.edu/ ⭐
- Lyapunov, trajectory optimization, policy search, with executable notebooks.
- 🎓 Data-Driven Science & Engineering (UW AMATH 563) — Steven Brunton. https://databookuw.com/
- Koopman, SINDy, DMD, PDEs from data.
Inclusion criteria. An entry earns its place only if it satisfies at least one of:
- Foundational — introduces a concept, formalism, or tool the field still builds on.
- Turning point — reframes the problem, unifies prior threads, or opens a new line of work.
- Canonical reference — the single place to point a student for a topic.
Exclusion criteria. Incremental benchmark improvements, minor variants, short-lived trends, and entries whose only merit is recency. Classical control that is not part of the learning-based trajectory belongs in A-make/awesome-control-theory, not here.
Entry format.
**Paper Title** (Authors, Year)
- Paper: https://... (or `TBD` if uncertain)
- Code: https://... (if available)
- Project: https://... (if available)
- Note: one-sentence reason it is on the trajectory (optional)
Link policy. We prefer verified direct links (arxiv abs page, official project page, publisher DOI) over aggregator pages. If you are not 100% sure a link points to the correct paper, please leave it as TBD rather than guessing — a missing link is always better than a wrong one.
Self-review checklist before adding:
- Does this entry belong to a visible development trajectory in its section?
- Could it be replaced by something already listed without losing information?
- Is the note about the idea, not about results or benchmarks?
- Have I verified that every URL I am providing actually points to this exact paper?
If you find a wrong link, incorrect year, or misattributed author, please open an issue — corrections are as valuable as additions.
Last updated: 2026-04-16