Awesome Learning-Based Control

A curated reading list tracing the development trajectory of learning-based control — from classical adaptive control through Gaussian-process and neural-network-based approaches, safe learning, Neural ODEs, and representation learning for dynamics.

Scope. This list focuses exclusively on learning-based control. For classical nonlinear / optimal / MPC foundations, see A-make/awesome-control-theory and alti3/Awesome-Control-Theory.

Curation principle. Quality and intellectual lineage over novelty. Each entry is either a foundational work, a turning point that reframed the problem, or a canonical reference for a topic. Entries are ordered chronologically so the trajectory is visible at a glance.

Legend. 📄 paper · 📘 book / monograph · 🎓 course / lecture notes · 💻 code · ⭐ personal must-read

Why this list?

Existing awesome-repos cover control theory learning resources beautifully — A-make/awesome-control-theory points newcomers to free textbooks, lectures, and simulators, while chauncygu/Safe-Reinforcement-Learning-Baselines and bitzhangcy/Safe-Deep-Reinforcement-Learning collect safe-RL papers comprehensively.

What is still missing is a repo that traces the development trajectory of learning-based control itself — the line running from MRAC and adaptive backstepping through GP-MPC, Neural ODEs, CBF-based safe learning, and Koopman-style representation learning — organized in the paper + code + project link format that has become standard in computer-vision awesome-lists (e.g. the 3DGS / diffusion paper trackers). When the author wanted to understand how learning-based control got here, the lineage had to be reconstructed from scratch; this repo tries to save the next reader that work.

We welcome contributions. If you know a paper that belongs on this trajectory, or an open-source implementation worth linking, please open a PR or an issue. Follow the inclusion criteria — foundational works and turning points only, not incremental variants. Missing or broken links, incorrect DOIs, and wrong citations are especially welcome as issues.

Acknowledgment

This list was organized with the assistance of Anthropic's Claude for literature search, citation checking, and structural drafting. All entries were reviewed by the author, and every included link has been manually verified; entries where a canonical link could not be confirmed are explicitly marked TBD rather than left with a guessed URL. Any remaining errors — misattributed authors, incorrect years, broken links, or questionable curation choices — are the author's responsibility, not the tool's.

If you use AI-assisted tooling to prepare your own PRs, we ask that you follow the same policy: verify every link before submitting, and mark unverified entries as TBD.

1. Adaptive Control — Classical Foundations
2. Learning-Based Control with Gaussian Processes
3. Safe Learning & Control Barrier Functions
4. Learning-Based MPC
5. Reinforcement Learning Meets Control
6. Neural ODEs & Continuous-Depth Models
7. Representation Learning for Dynamics
8. Physics-Informed & Structure-Preserving Learning
9. Emerging Paradigms (2023–2026)
10. Open Problems
11. Surveys & Roadmap Papers
12. Software & Benchmarks
13. Courses & Lecture Notes
14. How to Contribute

Note on links. Every link in this list has been verified at the time of the entry's addition. Entries without a verified URL are marked TBD and are open contributions — if you have a canonical link, please open an issue or PR.

Note on recency. A development trajectory includes where the field is going, not only where it came from. Section 9 tracks paradigm shifts from 2023 onward (diffusion policies, VLA models, neural certificates, behavior foundation models); Section 10 collects the open problems these shifts expose. Both sections will be updated aggressively — please submit PRs for missed paradigm-shift papers.

1. Adaptive Control — Classical Foundations

The pre-learning era. Parameter uncertainty, certainty equivalence, and Lyapunov-based adaptation. Included because the conceptual move from unknown parameter to unknown function is the origin story of learning-based control.

Design of Model-Reference Adaptive Control Systems for Aircraft (Whitaker, Yamron & Kezer, 1958)

Paper: TBD (MIT Instrumentation Lab Report R-164)
Note: the MIT rule; origin of MRAC.

Lyapunov Redesign of Model Reference Adaptive Control Systems (Parks, 1966) ⭐

Paper: TBD
Note: replaced the unstable MIT rule with Lyapunov-based adaptation.

On Self-Tuning Regulators (Åström & Wittenmark, 1973)

Paper: TBD
Note: STR; the other main branch alongside MRAC.

Stable Adaptive Systems (Narendra & Annaswamy, 1989)

Book: Prentice Hall
Note: rigorous MRAC stability theory.

Adaptive Control (2nd ed.) (Åström & Wittenmark, 1995) ⭐

Book: Addison-Wesley
Note: the standard graduate reference.

Robust Adaptive Control (Ioannou & Sun, 1996)

Paper: https://flyingv.ucsd.edu/krstic/teaching/282/ioannousun.pdf
Note: certainty equivalence + robustness modifications (σ-mod, dead zone, projection).

Nonlinear and Adaptive Control Design (Krstić, Kanellakopoulos & Kokotović, 1995) ⭐

Book: Wiley
Note: the adaptive-backstepping book; merges Lyapunov design with adaptation.

L1 Adaptive Control Theory (Hovakimyan & Cao, 2010)

Book: SIAM
Note: decouples estimation and control bandwidth.

2. Learning-Based Control with Gaussian Processes

The key shift: the unknown is no longer a parameter vector but an unknown function. GPs provide calibrated uncertainty, which closes the loop to safety and optimal exploration.

Gaussian Process Model Based Predictive Control (Kocijan, Murray-Smith, Rasmussen & Girard, 2004)

Paper: TBD
Note: the first mainstream GP-MPC formulation.

Bayesian Nonparametric Adaptive Control Using Gaussian Processes (Chowdhary, Kingravi, How & Vela, 2014)

Paper: TBD
Note: reinstates adaptive control in a Bayesian-nonparametric frame.

Reachability-Based Safe Learning with Gaussian Processes (Akametalu, Fisac, Gillula, Kaynama, Zeilinger & Tomlin, 2014)

Paper: TBD
Note: first to combine HJ reachability with GP uncertainty.

Safe Controller Optimization for Quadrotors with Gaussian Processes (Berkenkamp, Schoellig & Krause, 2016)

Paper: TBD
Note: safe Bayesian optimization demonstrated on hardware.

Safe Model-Based Reinforcement Learning with Stability Guarantees (Berkenkamp, Turchetta, Schoellig & Krause, 2017)

Paper: https://arxiv.org/abs/1705.08551
Code: https://github.com/befelix/safe_learning
Note: Lyapunov stability verification combined with GP dynamics.

Cautious Model Predictive Control Using Gaussian Process Regression (Hewing, Kabzan & Zeilinger, 2020) ⭐

Paper: https://arxiv.org/abs/1705.10702
Note: chance-constrained GP-MPC; the reference formulation.

3. Safe Learning & Control Barrier Functions

Forward-invariance-based safety filters. CBF-QP became the dominant way to wrap any learned / RL policy in a safety certificate.

Control Barrier Function Based Quadratic Programs for Safety Critical Systems (Ames, Xu, Grizzle & Tabuada, 2017) ⭐

Paper: TBD
Note: defines modern CBFs; the QP safety filter.

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks (Cheng, Orosz, Murray & Burdick, 2019)

Paper: https://arxiv.org/abs/1903.08792
Code: https://github.com/rcheng805/RL-CBF

Control Barrier Functions: Theory and Applications (Ames, Coogan, Egerstedt, Notomista, Sreenath & Tabuada, 2019)

Paper: https://arxiv.org/abs/1903.11199
Note: the definitive CBF survey.

Learning for Safety-Critical Control with Control Barrier Functions (Taylor, Singletary, Yue & Ames, 2020)

Paper: https://arxiv.org/abs/1912.10099
Note: closes the loop — learn residual dynamics inside the CBF framework.

4. Learning-Based MPC

The branch where model-based control absorbs data-driven model learning. Distinct from GP-MPC (which is a special case) because it also covers learned dynamics from neural networks and iterative learning.

(This section is intentionally short — substantial contributions overlap with Section 2 (GP-MPC) and Section 3 (CBF-based safety). PRs with canonical additions welcome.)

Learning-Based Model Predictive Control for Autonomous Racing (Kabzan, Hewing, Liniger & Zeilinger, 2019)

Paper: TBD
Note: GP-augmented MPC deployed on a real race car.

5. Reinforcement Learning Meets Control

The bridge from adaptive/optimal control to RL. Only entries on the control-theoretic side of the bridge are included here — for a comprehensive safe-RL list, see the repos linked in the preface.

Reinforcement Learning Is Direct Adaptive Optimal Control (Sutton, Barto & Williams, 1992) ⭐

Paper: TBD
Note: explicit conceptual bridge between RL and adaptive control.

6. Neural ODEs & Continuous-Depth Models

Deep learning as a continuous dynamical system. The mathematical object closest to classical control in the modern deep-learning toolbox.

A Proposal on Machine Learning via Dynamical Systems (Weinan E, 2017) ⭐

Paper: TBD
Note: reframes deep learning as a control / dynamical-systems problem.

Stable Architectures for Deep Neural Networks (Haber & Ruthotto, 2017)

Paper: https://arxiv.org/abs/1705.03341
Note: stability-motivated ResNet-like architectures.

Neural Ordinary Differential Equations (Chen, Rubanova, Bettencourt & Duvenaud, 2018) ⭐

Paper: https://arxiv.org/abs/1806.07366
Code: https://github.com/rtqichen/torchdiffeq
Note: NeurIPS 2018 best paper; defines the continuous-depth paradigm.

Augmented Neural ODEs (Dupont, Doucet & Teh, 2019)

Paper: https://arxiv.org/abs/1904.01681
Code: https://github.com/EmilienDupont/augmented-neural-odes
Note: topological obstructions of vanilla Neural ODEs and the fix.

How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization (Finlay, Jacobsen, Nurbekyan & Oberman, 2020)

Paper: https://arxiv.org/abs/2002.02798
Note: OT-style regularization for stable training.

Neural Controlled Differential Equations for Irregular Time Series (Kidger, Morrill, Foster & Lyons, 2020)

Paper: https://arxiv.org/abs/2005.08926
Note: controlled-path perspective on sequence modeling.

Large-Time Asymptotics in Deep Learning (Esteve, Geshkovski, Pighin & Zuazua, 2020)

Paper: https://arxiv.org/abs/2008.02491
Note: turnpike-theoretic view of residual networks.

Scalable Gradients for Stochastic Differential Equations (Li, Wong, Chen & Duvenaud, 2020)

Paper: https://arxiv.org/abs/2001.01328
Code: https://github.com/google-research/torchsde
Note: adjoint method for SDEs.

Neural ODE Control for Classification, Approximation and Transport (Ruiz-Balet & Zuazua, 2021 / SIAM Review 2023)

Paper: https://arxiv.org/abs/2104.05278
Note: control-theoretic analysis — controllability and simultaneous control of NODEs as the basis of classification and universal approximation.

7. Representation Learning for Dynamics

System identification as representation learning. Koopman, SINDy, latent world models.

Koopman operator theory

Hamiltonian Systems and Transformation in Hilbert Space (Koopman, 1931) ⭐

Paper: TBD
Note: the original infinite-dimensional linear lift.

Spectral Properties of Dynamical Systems, Model Reduction and Decompositions (Mezić, 2005) ⭐

Paper: TBD
Note: revived Koopman for applied dynamics.

A Data-Driven Approximation of the Koopman Operator: Extended Dynamic Mode Decomposition (Williams, Kevrekidis & Rowley, 2015)

Paper: https://arxiv.org/abs/1408.4408
Note: EDMD; the workhorse algorithm.

Dynamic Mode Decomposition with Control (Proctor, Brunton & Kutz, 2016)

Paper: https://arxiv.org/abs/1409.6358

Linear Predictors for Nonlinear Dynamical Systems: Koopman Operator Meets MPC (Korda & Mezić, 2018)

Paper: https://arxiv.org/abs/1611.03537
Code: https://github.com/MilanKorda/KoopmanMPC

Deep Learning for Universal Linear Embeddings of Nonlinear Dynamics (Lusch, Kutz & Brunton, 2018)

Paper: https://arxiv.org/abs/1712.09707
Code: https://github.com/BethanyL/DeepKoopman
Note: learn the Koopman lift end-to-end.

Modern Koopman Theory for Dynamical Systems (Brunton, Budišić, Kaiser & Kutz, 2022) ⭐

Paper: https://arxiv.org/abs/2102.12086
Note: the current authoritative survey.

Sparse identification — SINDy

Discovering Governing Equations from Data by Sparse Identification of Nonlinear Dynamical Systems (Brunton, Proctor & Kutz, 2016) ⭐

Paper: TBD
Code: https://github.com/dynamicslab/pysindy

Data-Driven Discovery of Partial Differential Equations (Rudy, Brunton, Proctor & Kutz, 2017)

Paper: https://arxiv.org/abs/1609.06401

Data-Driven Discovery of Coordinates and Governing Equations (Champion, Lusch, Kutz & Brunton, 2019)

Paper: https://arxiv.org/abs/1904.02107
Code: https://github.com/kpchamp/SindyAutoencoders

Latent dynamics & world models

Embed to Control: a Locally Linear Latent Dynamics Model for Control from Raw Images (Watter, Springenberg, Boedecker & Riedmiller, 2015)

Paper: https://arxiv.org/abs/1506.07365

World Models (Ha & Schmidhuber, 2018)

Paper: https://arxiv.org/abs/1803.10122
Project: https://worldmodels.github.io/

Dream to Control: Learning Behaviors by Latent Imagination (Hafner, Lillicrap, Ba & Norouzi, 2020)

Paper: https://arxiv.org/abs/1912.01603
Code: https://github.com/danijar/dreamer

8. Physics-Informed & Structure-Preserving Learning

Building inductive bias — energy conservation, symplectic structure, Lagrangian / Hamiltonian form — directly into the network architecture.

Physics-Informed Neural Networks (Raissi, Perdikaris & Karniadakis, 2019) ⭐

Paper: TBD
Note: the PINN paper.

Hamiltonian Neural Networks (Greydanus, Dzamba & Yosinski, 2019)

Paper: https://arxiv.org/abs/1906.01563
Code: https://github.com/greydanus/hamiltonian-nn

Lagrangian Neural Networks (Cranmer, Greydanus, Hoyer, Battaglia, Spergel & Ho, 2020)

Paper: https://arxiv.org/abs/2003.04630
Code: https://github.com/MilesCranmer/lagrangian_nns

9. Emerging Paradigms (2023–2026)

The trajectory extends into the present. This section tracks paradigm-shift papers from the past three years — works that introduce a new way of formulating the control problem, not just better performance on an existing one. Entries here are chronological; some are already becoming canonical, others may not survive the decade.

Diffusion-based policies

A shift from deterministic / Gaussian policies to generative policies that sample from a learned action-distribution score function. The key observation: multimodal demonstrations and high-dimensional action spaces had been poorly served by unimodal policy classes.

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion (Chi, Xu, Feng, Cousineau, Du, Burchfiel, Tedrake & Song, 2023 / IJRR 2024) ⭐

Paper: https://arxiv.org/abs/2303.04137
Project: https://diffusion-policy.cs.columbia.edu/
Code: https://github.com/real-stanford/diffusion_policy
Note: the paradigm-defining paper; formulates visuomotor policies as conditional DDPMs with receding-horizon control.

Vision-Language-Action (VLA) foundation models

A shift from task-specific policies to pretrained generalist controllers that share weights across robots, tasks, and embodiments. Open problem: how to reconcile foundation-model scale with control-theoretic safety and real-time constraints.

OpenVLA: An Open-Source Vision-Language-Action Model (Kim, Pertsch, Karamcheti, Xiao, Balakrishna, Nair, Rafailov, Foster, Lam, Sanketi, Vuong, Kollar, Burchfiel, Tedrake, Sadigh, Levine, Liang & Finn, 2024) ⭐

Paper: https://arxiv.org/abs/2406.09246
Code: https://github.com/openvla/openvla
Project: https://openvla.github.io/
Note: the first fully open-source 7B VLA; anchor point for community research.

π0: A Vision-Language-Action Flow Model for General Robot Control (Black, Brown, Driess, Esmail, Equi, Finn, Fusai, Groom, Hausman, Ichter et al., 2024)

Paper: https://arxiv.org/abs/2410.24164
Note: flow-matching action head on a VLM backbone; representative of the industrial VLA line.

Control-theoretic safety & stability for foundation-model policies

The emerging counter-paradigm: take the large policy as given, then add a control-theoretic layer that certifies safety or stability. Not a replacement for neural-CBF / certificate work below, but a distinct line focused specifically on foundation-model-scale policies (diffusion, VLA).

AEGIS / VLSA: Vision-Language-Action Models with Plug-and-Play Safety Constraint Layer (Liu et al., 2025)

Paper: https://arxiv.org/abs/2512.11891
Note: CBF-QP safety filter wrapping a VLA; introduces the SafeLIBERO benchmark. The filter is the certified object; the VLA itself remains uncertified.

Safe and Stable Control via Lyapunov-Guided Diffusion Models (S²Diff) (Cheng, Yang & colleagues, NeurIPS 2025) ⭐

Paper: https://arxiv.org/abs/2509.25375
Note: first direct stability result for diffusion-sampled policies via Almost-Lyapunov theory; trades pointwise Lie-derivative descent for a small-violation-measure condition. A genuine weakening of the classical stability notion, which is what makes the result achievable.

PACS: From Demonstrations to Safe Deployment — Path-Consistent Safety Filtering for Diffusion Policies (Römer et al., 2025 / v2 2026-03)

Paper: https://arxiv.org/abs/2511.06385
Note: set-based reachability analysis as the safety layer around diffusion policies. Reported to outperform CBF-based filters by up to 68 % in task success on human-robot interaction tasks — evidence that reachability-based filtering respects the training distribution better than reactive projection.

Neural certificates (CBF / Lyapunov) with verification

A shift from hand-designed certificates to learned neural CBFs / Lyapunov functions with formal verification. The CBF community's response to the scaling limits of analytic design.

How to Train Your Neural Control Barrier Function: Learning Safety Filters for Complex Input-Constrained Systems (So, Serlin, Mann, Gonzales, Rutledge, Roy & Fan, 2023 / ICRA 2024)

Paper: https://arxiv.org/abs/2310.15478
Note: policy neural CBFs (PNCBFs); value function of a nominal policy as a CBF.

Learning Neural Network Barrier Functions with Termination Guarantees (Chen, Molu & Fazlyab, 2024)

Paper: https://arxiv.org/abs/2403.07308
Note: CEGIS-based fine-tuning with formal termination guarantees.

Learning Conservative Neural Control Barrier Functions from Offline Data (Tabbara et al., 2025)

Paper: https://arxiv.org/abs/2505.00908
Note: CQL-inspired offline CBF training; OOD states become safety costs.

Scalable Verification of Neural Control Barrier Functions Using Linear Bound Propagation (Vertovec et al., 2025)

Paper: https://arxiv.org/abs/2511.06341
Note: LBP + McCormick relaxation to scale NCBF verification beyond small networks.

Behavior foundation models for whole-body control

A shift from single-task RL policies to pretrained primitive skill libraries that enable zero-shot or few-shot task adaptation — the behavioral analog of VLA for low-level control.

A Survey of Behavior Foundation Model: Next-Generation Whole-Body Control System of Humanoid Robots (Yuan, Yu, Ge, Yao, Wang, Chen, Li, Zhang, Zeng, Chen & Jin, 2025)

Paper: https://arxiv.org/abs/2506.20487
Note: the first comprehensive survey of BFMs for humanoid WBC; defines the category.

10. Open Problems

A curated set of open research questions visible in the current trajectory. These are drawn from recent surveys and position papers, cross-referenced against the paradigm shifts in Section 9. PRs adding new open problems (with a reference paper that articulates the problem clearly) are welcome.

Inclusion policy for the "control-theoretic perspective" entries below. We list only works that analyze, optimize, or provide guarantees for learning-based systems using tools from systems and control theory — Lyapunov / ISS stability, forward invariance / CBFs, Hamilton-Jacobi reachability, passivity / energy tanks, robust and distributionally robust control, conformal prediction coupled to a feedback loop. ML papers that merely add a regularization term to a loss do not qualify and are excluded. The classical-Lyapunov-for-a-7B-VLA question is, as of 2026-04, still open; what the literature offers instead are partial results — certificates for safety filters wrapping the FM, relaxed stability notions (almost-Lyapunov, UUB), or verification for small neural certificates. We label each entry accordingly.

10.1 Safety guarantees for foundation-model-driven control. VLA models and diffusion policies inherit the control loop but not the safety guarantees of the classical control stack. Classical Lyapunov / ISS arguments require closed-form dynamics and a closed-form policy $u = \pi(x)$; foundation models break both assumptions (iterative denoising, autoregressive token generation, implicit dynamics from pixels). How do we formally certify, or at least runtime-monitor, such policies?

Problem statement: Foundation Models in Robotics: Applications, Challenges, and the Future (Firoozi et al., 2023 / IJRR 2025) — https://arxiv.org/abs/2312.07843
Recent work (control-theoretic perspective):
- AEGIS / VLSA (Liu et al., 2025) — plug-and-play CBF-QP safety filter wrapping a VLA; certifies the filter, not the VLA — https://arxiv.org/abs/2512.11891
- S²Diff (Cheng, Yang & colleagues, NeurIPS 2025) — first direct stability proof for diffusion-sampled policies, via Almost-Lyapunov theory; weakens the stability notion from pointwise Lie-derivative descent to small-violation-measure — https://arxiv.org/abs/2509.25375
- PACS — Path-Consistent Safety Filtering for Diffusion Policies (Römer et al., v2 2026-03) — set-based reachability analysis as the safety layer; outperforms reactive CBF safety filters on human-robot interaction tasks — https://arxiv.org/abs/2511.06385
- Manifold-Guided Lyapunov Control with Diffusion (Mukherjee & colleagues, 2024) — uses diffusion to generate Lyapunov functions as a way to scale certificate synthesis — https://arxiv.org/abs/2403.17692

10.2 Physical risk in open-world deployment. Classical safe-learning assumed closed industrial environments; FM-enabled robots operate alongside humans where physical interaction is unavoidable. The question is not just avoid collision but robust constrained control under nonlinear dynamics with humans in the loop.

Problem statement: A Comprehensive Survey on Physical Risk Control in the Era of Foundation Model-enabled Robotics (Kojima et al., 2025) — https://arxiv.org/abs/2505.12583
Recent work (control-theoretic perspective):
- Safe Physics-informed Machine Learning for Dynamics and Control (Drgoňa et al., 2025, tutorial) — unifies Lyapunov, CBF, reachability-analysis, and safety-filter perspectives for physics-constrained learning — https://arxiv.org/abs/2504.12952
- On Safety and Liveness Filtering Using Hamilton–Jacobi Reachability Analysis (2024) — HJ-reachability for joint safety + liveness filter synthesis; applicable as a wrapper around learned policies — https://arxiv.org/abs/2312.15347

10.3 Uncertainty quantification under distribution shift. GP-based methods provide calibrated uncertainty but don't scale; deep ensembles scale but aren't calibrated under closed-loop shift (the policy's own actions move the distribution). The control-theoretic response: conformal prediction coupled to a predictive controller, giving distribution-free probabilistic safety bounds.

Recent work (control-theoretic perspective):
- Safe Planning in Dynamic Environments Using Conformal Prediction (Lindemann, Cleaveland, Shim & Pappas, RA-L 2023) — MPC with conformal-prediction regions around learned trajectory predictors, with formal collision-probability bounds — see their publication page; arxiv preprint: https://arxiv.org/abs/2210.10254
- Formal Verification and Control with Conformal Prediction (Lindemann, Zhao, Yu, Pappas & Deshmukh, 2024) — extends conformal coupling to STL specifications for runtime verification — https://arxiv.org/abs/2409.00536
- Conformal Prediction for Uncertainty-Aware Planning with Diffusion Dynamics Model (Sun, Jiang, Qiu, Nobel, Kochenderfer & Schwager, NeurIPS 2023) — CP-calibrated safety bounds for diffusion-based world models used inside a planner.

10.4 Real-time inference with foundation-scale policies. A standard diffusion policy runs at ~1.5 Hz; a 7B VLA eats GPU memory. Classical control wants ≥50 Hz for manipulation and ≥100 Hz for flight / contact-rich. The control-theoretic framing: what is the minimal inference cost that still preserves closed-loop stability / tracking margins of the original policy?

Recent work (control-theoretic perspective):
- One-Step Diffusion Policy (OneDP) (Wang et al., ICLR 2025) — KL-divergence distillation; 1.5 Hz → 62 Hz on a Franka. The closed-loop-margin preservation question is raised but not fully answered — https://arxiv.org/abs/2410.21257
- Consistency Policy (Prasad, Lin, Wu, Zhou & Bohg, RSS 2024) — self-consistency distillation of a pretrained diffusion policy to few-step inference.
- Open: a formal result bounding the closed-loop performance gap between a multi-step teacher and a one-step student, in terms of a control-theoretic metric (ISS gain, tracking error bound).

10.5 Bridging learned and analytic certificates. Neural CBFs / Lyapunov functions scale where analytic constructions fail, but verification is the bottleneck: the verifier's cost grows super-linearly with network width.

Recent work (control-theoretic perspective):
- Scalable Verification of Neural CBFs Using Linear Bound Propagation (Vertovec et al., 2025) — LBP + McCormick relaxation; extends verifiable network size by about an order of magnitude — https://arxiv.org/abs/2511.06341
- Verification of Neural CBFs with Symbolic Derivative Bounds Propagation (Hu, Yang, Wei & Liu, CoRL 2024) — symbolic bounds on the derivative, not just the value — https://arxiv.org/abs/2410.16281
- Certifying Stability of RL Policies using Generalized Lyapunov Functions (Long, Cortés & Atanasov, NeurIPS 2025) — generalized Lyapunov (multi-step weighted descent) enlarges certifiable regions for RL policies, including swing-up regimes where classical pointwise descent fails.
- Latent Representations for Control Design with Provable Stability and Safety Guarantees (2025) — dynamics-aware approximate conjugacy conditions that transfer latent-space Lyapunov/barrier guarantees back to the original state space — https://arxiv.org/abs/2505.23210

10.6 Generalization across embodiments. Open X-Embodiment made cross-robot pretraining possible but not morphology-transfer. From a control standpoint the question is geometric: what invariances must the policy respect to be embodiment-agnostic?

Status (2026-04): mostly an ML-engineering problem so far; control-theoretic treatment is thin. The cleanest candidate is the Koopman / latent-dynamics line (Section 7), which gives a representation-learning objective compatible with downstream stability analysis. We have no high-quality control-theoretic paper to list here yet — PRs welcome.

10.7 Data scarcity for contact-rich and dexterous manipulation. Internet-scale text and image data contain no force, torque, or contact signals. The control-theoretic response: combine imitation / RL with passivity-based low-level control, so the learned policy inherits energetic-stability guarantees from the controller structure.

Recent work (control-theoretic perspective):
- Diffusion-Based Impedance Learning for Contact-Rich Manipulation (Geiger, Asfour, Hogan & Lachner, 2025) — diffusion policy generates impedance trajectories; passivity guaranteed via energy-tank construction when stiffness decreases, combinable with Hogan-style impedance shaping otherwise — https://arxiv.org/abs/2509.19696
- Learning Variable Impedance Skills from Demonstrations with Passivity Guarantee (Zhang et al., 2024) — Lyapunov-based stability condition for learned variable-stiffness profiles — https://arxiv.org/abs/2306.11308
- Unified Force-Impedance Control (Shahriari & Haddadin, IJRR 2024) — passivity-based framework for rigid and flexible-joint robots via energy tanks; a canonical low-level layer to wrap learned high-level policies.

More open problems are welcome — please submit a PR pointing to a paper that articulates the problem. For the control-theoretic perspective lists, contributions must satisfy the inclusion policy at the top of this section.

11. Surveys & Roadmap Papers

Start here if you want a structured overview before diving into primary sources.

Learning-Based Model Predictive Control: Toward Safe Learning in Control (Hewing, Wabersich, Menner & Zeilinger, 2020) ⭐

Paper: TBD (Annual Review of Control, Robotics, and Autonomous Systems 3:269–296)
Note: the reference survey for learning-based MPC.

A Historical Perspective of Adaptive Control and Learning (Annaswamy & Fradkov, 2021)

Paper: https://arxiv.org/abs/2108.11336
Note: bridges classical adaptive control and the learning era.

Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning (Brunke, Greeff, Hall, Yuan, Zhou, Panerati & Schoellig, 2022) ⭐

Paper: https://arxiv.org/abs/2108.06266
Code: https://github.com/utiasDSL/safe-control-gym
Note: unifies the control and RL vocabulary for safe learning.

Foundation Models in Robotics: Applications, Challenges, and the Future (Firoozi, Tucker, Tian, Majumdar, Sun, Liu, Zhu, Song, Kapoor, Hausman, Ichter, Driess, Wu, Lu & Schwager, 2023 / IJRR 2025)

Paper: https://arxiv.org/abs/2312.07843
Code: https://github.com/robotics-survey/Awesome-Robotics-Foundation-Models
Note: the canonical statement of the FM-in-robotics open-problem list (safety, UQ, real-time, data scarcity, embodiment).

A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions (Kushwaha & Biron, 2025; v3 2026-01)

Paper: https://arxiv.org/abs/2508.09128
Note: most recent systematic survey dedicated to Lyapunov- and barrier-function-based safe RL; explicitly takes the control-theoretic viewpoint.

12. Software & Benchmarks

Benchmarks & environments

💻 safe-control-gym — https://github.com/utiasDSL/safe-control-gym
- Unified benchmark for safe learning-based control and RL.

13. Courses & Lecture Notes

🎓 Underactuated Robotics (MIT 6.832) — Russ Tedrake. https://underactuated.mit.edu/ ⭐
- Lyapunov, trajectory optimization, policy search, with executable notebooks.
🎓 Data-Driven Science & Engineering (UW AMATH 563) — Steven Brunton. https://databookuw.com/
- Koopman, SINDy, DMD, PDEs from data.

How to Contribute

Inclusion criteria. An entry earns its place only if it satisfies at least one of:

Foundational — introduces a concept, formalism, or tool the field still builds on.
Turning point — reframes the problem, unifies prior threads, or opens a new line of work.
Canonical reference — the single place to point a student for a topic.

Exclusion criteria. Incremental benchmark improvements, minor variants, short-lived trends, and entries whose only merit is recency. Classical control that is not part of the learning-based trajectory belongs in A-make/awesome-control-theory, not here.

Entry format.

**Paper Title** (Authors, Year)
- Paper: https://...           (or `TBD` if uncertain)
- Code: https://...            (if available)
- Project: https://...         (if available)
- Note: one-sentence reason it is on the trajectory (optional)

Link policy. We prefer verified direct links (arxiv abs page, official project page, publisher DOI) over aggregator pages. If you are not 100% sure a link points to the correct paper, please leave it as TBD rather than guessing — a missing link is always better than a wrong one.

Self-review checklist before adding:

Does this entry belong to a visible development trajectory in its section?
Could it be replaced by something already listed without losing information?
Is the note about the idea, not about results or benchmarks?
Have I verified that every URL I am providing actually points to this exact paper?

If you find a wrong link, incorrect year, or misattributed author, please open an issue — corrections are as valuable as additions.

Last updated: 2026-04-16

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Awesome Learning-Based Control

Why this list?

Acknowledgment

Table of Contents

1. Adaptive Control — Classical Foundations

2. Learning-Based Control with Gaussian Processes

3. Safe Learning & Control Barrier Functions

4. Learning-Based MPC

5. Reinforcement Learning Meets Control

6. Neural ODEs & Continuous-Depth Models

7. Representation Learning for Dynamics

Koopman operator theory

Sparse identification — SINDy

Latent dynamics & world models

8. Physics-Informed & Structure-Preserving Learning

9. Emerging Paradigms (2023–2026)

Diffusion-based policies

Vision-Language-Action (VLA) foundation models

Control-theoretic safety & stability for foundation-model policies

Neural certificates (CBF / Lyapunov) with verification

Behavior foundation models for whole-body control

10. Open Problems

11. Surveys & Roadmap Papers

12. Software & Benchmarks

Benchmarks & environments

Gaussian processes

Neural ODEs / differentiable dynamics

Sparse identification

Koopman

13. Courses & Lecture Notes

How to Contribute

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages