Abstract
Numerical methods for the optimal feedback control of high-dimensional dynamical systems typically suffer from the curse of dimensionality. In the current presentation, we devise a mesh-free data-based approximation method for the value function of optimal control problems, which partially mitigates the dimensionality problem. The method is based on a greedy Hermite kernel interpolation scheme and incorporates context knowledge by its structure. Especially, the value function surrogate is elegantly enforced to be 0 in the target state, non-negative and constructed as a correction of a linearized model. The algorithm allows formulation in a matrix-free way which ensures efficient offline and online evaluation of the surrogate, circumventing the large-matrix problem for multivariate Hermite interpolation. Additionally, an incremental Cholesky factorization is utilized in the offline generation of the surrogate. For finite time horizons, both convergence of the surrogate to the value function and for the surrogate vs. the optimal controlled dynamical system are proven. Experiments support the effectiveness of the scheme, using among others a new academic model with an explicitly given value function. It may also be useful for the community to validate other optimal control approaches.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Sethi, S.P.: Optimal control theory: applications to management science and economics. Springer (2021). https://doi.org/10.1007/978-3-030-91745-6
Dmitruk, A.V., Kuz’kina, N.V.: Existence theorem in the optimal control problem on an infinite time interval. Math. Notes 78(3), 466–480 (2005). https://doi.org/10.1007/s11006-005-0147-3
Bellman, R.E.: Adaptive control processes: a guided tour. Princeton University Press (1961)
Falcone, M., Ferretti, R.: Semi-Lagrangian approximation schemes for linear and Hamilton—Jacobi equations. Society for Industrial and Applied Mathematics, Philadelphia, PA (2013). https://doi.org/10.1137/1.9781611973051
Bokanowski, O., Garcke, J., Griebel, M., Klompmaker, I.: An adaptive sparse grid semi-Lagrangian scheme for first order Hamilton-Jacobi Bellman equations. J. Sci. Comput. 55(3), 575–605 (2013). https://doi.org/10.1007/s10915-012-9648-x
Alla, A., Falcone, M., Saluzzi, L.: An efficient DP algorithm on a tree-structure for finite horizon optimal control problems. SIAM J. Sci. Comput. 41(4), 2384–2406 (2019). https://doi.org/10.1137/18M1203900
Alla, A., Falcone, M., Saluzzi, L.: A tree structure algorithm for optimal control problems with state constraints. Rendiconti di Matematica e delle sue Applicazioni 5, 193–221 (2020)
Alla, A., Saluzzi, L.: A HJB-POD approach for the control of nonlinear PDEs on a tree structure. Applied Numerical Mathematics 155 (2019). https://doi.org/10.1016/j.apnum.2019.11.023
Bellman, R.: A Markovian decision process. Journal of Mathematics and Mechanics 6(5), 679–684 (1957)
Falcone, M.: A numerical approach to the infinite horizon problem of deterministic control theory. Appl. Math. Optim. 15(1), 1–13 (1987). https://doi.org/10.1007/BF01442644
Alla, A., Oliveira, H., Santin, G.: HJB-RBF based approach for the control of PDEs. J. Sci. Comput. 96(1) (2023). https://doi.org/10.1007/s10915-023-02208-3
Heydari, A.: Revisiting approximate dynamic programming and its convergence. IEEE Trans. Cybern. 44(12), 2733–2743 (2014). https://doi.org/10.1109/TCYB.2014.2314612
Kamalapurkar, R., Walters, P., Rosenfeld, J., Dixon, W.: Reinforcement learning for optimal feedback control: a Lyapunov-based approach (communications and control engineering). Springer (2018). 3319783831
Bellman, R.E.: A dynamic programming. Princeton University Press (1957)
Kalise, D., Kunisch, K.: Polynomial approximation of high-dimensional Hamilton-Jacobi-Bellman equations and applications to feedback control of semilinear parabolic PDEs. SIAM J. Sci. Comput. 40(2), 629–652 (2018). https://doi.org/10.1137/17M1116635
Kalise, D., Kundu, S., Kunisch, K.: Robust feedback control of nonlinear PDEs by numerical approximation of high-dimensional Hamilton-Jacobi-Isaacs equations. SIAM J. Appl. Dyn. Syst. 19(2), 1496–1524 (2020). https://doi.org/10.1137/19M1262139
Dolgov, S., Kalise, D., Kunisch, K.: Tensor decomposition methods for high-dimensional Hamilton-Jacobi-Bellman equations. SIAM J. Sci. Comput. 43(3), 1625–1650 (2021). https://doi.org/10.1137/19M1305136
Alla, A., Haasdonk, B., Schmidt, A.: Feedback control of parametrized PDEs via model order reduction and dynamic programming principle. Adv. Comput. Math. 46(1), 9 (2020). https://doi.org/10.1007/s10444-020-09744-8
Oster, M., Sallandt, L., Schneider, R.: Approximating optimal feedback controllers of finite horizon control problems using hierarchical tensor formats. SIAM J. Sci. Comput. 44(3), 746–770 (2022). https://doi.org/10.1137/21m1412190
Eigel, M., Schneider, R., Sommer, D.: Dynamical low-rank approximations of solutions to the Hamilton–Jacobi–Bellman equation. Num. Linear Algebra Appl. 30(3) (2022). https://doi.org/10.1002/nla.2463
Saridis, G.N., Lee, C.G.: An approximation theory of optimal control for trainable manipulators. IEEE Trans. Syst. Man Cybern. 9(3), 152–159 (1979). https://doi.org/10.1109/TSMC.1979.4310171
Alla, A., Falcone, M., Kalise, D.: An efficient policy iteration algorithm for dynamic programming equations. SIAM J. Sci. Comput. 37(1), 181–200 (2015). https://doi.org/10.1137/130932284
Grüne, L., Pannek, J.: Nonlinear model predictive control: theory and algorithms. Springer, Communications and Control Engineering (2011)
Freeman, R.A., Kokotovic, P.V.: Optimal nonlinear controllers for feedback linearizable systems. In: Proceedings of the American Control Conference, vol. 4, pp. 2722–2726 (1995). IEEE Inc
Schmidt, A., Haasdonk, B.: Reduced basis approximation of large scale parametric algebraic Riccati equations. ESAIM: Control Optim. Calculus Var. 24(1), 129–151 (2018)
Breiten, T., Kunisch, K., Pfeiffer, L.: Taylor expansions of the value function associated with a bilinear optimal control problem. Annales de l’Institut Henri Poincaré C, Analyse non linéaire 36(5), 1361–1399 (2019). https://doi.org/10.1016/j.anihpc.2019.01.001
Çimen, T.: State-dependent Riccati equation (SDRE) control: a survey. IFAC Proc. Vol. 41(2), 3761–3775 (2008). https://doi.org/10.3182/20080706-5-KR-1001.00635. 17th IFAC World Congress
Albi, G., Bicego, S., Kalise, D.: Gradient-augmented supervised learning of optimal feedback laws using state-dependent Riccati equations. IEEE Control Syst. Lett. 6, 836–841 (2022). https://doi.org/10.1109/LCSYS.2021.3086697
Kunisch, K., Walter, D.: Optimal feedback control of dynamical systems via value-function approximation (2023). https://doi.org/10.48550/ARXIV.2302.13122
Kunisch, K., Walter, D.: Semiglobal optimal feedback stabilization of autonomous systems via deep neural network approximation. ESAIM: Control Optim. Calculus Var. 27, 16. https://doi.org/10.1051/cocv/2021009
Kunisch, K., Vásquez-Varas, D.: Optimal polynomial feedback laws for finite horizon control problems (2023). https://doi.org/10.48550/ARXIV.2302.09878
Kunisch, K., Vásquez-Varas, D., Walter, D.: Learning optimal feedback operators and their polynomial approximation (2022). https://doi.org/10.48550/ARXIV.2208.14120
Rao, A.V.: A survey of numerical methods for optimal control. Adv. Astronaut. Sci. 135(1), 497–528 (2009)
Pontryagin, L.S., Boltyanskii, V.G., Gamkrelidze, R.V., Mishchenko, E.F.: The mathematical theory of optimal processes. Interscience (1962)
Nakamura-Zimmerer, T., Gong, Q., Kang, W.: Adaptive deep learning for high-dimensional Hamilton-Jacobi-Bellman equations. SIAM J. Sci. Comput. 43(2), 1221–1247 (2021). https://doi.org/10.1137/19M1288802
Azmi, B., Kalise, D., Kunisch, K.: Optimal feedback law recovery by gradient-augmented sparse polynomial regression. J. Mach. Learn. Res. 22(48), 1–32 (2021)
Kang, W., Wilcox, L.C.: Mitigating the curse of dimensionality: sparse grid characteristics method for optimal feedback control and HJB equations. Comput. Optim. Appl. 68(2), 289–315 (2017). https://doi.org/10.1007/s10589-017-9910-0
Schmidt, A., Haasdonk, B.: Data-driven surrogates of value functions and applications to feedback control for dynamical systems. IFAC-PapersOnLine 51(2), 307–312 (2018). 9th Vienna International Conference on Mathematical Modelling
Ehring, T., Haasdonk, B.: Feedback control for a coupled soft tissue system by kernel surrogates. In: Proceedings of COUPLED 2021. Scipedia (2021). https://doi.org/10.23967/coupled.2021.026
Ehring, T., Haasdonk, B.: Greedy sampling and approximation for realizing feedback control for high dimensional nonlinear systems. IFAC-PapersOnLine 55(20), 325–330 (2022). https://doi.org/10.1016/j.ifacol.2022.09.116. 10th Vienna International Conference on Mathematical Modelling MATHMOD 2022
Wirtz, D., Haasdonk, B.: A vectorial kernel orthogonal greedy algorithm. Dolomites Res. Note Approximation 6, 83–100 (2013)
Wenzel, T., Santin, G., Haasdonk, B.: Analysis of target data-dependent greedy kernel algorithms: convergence rates for f-,f\(\cdot \)P-and f / P-greedy. Constructive Approximation, 1–30 (2022)
Bardi, M., Capuzzo-Dolcetta, I.: Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations. Modern Birkhäuser Classics, Birkhäuser Boston (2008)
Benveniste, L.M., Scheinkman, J.A.: On the differentiability of the value function in dynamic models of economics. Econometrica 47(3), 727–732 (1979)
Aseev, S.M., Veliov, V.M.: Another view of the maximum principle for infinite-horizon optimal control problems in economics. Russ. Math. Surv. 74(6), 963 (2019)
Michel, P.: On the transversality condition in infinite horizon optimal problems. Econometrica: Journal of the Econometric Society, 975–985 (1982)
Seierstad, A., Sydsaeter, K.: Sufficient conditions in optimal control theory. International Economic Review 18(2), 367. https://doi.org/10.2307/2525753
Fahroo, F., Ross, I.M.: Pseudospectral methods for infinite-horizon nonlinear optimal control problems. J. Guid. Control. Dyn. 31(4), 927–936 (2008)
Garg, D., Hager, W.W., Rao, A.V.: Pseudospectral methods for solving infinite-horizon optimal control problems. Automatica 47(4), 829–837 (2011)
De Marchi, S., Schaback, R., Wendland, H.: Near-optimal data-independent point locations for radial basis function interpolation. Adv. Comput. Math. 23, 317–330 (2005)
Wendland, H.: Scattered data approximation vol. 17. Cambridge university press (2004)
Sontag, E.D.: Mathematical control theory: deterministic finite dimensional systems vol. 6. Springer (2013)
Kirszbraun, M.: Über die zusammenziehende und Lipschitzsche Transformationen. Fundam. Math. 22(1), 77–108 (1934)
Gronwall, T.H.: Note on the derivatives with respect to a parameter of the solutions of a system of differential equations. Annals of Mathematics, 292–296 (1919)
Khalil, H.: Nonlinear systems: Pearson New, vol. 3, International Pearson Education, Limited, Harlow (2013)
Gilding, B.H., Kersner, R.: Travelling waves in nonlinear diffusion-convection reaction vol. 60. Springer (2004)
Alla, A., Kalise, D., Simoncini, V.: State-dependent Riccati equation feedback stabilization for nonlinear PDEs. Adv. Comput. Math. 49(1), 9 (2023). https://doi.org/10.1007/s10444-022-09998-4
Funding
Open Access funding enabled and organized by Projekt DEAL. The authors gratefully acknowledge the financial support of this project by the International Research Training Group 2198 (IRTG) “Soft Tissue Robotics.” Further, we thank the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) for supporting this work by funding—EXC2075—390740016 under Germany’s Excellence Strategy.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by: Tobias Breiten
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ehring, T., Haasdonk, B. Hermite kernel surrogates for the value function of high-dimensional nonlinear optimal control problems. Adv Comput Math 50, 36 (2024). https://doi.org/10.1007/s10444-024-10128-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10444-024-10128-5