skip to main content
10.1109/IVS.2019.8814124guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Controlling an Autonomous Vehicle with Deep Reinforcement Learning

Published: 09 June 2019 Publication History

Abstract

We present a control approach for autonomous vehicles based on deep reinforcement learning. A neural network agent is trained to map its estimated state to acceleration and steering commands given the objective of reaching a specific target state while considering detected obstacles. Learning is performed using state-of-the-art proximal policy optimization in combination with a simulated environment. Training from scratch takes five to nine hours. The resulting agent is evaluated within simulation and subsequently applied to control a full-size research vehicle. For this, the autonomous exploration of a parking lot is considered, including turning maneuvers and obstacle avoidance. Altogether, this work is among the first examples to successfully apply deep reinforcement learning to a real vehicle.

References

[1]
M. Maurer, J. C. Gerdes, B. Lenz, and H. Winner, Eds., Autonomous Driving: Technical, Legal and Social Aspects. Heidelberg: Springer, 2016.
[2]
M. Bojarski, P. Yeres, A. Choromanska, K. Choromanski, B. Firner, L. Jackel, and U. Muller, “Explaining how a deep neural network trained with end-to-end learning steers a car,” 2017.
[3]
K. Ogata, Modern Control Engineering, 5th ed. Pearson, 2010.
[4]
D. Kim, J. Kang, and K. Yi, “Control strategy for high-speed autonomous driving in structured road,” International IEEE Conference on Intelligent Transportation Systems, pp. 186–191, 2011.
[5]
F. Lin, Z. Lin, and X. Qiu, “LQR controller for car-like robot,” in 35th Chinese Control Conference (CCC), 2016, pp. 2515–2518.
[6]
N. Tavan, M. Tavan, and R. Hosseini, “An optimal integrated longitudinal and lateral dynamic controller development for vehicle path tracking,” Latin American Journal of Solids and Structures, vol. 12, pp. 1006–1023, 2015.
[7]
P. Falcone, F. Borrelli, J. Asgari, and D. Hrovat, “Low complexity MPC schemes for integrated vehicle dynamics control problems,” International Symposium on Advanced Vehicle Control, 2008.
[8]
L. Sommer, M. Rick, A. Folkers, and C. Büskens, “AO-Car: transfer of space technology to autonomous driving with the use of WORHP,” in Proceedings of the 7th International Conference on Astrodynamics Tools and Techniques, 2018.
[9]
M. Knauer and C. Büskens, “From WORHP to TransWORHP,” in Proceedings of the 5th International Conference on Astrodynamics Tools and Techniques, 2012.
[10]
S. Gu, E. Holly, T. Lillicrap, and S. Levine, “Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates,” 2016.
[11]
P. Abbeel, A. Coates, M. Quigley, and A. Y. Ng, “An application of reinforcement learning to aerobatic helicopter flight,” in Advances in Neural Information Processing Systems 19, B. Schölkopf, J. C. Platt, and T. Hoffman, Eds. MIT Press, 2007, pp. 1–8.
[12]
D. Isele, R. Rahimi, A. Cosgun, K. Subramanian, and K. Fujimura, “Navigating occluded intersections with autonomous vehicles using deep reinforcement learning,” 2017.
[13]
B. Mirchevska, M. Blum, L. Louis, J. Boedecker, and M. Werling, “Reinforcement learning for autonomous maneuvering in highway scenarios,” in Workshop Fahrerassistenzsysteme und automatisiertes Fahren, 2017.
[14]
A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani, “Deep reinforcement learning framework for autonomous driving,” Electronic Imaging, Autonomous Vehicles and Machines, pp. 70–76, 2017.
[15]
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” 2015.
[16]
V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in Proceedings of The 33rd International Conference on Machine Learning, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48, 2016, pp. 1928–1937.
[17]
A. Kendall, J. Hawke, D. Janz, P. Mazur, D. Reda, J.-M. Allen, V.-D. Lam, A. Bewley, and A. Shah, “Learning to drive in a day,” 2018.
[18]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017.
[19]
R. S. Sutton and A. G. Barto, Reinforcement learning - an introduction, ser. Adaptive computation and machine learning. MIT Press, 2010.
[20]
D. E. Rumelhart, G. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, pp. 533–536, 1986.
[21]
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
[22]
Z. Wang, V. Bapst, N. Heess, V. Mnih, R. Munos, K. Kavukcuoglu, and N. de Freitas, “Sample efficient actor-critic with experience replay,” 2016.
[23]
J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei, Eds., vol. 37, Lille, France, 2015, pp. 1889–1897.
[24]
J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, “High-dimensional continuous control using generalized advantage estimation,” in Proceedings of the International Conference on Learning Representations (ICLR), 2016.
[25]
M. Rick, J. Clemens, L. Sommer, A. Folkers, K. Schill, and C. Büskens, “Autonomous driving based on nonlinear model predictive control and multi-sensor fusion,” to appear in Proceedings of the 10th IFAC Symposium on Intelligent Autonomous Vehicles.
[26]
P. Polack, F. Altch, B. d'Andra Novel, and A. de La Fortelle, “The kinematic bicycle model: A consistent model for planning feasible trajectories for autonomous vehicles?” in 2017 IEEE Intelligent Vehicles Symposium (IV), 2017, pp. 812–818.
[27]
A. D. Luca, G. Oriolo, and C. Samson, “Feedback control of a nonholonomic car-like robot,” in Robot Motion Planning and Control. Springer, 1998, ch. 4.
[28]
S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” 2016.
[29]
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014.

Cited By

View all
  • (2023)A Hierarchical Imitation Learning-based Decision Framework for Autonomous DrivingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615454(4695-4701)Online publication date: 21-Oct-2023
  • (2022)A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving PolicyWireless Communications & Mobile Computing10.1155/2022/96654212022Online publication date: 1-Jan-2022
  • (2022)The OPA3L System and Testconcept for Urban Autonomous Driving2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC55140.2022.9922416(1949-1956)Online publication date: 8-Oct-2022

Index Terms

  1. Controlling an Autonomous Vehicle with Deep Reinforcement Learning
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Guide Proceedings
          2019 IEEE Intelligent Vehicles Symposium (IV)
          Jun 2019
          2358 pages

          Publisher

          IEEE Press

          Publication History

          Published: 09 June 2019

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 15 Sep 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)A Hierarchical Imitation Learning-based Decision Framework for Autonomous DrivingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615454(4695-4701)Online publication date: 21-Oct-2023
          • (2022)A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving PolicyWireless Communications & Mobile Computing10.1155/2022/96654212022Online publication date: 1-Jan-2022
          • (2022)The OPA3L System and Testconcept for Urban Autonomous Driving2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC55140.2022.9922416(1949-1956)Online publication date: 8-Oct-2022

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media