research-article

Controlling an Autonomous Vehicle with Deep Reinforcement Learning

Authors:

Andreas Folkers,

Christof BüskensAuthors Info & Claims

2019 IEEE Intelligent Vehicles Symposium (IV)

Pages 2025 - 2031

https://doi.org/10.1109/IVS.2019.8814124

Published: 09 June 2019 Publication History

Abstract

We present a control approach for autonomous vehicles based on deep reinforcement learning. A neural network agent is trained to map its estimated state to acceleration and steering commands given the objective of reaching a specific target state while considering detected obstacles. Learning is performed using state-of-the-art proximal policy optimization in combination with a simulated environment. Training from scratch takes five to nine hours. The resulting agent is evaluated within simulation and subsequently applied to control a full-size research vehicle. For this, the autonomous exploration of a parking lot is considered, including turning maneuvers and obstacle avoidance. Altogether, this work is among the first examples to successfully apply deep reinforcement learning to a real vehicle.

References

[1]

M. Maurer, J. C. Gerdes, B. Lenz, and H. Winner, Eds., Autonomous Driving: Technical, Legal and Social Aspects. Heidelberg: Springer, 2016.

[2]

M. Bojarski, P. Yeres, A. Choromanska, K. Choromanski, B. Firner, L. Jackel, and U. Muller, “Explaining how a deep neural network trained with end-to-end learning steers a car,” 2017.

[3]

K. Ogata, Modern Control Engineering, 5th ed. Pearson, 2010.

[4]

D. Kim, J. Kang, and K. Yi, “Control strategy for high-speed autonomous driving in structured road,” International IEEE Conference on Intelligent Transportation Systems, pp. 186–191, 2011.

[5]

F. Lin, Z. Lin, and X. Qiu, “LQR controller for car-like robot,” in 35th Chinese Control Conference (CCC), 2016, pp. 2515–2518.

[6]

N. Tavan, M. Tavan, and R. Hosseini, “An optimal integrated longitudinal and lateral dynamic controller development for vehicle path tracking,” Latin American Journal of Solids and Structures, vol. 12, pp. 1006–1023, 2015.

[7]

P. Falcone, F. Borrelli, J. Asgari, and D. Hrovat, “Low complexity MPC schemes for integrated vehicle dynamics control problems,” International Symposium on Advanced Vehicle Control, 2008.

[8]

L. Sommer, M. Rick, A. Folkers, and C. Büskens, “AO-Car: transfer of space technology to autonomous driving with the use of WORHP,” in Proceedings of the 7th International Conference on Astrodynamics Tools and Techniques, 2018.

[9]

M. Knauer and C. Büskens, “From WORHP to TransWORHP,” in Proceedings of the 5th International Conference on Astrodynamics Tools and Techniques, 2012.

[10]

S. Gu, E. Holly, T. Lillicrap, and S. Levine, “Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates,” 2016.

[11]

P. Abbeel, A. Coates, M. Quigley, and A. Y. Ng, “An application of reinforcement learning to aerobatic helicopter flight,” in Advances in Neural Information Processing Systems 19, B. Schölkopf, J. C. Platt, and T. Hoffman, Eds. MIT Press, 2007, pp. 1–8.

[12]

D. Isele, R. Rahimi, A. Cosgun, K. Subramanian, and K. Fujimura, “Navigating occluded intersections with autonomous vehicles using deep reinforcement learning,” 2017.

[13]

B. Mirchevska, M. Blum, L. Louis, J. Boedecker, and M. Werling, “Reinforcement learning for autonomous maneuvering in highway scenarios,” in Workshop Fahrerassistenzsysteme und automatisiertes Fahren, 2017.

[14]

A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani, “Deep reinforcement learning framework for autonomous driving,” Electronic Imaging, Autonomous Vehicles and Machines, pp. 70–76, 2017.

[15]

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” 2015.

[16]

V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in Proceedings of The 33rd International Conference on Machine Learning, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48, 2016, pp. 1928–1937.

[17]

A. Kendall, J. Hawke, D. Janz, P. Mazur, D. Reda, J.-M. Allen, V.-D. Lam, A. Bewley, and A. Shah, “Learning to drive in a day,” 2018.

[18]

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017.

[19]

R. S. Sutton and A. G. Barto, Reinforcement learning - an introduction, ser. Adaptive computation and machine learning. MIT Press, 2010.

[20]

D. E. Rumelhart, G. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, pp. 533–536, 1986.

[21]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.

[22]

Z. Wang, V. Bapst, N. Heess, V. Mnih, R. Munos, K. Kavukcuoglu, and N. de Freitas, “Sample efficient actor-critic with experience replay,” 2016.

[23]

J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei, Eds., vol. 37, Lille, France, 2015, pp. 1889–1897.

[24]

J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, “High-dimensional continuous control using generalized advantage estimation,” in Proceedings of the International Conference on Learning Representations (ICLR), 2016.

[25]

M. Rick, J. Clemens, L. Sommer, A. Folkers, K. Schill, and C. Büskens, “Autonomous driving based on nonlinear model predictive control and multi-sensor fusion,” to appear in Proceedings of the 10th IFAC Symposium on Intelligent Autonomous Vehicles.

[26]

P. Polack, F. Altch, B. d'Andra Novel, and A. de La Fortelle, “The kinematic bicycle model: A consistent model for planning feasible trajectories for autonomous vehicles?” in 2017 IEEE Intelligent Vehicles Symposium (IV), 2017, pp. 812–818.

[27]

A. D. Luca, G. Oriolo, and C. Samson, “Feedback control of a nonholonomic car-like robot,” in Robot Motion Planning and Control. Springer, 1998, ch. 4.

[28]

S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” 2016.

[29]

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014.

Cited By

Liang HDong ZMa YHao XZheng YHao JFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)A Hierarchical Imitation Learning-based Decision Framework for Autonomous DrivingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615454(4695-4701)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615454
Si HTan GZuo H(2022)A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving PolicyWireless Communications & Mobile Computing10.1155/2022/96654212022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9665421
Folkers AWellhausen CRick MLi XEvers LSchwarting VClemens JDittmann PShubbak MBustert TZachmann GSchill KBüskens C(2022)The OPA³L System and Testconcept for Urban Autonomous Driving2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC55140.2022.9922416(1949-1956)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1109/ITSC55140.2022.9922416

Index Terms

Controlling an Autonomous Vehicle with Deep Reinforcement Learning
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
      1. Robotic autonomy
2. Computing methodologies
  1. Artificial intelligence
    1. Control methods
    2. Planning and scheduling
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Reliable and Efficient Lane Changing Behaviour for Connected Autonomous Vehicle through Deep Reinforcement Learning
Abstract
The establishment of future intelligent transport systems is dependable on the reliable and seamless function of Connected and Autonomous Vehicles (CAV). Reinforcement learning (RL), which allows autonomous vehicles (AVs) to learn an ideal driving ...
A deep reinforcement learning-based approach for autonomous lane-changing velocity control in mixed flow of vehicle group level
Abstract
As an important driving behavior, lane-changing has a great impact on the safety and efficiency of traffic flow interacting with surrounding vehicles, especially in mixed traffic flows with autonomous vehicles and human-driven vehicles. This ...
Multi-Vehicle Mixed Reality Reinforcement Learning for Autonomous Multi-Lane Driving
AAMAS '20: Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems

Autonomous driving promises to transform road transport. Multi-vehicle and multi-lane scenarios, however, present unique challenges due to constrained navigation and unpredictable vehicle interactions. Learning-based methods-such as deep reinforcement ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

2019 IEEE Intelligent Vehicles Symposium (IV)

Jun 2019

2358 pages

Copyright © 2019.

Publisher

IEEE Press

Publication History

Published: 09 June 2019

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liang HDong ZMa YHao XZheng YHao JFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)A Hierarchical Imitation Learning-based Decision Framework for Autonomous DrivingProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615454(4695-4701)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615454
Si HTan GZuo H(2022)A Deep Coordination Graph Convolution Reinforcement Learning for Multi-Intelligent Vehicle Driving PolicyWireless Communications & Mobile Computing10.1155/2022/96654212022Online publication date: 1-Jan-2022
https://dl.acm.org/doi/10.1155/2022/9665421
Folkers AWellhausen CRick MLi XEvers LSchwarting VClemens JDittmann PShubbak MBustert TZachmann GSchill KBüskens C(2022)The OPA³L System and Testconcept for Urban Autonomous Driving2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC55140.2022.9922416(1949-1956)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1109/ITSC55140.2022.9922416

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents