skip to main content
research-article

Synthesizing Decentralized Controllers With Graph Neural Networks and Imitation Learning

Published: 01 January 2022 Publication History

Abstract

Dynamical systems consisting of a set of autonomous agents face the challenge of having to accomplish a global task, relying only on local information. While centralized controllers are readily available, they face limitations in terms of scalability and implementation, as they do not respect the distributed information structure imposed by the network system of agents. Given the difficulties in finding optimal decentralized controllers, we propose a novel framework using graph neural networks (GNNs) to <italic>learn</italic> these controllers. GNNs are well-suited for the task since they are naturally distributed architectures and exhibit good scalability and transferability properties. We show that GNNs learn appropriate decentralized controllers by means of imitation learning, leverage their permutation invariance properties to successfully scale to larger teams and transfer to unseen scenarios at deployment time. The problems of flocking and multi-agent path planning are explored to illustrate the potential of GNNs in learning decentralized controllers.

References

[1]
E. Tolstaya, F. Gama, J. Paulos, G. Pappas, V. Kumar, and A. Ribeiro, “Learning decentralized controllers for robot swarms with graph neural networks,” in Proc. Conf. Robot Learn., 2019, vol. 100, pp. 671–682.
[2]
Q. Li, F. Gama, A. Ribeiro, and A. Prorok, “Graph neural networks for decentralized multi-robot path planning,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Las Vegas, NV, 2020, pp. 11785–11792.
[3]
A. Nedic, A. Ozdaglar, and P. A. Parrilo, “Constrained consensus and optimization in multi-agent networks,”IEEE Trans. Autom. Control, vol. 55, no. 4, pp. 922–938, Apr.2010.
[4]
A.-H. Mohsenian-Rad and A. Leon-Garcia, “Optimal residential load control with price prediction in real-time electricity pricing environments,”IEEE Trans. Smart Grids, vol. 1, no. 2, pp. 120–133, Sep.2010.
[5]
D. Owerko, F. Gama, and A. Ribeiro, “Optimal power flow using graph neural networks,” in Proc. 45th IEEE Int. Conf. Acoust, Speech Signal Process., Barcelona, Spain, 2020, pp. 5930–5934.
[6]
D. Owerko, F. Gama, and A. Ribeiro, “Predicting power outages using graph neural networks,” in Proc. IEEE Glob. Conf. Signal Inform. Process., Anaheim, CA, USA, 2018, pp. 743–747.
[7]
M. Chiang, C. W. Tan, D. P. Palomar, D. O’Neill, and D. Julian, “Power control by geometric programming,”IEEE Trans. Wireless Commun., vol. 6, no. 7, pp. 2640–2651, Jul.2007.
[8]
C. P. Bechlioulis and G. A. Rovithakis, “Robust adaptive control of feedback linearizable mimo nonlinear systems with prescribed performance,”IEEE Trans. Autom. Control, vol. 53, no. 9, pp. 2090–2099, Oct.2008.
[9]
Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recurrent neural network: Data-driven traffic forecasting,” in Proc. 6th Int. Conf. Learn. Representations, Vancouver, BC, Canada, 2018, pp. 1–16.
[10]
K. Ogata, Modern Control Engineering, 4th ed. Upper Saddle River, NJ, USA: Prentice-Hall, 2002.
[11]
H. S. Witsenhausen, “A counterexample in stochastic optimum control,”SIAM J. Control, vol. 6, no. 1, pp. 131–147, 1968.
[12]
F. Gama, E. Isufi, G. Leus, and A. Ribeiro, “Graphs, convolutions, and neural networks: From graph filters to graph neural networks,”IEEE Signal Process. Mag., vol. 37, no. 6, pp. 128–138, Nov.2020.
[13]
L. Ruiz, F. Gama, and A. Ribeiro, “Graph neural networks: Architectures, stability and transferability,”Proc. IEEE, vol. 109, no. 5, pp. 660–682, May2021.
[14]
J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and deep locally connected networks on graphs,” in Proc. 2nd Int. Conf. Learn. Representations, Banff, AB, Canada, 2014, pp. 1–14.
[15]
M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in Proc. 30th Conf. Neural Inform. Process. Syst., Barcelona, Spain, 2016, pp. 3844–3858.
[16]
F. Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Convolutional neural network architectures for signals supported on graphs,”IEEE Trans. Signal Process., vol. 67, no. 4, pp. 1034–1049, Feb. 15, 2019.
[17]
Y. Seo, M. Defferrard, P. Vandergheynst, and X. Bresson, “Structured sequence modeling with graph convolutional recurrent networks,” in Proc. 32nd Conf. Neural Inform. Process. Syst. Montreal, QC, Neural Inform. Process. Syst. Found., 2018, pp. 362–373.
[18]
L. Ruiz, F. Gama, and A. Ribeiro, “Gated graph recurrent neural networks,”IEEE Trans. Signal Process., vol. 68, pp. 6303–6318, Oct.2020.
[19]
S. Ross and J. A. Bagnell, “Efficient reductions for imitation learning,” in Proc. 13th Int. Conf. Artif. Intell., Statist., Sardinia, Italy, 2010, pp. 661–668.
[20]
S. Ross, G. J. Gordon, and J. A. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proc. 14th Int. Conf. Artif. Intell., Statist., Fort Lauderdale, FL, 2011, pp. 627–635.
[21]
W. Sun, A. Venkatraman, G. J. Gordon, B. Boots, and J. A. Bagnell, “Deeply AggreVaTeD: Differentiable imitation learning for sequential prediction,” in Proc. 34th Int. Conf. Mach. Learn., Sydney, Australia, 2017, pp. 3309–3318.
[22]
A. Hussein, M. M. Gaber, E. Elyan, and C. Jayne, “Imitation learning: A survey of learning methods,”ACM Comput. Surv., vol. 50, no. 2, pp. 21:1–21:35, Apr.2017.
[23]
R. Osa, J. Pajarinen, G. Neumann, J. A. Bagnell, P. Abbeel, and J. Peters, “An algorithmic perspective on imitation learning,”Found. Trends Robot., vol. 7, no. 1-2, pp. 1–179, 2018.
[24]
A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs,”IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644–1656, Apr.2013.
[25]
D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,”IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, May2013.
[26]
E. Isufi, F. Gama, and A. Ribeiro, “EdgeNets: Edge varying graph neural networks,”IEEE Trans. Pattern Anal. Mach. Intell., early access.
[27]
F. Gama, J. Bruna, and A. Ribeiro, “Stability properties of graph neural networks,”IEEE Trans. Signal Process., vol. 68, pp. 5680–5695, Sep.2020.
[28]
L. Ruiz, L. F. O. Chamon, and A. Ribeiro, “Graphon neural networks and the transferability of graph neural networks,” in Proc. Conf. Neural Inf. Process. Syst., Vancouver, BC, Canada, Oct. 6–12, 2020.
[29]
A. Ortega, P. Frossard, J. Kovačević, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges and applications,”Proc. IEEE, vol. 106, no. 5, pp. 808–828, May2018.
[30]
A. Heimowitz and Y. C. Eldar, “A unified view of diffusion maps and signal processing on graphs,” in Proc. Int. Conf. Sampling Theory Appl., Tallin, Estonia, 2017, pp. 308–312.
[31]
F. Gama, A. G. Marques, G. Mateos, and A. Ribeiro, “Rethinking sketching as sampling: A graph signal processing approach,”Signal Process., vol. 169, Apr.2020, Art. no.
[32]
G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting the dots: Identifying network structure via graph signal processing,”IEEE Signal Process. Mag., vol. 36, no. 3, pp. 16–43, May2019.
[33]
J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization,”J. Mach. Learn. Res., vol. 13, pp. 281–305, Feb.2012.
[34]
N. Perraudin and P. Vandergheynst, “Stationary signal processing on graphs,”IEEE Trans. Signal Process., vol. 65, no. 13, pp. 3462–3477, Jul.2017.
[35]
A. G. Marques, S. Segarra, G. Leus, and A. Ribeiro, “Stationary graph processes and spectral estimation,”IEEE Trans. Signal Process., vol. 65, no. 22, pp. 5911–5926, Nov.2017.
[36]
F. Gama and A. Ribeiro, “Ergodicity in stationary graph processes: A weak law of large numbers,”IEEE Trans. Signal Process., vol. 67, no. 10, pp. 2761–2774, Apr.2019.
[37]
F. Gama and S. Sojoudi, “Distributed linear-quadratic control with graph neural networks,”Signal Process., vol. 196, Jul.2022, Art. no.
[38]
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (Ser. Adaptive Comput. Mach. Learn.). Cambridge, MA, USA:MIT Press, 2016.
[39]
A. Prorok, J. Blumenkamp, Q. Li, R. Kortvelesy, and Z. Liu, “The holy grail of multi-robot planning: Learning to generate online-scalable solutions from offline-optimal experts,”Jul. 26, 2021. [Online]. Available: http://arxiv.org/abs/2107.12254
[40]
S. Tu, A. Robey, T. Zhang, and N. Matni, “On the Sample Complexity of Stability Constrained Imitation Learning,” in Proc. L4DC, to be published.
[41]
H. Yin, P. Seiler, M. Jin, and M. Arcak, “Imitation learning with stability and safety guarantees,”IEEE Control Syst. Lett., vol. 6, pp. 409–414, May, 2021.
[42]
F. Gama and S. Sojoudi, “Graph neural networks for distributed linear-quadratic control,” in Proc. 3rd Annu. Conf. Learn. Dyn. Control, Zürich, Switzerland, 2021, vol. 144, pp. 111–124.
[43]
Z. Gao, F. Gama, and A. Ribeiro, “Wide and deep graph neural network with distributed online learning,”Mar.2022. [Online]. Available: http://arxiv.org/abs/2107.09203
[44]
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Series Adaptive Comput. Mach. Learning), 2nd ed. Cambridge, MA, USA: MIT Press, 2018.
[45]
A. Khan, E. Tolstaya, A. Ribeiro, and V. Kumar, “Graph policy gradients for large scale robot control,” in Proc. Conf. Robot Learn., 2019, vol. 100, pp. 823–834.
[46]
M. Hertneck, J. Köhler, S. Trimpe, and F. Allgöwer, “Learning an approximate model predictive controller with guarantees,”IEEE Control Syst. Lett., vol. 2, no. 3, pp. 543–548, Jul.2018.
[47]
E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Filtering random graph processes over random time-varying graphs,”IEEE Trans. Signal Process., vol. 65, no. 16, pp. 4406–4421, Aug.2017.
[48]
F. Grassi, A. Loukas, N. Perraudin, and B. Ricaud, “A time-vertex signal processing framework: Scalable processing and meaningful representations for time-series on graphs,”IEEE Trans. Signal Process., vol. 66, no. 3, pp. 817–829, Feb.2018.
[49]
E. Isufi, A. Loukas, N. Perraudin, and G. Leus, “Forecasting time series with VARMA recursions on graphs,”IEEE Trans. Signal Process., vol. 67, no. 18, pp. 4870–4885, Sep.2019.
[50]
F. Gama, E. Isufi, A. Ribeiro, and G. Leus, “Controllability of bandlimited graph processes over random time varying graphs,”IEEE Trans. Signal Process., vol. 67, no. 24, pp. 6440–6454, Dec.2019.
[51]
H. G. Tanner, A. Jadbabaie, and G. J. Pappas, “Stable flocking of mobile agents—Part II: Dynamic topology,” in Proc. 42nd IEEE Conf. Decision, Control., Maui, HI, 2003, pp. 2016–2021.
[52]
H. G. Tanner, “Flocking with obstacle avoidance in switching networks of interconnected vehicles,” in Proc. IEEE Int. Conf. Robot. Automat., New Orleans, LA, USA, 2004, pp. 3006–3011.
[53]
T.-K. Huet al., “Scalable perception-action-communication loops with convolutional and graph neural networks,”IEEE Trans. Signal, Inform. Process. Netw., vol. 8, pp. 12–24, Dec.2021.
[54]
K. P. Murphy, Machine Learning: A Probabilistic Perspective (Ser. Adaptive Comput. Mach. Learn.). Cambridge, MA, USA: MIT Press, 2012.
[55]
D. P. Kingma and J. L. Ba, “ADAM: A method for stochastic optimization,” in Proc. 3rd Int. Conf. Learn. Representations, San Diego, CA, USA, 2015, pp. 1–15.
[56]
J. Yu and S. M. LaValle, “Structure and intractability of optimal multi-robot path planning on graphs,” in Proc. 27th AAAI Conf. Artif. Intell. Bellevue, 2013, pp. 1443–1449.
[57]
G. Sharon, R. Stern, A. Felner, and N. R. Sturtevant, “Conflict-based search for optimal multi-agents pathfinding,”Artif. Intell., vol. 219, pp. 40–66, Feb.2015.
[58]
C. Ferner, G. Wagner, and H. Choset, “ODrM* optimal multirobot path planning in low dimensional search spaces,” in Proc. IEEE Int. Conf. Robot. Automat., Karlsruhe, Germany, 2013, pp. 3854–3859.
[59]
Q. Li, W. Lin, Z. Liu, and A. Prorok, “Message-aware graph attention networks for large-scale multi-robot path planning,”IEEE Robot. Automat. Lett., vol. 6, no. 3, pp. 5533–5540, Jul.2021.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing  Volume 70, Issue
2022
2441 pages

Publisher

IEEE Press

Publication History

Published: 01 January 2022

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media