research-article

Synthesizing Decentralized Controllers With Graph Neural Networks and Imitation Learning

Authors:

Ekaterina Tolstaya,

Alejandro RibeiroAuthors Info & Claims

IEEE Transactions on Signal Processing, Volume 70

Pages 1932 - 1946

https://doi.org/10.1109/TSP.2022.3166401

Published: 01 January 2022 Publication History

Abstract

Dynamical systems consisting of a set of autonomous agents face the challenge of having to accomplish a global task, relying only on local information. While centralized controllers are readily available, they face limitations in terms of scalability and implementation, as they do not respect the distributed information structure imposed by the network system of agents. Given the difficulties in finding optimal decentralized controllers, we propose a novel framework using graph neural networks (GNNs) to <italic>learn</italic> these controllers. GNNs are well-suited for the task since they are naturally distributed architectures and exhibit good scalability and transferability properties. We show that GNNs learn appropriate decentralized controllers by means of imitation learning, leverage their permutation invariance properties to successfully scale to larger teams and transfer to unseen scenarios at deployment time. The problems of flocking and multi-agent path planning are explored to illustrate the potential of GNNs in learning decentralized controllers.

References

[1]

E. Tolstaya, F. Gama, J. Paulos, G. Pappas, V. Kumar, and A. Ribeiro, “Learning decentralized controllers for robot swarms with graph neural networks,” in Proc. Conf. Robot Learn., 2019, vol. 100, pp. 671–682.

[2]

Q. Li, F. Gama, A. Ribeiro, and A. Prorok, “Graph neural networks for decentralized multi-robot path planning,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., Las Vegas, NV, 2020, pp. 11785–11792.

[3]

A. Nedic, A. Ozdaglar, and P. A. Parrilo, “Constrained consensus and optimization in multi-agent networks,”IEEE Trans. Autom. Control, vol. 55, no. 4, pp. 922–938, Apr.2010.

[4]

A.-H. Mohsenian-Rad and A. Leon-Garcia, “Optimal residential load control with price prediction in real-time electricity pricing environments,”IEEE Trans. Smart Grids, vol. 1, no. 2, pp. 120–133, Sep.2010.

[5]

D. Owerko, F. Gama, and A. Ribeiro, “Optimal power flow using graph neural networks,” in Proc. 45th IEEE Int. Conf. Acoust, Speech Signal Process., Barcelona, Spain, 2020, pp. 5930–5934.

[6]

D. Owerko, F. Gama, and A. Ribeiro, “Predicting power outages using graph neural networks,” in Proc. IEEE Glob. Conf. Signal Inform. Process., Anaheim, CA, USA, 2018, pp. 743–747.

[7]

M. Chiang, C. W. Tan, D. P. Palomar, D. O’Neill, and D. Julian, “Power control by geometric programming,”IEEE Trans. Wireless Commun., vol. 6, no. 7, pp. 2640–2651, Jul.2007.

Digital Library

[8]

C. P. Bechlioulis and G. A. Rovithakis, “Robust adaptive control of feedback linearizable mimo nonlinear systems with prescribed performance,”IEEE Trans. Autom. Control, vol. 53, no. 9, pp. 2090–2099, Oct.2008.

[9]

Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recurrent neural network: Data-driven traffic forecasting,” in Proc. 6th Int. Conf. Learn. Representations, Vancouver, BC, Canada, 2018, pp. 1–16.

[10]

K. Ogata, Modern Control Engineering, 4th ed. Upper Saddle River, NJ, USA: Prentice-Hall, 2002.

[11]

H. S. Witsenhausen, “A counterexample in stochastic optimum control,”SIAM J. Control, vol. 6, no. 1, pp. 131–147, 1968.

[12]

F. Gama, E. Isufi, G. Leus, and A. Ribeiro, “Graphs, convolutions, and neural networks: From graph filters to graph neural networks,”IEEE Signal Process. Mag., vol. 37, no. 6, pp. 128–138, Nov.2020.

[13]

L. Ruiz, F. Gama, and A. Ribeiro, “Graph neural networks: Architectures, stability and transferability,”Proc. IEEE, vol. 109, no. 5, pp. 660–682, May2021.

[14]

J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and deep locally connected networks on graphs,” in Proc. 2nd Int. Conf. Learn. Representations, Banff, AB, Canada, 2014, pp. 1–14.

[15]

M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in Proc. 30th Conf. Neural Inform. Process. Syst., Barcelona, Spain, 2016, pp. 3844–3858.

Digital Library

[16]

F. Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Convolutional neural network architectures for signals supported on graphs,”IEEE Trans. Signal Process., vol. 67, no. 4, pp. 1034–1049, Feb. 15, 2019.

Digital Library

[17]

Y. Seo, M. Defferrard, P. Vandergheynst, and X. Bresson, “Structured sequence modeling with graph convolutional recurrent networks,” in Proc. 32nd Conf. Neural Inform. Process. Syst. Montreal, QC, Neural Inform. Process. Syst. Found., 2018, pp. 362–373.

[18]

L. Ruiz, F. Gama, and A. Ribeiro, “Gated graph recurrent neural networks,”IEEE Trans. Signal Process., vol. 68, pp. 6303–6318, Oct.2020.

Digital Library

[19]

S. Ross and J. A. Bagnell, “Efficient reductions for imitation learning,” in Proc. 13th Int. Conf. Artif. Intell., Statist., Sardinia, Italy, 2010, pp. 661–668.

[20]

S. Ross, G. J. Gordon, and J. A. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proc. 14th Int. Conf. Artif. Intell., Statist., Fort Lauderdale, FL, 2011, pp. 627–635.

[21]

W. Sun, A. Venkatraman, G. J. Gordon, B. Boots, and J. A. Bagnell, “Deeply AggreVaTeD: Differentiable imitation learning for sequential prediction,” in Proc. 34th Int. Conf. Mach. Learn., Sydney, Australia, 2017, pp. 3309–3318.

[22]

A. Hussein, M. M. Gaber, E. Elyan, and C. Jayne, “Imitation learning: A survey of learning methods,”ACM Comput. Surv., vol. 50, no. 2, pp. 21:1–21:35, Apr.2017.

Digital Library

[23]

R. Osa, J. Pajarinen, G. Neumann, J. A. Bagnell, P. Abbeel, and J. Peters, “An algorithmic perspective on imitation learning,”Found. Trends Robot., vol. 7, no. 1-2, pp. 1–179, 2018.

[24]

A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs,”IEEE Trans. Signal Process., vol. 61, no. 7, pp. 1644–1656, Apr.2013.

Digital Library

[25]

D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst, “The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains,”IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, May2013.

[26]

E. Isufi, F. Gama, and A. Ribeiro, “EdgeNets: Edge varying graph neural networks,”IEEE Trans. Pattern Anal. Mach. Intell., early access.

[27]

F. Gama, J. Bruna, and A. Ribeiro, “Stability properties of graph neural networks,”IEEE Trans. Signal Process., vol. 68, pp. 5680–5695, Sep.2020.

[28]

L. Ruiz, L. F. O. Chamon, and A. Ribeiro, “Graphon neural networks and the transferability of graph neural networks,” in Proc. Conf. Neural Inf. Process. Syst., Vancouver, BC, Canada, Oct. 6–12, 2020.

[29]

A. Ortega, P. Frossard, J. Kovačević, J. M. F. Moura, and P. Vandergheynst, “Graph signal processing: Overview, challenges and applications,”Proc. IEEE, vol. 106, no. 5, pp. 808–828, May2018.

[30]

A. Heimowitz and Y. C. Eldar, “A unified view of diffusion maps and signal processing on graphs,” in Proc. Int. Conf. Sampling Theory Appl., Tallin, Estonia, 2017, pp. 308–312.

[31]

F. Gama, A. G. Marques, G. Mateos, and A. Ribeiro, “Rethinking sketching as sampling: A graph signal processing approach,”Signal Process., vol. 169, Apr.2020, Art. no.

[32]

G. Mateos, S. Segarra, A. G. Marques, and A. Ribeiro, “Connecting the dots: Identifying network structure via graph signal processing,”IEEE Signal Process. Mag., vol. 36, no. 3, pp. 16–43, May2019.

[33]

J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization,”J. Mach. Learn. Res., vol. 13, pp. 281–305, Feb.2012.

[34]

N. Perraudin and P. Vandergheynst, “Stationary signal processing on graphs,”IEEE Trans. Signal Process., vol. 65, no. 13, pp. 3462–3477, Jul.2017.

Digital Library

[35]

A. G. Marques, S. Segarra, G. Leus, and A. Ribeiro, “Stationary graph processes and spectral estimation,”IEEE Trans. Signal Process., vol. 65, no. 22, pp. 5911–5926, Nov.2017.

Digital Library

[36]

F. Gama and A. Ribeiro, “Ergodicity in stationary graph processes: A weak law of large numbers,”IEEE Trans. Signal Process., vol. 67, no. 10, pp. 2761–2774, Apr.2019.

Digital Library

[37]

F. Gama and S. Sojoudi, “Distributed linear-quadratic control with graph neural networks,”Signal Process., vol. 196, Jul.2022, Art. no.

[38]

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (Ser. Adaptive Comput. Mach. Learn.). Cambridge, MA, USA:MIT Press, 2016.

[39]

A. Prorok, J. Blumenkamp, Q. Li, R. Kortvelesy, and Z. Liu, “The holy grail of multi-robot planning: Learning to generate online-scalable solutions from offline-optimal experts,”Jul. 26, 2021. [Online]. Available: http://arxiv.org/abs/2107.12254

[40]

S. Tu, A. Robey, T. Zhang, and N. Matni, “On the Sample Complexity of Stability Constrained Imitation Learning,” in Proc. L4DC, to be published.

[41]

H. Yin, P. Seiler, M. Jin, and M. Arcak, “Imitation learning with stability and safety guarantees,”IEEE Control Syst. Lett., vol. 6, pp. 409–414, May, 2021.

[42]

F. Gama and S. Sojoudi, “Graph neural networks for distributed linear-quadratic control,” in Proc. 3rd Annu. Conf. Learn. Dyn. Control, Zürich, Switzerland, 2021, vol. 144, pp. 111–124.

[43]

Z. Gao, F. Gama, and A. Ribeiro, “Wide and deep graph neural network with distributed online learning,”Mar.2022. [Online]. Available: http://arxiv.org/abs/2107.09203

[44]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Series Adaptive Comput. Mach. Learning), 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

[45]

A. Khan, E. Tolstaya, A. Ribeiro, and V. Kumar, “Graph policy gradients for large scale robot control,” in Proc. Conf. Robot Learn., 2019, vol. 100, pp. 823–834.

[46]

M. Hertneck, J. Köhler, S. Trimpe, and F. Allgöwer, “Learning an approximate model predictive controller with guarantees,”IEEE Control Syst. Lett., vol. 2, no. 3, pp. 543–548, Jul.2018.

[47]

E. Isufi, A. Loukas, A. Simonetto, and G. Leus, “Filtering random graph processes over random time-varying graphs,”IEEE Trans. Signal Process., vol. 65, no. 16, pp. 4406–4421, Aug.2017.

Digital Library

[48]

F. Grassi, A. Loukas, N. Perraudin, and B. Ricaud, “A time-vertex signal processing framework: Scalable processing and meaningful representations for time-series on graphs,”IEEE Trans. Signal Process., vol. 66, no. 3, pp. 817–829, Feb.2018.

Digital Library

[49]

E. Isufi, A. Loukas, N. Perraudin, and G. Leus, “Forecasting time series with VARMA recursions on graphs,”IEEE Trans. Signal Process., vol. 67, no. 18, pp. 4870–4885, Sep.2019.

Digital Library

[50]

F. Gama, E. Isufi, A. Ribeiro, and G. Leus, “Controllability of bandlimited graph processes over random time varying graphs,”IEEE Trans. Signal Process., vol. 67, no. 24, pp. 6440–6454, Dec.2019.

[51]

H. G. Tanner, A. Jadbabaie, and G. J. Pappas, “Stable flocking of mobile agents—Part II: Dynamic topology,” in Proc. 42nd IEEE Conf. Decision, Control., Maui, HI, 2003, pp. 2016–2021.

[52]

H. G. Tanner, “Flocking with obstacle avoidance in switching networks of interconnected vehicles,” in Proc. IEEE Int. Conf. Robot. Automat., New Orleans, LA, USA, 2004, pp. 3006–3011.

[53]

T.-K. Huet al., “Scalable perception-action-communication loops with convolutional and graph neural networks,”IEEE Trans. Signal, Inform. Process. Netw., vol. 8, pp. 12–24, Dec.2021.

[54]

K. P. Murphy, Machine Learning: A Probabilistic Perspective (Ser. Adaptive Comput. Mach. Learn.). Cambridge, MA, USA: MIT Press, 2012.

[55]

D. P. Kingma and J. L. Ba, “ADAM: A method for stochastic optimization,” in Proc. 3rd Int. Conf. Learn. Representations, San Diego, CA, USA, 2015, pp. 1–15.

[56]

J. Yu and S. M. LaValle, “Structure and intractability of optimal multi-robot path planning on graphs,” in Proc. 27th AAAI Conf. Artif. Intell. Bellevue, 2013, pp. 1443–1449.

[57]

G. Sharon, R. Stern, A. Felner, and N. R. Sturtevant, “Conflict-based search for optimal multi-agents pathfinding,”Artif. Intell., vol. 219, pp. 40–66, Feb.2015.

Digital Library

[58]

C. Ferner, G. Wagner, and H. Choset, “ODrM* optimal multirobot path planning in low dimensional search spaces,” in Proc. IEEE Int. Conf. Robot. Automat., Karlsruhe, Germany, 2013, pp. 3854–3859.

[59]

Q. Li, W. Lin, Z. Liu, and A. Prorok, “Message-aware graph attention networks for large-scale multi-robot path planning,”IEEE Robot. Automat. Lett., vol. 6, no. 3, pp. 5533–5540, Jul.2021.

Cited By

Wang NZhang BChi HWang HMcLoone SLiu H(2024)DUELInternational Journal of Robotics Research10.1177/0278364923121032543:3(305-329)Online publication date: 21-Feb-2024
https://dl.acm.org/doi/10.1177/02783649231210325
Isufi EGama FShuman DSegarra S(2024)Graph Filters for Signal Processing and Machine Learning on GraphsIEEE Transactions on Signal Processing10.1109/TSP.2024.334978872(4745-4781)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TSP.2024.3349788
Gama FZilberstein NSevilla MBaraniuk RSegarra S(2023)Unsupervised Learning of Sampling Distributions for Particle FiltersIEEE Transactions on Signal Processing10.1109/TSP.2023.332422171(3852-3866)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TSP.2023.3324221
Show More Cited By

Recommendations

Decentralized learning of randomization-based neural networks with centralized equivalence
Abstract
We consider a decentralized learning problem where training data samples are distributed over agents (processing nodes) of an underlying communication network topology without any central (master) node. Due to information privacy and ...
Highlights
- We propose a privacy preserving decentralized learning of randomized neural network.
Refining pid controllers using neural networks

The KBANN (Knowledge-Based Artificial Neural Networks) approach uses neural networks to refine knowledge that can be written in the form of simple propositional rules. We extend this idea further by presenting the MANNCON (Multivariable Artificial ...
Meta learning in decentralized neural networks: towards more general AI
AAAI'23/IAAI'23/EAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence

Meta-learning usually refers to a learning algorithm that learns from other learning algorithms. The problem of uncertainty in the predictions of neural networks shows that the world is only partially predictable and a learned neural network cannot ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Signal Processing

IEEE Transactions on Signal Processing Volume 70, Issue

2022

2441 pages

ISSN:1053-587X

Issue’s Table of Contents

1053-587X © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Press

Publication History

Published: 01 January 2022

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang NZhang BChi HWang HMcLoone SLiu H(2024)DUELInternational Journal of Robotics Research10.1177/0278364923121032543:3(305-329)Online publication date: 21-Feb-2024
https://dl.acm.org/doi/10.1177/02783649231210325
Isufi EGama FShuman DSegarra S(2024)Graph Filters for Signal Processing and Machine Learning on GraphsIEEE Transactions on Signal Processing10.1109/TSP.2024.334978872(4745-4781)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TSP.2024.3349788
Gama FZilberstein NSevilla MBaraniuk RSegarra S(2023)Unsupervised Learning of Sampling Distributions for Particle FiltersIEEE Transactions on Signal Processing10.1109/TSP.2023.332422171(3852-3866)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TSP.2023.3324221
Zhang RYu CChen JFan CGao SKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Learning-based motion planning in dynamic environments using GNNs and temporal encodingProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602445(30003-30015)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602445

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents