research-article

A structured prediction approach for robot imitation learning

Authors: Anqing Duan, Iason Batzianoulis,

Raffaello Camoriano, Lorenzo Rosasco, Daniele Pucci, Aude BillardAuthors Info & Claims

The International Journal of Robotics Research, Volume 43, Issue 2

Pages 113 - 133

https://doi.org/10.1177/02783649231204656

Published: 21 February 2024 Publication History

Abstract

We propose a structured prediction approach for robot imitation learning from demonstrations. Among various tools for robot imitation learning, supervised learning has been observed to have a prominent role. Structured prediction is a form of supervised learning that enables learning models to operate on output spaces with complex structures. Through the lens of structured prediction, we show how robots can learn to imitate trajectories belonging to not only Euclidean spaces but also Riemannian manifolds. Exploiting ideas from information theory, we propose a class of loss functions based on the f-divergence to measure the information loss between the demonstrated and reproduced probabilistic trajectories. Different types of f-divergence will result in different policies, which we call imitation modes. Furthermore, our approach enables the incorporation of spatial and temporal trajectory modulation, which is necessary for robots to be adaptive to the change in working conditions. We benchmark our algorithm against state-of-the-art methods in terms of trajectory reproduction and adaptation. The quantitative evaluation shows that our approach outperforms other algorithms regarding both accuracy and efficiency. We also report real-world experimental results on learning manifold trajectories in a polishing task with a KUKA LWR robot arm, illustrating the effectiveness of our algorithmic framework.

References

[1]

Abbeel P and Ng AY (2004) Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, p. 1.

[2]

Abbeel P, Coates A, and Ng AY (2010) Autonomous helicopter aerobatics through apprenticeship learning. The International Journal of Robotics Research 29(13): 1608–1639.

Digital Library

[3]

Absil PA, Mahony R, and Sepulchre R (2009) Optimization Algorithms on Matrix Manifolds. Princeton, NJ: Princeton University Press.

Digital Library

[4]

Ahmadzadeh SR and Chernova S (2018) Trajectory-based skill learning using generalized cylinders. Frontiers in Robotics and AI 5: 132.

[5]

Ajoudani A, Fang C, and Tsagarakis N, et al. (2018) Reduced-complexity representation of the human arm active endpoint stiffness for supervisory control of remote manipulation. The International Journal of Robotics Research 37(1): 155–167.

Digital Library

[6]

Álvarez MA, Rosasco L, and Lawrence ND (2012) Kernels for vector-valued functions: a review. Foundations and Trends® in Machine Learning 4(3): 195–266.

Digital Library

[7]

Amanhoud W, Khoramshahi M, and Billard A (2019) A dynamical system approach to motion and force generation in contact tasks. Robotics: Science and Systems (RSS).

[8]

Arduengo M, Colomé A, and Borràs J, et al. (2021) Task-adaptive robot learning from demonstration with Gaussian process models under replication. IEEE Robotics and Automation Letters 6(2): 966–973.

[9]

Bahl S, Mukadam M, and Gupta A, et al. (2020) Neural dynamic policies for end-to-end sensorimotor learning. Advances in Neural Information Processing Systems 33: 5058–5069.

[10]

Beik-Mohammadi H, Hauberg S, and Arvanitidis G, et al. (2021) Learning Riemannian manifolds for geodesic motion skills. Robotics: Science and Systems.

[11]

Billard A, Calinon S, and Dillmann R, et al. (2008) Robot Programming by Demonstration. Berlin, Heidelberg: Springer handbook of robotics, 1371–1394.

[12]

Bonalli R, Bylard A, and Cauligi A, et al. (2019) Trajectory optimization on manifolds: a theoretically-guaranteed embedded sequential convex programming approach Robotics: Science and Systems.

[13]

Calinon S (2020) Gaussians on riemannian manifolds: applications for robot learning and adaptive control. IEEE Robotics & Automation Magazine 27(2): 33–45.

[14]

Cheng CA, Yan X, and Wagener N, et al. (2018) Fast policy learning through imitation and reinforcement. Uncertainty in Artificial Intelligence. (pp. 2192–2202):Proceedings of Machine Learning Research (PMLR).

[15]

Ciliberto C, Rosasco L, and Rudi A (2016) A consistent regularization approach for structured prediction. Advances in Neural Information Processing Systems 29. California, CA, USA: Neural Information Processing Systems Foundation, Inc. (NeurIPS). (pp. 4412–4420)

[16]

Ciliberto C, Rosasco L, and Rudi A (2020) A general framework for consistent structured prediction with implicit loss embeddings. Journal of Machine Learning Research 21(98): 1–67.

[17]

Darvish K, Penco L, and Ramos J, et al. (2023) Teleoperation of humanoid robots: a survey. IEEE Transactions on Robotics 39(3): 1706–1727.

Digital Library

[18]

Duan A, Camoriano R, and Ferigo D, et al. (2018) Constrained DMPs for feasible skill learning on humanoid robots. In: 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids). IEEE, pp. 1–6.

[19]

Duan A, Camoriano R, and Ferigo D, et al. (2019) Learning to sequence multiple tasks with competing constraints. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 2672–2678.

[20]

Duan A, Camoriano R, and Ferigo D, et al. (2020) Learning to avoid obstacles with minimal intervention control. Frontiers in Robotics and AI 7: 60.

[21]

Duan A, Victorova M, and Zhao J, et al. (2022) Ultrasound-guided assistive robots for scoliosis assessment with optimization-based control and variable impedance. IEEE Robotics and Automation Letters 7(3): 8106–8113.

[22]

Figueroa N and Billard A (2018) A physically-consistent bayesian non-parametric mixture model for dynamical system learning. In:2nd Annual Conference on Robot Learning, CoRL 2018, Zürich, Switzerland, 29–31 October 2018, Proceedings, volume 87.

[23]

Florence P, Lynch C, and Zeng A, et al. (2022) Implicit behavioral cloning. In: Conference on Robot Learning. PMLR, 158–168.

[24]

Ghasemipour SKS, Zemel R, and Gu S (2020) A divergence minimization perspective on imitation learning methods. In: Conference on Robot Learning. PMLR, 1259–1277.

[25]

Ho J and Ermon S (2016) Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, 4565–4573.

[26]

Huang Y, Rozo L, and Silvério J, et al. (2019) Kernelized movement primitives. The International Journal of Robotics Research 38(7): 833–852.

Digital Library

[27]

Ijspeert AJ, Nakanishi J, and Hoffmann H, et al. (2013) Dynamical movement primitives: learning attractor models for motor behaviors. Neural Computation 25(2): 328–373.

Digital Library

[28]

Ke L, Choudhury S, and Barnes M, et al. (2021) Imitation learning as f-divergence minimization. In: Algorithmic Foundations of Robotics XIV: Proceedings of the Fourteenth Workshop on the Algorithmic Foundations of Robotics 14. Springer International Publishing, 313–329.

[29]

Khansari-Zadeh SM and Billard A (2011) Learning stable nonlinear dynamical systems with Gaussian mixture models. IEEE Transactions on Robotics 27(5): 943–957.

Digital Library

[30]

Khoramshahi M, Henriks G, and Naef A, et al. (2020) Arm-hand motion-force coordination for physical interactions with non-flat surfaces using dynamical systems: toward compliant robotic massage. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE.

[31]

Kingston Z, Moll M, and Kavraki LE (2019) Exploring implicit spaces for constrained sampling-based planning. The International Journal of Robotics Research 38(10–11): 1151–1178.

Digital Library

[32]

Kober J and Peters J (2014) Policy search for motor primitives in robotics. In: Learning Motor Skills. Springer, 83–117.

[33]

Kronander K and Billard A (2016) Passive interaction control with dynamical systems. IEEE Robotics and Automation Letters 1(1): 106–113.

[34]

Kulvicius T, Ning K, and Tamosiunaite M, et al. (2012) Joining movement sequences: modified dynamic movement primitives for robotics applications exemplified on handwriting. IEEE Transactions on Robotics 28(1): 145–157.

Digital Library

[35]

Mroueh Y, Poggio T, and Rosasco L, et al. (2012) Multiclass learning with simplex coding. In: Advances in Neural Information Processing Systems, 2789–2797.

[36]

Osa T, Pajarinen J, and Neumann G, et al. (2018) An algorithmic perspective on imitation learning. Foundations and Trends® in Robotics 7(1-2): 1–179.

Digital Library

[37]

Paraschos A, Daniel C, and Peters JR, et al. (2013) Probabilistic movement primitives. In: Advances in Neural Information Processing Systems, 2616–2624.

[38]

Pardo L (2018) Statistical Inference Based on Divergence Measures. Boca Raton: Chapman and Hall/CRC.

[39]

Peng XB, Abbeel P, and Levine S, et al. (2018) Deepmimic: example-guided deep reinforcement learning of physics-based character skills. ACM Transactions on Graphics (TOG) 37(4): 143.

Digital Library

[40]

Peters J, Mulling K, and Altun Y (2010) Relative entropy policy search. In: Twenty-Fourth AAAI Conference on Artificial Intelligence.

[41]

Pomerleau DA (1989) Alvinn: an autonomous land vehicle in a neural network. In: Advances in Neural Information Processing Systems, 305–313.

[42]

Ratliff ND, Silver D, and Bagnell JA (2009) Learning to search: functional gradient techniques for imitation learning. Autonomous Robots 27(1): 25–53.

Digital Library

[43]

Ravichandar H, Polydoros AS, and Chernova S, et al. (2020) Recent advances in robot learning from demonstration. Annual Review of Control, Robotics, and Autonomous Systems 3: 297–330.

[44]

Reiner B, Ertel W, and Posenauer H, et al. (2014) LAT: a simple learning from demonstration method. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp. 4436–4441.

[45]

Ross S, Gordon G, and Bagnell D (2011) A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 627–635.

[46]

Ruan S, Poblete KL, and Wu H, et al. (2023) Efficient path planning in narrow passages for robots with ellipsoidal components. IEEE Transactions on Robotics 39(1): 110–127.

[47]

Rudi A, Camoriano R, and Rosasco L (2015) Less is more: Nyström computational regularization. Advances in Neural Information Processing Systems 28: 1657–1665.

[48]

Rudi A, Ciliberto C, and Marconi G, et al. (2018) Manifold structured prediction. In: Advances in Neural Information Processing Systems, 5610–5621.

[49]

Schaal S (1999) Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences 3(6): 233–242.

[50]

Schneider M and Ertel W (2010) Robot learning by demonstration with local Gaussian process regression. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp. 255–260.

[51]

Schulman J, Levine S, and Abbeel P, et al. (2015) Trust region policy optimization. In: International Conference on Machine Learning, 1889–1897.

[52]

Simo-Serra E, Torras C, and Moreno-Noguer F (2017) 3D human pose tracking priors using geodesic mixture models. International Journal of Computer Vision 122(2): 388–408.

Digital Library

[53]

Stulp F and Sigaud O (2015) Many regression algorithms, one unified model: a review. Neural Networks 69: 60–79.

Digital Library

[54]

Swamy G, Choudhury S, and Bagnell JA, et al. (2021) Of moments and matching: a game-theoretic framework for closing the imitation gap. In: International Conference on Machine Learning. PMLR, pp. 10022–10032.

[55]

Traversaro S, Brossette S, and Escande A, et al. (2016) Identification of fully physical consistent inertial parameters using optimization on manifolds. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 5446–5451.

[56]

Yang C, Zeng C, and Fang C, et al. (2018) A DMPs-based framework for robot learning and generalization of humanlike variable impedance skills. IEEE/ASME Transactions on Mechatronics 23(3): 1193–1203.

[57]

Zahra O, Tolu S, and Zhou P, et al. (2022) A bio-inspired mechanism for learning robot motion from mirrored human demonstrations. Frontiers in Neurorobotics 16: 826410.

[58]

Zeestraten MJ, Havoutis I, and Calinon S, et al. (2017a) Learning task-space synergies using Riemannian geometry. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 73–78.

[59]

Zeestraten MJ, Havoutis I, and Silvério J, et al. (2017b) An approach for imitation learning on Riemannian manifolds. IEEE Robotics and Automation Letters 2(3): 1240–1247.

[60]

Zhou Y and Asfour T (2017) Task-oriented generalization of dynamic movement primitive. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 3202–3209.

[61]

Ziebart BD, Maas A, and Bagnell JA, et al. (2008) Maximum entropy inverse reinforcement learning. In: Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 3, AAAI’08. AAAI Press. ISBN 9781577353683, p. 1433–1438.

Cited By

Ruan SLiu WWang XMeng XChirikjian G(2024)PRIMP: PRobabilistically-Informed Motion Primitives for Efficient Affordance Learning From DemonstrationIEEE Transactions on Robotics10.1109/TRO.2024.339005240(2868-2887)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.1109/TRO.2024.3390052

Index Terms

A structured prediction approach for robot imitation learning
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
2. Computing methodologies
  1. Artificial intelligence
    1. Control methods
  2. Machine learning
    1. Learning paradigms
    2. Machine learning approaches

Index terms have been assigned to the content through auto-classification.

Recommendations

Sequential robot imitation learning from observations

This paper presents a framework to learn the sequential structure in the demonstrations for robot imitation learning. We first present a family of task-parameterized hidden semi-Markov models that extracts invariant segments (also called sub-goals or ...
Riemannian Manifold Learning

Recently, manifold learning has been widely exploited in pattern recognition, data analysis, and machine learning. This paper presents a novel framework, called Riemannian manifold learning (RML), based on the assumption that the input high-dimensional ...
Tracking the Race Between Deep Reinforcement Learning and Imitation Learning
Quantitative Evaluation of Systems
Abstract
Learning-based approaches for solving large sequential decision making problems have become popular in recent years. The resulting agents perform differently and their characteristics depend on those of the underlying learning approach. Here, we ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Robotics Research

International Journal of Robotics Research Volume 43, Issue 2

Feb 2024

126 pages

Issue’s Table of Contents

© The Author(s) 2023.

Publisher

Sage Publications, Inc.

United States

Publication History

Published: 21 February 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ruan SLiu WWang XMeng XChirikjian G(2024)PRIMP: PRobabilistically-Informed Motion Primitives for Efficient Affordance Learning From DemonstrationIEEE Transactions on Robotics10.1109/TRO.2024.339005240(2868-2887)Online publication date: 16-Apr-2024
https://dl.acm.org/doi/10.1109/TRO.2024.3390052

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents