research-article

MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning

Authors:

Tristan Tomilin,

Matthew E. Taylor,

A. Rupam Mahmood,

Mykola Pechenizkiy,

Decebal Constantin MocanuAuthors Info & Claims

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems

Pages 733 - 742

Published: 06 May 2024 Publication History

Abstract

The visual world provides an abundance of information, but many input pixels received by agents often contain distracting stimuli. Autonomous agents need the ability to distinguish useful information from task-irrelevant perceptions, enabling them to generalize to unseen environments with new distractions. Existing works approach this problem using data augmentation or large auxiliary networks with additional loss functions. We introduce MaDi, a novel algorithm that learns to mask distractions by the reward signal only. In MaDi, the conventional actor-critic structure of deep reinforcement learning agents is complemented by a small third sibling, the Masker. This lightweight neural network generates a mask to determine what the actor and critic will receive, such that they can focus on learning the task. The masks are created dynamically, depending on the current input. We run experiments on the DeepMind Control Generalization Benchmark, the Distracting Control Suite, and a real UR5 Robotic Arm. Our algorithm improves the agent's focus with useful masks, while its efficient Masker network only adds 0.2% more parameters to the original structure, in contrast to previous work. MaDi consistently achieves generalization results better than or competitive to state-of-the-art methods.

References

[1]

Rishabh Agarwal, Marlos C Machado, Pablo Samuel Castro, and Marc G Bellemare. 2020. Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning. In International Conference on Learning Representations. URL: https://arxiv.org/abs/2101.05265.

[2]

Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz Litwin, Bob McGrew, Arthur Petron, Alex Paino, Matthias Plappert, Glenn Powell, Raphael Ribas, et al. 2019. Solving Rubik's Cube with a Robot Hand. arXiv preprint arXiv:1910.07113 (2019). URL: https://openai.com/research/solving-rubiks-cube.

[3]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. 2016. Layer Normalization. Advances in Neural Information Processing Systems, Deep Learning Symposium (2016). URL: https://arxiv.org/abs/1607.06450.

[4]

David Bertoin, Adil Zouitine, Mehdi Zouitine, and Emmanuel Rachelson. 2022. Look where you look! Saliency-guided Q-networks for generalization in visual Reinforcement Learning. Advances in Neural Information Processing Systems, Vol. 35 (2022), 30693--30706. URL: https://arxiv.org/abs/2209.09203.

[5]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv preprint arXiv:1606.01540 (2016). URL: https://www.gymlibrary.dev/.

[6]

Karl Cobbe, Oleg Klimov, Chris Hesse, Taehoon Kim, and John Schulman. 2019. Quantifying Generalization in Reinforcement Learning. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, 1282--1289. URL: https://proceedings.mlr.press/v97/cobbe19a.html.

[7]

Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de Las Casas, et al. 2022. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, Vol. 602, 7897 (2022), 414--419. URL: https://www.nature.com/articles/s41586-021-04301-9.

[8]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (2021). URL: https://arxiv.org/abs/2010.11929.

[9]

Yan Duan, John Schulman, Xi Chen, Peter L Bartlett, Ilya Sutskever, and Pieter Abbeel. 2016. RLFast Reinforcement Learning via Slow Reinforcement Learning. arXiv preprint arXiv:1611.02779 (2016). URL: https://arxiv.org/abs/1611.02779.

[10]

Open X-Embodiment Collaboration et al. 2023. Open X-Embodiment: Robotic Learning Datasets and RT-X Models. URL: https://robotics-transformer-x.github.io.

[11]

Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, and Anima Anandkumar. 2021. SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies. International Conference on Machine Learning (2021). URL: https://arxiv.org/abs/2106.09678.

[12]

Jesse Farebrother, Marlos C Machado, and Michael Bowling. 2018. Generalization and Regularization in DQN. NeurIPS'18 Deep Reinforcement Learning Workshop (2018). URL: https://arxiv.org/abs/1810.00123.

[13]

Norm Ferns, Prakash Panangaden, and Doina Precup. 2011. Bisimulation Metrics for Continuous Markov Decision Processes. SIAM J. Comput., Vol. 40, 6 (2011), 1662--1714. URL: https://doi.org/10.1137/10080484X.

Digital Library

[14]

Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, and Pieter Abbeel. 2016. Deep Spatial Autoencoders for Visuomotor Learning. In 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 512--519. URL: https://arxiv.org/abs/1509.06113.

[15]

Laura Graesser, Utku Evci, Erich Elsen, and Pablo Samuel Castro. 2022. The State of Sparse Training in Deep Reinforcement Learning. In International Conference on Machine Learning. PMLR, 7766--7792. URL: https://arxiv.org/abs/2206.10369.

[16]

Bram Grooten, Ghada Sokar, Shibhansh Dohare, Elena Mocanu, Matthew E. Taylor, Mykola Pechenizkiy, and Decebal Constantin Mocanu. 2023. Automatic Noise Filtering with Dynamic Sparse Training in Deep Reinforcement Learning. The 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2023). URL: https://arxiv.org/abs/2302.06548.

[17]

Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H Huang, Dhruva Tirumala, Markus Wulfmeier, Jan Humplik, Saran Tunyasuvunakool, Noah Y Siegel, Roland Hafner, et al. 2023. Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning. arXiv preprint arXiv:2304.13653 (2023). URL: https://sites.google.com/view/op3-soccer.

[18]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. In International Conference on Machine Learning. PMLR, 1861--1870. URL: https://arxiv.org/abs/1801.01290.

[19]

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, and Mohammad Norouzi. 2020. Dream to Control: Learning Behaviors by Latent Imagination. International Conference on Learning Representations (2020). URL: https://arxiv.org/abs/1912.01603.

[20]

Nicklas Hansen, Rishabh Jangir, Yu Sun, Guillem Alenyà, Pieter Abbeel, Alexei A Efros, Lerrel Pinto, and Xiaolong Wang. 2020. Self-Supervised Policy Adaptation during Deployment. International Conference on Learning Representations (2020). URL: https://arxiv.org/abs/2007.04309.

[21]

Nicklas Hansen, Hao Su, and Xiaolong Wang. 2021. Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation. 35th Conference on Neural Information Processing Systems (2021). URL: https://arxiv.org/abs/2107.00644.

Digital Library

[22]

Nicklas Hansen and Xiaolong Wang. 2021. Generalization in Reinforcement Learning by Soft Data Augmentation. In International Conference on Robotics and Automation. URL: https://arxiv.org/abs/2011.13389.

Digital Library

[23]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778. URL: https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf.

[24]

Dan Hendrycks and Kevin Gimpel. 2016. Gaussian Error Linear Units (GELUs). arXiv preprint arXiv:1606.08415 (2016). URL: https://arxiv.org/abs/1606.08415.

[25]

Leslie Pack Kaelbling, Michael L Littman, and Anthony R Cassandra. 1998. Planning and acting in partially observable stochastic domains. Artificial Intelligence, Vol. 101, 1--2 (1998), 99--134. URL: https://www.sciencedirect.com/science/article/pii/S000437029800023X.

[26]

Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. International Conference for Learning Representations (2015). URL: https://arxiv.org/abs/1412.6980.

[27]

Robert Kirk, Amy Zhang, Edward Grefenstette, and Tim Rocktäschel. 2023. A Survey of Zero-shot Generalisation in Deep Reinforcement Learning. Journal of Artificial Intelligence Research, Vol. 76 (2023), 201--264. URL: https://arxiv.org/abs/2111.09794.

Digital Library

[28]

Misha Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel, and Aravind Srinivas. 2020a. Reinforcement Learning with Augmented Data. Advances in Neural Information Processing Systems, Vol. 33 (2020), 19884--19895. URL: https://arxiv.org/abs/2004.14990.

[29]

Michael Laskin, Aravind Srinivas, and Pieter Abbeel. 2020b. CURL: Contrastive Unsupervised Representations for Reinforcement Learning. In International Conference on Machine Learning. PMLR, 5639--5650. URL: https://arxiv.org/abs/2004.04136.

[30]

A Rupam Mahmood, Dmytro Korenkevych, Gautham Vasan, William Ma, and James Bergstra. 2018. Benchmarking Reinforcement Learning Algorithms on Real-World Robots. In Conference on robot learning. PMLR, 561--591. URL: https://arxiv.org/abs/1809.07731.

[31]

Piotr Mirowski, Razvan Pascanu, Fabio Viola, Hubert Soyer, Andrew J Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, et al. 2016. Learning to Navigate in Complex Environments. arXiv preprint arXiv:1611.03673 (2016). URL: https://arxiv.org/abs/1611.03673.

[32]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529--533. URL: https://www.nature.com/articles/nature14236.

[33]

Xinghua Qu, Zhu Sun, Yew-Soon Ong, Abhishek Gupta, and Pengfei Wei. 2020. Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy. IEEE Transactions on Cognitive and Developmental Systems, Vol. 13, 4 (2020), 806--817. URL: https://arxiv.org/abs/1911.03849.

[34]

Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, and Rob Fergus. 2021. Automatic Data Augmentation for Generalization in Reinforcement Learning. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 5402--5415. URL: https://proceedings.neurips.cc/paper/2021/hash/2b38c2df6a49b97f706ec9148ce48d86-Abstract.html.

[35]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention--MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, 234--241. URL: https://arxiv.org/abs/1505.04597.

[36]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), Vol. 115, 3 (2015), 211--252. https://doi.org/10.1007/s11263-015-0816-y URL: https://link.springer.com/article/10.1007/s11263-015-0816-y.

Digital Library

[37]

Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, et al. 2020. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, Vol. 588, 7839 (2020), 604--609. URL: https://www.nature.com/articles/s41586-020-03051-4.

[38]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017). URL: https://arxiv.org/abs/1707.06347.

[39]

Austin Stone, Oscar Ramirez, Kurt Konolige, and Rico Jonschkowski. 2021. The Distracting Control Suite - A Challenging Benchmark for Reinforcement Learning from Pixels. arXiv preprint arXiv:2101.02722 (2021). URL: https://arxiv.org/abs/2101.02722.

[40]

Tianxin Tao, Daniele Reda, and Michiel van de Panne. 2022. Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels. arXiv preprint arXiv:2204.04905 (2022). URL: https://arxiv.org/abs/2204.04905.

[41]

Yuval Tassa, Yotam Doron, Alistair Muldal, Tom Erez, Yazhe Li, Diego de Las Casas, David Budden, Abbas Abdolmaleki, Josh Merel, Andrew Lefrancq, et al. 2018. Deepmind Control Suite. arXiv preprint arXiv:1801.00690 (2018). URL: https://arxiv.org/abs/1801.00690.

[42]

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 23-30. URL: https://arxiv.org/abs/1703.06907.

Digital Library

[43]

Emanuel Todorov, Tom Erez, and Yuval Tassa. 2012. MuJoCo: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026--5033. URL: https://mujoco.org/.

[44]

Manan Tomar, Riashat Islam, Sergey Levine, and Philip Bachman. 2023. Ignorance is Bliss: Robust Control via Information Gating. arXiv preprint arXiv:2303.06121 (2023). URL: https://arxiv.org/abs/2303.06121.

[45]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. Advances in Neural Information Processing Systems, Vol. 30 (2017). URL: https://arxiv.org/abs/1706.03762.

[46]

Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, and Matt Botvinick. 2016. Learning to reinforcement learn arXiv preprint arXiv:1611.05763 (2016). URL: https://arxiv.org/abs/1611.05763.

[47]

Kaixin Wang, Bingyi Kang, Jie Shao, and Jiashi Feng. 2020. Improving Generalization in Reinforcement Learning with Mixture Regularization. Advances in Neural Information Processing Systems, Vol. 33 (2020), 7968--7978. URL: https://arxiv.org/abs/2010.10814.

[48]

Yan Wang, Gautham Vasan, and A Rupam Mahmood. 2023. Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 9435--9441. URL: https://arxiv.org/abs/2210.02317.

[49]

Jinwei Xing, Takashi Nagata, Kexin Chen, Xinyun Zou, Emre Neftci, and Jeffrey L Krichmar. 2021. Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 10452--10459. URL: https://arxiv.org/abs/2102.05714.

[50]

Denis Yarats, Ilya Kostrikov, and Rob Fergus. 2021. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels. International Conference on Learning Representations (2021). URL: https://openreview.net/forum?id=GY6-6sTvGaf.

[51]

Tao Yu, Zhizheng Zhang, Cuiling Lan, Yan Lu, and Zhibo Chen. 2022. Mask-based Latent Reconstruction for Reinforcement Learning. Advances in Neural Information Processing Systems, Vol. 35 (2022), 25117--25131. URL: https://arxiv.org/abs/2201.12096.

[52]

Yufeng Yuan and A Rupam Mahmood. 2022. Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots. In 2022 International Conference on Robotics and Automation (ICRA). IEEE, 5546--5552. URL: https://arxiv.org/abs/2203.12759.

[53]

Zhecheng Yuan, Guozheng Ma, Yao Mu, Bo Xia, Bo Yuan, Xueqian Wang, Ping Luo, and Huazhe Xu. 2022a. Don't Touch What Matters: Task-Aware Lipschitz Data Augmentation for Visual Reinforcement Learning. International Joint Conference on Artificial Intelligence (2022). URL: https://arxiv.org/abs/2202.09982.

[54]

Zhecheng Yuan, Zhengrong Xue, Bo Yuan, Xueqian Wang, Yi Wu, Yang Gao, and Huazhe Xu. 2022b. Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning. Advances in Neural Information Processing Systems, Vol. 35 (2022), 13022--13037. URL: https://arxiv.org/abs/2212.08860.

[55]

Dylan Yung, Andrew Szot, Prithvijit Chattopadhyay, Judy Hoffman, and Zsolt Kira. 2023. Augmentation Curriculum Learning For Generalization in RL. (2023). URL: https://openreview.net/forum?id=Fj1S0SV8p3U.

[56]

Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, and Sergey Levine. 2021. Learning Invariant Representations for Reinforcement Learning without Reconstruction. International Conference on Learning Representations (ICLR) (2021). URL: https://arxiv.org/abs/2006.10742.

[57]

Chiyuan Zhang, Oriol Vinyals, Remi Munos, and Samy Bengio. 2018. A Study on Overfitting in Deep Reinforcement Learning. arXiv preprint arXiv:1804.06893 (2018). URL: https://arxiv.org/abs/1804.06893.

[58]

Huan Zhang, Hongge Chen, Chaowei Xiao, Bo Li, Mingyan Liu, Duane Boning, and Cho-Jui Hsieh. 2020. Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations. Advances in Neural Information Processing Systems, Vol. 33 (2020), 21024--21037. URL: https://arxiv.org/abs/2003.08938.

[59]

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE transactions on pattern analysis and machine intelligence, Vol. 40, 6 (2017), 1452--1464. URL: https://ieeexplore.ieee.org/document/7968387.

[60]

Y Zhu, R Mottaghi, E Kolve, JJ Lim, A Gupta, L Fei-Fei, and A Farhadi. 2017. Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning. International Conference on Robotics and Automation (2017). URL: https://ieeexplore.ieee.org/document/7989381.

Digital Library

Cited By

Grooten BDastani MSichman JAlechina NDignum V(2024)Large Learning Agents: Towards Continually Aligned Robots with Scale in RLProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663274(2746-2748)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663274

Index Terms

MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Control methods
      1. Robotic planning
  2. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
    2. Machine learning algorithms

Recommendations

Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...
Deep Reinforcement Learning for Humanoid Robot Behaviors
Abstract
RoboCup 3D Soccer Simulation is a robot soccer competition based on a high-fidelity simulator with autonomous humanoid agents, making it an interesting testbed for robotics and artificial intelligence. Due to the recent success of Deep ...
Measuring and characterizing generalization in deep reinforcement learning
Abstract
Deep reinforcement learning (RL) methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports ...

image image

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '24: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems

May 2024

2898 pages

ISBN:9798400704864

General Chairs:
Mehdi Dastani
Utrecht University, Netherlands
,
Jaime Simão Sichman
University of São Paulo, Brazil
,
Program Chairs:
Natasha Alechina
Utrecht University, Netherlands
,
Virginia Dignum
Umeå University, Sweden

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 06 May 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Dutch Research Council (NWO)
Alberta Machine Intelligence Institute (Amii); a Canada CIFAR AI Chair Amii; Compute Canada; Huawei; Mitacs; and NSERC
SURF Cooperative

Conference

AAMAS '24

Sponsor:

SIGAI

AAMAS '24: International Conference on Autonomous Agents and Multiagent Systems

May 6 - 10, 2024

Auckland, New Zealand

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
16
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)3

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Grooten BDastani MSichman JAlechina NDignum V(2024)Large Learning Agents: Towards Continually Aligned Robots with Scale in RLProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663274(2746-2748)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663274

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents