Article

Learning and multiagent reasoning for autonomous agents

Author:

Peter StoneAuthors Info & Claims

IJCAI'07: Proceedings of the 20th international joint conference on Artifical intelligence

Pages 13 - 30

Published: 06 January 2007 Publication History

Abstract

One goal of Artificial Intelligence is to enable the creation of robust, fully autonomous agents that can coexist with us in the real world. Such agents will need to be able to learn, both in order to correct and circumvent their inevitable imperfections, and to keep up with a dynamically changing world. They will also need to be able to interact with one another, whether they share common goals, they pursue independent goals, or their goals are in direct conflict. This paper presents current research directions in machine learning, multiagent reasoning, and robotics, and advocates their unification within concrete application domains. Ideally, new theoretical results in each separate area will inform practical implementations while innovations from concrete multiagent applications will drive new theoretical pursuits, and together these synergistic research approaches will lead us towards the goal of fully autonomous agents.

References

[1]

David Ackley and Michael Littman. Interactions between learning and evolution. In C.G. Langton, C. Taylor, J.D. Farmer, and S. Rasmussen, editors, Artificial Life II. Addison-Wesley, 1991.

[2]

Mazda Ahmadi and Peter Stone. Keeping in touch: A distributed check for biconnected structure by homogeneous robots. In The 8th International Symposium on Distributed Autonomous Robotic Systems, July 2006.

[3]

Mazda Ahmadi and Peter Stone. A multi-robot system for continuous area sweeping tasks. In Proceedings of International Conference on Robotics and Automation (ICRA), to appear., May 2006.

[4]

Minoru Asada and Hiroaki Kitano, editors. RoboCup-98: Robot Soccer World Cup II. Lecture Notes in Artificial Intelligence 1604. Springer Verlag, Berlin, 1999.

Digital Library

[5]

Minoru Asada, Shoichi Noda, Sukoya Tawaratsumida, and Koh Hosoda. Vision-based behavior acquisition for a shooting robot by using a reinforcement learning. In Proc. of IAPR/IEEE Workshop on Visual Behaviors-1994, pages 112-118, 1994.

[6]

J. Andrew Bagnell and Jeff Schneider. Autonomous helicopter control using reinforcement learning policy search methods. In International Conference on Robotics and Automation, pages 1615-1620. IEEE Press, 2001.

[7]

L. C. Baird and A. W Moore. Gradient descent for general reinforcement learning. In Michael J. Kearns, Sara A. Solla, and David A. Cohn, editors, Advances in Neural Information Processing Systems, volume 11, pages 968-974. The MIT Press, 1999.

Digital Library

[8]

L.C. Baird. Residual algorithms: Reinforcement learning with function approximation. In Proceedings of the Twelfth International Conference on Machine Learning (ICML). Morgan Kaufman, July 1995.

[9]

Bikramjit Banerjee and Peter Stone. General game learning using knowledge transfer. In The 20th International Joint Conference on Artificial Intelligence, January 2007. To appear.

Digital Library

[10]

Andreas Birk, Silvia Coradeschi, and Satoshi Tadokoro, editors. RoboCup-2001: Robot Soccer World Cup V. Springer Verlag, Berlin, 2002.

Digital Library

[11]

Richard Bishop. Intelligent Vehicle Technology and Trends. Artech House, 2005.

[12]

E.J.W. Boers, M.V. Borst, and I.G. Sprinkhuizen-Kuyper. Evolving Artificial Neural Networks using the "Baldwin Effect". In Artificial Neural Nets and Genetic Algorithms, Proceedings of the International Conference in Ales, France, 1995.

[13]

Alan H. Bond and Les Gasser. An analysis of problems and research in DAI. In Alan H. Bond and Les Gasser, editors, Readings in Distributed Artificial Intelligence, pages 3-35. Morgan Kaufmann Publishers, San Mateo, CA, 1988.

[14]

Rodney A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2:14-23, 1986.

[15]

Rodney A. Brooks. Intelligence without reason. In John Myopoulos and Ray Reiter, editors, Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI-91), pages 569-595, Sydney, Australia, 1991. Morgan Kaufmann publishers Inc.: San Mateo, CA, USA.

[16]

Murray Campbell, A. Joseph Hoane Jr., and Feng Hsiung Hsu. Deep blue. Artificial Intelligence, 134(1-2):57-83, 2002.

Digital Library

[17]

Y. Uny Cao, Alex S. Fukunaga, and Andrew B. Kahng. Cooperative mobile robotics: Antecedents and directions. Autonomous Robots, 4:7-27, 1997.

Digital Library

[18]

Y-Han Chang, Tracy Ho, and Leslie Pack Kaelbling. Mobilized ad-hoc networks: A reinforcement learning approach. In Proceedings of the First International Conference on Autonomic Computing, May 2004.

Digital Library

[19]

Mike Chen, Alice X. Zheng, Jim Lloyd, Michael I. Jordan, and Eric Brewer. Failure diagnosis using decision trees. In Proceedings of the First International Conference on Autonomic Computing, May 2004.

Digital Library

[20]

Weiming Chen. Odometry calibration and gait optimisation. Technical report, The University of New South Wales, School of Computer Science and Engineering, 2005.

[21]

Sonia Chernova and Manuela Veloso. An evolutionary approach to gait learning for four-legged robots. In In Proceedings of IROS'04, September 2004.

[22]

David Cohen, Yao Hua Ooi, Paul Vernaza, and Daniel D. Lee. The University of Pennsylvania RoboCup 2004 legged soccer team, 2004. Available at URL http://www. cis.upenn.edu/robocup/UPenn04.pdf.

[23]

Peter C. Cramton. The FCC spectrum auctions: An early assessment. Journal of Economics and Management Strategy, 6(3):431-495, 1997.

[24]

Robert H. Crites and Andrew G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 1017-1023, Cambridge, MA, 1996. MIT Press.

[25]

DARPA. The DARPA grand challenge, 2006. http://www.darpa.mil/grandchallenge.

[26]

Kerstin Dautenhahn. Getting to know each other--artificial social intelligence for autonomous robots. Robotics and Autonomous Systems, 16:333-356, 1995.

[27]

Yuval Davidor. Genetic Algorithms and Robotics: A Heuristic Strategy for Optimization. World Scientific Publishing Co., Inc., NJ, USA, 1991.

Digital Library

[28]

Thomas Dean and Robert Givan. Model minimization in Markov decision processes. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, 1997.

[29]

Keith S. Decker. Distributed problem solving: A survey. IEEE Transactions on Systems, Man, and Cybernetics, 17(5):729-740, September 1987.

Digital Library

[30]

F Dellaert, D Fox, W Burgard, and S Thrun. Monte carlo localization for mobile robots. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 1999.

[31]

Thomas G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303, 2000.

[32]

Kurt Dresner and Peter Stone. Multiagent traffic management: A reservation-based intersection control mechanism. In The Third International Joint Conference on Autonomous Agents and Multiagent Systems, pages 530-537, July 2004.

Digital Library

[33]

Uwe Dueffert and Jan Hoffmann. Reliable and precise gait modeling for a quadruped robot. In RoboCup 2005: Robot Soccer World Cup IX, Lecture Notes in Artificial Intelligence. Springer, 2005.

Digital Library

[34]

Anne Eisenberg. In online auctions of the future, it'll be bot vs. bot vs. bot. The New York Times, 2000. August 17th.

[35]

Tom Elliott Fawcett. Feature discovery for problem solving systems, PhD thesis, University of Massachusetts, Amherst, 1993.

[36]

Alan Fern, Robert Givan, Babak Falsafi, and T.N. Vijaykumar. Dynamic feature selection for hardware prediction, 2004. From http://web.engr.oregonstate. edu/~afern/.

[37]

F. Fernandez and Manuela Veloso. Learning by probabilistic reuse of past policies. In Proceedings of the 6th International Conference on Autonomous Agents and Multiagent Systems, 2006.

[38]

Peggy Fidelman and Peter Stone. The chin pinch: A case study in skill learning on a legged robot. In Gerhard Lakemeyer, Elizabeth Sklar, Domenico Sorenti, and Tomoichi Takahashi, editors, RoboCup-2006: Robot Soccer World Cup X. Springer Verlag, Berlin, 2007. To appear.

Digital Library

[39]

Armando Fox, Emre Kiciman, David Patterson, Michael Jordan, and Randy Katz. Combining statistical monitoring and predictable recovery for self-management. In Proceedings of 2004 Workshop on Self-Managed Systems (WOSS'04), October 2004.

Digital Library

[40]

Robert French and Adam Messinger. Genes, phenes and the Baldwin effect: Learning and evolution in a simulated population. Artificial Life, 4:277-282, 1994.

[41]

Erann Gat. On the role of simulation in the study of autonomous mobile robots. In AAAI-95 Spring Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents., Stanford, CA, March 1995.

[42]

Erann Gat. Three-layer architectures. In David Kortenkamp, R. Peter Bonasso, and Robin Murphy, editors, Artificial Intelligence and Mobile Robots, pages 195-210. AAAI Press, Menlo Park, CA, 1998.

Digital Library

[43]

Michael Genesereth and Nathaniel Love. General game playing: Overview of the AAAI competition. AI Magazine, 26(2), 2005.

[44]

David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. 1989.

Digital Library

[45]

Faustino Gomez, Doug Burger, and Risto Miikkulainen. A neuroevolution method for dynamic resource allocation on a chip multiprocessor. In Proceedings of the INNS-IEEE International Joint Conference on Neural Networks, pages 2355-2361. IEEE, 2001.

[46]

Frederic Gruau and Darrell Whitley. Adding learning to the cellular development of neural networks: Evolution and the Baldwin effect. Evolutionary Computation, 1:213-233, 1993.

Digital Library

[47]

Frederic Gruau, Darrell Whitley, and Larry Pyeatt. A comparison between cellular encoding and direct encoding for genetic neural networks. In Genetic Programming 1996: Proceedings of the First Annual Conference, pages 81-89, 1996.

Digital Library

[48]

Geoffrey E. Hinton and Steven J. Nowlan. How learning can guide evolution. Complex Systems, 1:495-502, 1987.

Digital Library

[49]

G. S. Hornby, M. Fujita, S. Takamura, T. Yamamoto, and O. Hanagata. Autonomous evolution of gaits with the Sony quadruped robot. In Wolfgang Banzhaf, Jason Daida, Agoston E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1297-1304, Orlando, Florida, USA, 13-17 1999. Morgan Kaufmann.

[50]

G.S. Hornby, S. Takamura, J. Yokono, O. Hanagata, T. Yamamoto, and M. Fujita. Evolving robust gaits with AIBO. In IEEE International Conference on Robotics and Automation, pages 3040-3045, 2000.

[51]

Marcus J. Huber and Edmund H. Durfee. Deciding when to commit to action during observation-based coordination. In Proceedings of the First International Conference on Multi-Agent Systems, pages 163-170, Menlo Park, California, June 1995. AAAI Press.

[52]

R. Colin Johnson. Steady pace takes DARPA race. EE Times, October 2005. Accessed at http://www. eetimes.com.

[53]

Nicholas K. Jong and Peter Stone. State abstraction discovery fromirrelevant state variables. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, pages 752-757, August 2005.

Digital Library

[54]

Gal A. Kaminka, Pedro U. Lima, and Raul Rojas, editors. RoboCup-2002: Robot Soccer World Cup VI. Springer Verlag, Berlin, 2003.

[55]

Jeffrey O. Kephart and David M. Chess. The vision of autonomic computing. Computer, pages 41-50, January 2003.

Digital Library

[56]

Min Sub Kim and William Uther. Automatic gait optimisation for quadruped robots. In Australasian Conference on Robotics and Automation, Brisbane, December 2003.

[57]

Hiroaki Kitano, Yasuo Kuniyoshi, Itsuki Noda, Minoru Asada, Hitoshi Matsubara, and Eiichi Osawa. RoboCup: A challenge problem for AI. AI Magazine, 18(1):73-85, Spring 1997.

Digital Library

[58]

Hiroaki Kitano, Milind Tambe, Peter Stone, Manuela Veloso, Silvia Coradeschi, Eiichi Osawa, Hitoshi Matsubara, Itsuki Noda, and Minoru Asada. The RoboCup synthetic agent challenge 97. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 24-29, San Francisco, CA, 1997. Morgan Kaufmann.

Digital Library

[59]

Hiroaki Kitano, editor. RoboCup-97: Robot Soccer World Cup I. Springer Verlag, Berlin, 1998.

Digital Library

[60]

Nate Kohl and Peter Stone. Policy gradient reinforcement learning for fast quadrupedal locomotion. In Proceedings of the IEEE International Conference on Robotics and Automation, May 2004.

[61]

Nate Kohl and Peter Stone. Policy gradient reinforcement learning for fast quadrupedal locomotion. In Proceedings of the IEEE International Conference on Robotics and Automation, volume 3, pages 2619-2624, May 2004.

[62]

Daphne Koller. Representation, reasoning, learning, August 2001. IJCAI Computers and Thought Award talk.

[63]

J.Z. Kolter and M.A. Maloof. Learning to detect malicious executables in the wild. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 470-478, New York, NY, 2004. ACM Press. Best Application Paper.

Digital Library

[64]

George Konidaris and Andrew Barto. Autonomous shaping: Knowledge transfer in reinforcement learning. In Proceedings of the 23rd Internation Conference on Machine Learning, pages 489-496, 2006.

Digital Library

[65]

Gregory Kuhlmann, Peter Stone, and Justin Lallinger. The UT Austin Villa 2003 champion simulator coach: A machine learning approach. In Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors, RoboCup-2004: Robot Soccer World Cup VIII, volume 3276 of Lecture Notes in Artificial Intelligence, pages 636-644. Springer Verlag, Berlin, 2005.

Digital Library

[66]

Gregory Kuhlmann, Kurt Dresner, and Peter Stone. Automatic heuristic construction in a complete general game player. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, pages 1457-62, July 2006.

Digital Library

[67]

Gregory Kuhlmann, William B. Knox, and Peter Stone. Know thine enemy: A champion RoboCup coach agent. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, pages 1463-68, July 2006.

Digital Library

[68]

C. Kwok, D. Fox, and M. Meila. Adaptive real-time particle filters for robot localization. In Proc. of the IEEE International Conference on Robotics & Automation, 2003.

[69]

Michail G. Lagoudakis and Ronald Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4(2003):1107-1149, 2003.

Digital Library

[70]

Lihong Li, Thomas J. Walsh, and Michael L. Littman. Towards a unified theory of state abstractions for MDPs. In Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics, pages 531-539, 2006.

[71]

Ben Liblit, Alex Aiken, Alice X. Zheng, and Michael I. Jordan. Bug isolation via remote program sampling. In Programming Languages Design and Implementation (PLDI), June 2003.

Digital Library

[72]

B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI, 2005.

Digital Library

[73]

A. K. Mackworth. On seeing robots. In A. Basu and X. Li, editors, Computer Vision: Systems, Theory, and Applications, pages 1-13. World Scientific Press, Singapore, 1993.

[74]

R. Maclin, J. Shavlik, L. Torrey, T. Walker, and E. Wild. Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression. In Proceedings of the 20th National Conference on Artificial Intelligence, 2005.

Digital Library

[75]

Michael Mesnier, Eno Thereska, Gregory R. Ganger, Daniel Ellard, and Margo Seltzer. File classification in self-* storage systems. In Proceedings of the First International Conference on Autonomic Computing, May 2004.

Digital Library

[76]

Marvin L. Minsky. The Society of Mind. Simon & Schuster, 1988.

Digital Library

[77]

Tom Mitchell. Learning and problem-solving. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Karlsruhe, Germany, August 1983. Computers and Thought Award Paper.

[78]

M. Montemerlo, S. Thrun, H. Dahlkamp, D. Stavens an, and S. Strohband. Winning the DARPA Grand Challenge with an AI robot. In Proceedings of the AAAI National Conference on Artificial Intelligence, Boston, MA, July 2006.

Digital Library

[79]

Joseph F. Murray, Gordon F. Hughes, and Kenneth Kreutz-Delgado. Machine learning methods for predicting failures in hard drives: A multiple-instance application. Journal of Machine Learning research, 6:783-816, May 2005.

Digital Library

[80]

Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors. RoboCup-2004: Robot Soccer World Cup VIII. Springer Verlag, Berlin, 2005.

Digital Library

[81]

James Newsome, Brad Karp, and Dawn Song. Polygraph: Automatically generating signatures for polymorphic worms. In The IEEE Symposium on Security and Privacy, May 2005.

Digital Library

[82]

Andrew Y. Ng and Stuart Russell. Algorithms for inverse reinforcement learning. In Proc. 17th International Conf. on Machine Learning, 2000.

Digital Library

[83]

Andrew Y. Ng, H. Jin Kim, Michael I. Jordan, and Shankar Sastry. Autonomous helicopter flight via reinforcement learning. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. To Appear.

[84]

Itsuki Noda, Hitoshi Matsubara, Kazuo Hiraki, and Ian Frank. Soccer server: A tool for research on multiagent systems. Applied Artificial Intelligence, 12:233-250, 1998.

[85]

Itsuki Noda, Adam Jacoff, Ansgar Bredenfeld, and Yasutake Takahashi, editors. RoboCup-2005: Robot Soccer World Cup IX. Springer Verlag, Berlin, 2006.

[86]

Stefano Nolfi, Jeffery L. Elman, and Domenico Parisi. Learning and evolution in neural networks. Adaptive Behavior, 2:5-28, 1994.

Digital Library

[87]

David Pardoe and Peter Stone. TacTex- 2005: A champion supply chain management agent. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, pages 1489-94, July 2006.

Digital Library

[88]

David Pardoe, Peter Stone, Maytal Saar-Tsechansky, and Kerem Tomak. Adaptive mechanism design: A metalearning approach. In The Eighth International Conference on Electronic Commerce, August 2006. To appear.

Digital Library

[89]

Lynne E. Parker. Distributed algorithms for multi-robot observation of multiple moving targets. Autonomous Robots, 12(3):231-255, 2002.

Digital Library

[90]

David C. Parkes. Iterative Combinatorial Auctions: Achieving Economic and Computational Efficiency. PhD thesis, Department of Computer and Information Science, University of Pennsylvania, May 2001.

Digital Library

[91]

Barney Pell. Strategy generation and evaluation for meta-game playing. PhD thesis, University of Cambridge, 1993.

[92]

Steve Phelps, Peter Mc Burnley, Simon Parsons, and Elizabeth Sklar. Co-evolutionary auction mechanism design. In Agent Mediated Electronic Commerce IV, volume 2531 of Lecture Notes in Artificial Intelligence. Springer Verlag, 2002.

Digital Library

[93]

Daniel Polani, Brett Browning, Andrea Bonarini, and Kazuo Yoshida, editors. RoboCup-2003: Robot Soccer World Cup VII. Springer Verlag, Berlin, 2004.

[94]

Dean A. Pormerleau. Neural Network Perception for Mobile Robot Guidance. Kluwer Academic Publishers, 1993.

Digital Library

[95]

J. M. Porta and E. Celaya. Efficient gait generation using reinforcement learning. In Proceedings of the Fourth International Conference on Climbing and Walking Robots, pages 411-418, 2001.

[96]

Michael J. Quinlan, Stephen K. Chalup, and Richard H. Middleton. Techniques for improving vision and locomotion on the sony aibo robot. In Proceedings of the 2003 Australasian Conference on Robotics and Automation, December 2003.

[97]

Michael J. Quinlan, Steven P. Nicklin, Kenny Hong, Naomi Henderson, Stephen R. Young, Timothy G. Moore, Robin Fisher, Phavanna Douangboupha, and Stephan K. Chalup. The 2005 nubots team report. Technical report, The University of Newcastle, School of Electrical Engineering and Computer Science, 2005.

[98]

J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.

Digital Library

[99]

Balaraman Ravindran and Andrew G. Barto. SMDP homomorphisms: An algebraic approach to abstraction in semi-Markov decision processes. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003.

Digital Library

[100]

Craig W. Reynolds. Steering behaviors for autonomous characters. In Proceedings of the Game Developers Conference, pages 763-782, 1999.

[101]

Patrick Riley and Manuela Veloso. On behavior classification in adversarial environments. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-2000), 2000.

[102]

Patrick Riley and Manuela Veloso. Recognizing probabilistic opponent movement models. In A. Birk, S. Coradeschi, and S. Tadokoro, editors, RoboCup-2001: The Fifth RoboCup Competitions and Conferences. Springer Verlag, Berlin, 2002.

Digital Library

[103]

Patrick Riley, Manuela Veloso, and Gal Kaminka. An empirical study of coaching. In H. Asama, T. Arai, T. Fukuda, and T. Hasegawa, editors, Distributed Autonomous Robotic Systems 5, pages 215-224. Springer-Verlag, 2002.

[104]

T. Roefer, R. Brunn, S. Czarnetzki, M. Dassler, M. Hebbel, M. Juengel, T. Kerkhof, W. Nistico, T. Oberlies, C. Rohde, M. Spranger, and C. Zarges. Germanteam 2005. In RoboCup 2005: Robot Soccer World Cup IX, Lecture Notes in Artificial Intelligence. Springer, 2005.

[105]

T. Rofer, H.-D. Burkhard, U. Duffert, J. Hoffman, D. Gohring, M. Jungel, M. Lotzach, O. v. Stryk, R. Brunn, M. Kallnik, M. Kunz, S. Petters, M. Risler, M. Stelzer, I. Dahm, M. Wachter, K. Engel, A. Osterhues, C. Schumann, and J. Ziegler. Germanteam robocup 2003. Technical report, 2003.

[106]

T. Rofer. Evolutionary gait-optimization using a fitness function based on proprioception. In Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors, RoboCup-2004: Robot Soccer World Cup VIII. Springer Verlag, Berlin, 2004.

Digital Library

[107]

Seth Rogers, Claude-Nicolas Flechter, and Pat Langley. An adaptive interactive agent for route advice. In Oren Etzioni, Jörg P. Müller, and Jeffrey M. Bradshaw, editors, Proceedings of the Third International Conference on Autonomous Agents (Agents'99), pages 198-205, Seattle, WA, USA, 1999. ACM Press.

Digital Library

[108]

Stuart Russell. Rationality and intelligence. 1995. Computers and Thought Award Paper.

[109]

Tuomas Sandholm. Making markets and democracy work: A story of incentives and computing. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 1649-1671, 2003. Computers and Thought Award Paper.

Digital Library

[110]

Jonathan Schaeffer, Joseph C. Culberson, Norman Treloar, Brent Knight, Paul Lu, and Duane Szafron. A world championship caliber checkers program. Artificial Intelligence, 53(2-3):273-289, 1992.

Digital Library

[111]

T. Schonberg, M. Ojala, J. Suomela, A. Torpo, and A. Halme. Positioning an autonomous off-road vehicle by using fused DGPS and inertial navigation. In 2nd IFAC Conference on Intelligent Autonomous Vehicles, pages 226-231, 1995.

[112]

O. Selfridge, R. S. Sutton, and Andrew G. Barto. Training and tracking in robotics. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 670-672, 1985.

Digital Library

[113]

Satinder P. Singh. Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8:323-339, 1992.

Digital Library

[114]

Vishal Soni and Satinder Singh. Using homomorphisms to transfer options across continuous reinforcement learning domains. In Proceedings of the Twenty First National Conference on Artificial Intelligence, July 2006.

Digital Library

[115]

Sony. Aibo robot, 2004. http://www.sony. net/Products/aibo.

[116]

Mohan Sridharan and Peter Stone. Autonomous color learning on a mobile robot. In Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.

Digital Library

[117]

Mohan Sridharan and Peter Stone. Real-time vision on a mobile robot platform. In IEEE/RSJ International Conference on Intelligent Robots and Systems, August 2005.

[118]

Mohan Sridharan and Peter Stone. Towards illumination invariance in the legged league. In Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors, RoboCup-2004: Robot Soccer World Cup VIII, volume 3276 of Lecture Notes in Artificial Intelligence, pages 196-208. Springer Verlag, Berlin, 2005.

Digital Library

[119]

Mohan Sridharan and Peter Stone. Color learning on a mobile robot: Towards full autonomy under changing illumination. In The 20th International Joint Conference on Artificial Intelligence, January 2007. To appear.

Digital Library

[120]

Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2):99-127, 2002.

Digital Library

[121]

Peter Stone and Manuela Veloso. A layered approach to learning client behaviors in the RoboCup soccer server. Applied Artificial Intelligence, 12:165-188, 1998.

[122]

Peter Stone and Manuela Veloso. Layered learning. In Ramon López de Mántaras and Enric Plaza, editors, Machine Learning: ECML 2000 (Proceedings of the Eleventh European Conference on Machine Learning), pages 369-381. Springer Verlag, Barcelona,Catalonia,Spain, May/June 2000.

Digital Library

[123]

Peter Stone and Manuela Veloso. Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8(3):345-383, July 2000.

Digital Library

[124]

Peter Stone, Tucker Balch, and Gerhard Kraetzschmar, editors. RoboCup-2000: Robot Soccer World Cup IV, volume 2019 of Lecture Notes in Artificial Intelligence. Springer Verlag, Berlin, 2001.

Digital Library

[125]

Peter Stone, Michael L. Littman, Satinder Singh, and Michael Kearns. ATTac-2000: An adaptive autonomous bidding agent. Journal of Artificial Intelligence Research, 15:189-206, June 2001.

Digital Library

[126]

Peter Stone, Robert E. Schapire, Michael L. Littman, János A. Csirik, and David McAllester. Decision-theoretic bidding based on learned density models in simultaneous, interacting auctions. Journal of Artificial Intelligence Research, 19:209-242, 2003.

Digital Library

[127]

Peter Stone, Richard S. Sutton, and Gregory Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165-188, 2005.

[128]

Daniel Stronger and Peter Stone. Towards autonomous sensor and actuator model induction on a mobile robot. Connection Science, 18(2):97-119, 2006. Special Issue on Developmental Robotics.

[129]

Gita Sukthankar and Katia Sycara. Automatic recognition of human team behaviors. In Proceedings of Modeling Others from Observations (MOO), Workshop at the International Joint Conference on Artificial Intelligence (IJCAI). July 2005.

[130]

Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.

Digital Library

[131]

Richard S. Sutton, Doina Precup, and Satinder Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2):181-211, 1999.

Digital Library

[132]

R.S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems, pages 1057-1063, 2000.

Digital Library

[133]

Richard Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.

[134]

Katia Sycara. Multiagent systems. AI Magazine, 19(2):79-92, 1998.

[135]

M. Tambe. Tracking dynamic team activity. In National Conference on Artificial Intelligence(AAAI96), 1996.

Digital Library

[136]

Matthew E. Taylor, Peter Stone, and Yaxin Liu. Value functions for RL-based behavior transfer: A comparative study. In Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.

Digital Library

[137]

Gerald Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215-219, 1994.

Digital Library

[138]

Sebastian Thrun and Anton Schwartz. Finding structure in reinforcement learning. In Advances in Neural Information Processing Systems 7, 1995.

[139]

Manuela Veloso, Enrico Pagello, and Hiroaki Kitano, editors. RoboCup-99: Robot Soccer World Cup III. Springer Verlag, Berlin, 2000.

Digital Library

[140]

Jose M. Vidal and Edmund H. Durfee. Recursive agent modeling using limited rationality. In Proceedings of the First International Conference on Multi-Agent Systems, pages 376-383, Menlo Park, California, June 1995. AAAI Press.

[141]

William E. Walsh, Gerald Tesauro, Jeffrey O. Kephart, and Rajarshi Das. Utility functions in autonomic systems. In Proceedings of the First International Conference on Autonomic Computing, May 2004.

Digital Library

[142]

Christopher J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, King's College, Cambridge, UK, 1989.

[143]

Robert J. Weber. Making more from less: Strategic demand reduction in the FCC spectrum auctions. Journal of Economics and Management Strategy, 6(3):529-548, 1997.

[144]

Gerhard Weiß. ECAI-96 workshop on learning in distributed artificial intelligence. Call For Papers, 1996.

[145]

Michael P. Wellman, Peter R. Wurman, Kevin O'Malley, Roshan Bangera, Shou-de Lin, Daniel Reeves, and William E. Walsh. A trading agent competition. IEEE Internet Computing, 5(2):43-51, March/April 2001.

Digital Library

[146]

Shimon Whiteson and Peter Stone. Concurrent layered learning. In Jeffrey S. Rosenschein, Tuomas Sandholm, Michael Wooldridge, and Makoto Yokoo, editors, Second International Joint Conference on Autonomous Agents and Multiagent Systems, pages 193-200, New York, NY, July 2003. ACM Press.

Digital Library

[147]

Shimon Whiteson and Peter Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7:877-917, May 2006.

Digital Library

[148]

Shimon Whiteson, Peter Stone, Kenneth O. Stanley, Risto Miikkulainen, and Nate Kohl. Automatic feature selection via neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference, June 2005.

Digital Library

[149]

Jonathan Wildstrom, Peter Stone, Emmett Witchel, Raymond J. Mooney, and Mike Dahlin. Towards self-configuring hardware for distributed computer systems. In The Second International Conference on Autonomic Computing, pages 241-249, June 2005.

Digital Library

[150]

Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October 1999. http: //www.cs.waikato.ac.nz/ml/weka/.

Digital Library

[151]

P. R. Wurman, M. P. Wellman, and W. E. Walsh. A parameterization of the auction design space. Journal of Games of Economic Behavior, 35:304-338, 2001.

[152]

Ruixiang Zhang and Prahlad Vadakkepat. An evolutionary algorithm for trajectory based gait generation of biped robot. In Proceedings of the International Conference on Computational Intelligence, Robotics and Autonomous Systems, Singapore, 2003.

Cited By

Walczak S(2018)Society of AgentsInternational Journal of Intelligent Information Technologies10.4018/IJIIT.201810010114:4(1-23)Online publication date: 1-Oct-2018
https://dl.acm.org/doi/10.4018/IJIIT.2018100101
Bazzan A(2017)Multiagent systems and agent-based modeling and simulationProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3067695.3067723(959-1004)Online publication date: 15-Jul-2017
https://dl.acm.org/doi/10.1145/3067695.3067723
Malialis KDevlin SKudenko DJonker CMarsella SThangarajah JTuyls K(2016)Resource Abstraction for Reinforcement Learning in Multiagent Congestion ProblemsProceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems10.5555/2936924.2937000(503-511)Online publication date: 9-May-2016
https://dl.acm.org/doi/10.5555/2936924.2937000
Show More Cited By

Learning and multiagent reasoning for autonomous agents
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence

Recommendations

Preference generation for autonomous agents
MATES'10: Proceedings of the 8th German conference on Multiagent system technologies

An intelligent agent situated in an environment needs to know the preferred states it is expected to achieve or maintain so that it can work towards achieving or maintaining them. We refer to all these preferred states as "preferences". The preferences ...
Autonomous Learning Agents: Layered Learning and Ad Hoc Teamwork
AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

In order to achieve long-term autonomy in the real world, fully autonomous agents need to be able to learn, both to improve their behaviors in a complex, dynamically changing world, and to enable interaction with previously unfamiliar agents. This talk ...
Real-Time Search for Learning Autonomous Agents

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

IJCAI'07: Proceedings of the 20th international joint conference on Artifical intelligence

January 2007

2953 pages

Editors:
Rajeev Sangal
International Institute of Information Technology, Hyderabad
,
Harish Mehta
Onward Technologies Limited
,
R. K. Bagga
International Institute of Information Technology, Hyderabad

Sponsors

The International Joint Conferences on Artificial Intelligence, Inc.

Publisher

Morgan Kaufmann Publishers Inc.

San Francisco, CA, United States

Publication History

Published: 06 January 2007

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Walczak S(2018)Society of AgentsInternational Journal of Intelligent Information Technologies10.4018/IJIIT.201810010114:4(1-23)Online publication date: 1-Oct-2018
https://dl.acm.org/doi/10.4018/IJIIT.2018100101
Bazzan A(2017)Multiagent systems and agent-based modeling and simulationProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3067695.3067723(959-1004)Online publication date: 15-Jul-2017
https://dl.acm.org/doi/10.1145/3067695.3067723
Malialis KDevlin SKudenko DJonker CMarsella SThangarajah JTuyls K(2016)Resource Abstraction for Reinforcement Learning in Multiagent Congestion ProblemsProceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems10.5555/2936924.2937000(503-511)Online publication date: 9-May-2016
https://dl.acm.org/doi/10.5555/2936924.2937000
Chatzidimitriou KMitkas P(2013)Adaptive reservoir computing through evolution and learningNeurocomputing10.1016/j.neucom.2012.09.022103(198-209)Online publication date: 1-Mar-2013
https://dl.acm.org/doi/10.1016/j.neucom.2012.09.022
Kouno AMontanier JTakano SBredeche NSchoenauer MSebag MSuzuki E(2011)On-Board Evolutionary Algorithm and Off-Line Rule Discovery for Column Formation in Swarm RoboticsProceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 0210.1109/WI-IAT.2011.143(220-227)Online publication date: 22-Aug-2011
https://dl.acm.org/doi/10.1109/WI-IAT.2011.143
Chatzidimitriou KPartalas IMitkas PVlahavas I(2011)Transferring evolved reservoir features in reinforcement learning tasksProceedings of the 9th European conference on Recent Advances in Reinforcement Learning10.1007/978-3-642-29946-9_22(213-224)Online publication date: 9-Sep-2011
https://dl.acm.org/doi/10.1007/978-3-642-29946-9_22
Chatzidimitriou KMitkas P(2010)A NEAT Way for Evolving Echo State NetworksProceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence10.5555/1860967.1861145(909-914)Online publication date: 4-Aug-2010
https://dl.acm.org/doi/10.5555/1860967.1861145
Bazzan A(2009)Opportunities for multiagent systems and multiagent reinforcement learning in traffic controlAutonomous Agents and Multi-Agent Systems10.1007/s10458-008-9062-918:3(342-375)Online publication date: 1-Jun-2009
https://dl.acm.org/doi/10.1007/s10458-008-9062-9
Chatzidimitriou KSymeonidis AMitkas P(2008)Data Mining-Driven Analysis and Decomposition in Agent Supply Chain Management NetworksProceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 0310.1109/WIIAT.2008.395(558-561)Online publication date: 9-Dec-2008
https://dl.acm.org/doi/10.1109/WIIAT.2008.395
Borissov NWirström N(2008)Q-StrategyProceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:10.1007/978-3-540-88871-0_52(744-761)Online publication date: 9-Nov-2008
https://dl.acm.org/doi/10.1007/978-3-540-88871-0_52

View Options

View options

Media

Figures

Other

Tables

View Table of Contents