skip to main content
10.5555/1625275.1625278guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Learning and multiagent reasoning for autonomous agents

Published: 06 January 2007 Publication History

Abstract

One goal of Artificial Intelligence is to enable the creation of robust, fully autonomous agents that can coexist with us in the real world. Such agents will need to be able to learn, both in order to correct and circumvent their inevitable imperfections, and to keep up with a dynamically changing world. They will also need to be able to interact with one another, whether they share common goals, they pursue independent goals, or their goals are in direct conflict. This paper presents current research directions in machine learning, multiagent reasoning, and robotics, and advocates their unification within concrete application domains. Ideally, new theoretical results in each separate area will inform practical implementations while innovations from concrete multiagent applications will drive new theoretical pursuits, and together these synergistic research approaches will lead us towards the goal of fully autonomous agents.

References

[1]
David Ackley and Michael Littman. Interactions between learning and evolution. In C.G. Langton, C. Taylor, J.D. Farmer, and S. Rasmussen, editors, Artificial Life II. Addison-Wesley, 1991.
[2]
Mazda Ahmadi and Peter Stone. Keeping in touch: A distributed check for biconnected structure by homogeneous robots. In The 8th International Symposium on Distributed Autonomous Robotic Systems, July 2006.
[3]
Mazda Ahmadi and Peter Stone. A multi-robot system for continuous area sweeping tasks. In Proceedings of International Conference on Robotics and Automation (ICRA), to appear., May 2006.
[4]
Minoru Asada and Hiroaki Kitano, editors. RoboCup-98: Robot Soccer World Cup II. Lecture Notes in Artificial Intelligence 1604. Springer Verlag, Berlin, 1999.
[5]
Minoru Asada, Shoichi Noda, Sukoya Tawaratsumida, and Koh Hosoda. Vision-based behavior acquisition for a shooting robot by using a reinforcement learning. In Proc. of IAPR/IEEE Workshop on Visual Behaviors-1994, pages 112-118, 1994.
[6]
J. Andrew Bagnell and Jeff Schneider. Autonomous helicopter control using reinforcement learning policy search methods. In International Conference on Robotics and Automation, pages 1615-1620. IEEE Press, 2001.
[7]
L. C. Baird and A. W Moore. Gradient descent for general reinforcement learning. In Michael J. Kearns, Sara A. Solla, and David A. Cohn, editors, Advances in Neural Information Processing Systems, volume 11, pages 968-974. The MIT Press, 1999.
[8]
L.C. Baird. Residual algorithms: Reinforcement learning with function approximation. In Proceedings of the Twelfth International Conference on Machine Learning (ICML). Morgan Kaufman, July 1995.
[9]
Bikramjit Banerjee and Peter Stone. General game learning using knowledge transfer. In The 20th International Joint Conference on Artificial Intelligence, January 2007. To appear.
[10]
Andreas Birk, Silvia Coradeschi, and Satoshi Tadokoro, editors. RoboCup-2001: Robot Soccer World Cup V. Springer Verlag, Berlin, 2002.
[11]
Richard Bishop. Intelligent Vehicle Technology and Trends. Artech House, 2005.
[12]
E.J.W. Boers, M.V. Borst, and I.G. Sprinkhuizen-Kuyper. Evolving Artificial Neural Networks using the "Baldwin Effect". In Artificial Neural Nets and Genetic Algorithms, Proceedings of the International Conference in Ales, France, 1995.
[13]
Alan H. Bond and Les Gasser. An analysis of problems and research in DAI. In Alan H. Bond and Les Gasser, editors, Readings in Distributed Artificial Intelligence, pages 3-35. Morgan Kaufmann Publishers, San Mateo, CA, 1988.
[14]
Rodney A. Brooks. A robust layered control system for a mobile robot. IEEE Journal of Robotics and Automation, RA-2:14-23, 1986.
[15]
Rodney A. Brooks. Intelligence without reason. In John Myopoulos and Ray Reiter, editors, Proceedings of the 12th International Joint Conference on Artificial Intelligence (IJCAI-91), pages 569-595, Sydney, Australia, 1991. Morgan Kaufmann publishers Inc.: San Mateo, CA, USA.
[16]
Murray Campbell, A. Joseph Hoane Jr., and Feng Hsiung Hsu. Deep blue. Artificial Intelligence, 134(1-2):57-83, 2002.
[17]
Y. Uny Cao, Alex S. Fukunaga, and Andrew B. Kahng. Cooperative mobile robotics: Antecedents and directions. Autonomous Robots, 4:7-27, 1997.
[18]
Y-Han Chang, Tracy Ho, and Leslie Pack Kaelbling. Mobilized ad-hoc networks: A reinforcement learning approach. In Proceedings of the First International Conference on Autonomic Computing, May 2004.
[19]
Mike Chen, Alice X. Zheng, Jim Lloyd, Michael I. Jordan, and Eric Brewer. Failure diagnosis using decision trees. In Proceedings of the First International Conference on Autonomic Computing, May 2004.
[20]
Weiming Chen. Odometry calibration and gait optimisation. Technical report, The University of New South Wales, School of Computer Science and Engineering, 2005.
[21]
Sonia Chernova and Manuela Veloso. An evolutionary approach to gait learning for four-legged robots. In In Proceedings of IROS'04, September 2004.
[22]
David Cohen, Yao Hua Ooi, Paul Vernaza, and Daniel D. Lee. The University of Pennsylvania RoboCup 2004 legged soccer team, 2004. Available at URL http://www. cis.upenn.edu/robocup/UPenn04.pdf.
[23]
Peter C. Cramton. The FCC spectrum auctions: An early assessment. Journal of Economics and Management Strategy, 6(3):431-495, 1997.
[24]
Robert H. Crites and Andrew G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 1017-1023, Cambridge, MA, 1996. MIT Press.
[25]
DARPA. The DARPA grand challenge, 2006. http://www.darpa.mil/grandchallenge.
[26]
Kerstin Dautenhahn. Getting to know each other--artificial social intelligence for autonomous robots. Robotics and Autonomous Systems, 16:333-356, 1995.
[27]
Yuval Davidor. Genetic Algorithms and Robotics: A Heuristic Strategy for Optimization. World Scientific Publishing Co., Inc., NJ, USA, 1991.
[28]
Thomas Dean and Robert Givan. Model minimization in Markov decision processes. In Proceedings of the Fourteenth National Conference on Artificial Intelligence, 1997.
[29]
Keith S. Decker. Distributed problem solving: A survey. IEEE Transactions on Systems, Man, and Cybernetics, 17(5):729-740, September 1987.
[30]
F Dellaert, D Fox, W Burgard, and S Thrun. Monte carlo localization for mobile robots. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 1999.
[31]
Thomas G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227-303, 2000.
[32]
Kurt Dresner and Peter Stone. Multiagent traffic management: A reservation-based intersection control mechanism. In The Third International Joint Conference on Autonomous Agents and Multiagent Systems, pages 530-537, July 2004.
[33]
Uwe Dueffert and Jan Hoffmann. Reliable and precise gait modeling for a quadruped robot. In RoboCup 2005: Robot Soccer World Cup IX, Lecture Notes in Artificial Intelligence. Springer, 2005.
[34]
Anne Eisenberg. In online auctions of the future, it'll be bot vs. bot vs. bot. The New York Times, 2000. August 17th.
[35]
Tom Elliott Fawcett. Feature discovery for problem solving systems, PhD thesis, University of Massachusetts, Amherst, 1993.
[36]
Alan Fern, Robert Givan, Babak Falsafi, and T.N. Vijaykumar. Dynamic feature selection for hardware prediction, 2004. From http://web.engr.oregonstate. edu/~afern/.
[37]
F. Fernandez and Manuela Veloso. Learning by probabilistic reuse of past policies. In Proceedings of the 6th International Conference on Autonomous Agents and Multiagent Systems, 2006.
[38]
Peggy Fidelman and Peter Stone. The chin pinch: A case study in skill learning on a legged robot. In Gerhard Lakemeyer, Elizabeth Sklar, Domenico Sorenti, and Tomoichi Takahashi, editors, RoboCup-2006: Robot Soccer World Cup X. Springer Verlag, Berlin, 2007. To appear.
[39]
Armando Fox, Emre Kiciman, David Patterson, Michael Jordan, and Randy Katz. Combining statistical monitoring and predictable recovery for self-management. In Proceedings of 2004 Workshop on Self-Managed Systems (WOSS'04), October 2004.
[40]
Robert French and Adam Messinger. Genes, phenes and the Baldwin effect: Learning and evolution in a simulated population. Artificial Life, 4:277-282, 1994.
[41]
Erann Gat. On the role of simulation in the study of autonomous mobile robots. In AAAI-95 Spring Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents., Stanford, CA, March 1995.
[42]
Erann Gat. Three-layer architectures. In David Kortenkamp, R. Peter Bonasso, and Robin Murphy, editors, Artificial Intelligence and Mobile Robots, pages 195-210. AAAI Press, Menlo Park, CA, 1998.
[43]
Michael Genesereth and Nathaniel Love. General game playing: Overview of the AAAI competition. AI Magazine, 26(2), 2005.
[44]
David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning. 1989.
[45]
Faustino Gomez, Doug Burger, and Risto Miikkulainen. A neuroevolution method for dynamic resource allocation on a chip multiprocessor. In Proceedings of the INNS-IEEE International Joint Conference on Neural Networks, pages 2355-2361. IEEE, 2001.
[46]
Frederic Gruau and Darrell Whitley. Adding learning to the cellular development of neural networks: Evolution and the Baldwin effect. Evolutionary Computation, 1:213-233, 1993.
[47]
Frederic Gruau, Darrell Whitley, and Larry Pyeatt. A comparison between cellular encoding and direct encoding for genetic neural networks. In Genetic Programming 1996: Proceedings of the First Annual Conference, pages 81-89, 1996.
[48]
Geoffrey E. Hinton and Steven J. Nowlan. How learning can guide evolution. Complex Systems, 1:495-502, 1987.
[49]
G. S. Hornby, M. Fujita, S. Takamura, T. Yamamoto, and O. Hanagata. Autonomous evolution of gaits with the Sony quadruped robot. In Wolfgang Banzhaf, Jason Daida, Agoston E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1297-1304, Orlando, Florida, USA, 13-17 1999. Morgan Kaufmann.
[50]
G.S. Hornby, S. Takamura, J. Yokono, O. Hanagata, T. Yamamoto, and M. Fujita. Evolving robust gaits with AIBO. In IEEE International Conference on Robotics and Automation, pages 3040-3045, 2000.
[51]
Marcus J. Huber and Edmund H. Durfee. Deciding when to commit to action during observation-based coordination. In Proceedings of the First International Conference on Multi-Agent Systems, pages 163-170, Menlo Park, California, June 1995. AAAI Press.
[52]
R. Colin Johnson. Steady pace takes DARPA race. EE Times, October 2005. Accessed at http://www. eetimes.com.
[53]
Nicholas K. Jong and Peter Stone. State abstraction discovery fromirrelevant state variables. In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, pages 752-757, August 2005.
[54]
Gal A. Kaminka, Pedro U. Lima, and Raul Rojas, editors. RoboCup-2002: Robot Soccer World Cup VI. Springer Verlag, Berlin, 2003.
[55]
Jeffrey O. Kephart and David M. Chess. The vision of autonomic computing. Computer, pages 41-50, January 2003.
[56]
Min Sub Kim and William Uther. Automatic gait optimisation for quadruped robots. In Australasian Conference on Robotics and Automation, Brisbane, December 2003.
[57]
Hiroaki Kitano, Yasuo Kuniyoshi, Itsuki Noda, Minoru Asada, Hitoshi Matsubara, and Eiichi Osawa. RoboCup: A challenge problem for AI. AI Magazine, 18(1):73-85, Spring 1997.
[58]
Hiroaki Kitano, Milind Tambe, Peter Stone, Manuela Veloso, Silvia Coradeschi, Eiichi Osawa, Hitoshi Matsubara, Itsuki Noda, and Minoru Asada. The RoboCup synthetic agent challenge 97. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, pages 24-29, San Francisco, CA, 1997. Morgan Kaufmann.
[59]
Hiroaki Kitano, editor. RoboCup-97: Robot Soccer World Cup I. Springer Verlag, Berlin, 1998.
[60]
Nate Kohl and Peter Stone. Policy gradient reinforcement learning for fast quadrupedal locomotion. In Proceedings of the IEEE International Conference on Robotics and Automation, May 2004.
[61]
Nate Kohl and Peter Stone. Policy gradient reinforcement learning for fast quadrupedal locomotion. In Proceedings of the IEEE International Conference on Robotics and Automation, volume 3, pages 2619-2624, May 2004.
[62]
Daphne Koller. Representation, reasoning, learning, August 2001. IJCAI Computers and Thought Award talk.
[63]
J.Z. Kolter and M.A. Maloof. Learning to detect malicious executables in the wild. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 470-478, New York, NY, 2004. ACM Press. Best Application Paper.
[64]
George Konidaris and Andrew Barto. Autonomous shaping: Knowledge transfer in reinforcement learning. In Proceedings of the 23rd Internation Conference on Machine Learning, pages 489-496, 2006.
[65]
Gregory Kuhlmann, Peter Stone, and Justin Lallinger. The UT Austin Villa 2003 champion simulator coach: A machine learning approach. In Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors, RoboCup-2004: Robot Soccer World Cup VIII, volume 3276 of Lecture Notes in Artificial Intelligence, pages 636-644. Springer Verlag, Berlin, 2005.
[66]
Gregory Kuhlmann, Kurt Dresner, and Peter Stone. Automatic heuristic construction in a complete general game player. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, pages 1457-62, July 2006.
[67]
Gregory Kuhlmann, William B. Knox, and Peter Stone. Know thine enemy: A champion RoboCup coach agent. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, pages 1463-68, July 2006.
[68]
C. Kwok, D. Fox, and M. Meila. Adaptive real-time particle filters for robot localization. In Proc. of the IEEE International Conference on Robotics & Automation, 2003.
[69]
Michail G. Lagoudakis and Ronald Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4(2003):1107-1149, 2003.
[70]
Lihong Li, Thomas J. Walsh, and Michael L. Littman. Towards a unified theory of state abstractions for MDPs. In Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics, pages 531-539, 2006.
[71]
Ben Liblit, Alex Aiken, Alice X. Zheng, and Michael I. Jordan. Bug isolation via remote program sampling. In Programming Languages Design and Implementation (PLDI), June 2003.
[72]
B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI, 2005.
[73]
A. K. Mackworth. On seeing robots. In A. Basu and X. Li, editors, Computer Vision: Systems, Theory, and Applications, pages 1-13. World Scientific Press, Singapore, 1993.
[74]
R. Maclin, J. Shavlik, L. Torrey, T. Walker, and E. Wild. Giving advice about preferred actions to reinforcement learners via knowledge-based kernel regression. In Proceedings of the 20th National Conference on Artificial Intelligence, 2005.
[75]
Michael Mesnier, Eno Thereska, Gregory R. Ganger, Daniel Ellard, and Margo Seltzer. File classification in self-* storage systems. In Proceedings of the First International Conference on Autonomic Computing, May 2004.
[76]
Marvin L. Minsky. The Society of Mind. Simon & Schuster, 1988.
[77]
Tom Mitchell. Learning and problem-solving. In Proceedings of the Eighth International Joint Conference on Artificial Intelligence, Karlsruhe, Germany, August 1983. Computers and Thought Award Paper.
[78]
M. Montemerlo, S. Thrun, H. Dahlkamp, D. Stavens an, and S. Strohband. Winning the DARPA Grand Challenge with an AI robot. In Proceedings of the AAAI National Conference on Artificial Intelligence, Boston, MA, July 2006.
[79]
Joseph F. Murray, Gordon F. Hughes, and Kenneth Kreutz-Delgado. Machine learning methods for predicting failures in hard drives: A multiple-instance application. Journal of Machine Learning research, 6:783-816, May 2005.
[80]
Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors. RoboCup-2004: Robot Soccer World Cup VIII. Springer Verlag, Berlin, 2005.
[81]
James Newsome, Brad Karp, and Dawn Song. Polygraph: Automatically generating signatures for polymorphic worms. In The IEEE Symposium on Security and Privacy, May 2005.
[82]
Andrew Y. Ng and Stuart Russell. Algorithms for inverse reinforcement learning. In Proc. 17th International Conf. on Machine Learning, 2000.
[83]
Andrew Y. Ng, H. Jin Kim, Michael I. Jordan, and Shankar Sastry. Autonomous helicopter flight via reinforcement learning. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. To Appear.
[84]
Itsuki Noda, Hitoshi Matsubara, Kazuo Hiraki, and Ian Frank. Soccer server: A tool for research on multiagent systems. Applied Artificial Intelligence, 12:233-250, 1998.
[85]
Itsuki Noda, Adam Jacoff, Ansgar Bredenfeld, and Yasutake Takahashi, editors. RoboCup-2005: Robot Soccer World Cup IX. Springer Verlag, Berlin, 2006.
[86]
Stefano Nolfi, Jeffery L. Elman, and Domenico Parisi. Learning and evolution in neural networks. Adaptive Behavior, 2:5-28, 1994.
[87]
David Pardoe and Peter Stone. TacTex- 2005: A champion supply chain management agent. In Proceedings of the Twenty-First National Conference on Artificial Intelligence, pages 1489-94, July 2006.
[88]
David Pardoe, Peter Stone, Maytal Saar-Tsechansky, and Kerem Tomak. Adaptive mechanism design: A metalearning approach. In The Eighth International Conference on Electronic Commerce, August 2006. To appear.
[89]
Lynne E. Parker. Distributed algorithms for multi-robot observation of multiple moving targets. Autonomous Robots, 12(3):231-255, 2002.
[90]
David C. Parkes. Iterative Combinatorial Auctions: Achieving Economic and Computational Efficiency. PhD thesis, Department of Computer and Information Science, University of Pennsylvania, May 2001.
[91]
Barney Pell. Strategy generation and evaluation for meta-game playing. PhD thesis, University of Cambridge, 1993.
[92]
Steve Phelps, Peter Mc Burnley, Simon Parsons, and Elizabeth Sklar. Co-evolutionary auction mechanism design. In Agent Mediated Electronic Commerce IV, volume 2531 of Lecture Notes in Artificial Intelligence. Springer Verlag, 2002.
[93]
Daniel Polani, Brett Browning, Andrea Bonarini, and Kazuo Yoshida, editors. RoboCup-2003: Robot Soccer World Cup VII. Springer Verlag, Berlin, 2004.
[94]
Dean A. Pormerleau. Neural Network Perception for Mobile Robot Guidance. Kluwer Academic Publishers, 1993.
[95]
J. M. Porta and E. Celaya. Efficient gait generation using reinforcement learning. In Proceedings of the Fourth International Conference on Climbing and Walking Robots, pages 411-418, 2001.
[96]
Michael J. Quinlan, Stephen K. Chalup, and Richard H. Middleton. Techniques for improving vision and locomotion on the sony aibo robot. In Proceedings of the 2003 Australasian Conference on Robotics and Automation, December 2003.
[97]
Michael J. Quinlan, Steven P. Nicklin, Kenny Hong, Naomi Henderson, Stephen R. Young, Timothy G. Moore, Robin Fisher, Phavanna Douangboupha, and Stephan K. Chalup. The 2005 nubots team report. Technical report, The University of Newcastle, School of Electrical Engineering and Computer Science, 2005.
[98]
J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
[99]
Balaraman Ravindran and Andrew G. Barto. SMDP homomorphisms: An algebraic approach to abstraction in semi-Markov decision processes. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, 2003.
[100]
Craig W. Reynolds. Steering behaviors for autonomous characters. In Proceedings of the Game Developers Conference, pages 763-782, 1999.
[101]
Patrick Riley and Manuela Veloso. On behavior classification in adversarial environments. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-2000), 2000.
[102]
Patrick Riley and Manuela Veloso. Recognizing probabilistic opponent movement models. In A. Birk, S. Coradeschi, and S. Tadokoro, editors, RoboCup-2001: The Fifth RoboCup Competitions and Conferences. Springer Verlag, Berlin, 2002.
[103]
Patrick Riley, Manuela Veloso, and Gal Kaminka. An empirical study of coaching. In H. Asama, T. Arai, T. Fukuda, and T. Hasegawa, editors, Distributed Autonomous Robotic Systems 5, pages 215-224. Springer-Verlag, 2002.
[104]
T. Roefer, R. Brunn, S. Czarnetzki, M. Dassler, M. Hebbel, M. Juengel, T. Kerkhof, W. Nistico, T. Oberlies, C. Rohde, M. Spranger, and C. Zarges. Germanteam 2005. In RoboCup 2005: Robot Soccer World Cup IX, Lecture Notes in Artificial Intelligence. Springer, 2005.
[105]
T. Rofer, H.-D. Burkhard, U. Duffert, J. Hoffman, D. Gohring, M. Jungel, M. Lotzach, O. v. Stryk, R. Brunn, M. Kallnik, M. Kunz, S. Petters, M. Risler, M. Stelzer, I. Dahm, M. Wachter, K. Engel, A. Osterhues, C. Schumann, and J. Ziegler. Germanteam robocup 2003. Technical report, 2003.
[106]
T. Rofer. Evolutionary gait-optimization using a fitness function based on proprioception. In Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors, RoboCup-2004: Robot Soccer World Cup VIII. Springer Verlag, Berlin, 2004.
[107]
Seth Rogers, Claude-Nicolas Flechter, and Pat Langley. An adaptive interactive agent for route advice. In Oren Etzioni, Jörg P. Müller, and Jeffrey M. Bradshaw, editors, Proceedings of the Third International Conference on Autonomous Agents (Agents'99), pages 198-205, Seattle, WA, USA, 1999. ACM Press.
[108]
Stuart Russell. Rationality and intelligence. 1995. Computers and Thought Award Paper.
[109]
Tuomas Sandholm. Making markets and democracy work: A story of incentives and computing. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 1649-1671, 2003. Computers and Thought Award Paper.
[110]
Jonathan Schaeffer, Joseph C. Culberson, Norman Treloar, Brent Knight, Paul Lu, and Duane Szafron. A world championship caliber checkers program. Artificial Intelligence, 53(2-3):273-289, 1992.
[111]
T. Schonberg, M. Ojala, J. Suomela, A. Torpo, and A. Halme. Positioning an autonomous off-road vehicle by using fused DGPS and inertial navigation. In 2nd IFAC Conference on Intelligent Autonomous Vehicles, pages 226-231, 1995.
[112]
O. Selfridge, R. S. Sutton, and Andrew G. Barto. Training and tracking in robotics. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, pages 670-672, 1985.
[113]
Satinder P. Singh. Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8:323-339, 1992.
[114]
Vishal Soni and Satinder Singh. Using homomorphisms to transfer options across continuous reinforcement learning domains. In Proceedings of the Twenty First National Conference on Artificial Intelligence, July 2006.
[115]
Sony. Aibo robot, 2004. http://www.sony. net/Products/aibo.
[116]
Mohan Sridharan and Peter Stone. Autonomous color learning on a mobile robot. In Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.
[117]
Mohan Sridharan and Peter Stone. Real-time vision on a mobile robot platform. In IEEE/RSJ International Conference on Intelligent Robots and Systems, August 2005.
[118]
Mohan Sridharan and Peter Stone. Towards illumination invariance in the legged league. In Daniele Nardi, Martin Riedmiller, and Claude Sammut, editors, RoboCup-2004: Robot Soccer World Cup VIII, volume 3276 of Lecture Notes in Artificial Intelligence, pages 196-208. Springer Verlag, Berlin, 2005.
[119]
Mohan Sridharan and Peter Stone. Color learning on a mobile robot: Towards full autonomy under changing illumination. In The 20th International Joint Conference on Artificial Intelligence, January 2007. To appear.
[120]
Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2):99-127, 2002.
[121]
Peter Stone and Manuela Veloso. A layered approach to learning client behaviors in the RoboCup soccer server. Applied Artificial Intelligence, 12:165-188, 1998.
[122]
Peter Stone and Manuela Veloso. Layered learning. In Ramon López de Mántaras and Enric Plaza, editors, Machine Learning: ECML 2000 (Proceedings of the Eleventh European Conference on Machine Learning), pages 369-381. Springer Verlag, Barcelona,Catalonia,Spain, May/June 2000.
[123]
Peter Stone and Manuela Veloso. Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8(3):345-383, July 2000.
[124]
Peter Stone, Tucker Balch, and Gerhard Kraetzschmar, editors. RoboCup-2000: Robot Soccer World Cup IV, volume 2019 of Lecture Notes in Artificial Intelligence. Springer Verlag, Berlin, 2001.
[125]
Peter Stone, Michael L. Littman, Satinder Singh, and Michael Kearns. ATTac-2000: An adaptive autonomous bidding agent. Journal of Artificial Intelligence Research, 15:189-206, June 2001.
[126]
Peter Stone, Robert E. Schapire, Michael L. Littman, János A. Csirik, and David McAllester. Decision-theoretic bidding based on learned density models in simultaneous, interacting auctions. Journal of Artificial Intelligence Research, 19:209-242, 2003.
[127]
Peter Stone, Richard S. Sutton, and Gregory Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165-188, 2005.
[128]
Daniel Stronger and Peter Stone. Towards autonomous sensor and actuator model induction on a mobile robot. Connection Science, 18(2):97-119, 2006. Special Issue on Developmental Robotics.
[129]
Gita Sukthankar and Katia Sycara. Automatic recognition of human team behaviors. In Proceedings of Modeling Others from Observations (MOO), Workshop at the International Joint Conference on Artificial Intelligence (IJCAI). July 2005.
[130]
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
[131]
Richard S. Sutton, Doina Precup, and Satinder Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1-2):181-211, 1999.
[132]
R.S. Sutton, D. McAllester, S. Singh, and Y. Mansour. Policy gradient methods for reinforcement learning with function approximation. In Advances in Neural Information Processing Systems, pages 1057-1063, 2000.
[133]
Richard Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9-44, 1988.
[134]
Katia Sycara. Multiagent systems. AI Magazine, 19(2):79-92, 1998.
[135]
M. Tambe. Tracking dynamic team activity. In National Conference on Artificial Intelligence(AAAI96), 1996.
[136]
Matthew E. Taylor, Peter Stone, and Yaxin Liu. Value functions for RL-based behavior transfer: A comparative study. In Proceedings of the Twentieth National Conference on Artificial Intelligence, July 2005.
[137]
Gerald Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215-219, 1994.
[138]
Sebastian Thrun and Anton Schwartz. Finding structure in reinforcement learning. In Advances in Neural Information Processing Systems 7, 1995.
[139]
Manuela Veloso, Enrico Pagello, and Hiroaki Kitano, editors. RoboCup-99: Robot Soccer World Cup III. Springer Verlag, Berlin, 2000.
[140]
Jose M. Vidal and Edmund H. Durfee. Recursive agent modeling using limited rationality. In Proceedings of the First International Conference on Multi-Agent Systems, pages 376-383, Menlo Park, California, June 1995. AAAI Press.
[141]
William E. Walsh, Gerald Tesauro, Jeffrey O. Kephart, and Rajarshi Das. Utility functions in autonomic systems. In Proceedings of the First International Conference on Autonomic Computing, May 2004.
[142]
Christopher J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, King's College, Cambridge, UK, 1989.
[143]
Robert J. Weber. Making more from less: Strategic demand reduction in the FCC spectrum auctions. Journal of Economics and Management Strategy, 6(3):529-548, 1997.
[144]
Gerhard Weiß. ECAI-96 workshop on learning in distributed artificial intelligence. Call For Papers, 1996.
[145]
Michael P. Wellman, Peter R. Wurman, Kevin O'Malley, Roshan Bangera, Shou-de Lin, Daniel Reeves, and William E. Walsh. A trading agent competition. IEEE Internet Computing, 5(2):43-51, March/April 2001.
[146]
Shimon Whiteson and Peter Stone. Concurrent layered learning. In Jeffrey S. Rosenschein, Tuomas Sandholm, Michael Wooldridge, and Makoto Yokoo, editors, Second International Joint Conference on Autonomous Agents and Multiagent Systems, pages 193-200, New York, NY, July 2003. ACM Press.
[147]
Shimon Whiteson and Peter Stone. Evolutionary function approximation for reinforcement learning. Journal of Machine Learning Research, 7:877-917, May 2006.
[148]
Shimon Whiteson, Peter Stone, Kenneth O. Stanley, Risto Miikkulainen, and Nate Kohl. Automatic feature selection via neuroevolution. In Proceedings of the Genetic and Evolutionary Computation Conference, June 2005.
[149]
Jonathan Wildstrom, Peter Stone, Emmett Witchel, Raymond J. Mooney, and Mike Dahlin. Towards self-configuring hardware for distributed computer systems. In The Second International Conference on Autonomic Computing, pages 241-249, June 2005.
[150]
Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, October 1999. http: //www.cs.waikato.ac.nz/ml/weka/.
[151]
P. R. Wurman, M. P. Wellman, and W. E. Walsh. A parameterization of the auction design space. Journal of Games of Economic Behavior, 35:304-338, 2001.
[152]
Ruixiang Zhang and Prahlad Vadakkepat. An evolutionary algorithm for trajectory based gait generation of biped robot. In Proceedings of the International Conference on Computational Intelligence, Robotics and Autonomous Systems, Singapore, 2003.

Cited By

View all
  1. Learning and multiagent reasoning for autonomous agents

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    IJCAI'07: Proceedings of the 20th international joint conference on Artifical intelligence
    January 2007
    2953 pages
    • Editors:
    • Rajeev Sangal,
    • Harish Mehta,
    • R. K. Bagga

    Sponsors

    • The International Joint Conferences on Artificial Intelligence, Inc.

    Publisher

    Morgan Kaufmann Publishers Inc.

    San Francisco, CA, United States

    Publication History

    Published: 06 January 2007

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media