skip to main content
10.5555/1625135.1625265guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Training and tracking in robotics

Published: 18 August 1985 Publication History

Abstract

We explore the use of learning schemes in training and adapting performance on simple coordination tasks. The tasks are 1-D pole balancing. Several programs incorporating learning have already achieved this (1, S, 8): the problem is to move a cart along a short piece of track to at to keep a pole balanced on its end; the pole is hinged to the cart at its bottom, and the cart is moved either to the left or to the right by a force of constant magnitude. The form of the task considered here, after (3), involves a genuinely difficult credit-assignment problem. We use a learning scheme previously developed and analysed (1, 7) to achieve performance through reinforcement, and extend it to include changing and new requirements. For example, the length or mast of the pole can change, the bias of the force, its strength, and so on; and the system can be tasked to avoid certain regions altogether. In this way we explore the learning system's ability to adapt to changes and to profit from a selected training sequence, both of which are of obvious utility in practical robotics applications.
The results described here were obtained using a computer simulation of the pole-balancing problem. A movie will be shown of the performance of the system under the various requirements and tasks.

References

[1]
Barto, A. G., Sutton, R. S., & Anderson, C. W. Neuronlike elements that can solve difficult learning control problems. IEEE Trans, on Systems, Man, and Cybernetics, vol. SMC-14, 1984, 834-846.
[2]
Cannon, R. H. Dynamics of Physical Systems. New York: McGraw-Hill, 1967.
[3]
Michie, D., & Chambers, R. A. BOXES: An experiment in adaptive control. In Dale, E., & Michie, D. (Eds.), Machine Intelligence S. Edinburgh: Oliver and Boyd, 1968, 137-152.
[4]
Minsky, M. L. & Selfridge, O. G. Learning in random nets. In C. Cherry (Ed.), Information Theory: Fourth London Symposium . London: Butterworths, 1961.
[5]
Mitchell, T. M. Learning and problem solving. Proc. IJCAI 83, 1983, 1139-1151.
[6]
Samuel, A. L. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 1959, 3, 210-229.
[7]
Sutton, R. S. Temporal aspects of credit assignment in reinforcement learning. Univ. of Massachusetts Technical Report 84-2, 1984.
[8]
Widrow, B., & Smith, F. W. Pattern-recognising control systems. In Tow, J. T., & Wilcox, R. H., Computer and Information Sciences. Washington D.C: Spartan Books, 1964, 288-317.
[9]
Winston, P. H. Learning structural descriptions from examples. In Winston, P. H. (Ed.) The Psychology of Computer Vision. New York: McGraw-Hill, 1975, 157-209.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
IJCAI'85: Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
August 1985
700 pages
ISBN:0934613028

Sponsors

  • The International Joint Conferences on Artificial Intelligence, Inc.

Publisher

Morgan Kaufmann Publishers Inc.

San Francisco, CA, United States

Publication History

Published: 18 August 1985

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media