Where Do Rewards Come From? on Dec 16, 2009 @ MLSS, CSE IIT Madras

Where do Rewards Come From?

Prof. Andy Barto (Department of CSE, University of Massachusetts - Amherst)
Dec 16, 2009 @ 11:00 pm
BSB 361, Dept. of CSE, IIT Madras

Abstract

I describe a series of computational experiments recently carried out by Satinder Singh, Rick Lewis, and me that elucidate aspects of the relationship between ultimate goals (e.g., reproductive success for an animal) and the primary rewards that drive learning. Among the lessons provided by these experiments are clarification of the traditional notions of extrinsically and intrinsically motivated behavior and that the precise form of an optimal reward function need not bear a transparent relationship to an agent's ultimate goal.

Bio

Andrew Barto is Professor of Computer Science, University of Massachusetts, Amherst. He has been Chair of the UMass Department of Computer Science since 2007. He received his B.S. with distinction in mathematics from the University of Michigan in 1970, and his Ph.D. in Computer Science in 1975, also from the University of Michigan. He joined the Computer Science Department of the University of Massachusetts Amherst in 1977 as a Postdoctoral Research Associate, became an Associate Professor in 1982, and has been a Full Professor since 1991. He is Co-Director of the Autonomous Learning Laboratory and a core faculty member of the Neuroscience and Behavior Program of the University of Massachusetts. His research centers on learning in natural and artificial systems, and he has studied machine learning algorithms since 1977, contributing to the development of the computational theory and practice of reinforcement learning. His current research centers on what psychologists call intrinsically motivated behavior, meaning behavior that is done for its own sake rather than as a step toward solving a specific problem. Recent work is aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that form the building blocks for open-ended learning. He currently serves as an associate editor of Neural Computation, as a member of the editorial boards of the Journal of Machine Learning Research, Adaptive Behavior, and Theoretical Computer Science-C: Natural Computing. Professor Barto is a Fellow of the American Association for the Advancement of Science, a Fellow and Senior Member of the IEEE, and a member of the American Association for Artificial Intelligence and the Society for Neuroscience. He received the 2004 IEEE Neural Network Society Pioneer Award for contributions to the field of reinforcement learning. He has published over one hundred papers or chapters in journals, books, and conference and workshop proceedings. He is co-author with Richard Sutton of the book "Reinforcement Learning: An Introduction," MIT Press 1998, and co-editor with Jennie Si, Warren Powell, and Don Wunch II of the "Handbook of Learning and Approximate Dynamic Programming," Wiley-IEEE Press, 2004.