“Goals” in reinforcement learning

I want to create a mind that, through it’s own learning, makes real time decisions. These decisions would likely be optimized to accomplish a goal – which can be represented by maximizing reward. So in this sense, goals help shape behavior and decision making.

Constraining this intelligence around goals however, seems minimizing. Imagine, however we had an intelligent mind that was able to take an input stream, and project all possible futures, based on it’s action. An ability to make these predictions would produce the ultimate mind.

However, projecting an unconstrained future in this way, is impractical for 2 reasons. At least now.  The first is for performance. The second is for learning. For performance, without goals, the agent is essentially performing a random walk. It’s akin to a water skier with no goal, performing random actions. Without a goal, the skier will fall almost immediately. Learning without a goal is equally problematic. With a goal, an agent is able to try different actions, and access how successful they were towards an end goal. It’s a pruning mechanism to determine what temporal based actions to continue to learn about. For a water skier learning to slalom ski, they’re able to continue to try different levels of aggression of edge change. If the goal however was to learn how to trick ski, the actions tried would be much different.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s