Dynamic Horde

I’m just finishing up a research project about an architecture for a dynamic set of general value functions. I’ll look to share the documentation and source on http://github.com/dquail as per usual. But in the mean time, wanted to share the abstract.

General value functions (GVFs) have proven to be effective in answering predictive questions about the future. However, simply answering a single predictive question has limited utility. Others have demonstrated further utility by using these GVFs to dynamically compose more abstract questions (Ring 2017), or to optimize control (Modayil & Sutton 2014). In other words, to feed the prediction back into the system. But these demonstrations have relied on a static set of GVFs, handcrafted by a human designer.

In this paper, we look to extend the Horde architecture (Sutton et al. 2011) to not only feed the GVFs back into the system, but to do so dynamically. In doing so, we explore ways to control the lifecycle of GVFs contained in a Horde – mainly to create, test, cull, and recreate GVFs, in an attempt to maximize some objective.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s