Markov Decision Process in Reinforcement Learning

I’ve been reading and taking notes of “Introduction to Reinforcement Learning” by Sutton and Barto. As I do so, I’ve been taking notes of each chapter.

Chapter 4 – Markov Decision Process took more time than I’d like to admit to understand. I’ll blame it on the larger than usual amount of statistics/math combined with the 10+ years away from University.

Nonetheless I took notes as I slogged my way to a bit of a better understanding. Hopefully someone finds the attached notes useful below.

cmput-609-chapter-3

Perception as Prediction

What is perception? What does the world believe perception is? How does supervised learning model perception? Presumably through labeled data? What does Reinforcement Learning suggest perception is? State and action combinations? Where are these 2 fields right? Where are they wrong? Where are they common? Is perception the same as prediction? Is perception just opportunity for action? Perhaps an opportunity for interaction (action implies I’m acting on an object, when perhaps the object could act on me.) Furthermore, maybe I only perceive things that do in fact pose opportunities to interact with. A near infinite amount of things pass by in day to day life without notice.

These questions warrant consideration given the current state of artificial intelligence (and my understanding of it!).

Paleo … again

Since competing at Canadian Ultimate championships in the middle of august, my diet has been … pretty terrible. This is a fairly rare occurrence for me … I’m usually a stickler for eating healthy. Chocolate, trail mix, granola … I have a love hate for you!

So tomorrow … I’m trying paleo again. In the past I’ve felt great – mentally and physically – while on it.

“Blogging” more

I obviously use “blogging” loosely here.

Since starting grad school a few weeks ago, I’ve allowed myself to believe more in crystalizing my thoughts and putting pen to paper. Not surprisingly, I’ve noticed thoughts of my own, or concepts I’ve read about, finally coming together when I try to describe them to others.

So I’m going to try to write more. And use this blog as a landing spot. For the most part, I won’t be spending as much time / thought / editorial intelligence on these. Consider the posts somewhere between a facebook status update and a well thought out blog post. Most of them related to AI. And I’d guess a few about family, fitness, sport, and other random ideas.

David Quail

Random thoughts and rants

Month: September 2016

Markov Decision Process in Reinforcement Learning

Perception as Prediction

Paleo … again

“Blogging” more