Hypothesis

5 Matching Annotations

Feb 2023
arxiv.org arxiv.org

2010.03950.pdf

4
1. mark.crowley 16 Feb 2023
  
  in Public
  
  Definition 3.2 (simple reward machine).
  
  The MDP does not change, it's dynamics are the same, with or without the RM, as they are with or without a standard reward model. Additionally, the rewards from the RM can be non-Markovian with respect to the MDP because they inherently have a kind of memory or where you've been, limited to the agents "movement" (almost "in it's mind") about where it is along the goals for this task.
  
  reinforcement-learning reward-machines
2. mark.crowley 16 Feb 2023
  
  in Public
  
  e thenshow that an RM can be interpreted as specifying a single reward function over a largerstate space, and consider types of reward functions that can be expressed using RMs
  
  So by specifying a reward machine you are augmenting the state space of the MDP with higher level goals/subgoals/concepts that provide structure about what is good and what isn't.
  
  reinforcement-learning reward-machines
3. mark.crowley 16 Feb 2023
  
  in Public
  
  However, an agent that hadaccess to the specification of the reward function might be able to use such information tolearn optimal policies faster.
  
  Fascinating idea, why not? Why are we hiding the reward from the agent really?
  
  reinforcement-learning reward-machines
4. mark.crowley 02 Feb 2023
  
  in Public
  
  Reward Machines: Exploiting Reward FunctionStructure in Reinforcement Learning
  
  [Icarte, JAIR, 2022] "Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning"
  
  reinforcement-learning reward-machines
Visit annotations in context

Tags

reinforcement-learning

reward-machines

Annotators

mark.crowley

URL

arxiv.org/pdf/2010.03950.pdf
proceedings.mlr.press proceedings.mlr.press

Using Reward Machines for High-Level Task Specificationand Decomposition in Reinforcement Learning

1
1. mark.crowley 16 Feb 2023
  
  in Public
  
  Using Reward Machines for High-Level Task Specificationand Decomposition in Reinforcement Learning
  
  [Icarte, PMLR, 2018] "Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning"
  
  reinforcement-learning reward-machines
Visit annotations in context

Tags

reinforcement-learning

reward-machines

Annotators

mark.crowley

URL

proceedings.mlr.press/v80/icarte18a/icarte18a.pdf

Tags

Annotators

URL

Tags

Annotators

URL