Hypothesis

6 Matching Annotations

Dec 2022
Local file Local file

Untitled document

1
1. Kasperkpedersen 27 Dec 2022
  
  in Public
  
  The log_lik matrix
  
  i skipped this part, it is something about showing the comparisons of two models
Annotators

Kasperkpedersen
Nov 2022
Local file Local file

Untitled document

1
1. Kasperkpedersen 12 Nov 2022
  
  in Public
  
  There isno back-door path through Q, as you can see. But there is a non-causal path from Q to Wthrough U: Q → E ← U → W.
  
  We don't know what the right side is of a Basketball game, it could be the underdog, it could be the favorite, it could be any team - anything can happend
Annotators

Kasperkpedersen
Sep 2022
incompleteideas.net incompleteideas.net

RLbook2020.pdf

4
1. Kasperkpedersen 28 Sep 2022
  
  in Public
  
  aphically represents the Bellman optimality equation (3.19) and the backup diagramon the right graphically represents (3.20)
  
  I got lost here
2. Kasperkpedersen 28 Sep 2022
  
  in Public
  
  the agent selects all four actions with equal probability in all states
  
  Is this a policy? so if we changed the probability with which actions are chosen we alter the policy?
3. Kasperkpedersen 28 Sep 2022
  
  in Public
  
  they satisfy recursive relationships simila
  
  How can i see that these functions are recursive?
4. Kasperkpedersen 28 Sep 2022
  
  in Public
  
  = Xa⇡(a|s) Xs 0 ,rp(s 0 , r |s, a)hr + v ⇡ (s 0 )i, for all s 2 S,
  
  I have an issue understanding this formula, and how it easily can be read as an expected value. Why do we merge the two sums, one over all the values of s' and the other over all the values of r. What are we trying to accomplish here?
Visit annotations in context

Annotators

Kasperkpedersen

URL

incompleteideas.net/book/RLbook2020.pdf

Annotators

Annotators

Annotators

URL