Hypothesis

1 Matching Annotations

Mar 2015
www.youtube.com www.youtube.com

Deep Learning Lecture 15: Deep Reinforcement Learning - Policy search

1
1. gerben 19 Mar 2015
  
  in Public
  
  Too long; didn't watch: Starts with introduction to reinforcement learning. From 20:30 he starts formalising the problem, to derive in the last ten minutes (from 45:00) how to compute policy gradients using backprop - supposedly the same method used by DeepMind to learn to play arcade games.
  
  tl;dw
Visit annotations in context

Tags

tl;dw

Annotators

gerben

URL

youtube.com/watch