1 Matching Annotations
- Jul 2023
-
arxiv.org arxiv.org
-
Paper that introduced the PPO algorithm. PPO is, in a way, a response to the TRPO algorithm, trying to use the core idea but implement a more efficient and simpler algorithm.
TRPO defines the problem as a straight optimization problem, no learning is actually involved.
-