Hypothesis

1 Matching Annotations

May 2026
openai.com openai.com

https://openai.com/index/where-the-goblins-came-from/

1
1. fxp007 01 May 2026
  
  in Public
  
  We unknowingly gave particularly high rewards for metaphors with creatures.
  
  这揭示了最佳实践建议：在训练模型时，应仔细设计奖励机制，以避免意外地鼓励不希望的行为。
  
  best-practice reward-mechanism
Visit annotations in context

Tags

best-practice

reward-mechanism

Annotators

fxp007

URL

openai.com/index/where-the-goblins-came-from/