Hypothesis

2 Matching Annotations

Aug 2023
www.alignmentforum.org www.alignmentforum.org

Outer Alignment - AI Alignment Forum

1
1. kidrain61 27 Aug 2023
  
  in Public
  
  Outer alignment asks the question - "What should we aim our model at?" In other words, is the model optimizing for the correct reward such that there are no exploitable loopholes? It is also known as the reward misspecification problem.
  
  [!NOTE] Outer Alignment / Reward Misspecification Promblem 是指什么？
  
  flashcard
  
  模型是否在向人类真正的目标优化
Visit annotations in context

Annotators

kidrain61

URL

alignmentforum.org/tag/outer-alignment
www.alignmentforum.org www.alignmentforum.org

Inner Alignment - AI Alignment Forum

1
1. kidrain61 27 Aug 2023
  
  in Public
  
  Inner alignment asks the question - “Is the model trying to do what humans want it to do?”, or in other words can we robustly aim our AI optimizers at any objective function at all?
  
  [!NOTE] Inner Alignment 的基本思想是？
  
  flashcard
  
  确保模型确实在向特定目标优化
Visit annotations in context

Annotators

kidrain61

URL

alignmentforum.org/tag/inner-alignment