Hypothesis

8 Matching Annotations

Nov 2023
proceedings.mlr.press proceedings.mlr.press

janner22a.pdf

1
1. mark.crowley 24 Nov 2023
  
  in Public
  
  Reading this one on Nov 27, 2023 for the reading group.
  
  rdgrp-f23 reinforcement-learning transformers
Visit annotations in context

Tags

transformers

rdgrp-f23

reinforcement-learning

Annotators

mark.crowley

URL

proceedings.mlr.press/v162/janner22a/janner22a.pdf
proceedings.neurips.cc proceedings.neurips.cc

NeurIPS-2021-offline-reinforcement-learning-as-one-big-sequence-modeling-problem-Paper.pdf

1
1. mark.crowley 24 Nov 2023
  
  in Public
  
  Reading this one on Nov 27, 2023 for the reading group.
  
  rdgrp-f23 reinforcement-learning transformers
Visit annotations in context

Tags

transformers

rdgrp-f23

reinforcement-learning

Annotators

mark.crowley

URL

proceedings.neurips.cc/paper_files/paper/2021/file/099fe6b0b444c23836c4a5d07346082b-Paper.pdf
Oct 2023
arxiv.org arxiv.org

2106.01345.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  (Chen, NeurIPS, 2021) Che1, Lu, Rajeswaran, Lee, Grover, Laskin, Abbeel, Srinivas, and Mordatch. "Decision Transformer: Reinforcement Learning via Sequence Modeling". Arxiv preprint rXiv:2106.01345v2, June, 2021.
  
  Quickly a very influential paper with a new idea of how to learn generative models of action prediction using SARSA training from demonstration trajectories. No optimization of actions or rewards, but target reward is an input.
  
  reinforcement-learning transformers generative-models minecraft minerl rdgrp-f23 reading_group_crowley
Visit annotations in context

Tags

generative-models

reinforcement-learning

minecraft

transformers

reading_group_crowley

rdgrp-f23

minerl

Annotators

mark.crowley

URL

arxiv.org/pdf/2106.01345
arxiv.org arxiv.org

2305.15486.pdf

2
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Wu, Prabhumoye, Yeon Min, Bisk, Salakhutdinov, Azaria, Mitchell and Li. "SPRING: GPT-4 Out-performs RL Algorithms byStudying Papers and Reasoning". Arxiv preprint arXiv:2305.15486v2, May, 2023.
  
  reinforcement-learning nlp large-language-models chatgpt minecraft evaluation-methods rdgrp-f23
2. mark.crowley 25 Oct 2023
  
  in Public
  
  Quantitatively, SPRING with GPT-4 outperforms all state-of-the-art RLbaselines, trained for 1M steps, without any training.
  
  Them's fighten' words!
  
  I haven't read it yet, but we're putting it on the list for this fall's reading group. Seriously, a strong result with a very strong implied claim. they are careful to say it's from their empirical results, very worth a look. I suspect that amount of implicit knowledge in the papers, text and DAG are helping to do this.
  
  The Big Question: is their comparison to RL baselines fair, are they being trained from scratch? What does a fair comparison of any from-scratch model (RL or supervised) mean when compared to an LLM approach (or any approach using a foundation model), when that model is not really from scratch.
  
  reinforcement-learning rdgrp-f23 reading_group_crowley nlp larg deep-learning self-supervised supervised-learning evaluation-methods
Visit annotations in context

Tags

self-supervised

supervised-learning

deep-learning

chatgpt

large-language-models

reinforcement-learning

minecraft

rdgrp-f23

nlp

larg

evaluation-methods

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/2305.15486.pdf
www.nature.com www.nature.com

Scientific discovery in the age of artificial intelligence

2
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Wang et. al. "Scientific discovery in the age of artificial intelligence", Nature, 2023.
  
  A paper about the current state of using AI/ML for scientific discovery, connected with the AI4Science workshops at major conferences.
  
  (NOTE: since Springer/Nature don't allow public pdfs to be linked without a paywall, we can't use hypothesis directly on the pdf of the paper, this link is to the website version of it which is what we'll use to guide discussion during the reading group.)
  
  machine-learning deep-learning ai-for-science artificial-intelligence reading_group_crowley rdgrp-f23
2. mark.crowley 25 Oct 2023
  
  in Public
  
  Petersen, B. K. et al. Deep symbolic regression: recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations (2020).
  
  Description: Reinforcement learning uses neural networks to generate a mathematical expression sequentially by adding mathematical symbols from a predefined vocabulary and using the learned policy to decide which notation symbol to be added next. The mathematical formula is represented as a parse tree. The learned policy takes the parse tree as input to determine what leaf node to expand and what notation (from the vocabulary) to add.
  
  rdgrp-f23 to-read
Visit annotations in context

Tags

ai-for-science

machine-learning

deep-learning

to-read

artificial-intelligence

rdgrp-f23

reading_group_crowley

Annotators

mark.crowley

URL

nature.com/articles/s41586-023-06221-2
arxiv.org arxiv.org

2308.13067.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Zecevic, Willig, Singh Dhami and Kersting. "Causal Parrots: Large Language Models May Talk Causality But Are Not Causal". In Transactions on Machine Learning Research, Aug, 2023.
  
  transformers large-language-models nlp reading_group_crowley rdgrp-f23
Visit annotations in context

Tags

large-language-models

transformers

rdgrp-f23

nlp

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/2308.13067.pdf

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL