Hypothesis

12 Matching Annotations

Jan 2024
cdn.openai.com cdn.openai.com

gpt-4-system-card.pdf

1
1. mark.crowley 06 Jan 2024
  
  in Public
  
  GPT-4 System CardOpenAIMarch 23, 2023
  
  chat-gpt large-language-models openai system-cards transformers toread reading_group_crowley
Visit annotations in context

Tags

large-language-models

transformers

openai

system-cards

toread

chat-gpt

reading_group_crowley

Annotators

mark.crowley

URL

cdn.openai.com/papers/gpt-4-system-card.pdf
Oct 2023
arxiv.org arxiv.org

RoBERTa: A Robustly Optimized BERT Pretraining Approach

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Introduction of the RoBERTa improved analysis and training approach to BERT NLP models.
  
  large-language-models nlp transformers rdgrp-s23 reading_group_crowley
Visit annotations in context

Tags

rdgrp-s23

large-language-models

transformers

nlp

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/1907.11692
arxiv.org arxiv.org

2106.01345.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  (Chen, NeurIPS, 2021) Che1, Lu, Rajeswaran, Lee, Grover, Laskin, Abbeel, Srinivas, and Mordatch. "Decision Transformer: Reinforcement Learning via Sequence Modeling". Arxiv preprint rXiv:2106.01345v2, June, 2021.
  
  Quickly a very influential paper with a new idea of how to learn generative models of action prediction using SARSA training from demonstration trajectories. No optimization of actions or rewards, but target reward is an input.
  
  reinforcement-learning transformers generative-models minecraft minerl rdgrp-f23 reading_group_crowley
Visit annotations in context

Tags

generative-models

reinforcement-learning

minecraft

transformers

reading_group_crowley

rdgrp-f23

minerl

Annotators

mark.crowley

URL

arxiv.org/pdf/2106.01345
www.nature.com www.nature.com

Scientific discovery in the age of artificial intelligence

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Wang et. al. "Scientific discovery in the age of artificial intelligence", Nature, 2023.
  
  A paper about the current state of using AI/ML for scientific discovery, connected with the AI4Science workshops at major conferences.
  
  (NOTE: since Springer/Nature don't allow public pdfs to be linked without a paywall, we can't use hypothesis directly on the pdf of the paper, this link is to the website version of it which is what we'll use to guide discussion during the reading group.)
  
  machine-learning deep-learning ai-for-science artificial-intelligence reading_group_crowley rdgrp-f23
Visit annotations in context

Tags

ai-for-science

machine-learning

deep-learning

artificial-intelligence

rdgrp-f23

reading_group_crowley

Annotators

mark.crowley

URL

nature.com/articles/s41586-023-06221-2
arxiv.org arxiv.org

2308.13067.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Zecevic, Willig, Singh Dhami and Kersting. "Causal Parrots: Large Language Models May Talk Causality But Are Not Causal". In Transactions on Machine Learning Research, Aug, 2023.
  
  transformers large-language-models nlp reading_group_crowley rdgrp-f23
Visit annotations in context

Tags

large-language-models

transformers

rdgrp-f23

nlp

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/2308.13067.pdf
cdn.openai.com cdn.openai.com

Language Models are Unsupervised Multitask Learners

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  GPT-2 Introduction paper
  
  Language Models are Unsupervised Multitask Learners A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, (2019).
  
  large-language-models nlp machine-learning transformers gpt reading_group_crowley rdgrp-s23
Visit annotations in context

Tags

machine-learning

rdgrp-s23

large-language-models

transformers

gpt

nlp

reading_group_crowley

Annotators

mark.crowley

URL

cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
arxiv.org arxiv.org

1706.03762.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  "Attention is All You Need" Foundational paper introducing the Transformer Architecture.
  
  transformers reading_group_crowley rdgrp-s23 large-language-models nlp
Visit annotations in context

Tags

rdgrp-s23

large-language-models

transformers

nlp

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/1706.03762
papers.nips.cc papers.nips.cc

NeurIPS-2020-language-models-are-few-shot-learners-Paper.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  GPT-3 introduction paper
  
  large-language-models nlp machine-learning transformers gpt reading_group_crowley rdgrp-s23
Visit annotations in context

Tags

machine-learning

rdgrp-s23

large-language-models

transformers

gpt

nlp

reading_group_crowley

Annotators

mark.crowley

URL

papers.nips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
arxiv.org arxiv.org

2105.03322.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  "Are Pre-trained Convolutions Better than Pre-trained Transformers?"
  
  transformers deep-learning nlp large-language-models reading_group_crowley rdgrp-s23
Visit annotations in context

Tags

rdgrp-s23

deep-learning

large-language-models

transformers

nlp

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/2105.03322.pdf
arxiv.org arxiv.org

2201.08239.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  LaMDA: Language Models for Dialog Application
  
  "LaMDA: Language Models for Dialog Application" Meta's introduction of LaMDA v1 Large Language Model.
  
  transformers reading_group_crowley rdgrp-s23 large-language-models nlp
Visit annotations in context

Tags

rdgrp-s23

large-language-models

transformers

nlp

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/2201.08239.pdf
arxiv.org arxiv.org

2305.15486.pdf

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Quantitatively, SPRING with GPT-4 outperforms all state-of-the-art RLbaselines, trained for 1M steps, without any training.
  
  Them's fighten' words!
  
  I haven't read it yet, but we're putting it on the list for this fall's reading group. Seriously, a strong result with a very strong implied claim. they are careful to say it's from their empirical results, very worth a look. I suspect that amount of implicit knowledge in the papers, text and DAG are helping to do this.
  
  The Big Question: is their comparison to RL baselines fair, are they being trained from scratch? What does a fair comparison of any from-scratch model (RL or supervised) mean when compared to an LLM approach (or any approach using a foundation model), when that model is not really from scratch.
  
  reinforcement-learning rdgrp-f23 reading_group_crowley nlp larg deep-learning self-supervised supervised-learning evaluation-methods
Visit annotations in context

Tags

self-supervised

supervised-learning

rdgrp-f23

nlp

larg

deep-learning

reinforcement-learning

evaluation-methods

reading_group_crowley

Annotators

mark.crowley

URL

arxiv.org/pdf/2305.15486.pdf
osf.io osf.io

Attention Mechanism, Transformers, BERT, and GPT: Tutorial and Survey

1
1. mark.crowley 25 Oct 2023
  
  in Public
  
  Benyamin GhojoghAli Ghodsi. "Attention Mechanism, Transformers, BERT, and GPT: Tutorial and Survey"
  
  reading_group_crowley transformers reading_group_crowley rdgrp-s23 nlp large-language-models
Visit annotations in context

Tags

rdgrp-s23

large-language-models

transformers

nlp

reading_group_crowley

Annotators

mark.crowley

URL

osf.io/m6gcn/