Hypothesis

2 Matching Annotations

Jun 2023
cdn.openai.com cdn.openai.com

Language Models are Unsupervised Multitask Learners

1
1. mark.crowley 28 Jun 2023
  
  in Public
  
  Recent work in computer vision has shown that common im-age datasets contain a non-trivial amount of near-duplicateimages. For instance CIFAR-10 has 3.3% overlap betweentrain and test images (Barz & Denzler, 2019). This results inan over-reporting of the generalization performance of ma-chine learning systems.
  
  CIFAR-10 performance results are overestimates since some of the training data is essentially in the test set.
  
  image-processing convolutional-neural-networks deep-learning machine-learning datasets
Visit annotations in context

Tags

datasets

image-processing

deep-learning

machine-learning

convolutional-neural-networks

Annotators

mark.crowley

URL

cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
Aug 2017
arxiv.org arxiv.org

1708.04347.pdf

1
1. hodapp 31 Aug 2017
  
  in Public
  
  This is a very easy paper to follow, but it looks like their methodology is a simple way to improve performance on limited data. I'm curious how well this is reproduced elsewhere.
  
  convolutional neural networks neural networks deep learning
Visit annotations in context

Tags

neural networks

deep learning

convolutional neural networks

Annotators

hodapp

URL

arxiv.org/pdf/1708.04347.pdf