3 Matching Annotations
  1. Jun 2023
    1. Recent work in computer vision has shown that common im-age datasets contain a non-trivial amount of near-duplicateimages. For instance CIFAR-10 has 3.3% overlap betweentrain and test images (Barz & Denzler, 2019). This results inan over-reporting of the generalization performance of ma-chine learning systems.

      CIFAR-10 performance results are overestimates since some of the training data is essentially in the test set.

  2. May 2020
  3. Aug 2017
    1. This is a very easy paper to follow, but it looks like their methodology is a simple way to improve performance on limited data. I'm curious how well this is reproduced elsewhere.