1 Matching Annotations
  1. Aug 2017
    1. The takeaway is that you should not be using smaller networks because you are afraid of overfitting. Instead, you should use as big of a neural network as your computational budget allows, and use other regularization techniques to control overfitting

      What about the rule of thumb stating that you should have roughly 5-10 times as many data points as weights in order to not overfit?