1 Matching Annotations
- Aug 2017
The takeaway is that you should not be using smaller networks because you are afraid of overfitting. Instead, you should use as big of a neural network as your computational budget allows, and use other regularization techniques to control overfitting
What about the rule of thumb stating that you should have roughly 5-10 times as many data points as weights in order to not overfit?