Validation Error Shape
Most learning algorithms make many small changes to the model, ensuring that each little change improves the model’s fit of the training data. But when the model starts getting too good at the training data, its test and validation error get worse. That’s the point where we stop learning because additional learning will improve improve our training error at the expense of the validation and the test error.
Many machine learning courses depict a cartoon of the validation error:
Note that both the training and the validation errors are convex functions of time (if we ignore the bit in the beginning). However, if we train the model longer, we discover the following picture:
This is the real shape of the validation error. The error is almost always bounded from above, so the validation error must eventually inflect and converge. So the training error curve is convex, but the validation isn’t!