Double descent is a phenomenon observed in machine learning where the test error of a model first decreases, then increases, and then decreases again as the model complexity increases. This contrasts with the traditional U-shaped bias-variance tradeoff, which suggests that test error should monotonically increase with model complexity beyond a certain point.
Double descent occurs due to the interplay between model complexity, data size, and noise. As model complexity increases, the model first overfits to the noise in the data, causing the test error to increase. However, as the model becomes even more complex, it starts to interpolate the data, fitting the noise less and the signal more, causing the test error to decrease again.
Double descent has been observed in various settings, including neural networks and decision trees.
Double descent challenges traditional notions of overfitting and model selection in machine learning. It suggests that increasing model complexity beyond the point of overfitting can sometimes improve performance, contrary to conventional wisdom. This has implications for how we design and train machine learning models, and how we understand their behavior.