저의 주관적인 해석이나 오역이 있을 수 있습니다. 댓글로 피드백 해주시면 수정하도록 하겠습니다! https://arxiv.org/abs/2201.02177 Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets In this paper we propose to study generalization of neural networks on small algorithmically generated datasets. In this setting, questions about data efficiency, memorization, generalization, and speed of learning can be studied in grea..