This is a mini code for paper "pay attention to small weights" (NeurIPS'2025).
Most of the implementation is based on "MicroAdam repo". Huge thanks to their repo.
If this code is helpful to you, please cite:
@misc{zhou2025payattentionsmallweights,
title={Pay Attention to Small Weights},
author={Chao Zhou and Tom Jacobs and Advait Gadhikar and Rebekka Burkholz},
year={2025},
eprint={2506.21374},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2506.21374},
}