2605.25134
2026-06-01
cs.LG
cs.AI
Theoretical Analysis of Sparse Optimization with Reparameterization, Weight Decay, and Adaptive Learning Rate
重参数化、权重衰减和自适应学习率下稀疏优化的理论分析
Huangyu Xu, Jingqin Yang, Qianqian Xu, Jiaye Teng
发表机构
*
State Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China(人工智能安全国家重点实验室,计算技术研究所,中国科学院,北京,中国)
;
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China(中国科学院大学计算机科学与技术学院,北京,中国)
;
Beijing Academy of Artificial Intelligence (BAAI), Beijing, China(北京人工智能研究院(BAAI),北京,中国)
;
IIIS, Tsinghua University, Beijing, China(清华大学人工智能院,北京,中国)
;
School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China(上海财经大学统计与管理学院,上海,中国)
;
Institute of Data Science and Statistics, Shanghai University of Finance and Economics, Shanghai, China(上海财经大学数据科学与统计研究所,上海,中国)
AI总结
针对稀疏优化中的不稳定问题,提出基于重参数化、权重衰减和自适应学习率的ReWA方法,通过改善优化景观实现比ℓ1正则化更好的稀疏性,同时保持测试精度。