2606.05868
2026-06-05
cs.CL
YouZhi: Towards High-Concurrency Financial LLMs via Adaptive GQA-to-MLA Transition
YouZhi:通过自适应GQA到MLA转换实现高并发金融大语言模型
PSBC LLM Team, Huawei LLM Team, Ruihan Long, Junjie Wu, Tianan Zhang, Duo Zhang, Yaozong Wu, Jinbin Fu, Chang Liu, Zhentao Tang, Wenshuang Yang, Xin Wang, Zhihao Song, Ning Huang, Wenjing Xu, Shuai Zong, Shupei Sun, Sen Wang, Jing Hu, Bin Wang, Xinyu Wang, Junkui Ju, Zequn Ding, Jie Ran, Man Luo, Shixiong Kai, Linkai Hou, Kaichao Liang, Hu Zhao, Yang Zhao, Shucheng Lin, Wei Yu, Chenghan Jiang, Jingjing Ding, Jiahui Zhang, Tian Jin, Yuhang Zhang, Dong Guo, Wei Sun, Jun Xie, Jianwei Li, Lei Cao, Pei Li, Jiabin Li, Jia Yuan, Rui Yuan, Jing Zhu, Mingxuan Yuan, Zhangcheng Lv, Xin Jiang, Xiuhong Fei, Xiaozhe Ren, Yulong Li, Zhipeng Zhang, Hang Wang, Zhaohui Xu, Rui Zhao, Yibo He, Xinzhuang Niu
发表机构
*
Postal Savings Bank of China & Huawei LLM Team(中国邮政储蓄银行及华为LLM团队)
;
Postal Savings Bank of China(中国邮政储蓄银行)
;
Huawei Technologies(华为技术)
AI总结
提出YouZhi-LLM,通过层自适应GQA-to-MLA转换框架和基于昇腾的训练流水线,显著压缩KV缓存并提升金融领域高并发推理效率。