ReSum: Synergizing LLM Reasoning and Summarization with Reinforcement Learning
ReSum: 通过强化学习协同LLM推理与摘要生成
发表机构 * University of Science and Technology of China(中国科学技术大学) ; AMAP, Alibaba Group(阿里巴巴集团高德地图)
AI总结 提出ReSum框架,利用自摘要机制让LLM压缩和组织推理轨迹,通过对比评估自适应触发摘要,在提升性能4%的同时减少18.6%的推理长度。
Comments 24 pages, including 13 pages of main text and 11 pages of appendix