FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention
FlashMemory-DeepSeek-V4: 通过前瞻稀疏注意力实现闪电索引超长上下文
发表机构 * Independent Researchers(独立研究者) ; Tencent(腾讯) ; The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) ; Tsinghua University(清华大学)
AI总结 提出前瞻稀疏注意力(LSA),基于DeepSeek-V4架构的神经记忆索引器,通过预测未来上下文需求仅保留关键KV块,在超长上下文场景下将物理KV缓存压缩至全上下文的13.5%,同时保持或略微提升下游准确率。
Comments Technical report. 11 pages. Code and model available at https://github.com/libertywing/FlashMemory-Deepseek-V4 and https://huggingface.co/libertywing/FlashMemory-Deepseek-V4