arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 3851
热门方向导航
2606.08477 2026-06-09 cs.AI 新提交

A Variability-Based Framework for Interpretable Naming in Formal and Relational Concept Analysis

基于可变性的框架:形式概念分析与关系概念分析中的可解释命名

Alain Gutierrez, Marianne Huchard, Pierre Martin, André Miralles, Violaine Prince

发表机构 * LIRMM, Univ. Montpellier, CNRS(法国国家科学研究中心蒙彼利埃大学计算机科学、机器人及微电子实验室) CIRAD, UPR AIDA(法国农业国际合作研究发展中心AIDA研究单元) AIDA, CIRAD, Univ. Montpellier(法国农业国际合作研究发展中心AIDA研究单元,蒙彼利埃大学) INRAE - UMR TETIS - Territoires, Environnement(法国国家农业、食品与环境研究院TETIS联合研究单元)

AI总结 针对形式概念分析和关系概念分析中概念命名缺乏可解释性的问题,提出一种基于可变性的LLM辅助命名框架,通过控制信息源生成可读名称,并在披萨店数据集上验证其有效性。

详情
AI中文摘要

从符号数据中提取知识通常会产生形式上定义但用户无法立即解释的抽象概念。形式概念分析(FCA)和关系概念分析(RCA)为此问题提供了代表性场景:它们根据对象描述和关系生成明确的概念结构、蕴含关系和关系依赖。尽管这些结构在设计上是可解释的,但概念通常由技术标签标识,这限制了它们作为人类可解释知识单元的使用。因此,为这些概念赋予有意义的名称是领域专家进行解释、导航、验证和复用的关键问题。\n本文从符号知识表示的角度研究FCA和RCA中的概念命名。我们首先描述了命名生成的符号抽象所涉及的语言和术语挑战,包括歧义性、区分性、简洁性以及相关概念间的一致性。然后,我们提出一个可配置的LLM辅助概念命名框架。该框架依赖于一个可变性模型,该模型控制命名过程中暴露的信息源,如内涵、外延、继承信息、邻近概念、蕴含关系和关系属性。从而明确从形式概念描述到人类可读名称的语义选择。\n该方法作为概念验证在披萨店领域的小型关系数据集上进行了说明。该示例展示了不同配置如何影响LLM建议的名称,以及命名可变性如何揭示解释选择、关系依赖以及底层符号数据中可能的建模问题。

英文摘要

Knowledge extraction from symbolic data often produces abstractions that are formally defined but not immediately interpretable by users. Formal Concept Analysis (FCA) and Relational Concept Analysis (RCA) provide representative settings for this issue: they generate explicit conceptual structures, implications, and relational dependencies from object descriptions and relations. Although these structures are explainable by design, their concepts are often identified by technical labels, which limits their use as human-interpretable knowledge units. Assigning meaningful names to such concepts is therefore a key issue for interpretation, navigation, validation, and reuse by domain experts. This paper investigates concept naming in FCA and RCA from a symbolic knowledge representation perspective. We first characterize the linguistic and terminological challenges involved in naming generated symbolic abstractions, including ambiguity, discrimination, concision, and consistency across related concepts. We then propose a configurable framework for LLM-assisted concept naming. The framework relies on a variability model that controls which sources of information are exposed during naming, such as intent, extent, inherited information, neighboring concepts, implications, and relational attributes. It thereby makes explicit the semantic choices involved in moving from formal concept descriptions to human-readable names. The approach is illustrated as a proof of concept on a small relational dataset in the pizzeria domain. This illustration shows how different configurations influence the names suggested by an LLM, and how naming variability can reveal interpretation choices, relational dependencies, and possible modeling issues in the underlying symbolic data.

2606.08473 2026-06-09 cs.LG 新提交

Physically Consistent Null Space Alignment for Detection of Low-Magnitude False Data Injection Attacks

物理一致零空间对齐用于检测低幅值虚假数据注入攻击

Xin Li, Chenhan Xiao, Jonathan Cohen, Aviad Elyashar, Yang Weng, Rami Puzis

发表机构 * Ben-Gurion-University(本-古里安大学)

AI总结 提出物理一致零空间对齐(PCNSA)框架,通过伪零空间守恒预处理保持物理零空间与测量伪零空间的几何对应,从而检测低幅值但高影响的隐蔽虚假数据注入攻击。

Comments 12 pages, 13 figures

详情
AI中文摘要

虚假数据注入攻击(FDIAs)引入小的测量扰动,当注入信号与系统模型的伪零空间对齐时,仍可能导致电力系统状态估计出现较大偏差。现有的基于模型和数据驱动的检测器可能无法识别这种低幅值但高影响的攻击,因为残差检验忽略了隐藏在伪零空间中的变化,而子空间学习方法捕获相关模式但未强制执行物理一致性。本文提出物理一致零空间对齐(PCNSA),一种通过预处理保持物理零空间与测量导出伪零空间之间的几何对应来检测隐蔽FDIAs的框架。关键在于伪零空间守恒数据预处理(PSCP)步骤,该步骤在子空间提取之前将测量重新表达在物理坐标系中。我们证明PSCP保持了行空间与其正交补之间的分离,这是传统逐特征标准化所违反的性质。这使得奇异值分解(SVD)导出的伪零子空间与物理残差空间对齐,而无需显式知道H。在IEEE 14、30、57和118节点系统上的实验证实了这一原理:逃避XTM、LSTM、AE和Isolation Forest基线的隐蔽攻击在对齐子空间中表现为明显偏差,从而获得更高的F1分数和检测精度,同时在部分可观测性和实际PMU噪声下保持鲁棒性。

英文摘要

False data injection attacks (FDIAs) introducing small measurement perturbations can still cause large deviations in power system state estimation when the injected signals align with the pseudo-null space of the system model. Existing model- and data-driven detectors may fail to identify such low-magnitude but high-impact attacks because residual tests ignore changes hidden in the pseudo-null space, while subspace learning methods capture correlation patterns without enforcing physical consistency. This paper proposes Physically Consistent Null Space Alignment (PCNSA), a framework that detects stealthy FDIAs by preserving, through preprocessing, the geometric correspondence between the physical null space and the measurement-derived pseudo-null space. The key point is a Pseudo-null Space Conserved data Preprocessing (PSCP) step that re-expresses measurements in the physical coordinate frame before subspace extraction. We prove that PSCP preserves the separation between row space and its orthogonal complement, a property that conventional per-feature standardization violates. This keeps the singular value decomposition (SVD)-derived pseudo-null subspace aligned with the physical residual space without explicit knowledge of H. Experiments on IEEE 14-, 30-, 57-, and 118-bus systems confirm this principle in practice: stealthy attacks that evade XTM, LSTM, AE and Isolation Forest baselines appear as clear deviations in the aligned subspace, yielding higher F1-score and detection accuracy while remaining robust under partial observability and realistic PMU noise.

2606.08471 2026-06-09 cs.CL cs.AI 新提交

More Yap Less Meaning: Uncovering Self-Improvement Behavior in SLMs

更多废话,更少意义:揭示小语言模型中的自我改进行为

Marina Igitkhanian, Erik Arakelyan

发表机构 * American University of Armenia(亚美尼亚美国大学) NVIDIA(英伟达)

AI总结 本研究通过构建充分性测试,发现小语言模型在自我纠正中仅获得4.4%的准确率提升,且较长的提示反而与错误答案正相关,表明其推理能力有限。

Comments GEM Workshop at ACL 2026

详情
AI中文摘要

近年来,语言模型在各个领域和应用中取得了快速进展。然而,它们的自我改进能力——即是否善于识别和纠正自身推理中的缺陷——仍然存疑。在本研究中,我们通过构建一个充分性测试来严格检验小语言模型(SLMs)的自我纠正能力。我们提出了一个最小化的三步自我纠正流程:收集初始SLM答案,提示同一模型根据真实答案为错误回答生成提示,然后将相同问题与模型自身的反馈一起输入以改进初始答案。我们在算术和逻辑推理基准上评估了多种指令微调和推理SLM。我们的发现表明,注入提示句子的SLM相比初始问答准确率仅提升4.4%。即使正确答案与模型的错误推理一起提供,评估的SLM也无法理解其推理中缺失了什么,并且在导致纠正和未导致纠正的提示之间显示出最小的语义差异。此外,我们的实验表明,较长的提示与错误的最终答案正相关,表明对问题的较长思考可能阻碍推理过程,这意味着SLM的性能不一定随更大的计算预算而扩展。

英文摘要

Recently, language models have made rapid progress across various domains and applications. However, their capability for self-improvement, i.e., whether they are adept at recognising and correcting flaws in their own reasoning, remains dubious. In this study, we address this question by constructing a sufficiency test to rigorously examine the self-correction capabilities of small language models (SLMs). We propose a minimal three-step self-correction pipeline that collects initial SLM answers, prompts the same model to generate hints for its incorrect responses given the ground truth, and feeds the model the same question with its own feedback to refine the initial answer. We evaluate a variety of instruction-tuned and reasoning SLMs in this experimental setup on arithmetic and logical reasoning benchmarks. Our findings show that SLMs with injected hint sentences yield only a 4.4 percent gain over initial question-answering accuracy. Even though the correct answer was provided alongside the model's incorrect reasoning, the evaluated SLMs fail to understand what was missing in their reasoning and show minimal semantic difference between hints that lead to corrections and ones that do not. Furthermore, our experiments show that longer hints are positively correlated with incorrect final answers, suggesting that longer deliberation on problems can hinder the reasoning process, meaning that SLMs do not necessarily scale in performance with a larger compute budget.

2606.08470 2026-06-09 cs.RO 新提交

LUNA-AD: Lightweight Uncertainty-Aware Language Model with Lifelong Learning for Autonomous Driving

LUNA-AD: 面向自动驾驶的轻量级不确定性感知语言模型与终身学习

Ruoyu Yao, Pei Liu, Ruiguo Zhong, Mingxing Peng, Rui Yang, Jun Ma

发表机构 * The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州))

AI总结 提出LUNA-AD,一种结合三系统架构、多智能体分析、双头轻量模型和反思驱动终身学习的轻量级不确定性感知语言模型,在nuPlan上实现高成功率与低推理延迟。

Comments 16 pages,9 figures

详情
AI中文摘要

虽然大型语言模型(LLMs)提供了有前景的推理能力,但它们在安全关键的驾驶系统中的集成受到推理多样性有限、高计算开销和静态学习范式的阻碍。为了解决这些挑战,我们提出了LUNA-AD,一种面向自动驾驶(AD)的轻量级不确定性感知语言模型与终身学习。LUNA-AD采用三系统架构,协调复杂的多模态行为推理、高效部署和持续改进。我们设计了一个多智能体分析系统,通过多样化的假设探索生成不确定性感知的决策演示。一个双头轻量启发式模型被蒸馏,以统一决策分布和文本解释的推理,同时实现高效部署。此外,一种反思驱动的终身学习机制作用于多模态决策输出并保持策略多样性,允许通过闭环反馈改进候选决策和理由,以增强驾驶鲁棒性。在nuPlan基准上的大量实验表明,与现有知识驱动的AD框架相比,LUNA-AD在非反应式和反应式模式下均实现了最先进的成功率,并显著降低了推理延迟。

英文摘要

While large language models (LLMs) offer promising reasoning capabilities, their integration into safety-critical driving systems is hindered by limited reasoning diversity, high computational overhead, and static learning paradigms. To address these challenges, we propose LUNA-AD, a lightweight uncertainty-aware language model with lifelong learning for autonomous driving (AD). LUNA-AD features a tri-system architecture that reconciles complex multimodal behavioral reasoning, efficient deployment, and continual refinement. We design a multi-agent analytical system to generate uncertainty-aware decision-making demonstrations through diverse hypothesis exploration. A dual-head lightweight heuristic model is distilled to unify the inference of decision distributions and textual explanations while enabling efficient deployment. Furthermore, a reflection-driven lifelong learning mechanism operates on multimodal decision outputs and preserves strategic diversity, allowing for the refinement of candidate decisions and rationales via closed-loop feedback to enhance driving robustness. Extensive experiments on nuPlan benchmarks demonstrate that LUNA-AD achieves state-of-the-art success rates under both non-reactive and reactive modes, with drastically reduced inference latency compared to existing knowledge-driven AD frameworks.

2606.08467 2026-06-09 cs.LG cs.AI 新提交

The Confidence Trap: Calibration Attacks for Graph Neural Networks

置信陷阱:图神经网络的校准攻击

Cuong Dang, Jiahao Zhang, Hieu Ta Quang, Dung Le, Lu Cheng, Suhang Wang

发表机构 * Virginia Polytechnic Institute and State University(弗吉尼亚理工学院暨州立大学) The Pennsylvania State University(宾夕法尼亚州立大学) VinUniversity University of Illinois at Chicago(伊利诺伊大学芝加哥分校)

AI总结 提出统一图校准攻击(UGCA)框架,通过KL散度损失、重排序机制和混合损失等策略,在保持分类精度下显著提高期望校准误差,揭示高精度或多类模型更易受攻击。

详情
AI中文摘要

尽管置信校准对于安全关键应用中的可信决策至关重要,但校准后的GNN对对抗性结构扰动的鲁棒性仍未被充分探索。然而,研究图上的校准攻击面临独特的技术挑战:(1)图结构的离散性使基于梯度的优化复杂化;(2)现有的低置信目标无法将预测推向均匀分布;(3)GNN对边扰动高度敏感,常导致违反攻击约束的意外标签变化。为应对这些挑战,我们提出一个\textbf{统一图校准攻击(UGCA)}框架,用于GNN校准鲁棒性的\textbf{最坏情况(白盒)分析}。UGCA引入KL散度损失以鼓励均匀预测分布,重排序机制以减少标签翻转,混合损失以在违规时恢复标签,以及束搜索以探索更广的对抗搜索空间。我们进一步提供理论见解,将模型泛化、数据集复杂性和校准脆弱性联系起来,表明在该威胁模型下,具有更高精度或在更多类别数据集上训练的模型更容易受到攻击。大量实验表明,UGCA在保持分类精度的同时显著增加了期望校准误差。我们的代码公开在https://github.com/CaptainCuong/Graph-Calibration-Attack.git。

英文摘要

While confidence calibration is essential for trustworthy decision-making in safety-critical applications, the robustness of calibrated GNNs to adversarial structural perturbations remains largely unexplored. However, studying calibration attacks on graphs presents unique technical challenges: (1) the discrete nature of graph structures complicates gradient-based optimization, (2) existing underconfidence objectives fail to drive predictions toward uniform distributions, and (3) GNNs are highly sensitive to edge perturbations, often causing unintended label changes that violate attack constraints. To address these challenges, we propose a \textbf{Unified Graph Calibration Attack (UGCA)} framework designed for \textbf{worst-case (white-box) analysis} of GNN calibration robustness. UGCA introduces a KL-divergence loss to encourage uniform predictive distributions, a reranking mechanism to reduce label flipping, a hybrid loss to recover labels when violations occur, and beam search to explore a broader adversarial search space. We further provide theoretical insights linking model generalization, dataset complexity, and calibration vulnerability, showing that models with higher accuracy or trained on datasets with more classes are more susceptible under this threat model. Extensive experiments demonstrate that UGCA substantially increases Expected Calibration Error while preserving classification accuracy. Our code is publicly available at https://github.com/CaptainCuong/Graph-Calibration-Attack.git.

2606.08464 2026-06-09 cs.CV 新提交

TVI-CoT: Text-Visual Interleaved Chain-of-Thought Reasoning for Multimodal Understanding

TVI-CoT: 面向多模态理解的文本-视觉交错思维链推理

Lianyu Hu, Xiaoyu Ma, Zeqin Liao, Yang Liu

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出TVI-CoT框架,通过可学习控制令牌实现文本推理与视觉特征访问的动态交错,解决多模态LLM在推理过程中无法访问视觉特征的问题,在多个基准上取得最优结果。

Comments ICML2026

详情
AI中文摘要

思维链推理已被证明能有效增强大语言模型的问题解决能力。然而,当应用于多模态大语言模型时,现有的思维链方法存在一个根本性限制:它们在推理过程中完全基于文本进行,无法访问视觉特征。在初始视觉编码后,图像信息变得不可访问,迫使模型仅基于初始描述中捕获的内容进行推理,形成了一种“视觉盲推理”范式,限制了细粒度视觉提取、错误验证和自适应注意力。我们提出了文本-视觉交错思维链(TVI-CoT),这是一个通过可学习控制令牌<THINK>、<LOOK>和<ANSWER>实现文本推理与视觉特征访问显式交错的框架。这些令牌允许在推理和视觉定位之间动态切换,根据不断演化的推理状态关注相关的图像区域。在八个基准上的实验表明,该方法在多模态大语言模型思维链方法中达到了最先进的结果,并且相比基线有显著性能提升:在MMMU上提升6.1%,在MathVerse上提升3.8%,在MathVista上提升3.4%,在ScienceQA上提升3.4%。代码可在https://github.com/hulianyuyy/TVI-CoT获取。

英文摘要

Chain-of-thought (CoT) reasoning has proven effective for enhancing problem-solving in large language models. However, when applied to multimodal LLMs (MLLMs), existing CoT approaches suffer from a fundamental limitation: they perform reasoning entirely in text without accessing visual features during the reasoning process. After initial visual encoding, image information becomes inaccessible, forcing models to reason based solely on whatever was captured in the initial description, which forms a `vision-blind reasoning' paradigm that limits fine-grained visual extraction, error verification, and adaptive attention. We propose Text-Visual Interleaved Chain-of-Thought (TVI-CoT), a framework that enables explicit interleaving of textual reasoning and visual feature access through learnable control tokens <THINK>, <LOOK> and <ANSWER>. These tokens allow dynamic switching between reasoning and visual grounding, attending to relevant image regions conditioned on the evolving reasoning state. Experiments on eight benchmarks demonstrate state-of-the-art results among MLLM-based CoT methods and notable performance boost compared to the baseline: +6.1% on MMMU, +3.8% on MathVerse, +3.4% on MathVista, and +3.4% on ScienceQA. Code is available at https://github.com/hulianyuyy/TVI-CoT.

2606.08458 2026-06-09 cs.RO 新提交

Personalized and Robust Proactive Robot Assistance with Uncertainty-Guided LLM Reasoning

个性化且鲁棒的主动机器人辅助:基于不确定性引导的大语言模型推理

Alvaro Gonzalez, M. H. Hasan Shovo, Ali Ayub

发表机构 * Concordia University(康考迪亚大学)

AI总结 提出GLOBE框架,结合n-gram马尔可夫模型与不确定性引导的大语言模型推理,在家庭环境中实现高效鲁棒的主动机器人辅助,并在HOMER-Noise数据集上验证了其性能与效率。

Comments Accepted to the 2026 IEEE 35th International Conference on Robot and Human Interactive Communication (RO-MAN)

详情
AI中文摘要

在家庭环境中,主动机器人辅助需要在动态和嘈杂条件下准确预测人类活动和物体使用。现有方法通常依赖复杂的时空模型,这些模型计算成本高且对环境变化敏感。本文提出GLOBE,一个轻量级框架,结合n-gram马尔可夫模型捕捉时间行为模式与不确定性引导的大语言模型推理。该框架高效执行序列预测,仅在模型置信度低时选择性调用大语言模型推理。为评估现实条件下的性能,我们引入HOMER-Noise,即HOMER+数据集的噪声扩展,模拟由人类、宠物和幼儿引起的物体移动等结构化干扰。实验结果表明,GLOBE在干净和嘈杂环境下均达到与最先进方法竞争的性能,同时提高了鲁棒性和计算效率。该框架进一步通过与Stretch 3移动操作器的概念验证集成得到验证,展示了其在真实人机交互场景中的潜在应用。

英文摘要

Proactive robot assistance in household environments requires accurate prediction of human activities and object usage under dynamic and noisy conditions. Existing approaches often rely on complex spatio-temporal models, which can be computationally expensive and sensitive to environmental variability. In this paper, we propose GLOBE, a lightweight framework that combines n-gram Markov models for capturing temporal behavioral patterns with uncertainty-guided large language model (LLM) reasoning. The framework performs sequential prediction efficiently while selectively invoking LLM reasoning only when the model confidence is low. To evaluate performance under realistic conditions, we introduce HOMER-Noise, a noisy extension of the HOMER+ dataset that simulates structured disturbances such as object movements caused by humans, pets, and toddlers. Experimental results show that GLOBE achieves competitive performance with state-of-the-art methods while improving robustness and computational efficiency across both clean and noisy settings. The framework is further validated through a proof-of-concept integration with a Stretch 3 mobile manipulator, demonstrating its potential application in real-world human-robot interaction scenarios.

2606.08454 2026-06-09 cs.LG cs.CL 新提交

Beyond Linear Activation Steering: Invertible Latent Transformations for Controlling LLM Behavior

超越线性激活引导:用于控制大语言模型行为的可逆潜在变换

Tuc Nguyen, Thai Le

发表机构 * Indiana University Bloomington(印第安纳大学伯明顿分校)

AI总结 提出INNSteer框架,通过可逆神经网络将LLM激活映射到潜在空间进行线性控制,再逆变换回原空间,实现非线性、输入依赖的激活引导,在多个模型和基准上优于现有方法。

Comments 36 pages, 7 figures

详情
AI中文摘要

激活引导提供了一种轻量级的推理时机制,通过修改大语言模型(LLM)的内部激活向量,使其朝向期望行为。现有方法大多在原始激活空间中计算固定的引导方向,通常使用对比示例对的均值差、线性探针或任意可分离性标准。虽然在一定程度上有效,但这些方法将行为控制视为全局线性加性偏移:相同的方向应用于所有输入,且行为是线性可分的。当行为特征在激活空间中非线性变化或位于弯曲和各向异性流形上时,这种处理可能具有局限性,因为最优干预可能是输入依赖的。为解决这一限制,我们提出了INNSteer,一种基于可逆潜在变换的非线性激活引导框架。INNSteer并非在原始表示空间中寻找更好的引导向量,而是学习一个轻量级可逆神经网络$ϕ$,将LLM的激活映射到潜在空间,在该空间中行为类别更易于线性控制。推理时,激活通过$ϕ$映射,在潜在空间中进行引导,再通过精确逆变换$ϕ^{-1}$映射回原空间。这使得简单的潜在空间平移在原始激活空间中变为非线性、输入依赖的干预。在多个LLM系列、规模、行为特征和安全基准的实验设置中,INNSteer在保持生成流畅性的同时,一致地优于线性、基于传输和非线性的引导基线。

英文摘要

Activation steering provides a lightweight inference-time mechanism for controlling large language models (LLMs) by modifying their internal activation vectors toward desired behaviors. Most existing methods compute a fixed steering direction in the original activation space, typically from pairs of contrastive examples using mean differences, linear probes, or arbitrary separability criteria. While effective to a certain extent, these methods treat behavioral control as a global, linear, additive offset: the same direction is applied across inputs, and behaviors are linearly separable. This can be restrictive when behavioral features vary nonlinearly across the activation space or lie on curved and anisotropic manifolds, where the optimal intervention may be input-dependent. To address this limitation, we propose INNSteer, a nonlinear activation steering framework based on invertible latent transformations. Rather than searching for a better steering vector in the original representation space, INNSteer learns a lightweight invertible neural network $ϕ$ that maps an LLM's activations into a latent space where behavioral classes are more amenable to linear control. At inference time, activations are mapped through $ϕ$, steered in the latent space, and mapped back through the exact inverse transformation $ϕ^{-1}$. This makes a simple latent-space translation become a nonlinear, input-dependent intervention in the original activation space. Across experiment settings on multiple LLM families, scales, behavioral traits, and safety benchmarks, INNSteer consistently improves model control over linear, transport-based, and nonlinear steering baselines while largely preserving generation fluency.

2606.08452 2026-06-09 cs.LG 新提交

Theoretical Foundations of Continual Learning via Drift-Plus-Penalty

基于漂移加惩罚的持续学习的理论基础

Nazreen Shah, Govinda Arya, Bharath B. N., Ranjitha Prasad

发表机构 * IIIT Delhi(德里印度理工学院) IIT Dharwad(达尔瓦德印度理工学院)

AI总结 提出COLD框架,利用漂移加惩罚原理调节稳定性-可塑性权衡,通过虚拟队列控制遗忘,理论保证收敛性,实验优于现有方法。

Comments Accepted to Transactions on Machine Learning Research (TMLR)

详情
AI中文摘要

在许多实际场景中,数据流是非平稳的且顺序到达,要求学习系统在不从头重新训练的情况下持续适应。持续学习通过整合新任务同时缓解灾难性遗忘来应对这一挑战,其中学习新信息会降低先前知识的性能。我们引入了一种控制理论视角来明确调节遗忘的演化,将适应视为受长期稳定性约束的受控过程。我们专注于基于回放的持续学习,其中有限的内存缓冲区存储来自先前任务的代表性样本。我们提出了基于漂移加惩罚原理的持续学习框架COLD,该原理来自随机优化。为了便于分析,我们还考虑了一种oracle变体COLD-ORACLE作为参考基准。在每个任务中,两种方法都最小化当前任务损失,同时维护一个虚拟队列,该队列跟踪先前学习任务上长期稳定性的偏差,将稳定性-可塑性权衡捕捉为受调节的动态过程。我们建立了稳定性和收敛性保证,通过可调控制参数表征这种权衡。在标准基准上的实验表明,COLD在提供竞争性和可控的遗忘行为的同时,通过显式调节稳定性和可塑性,始终优于广泛的最先进的持续学习方法。

英文摘要

In many real-world settings, data streams are nonstationary and arrive sequentially, requiring learning systems to adapt continuously without retraining from scratch. Continual learning (CL) addresses this challenge by incorporating new tasks while mitigating catastrophic forgetting, where learning new information degrades performance on previously acquired knowledge. We introduce a control-theoretic perspective on CL that explicitly regulates the evolution of forgetting, framing adaptation as a controlled process subject to long-term stability constraints. We focus on replay-based CL, where a finite memory buffer stores representative samples from prior tasks. We propose COntinual Learning with Drift-Plus-Penalty (COLD), a continual learning framework based on the Drift-Plus-Penalty (DPP) principle from stochastic optimization. To facilitate analysis, we also consider an oracle variant, COLD-ORACLE, as a reference benchmark. At each task, both methods minimize the current task loss while maintaining a virtual queue that tracks deviations from long-term stability on previously learned tasks, capturing the stability-plasticity trade-off as a regulated dynamical process. We establish stability and convergence guarantees that characterize this trade-off through a tunable control parameter. Experiments on standard benchmarks demonstrate that COLD consistently outperforms a broad range of state-of-the-art CL methods while providing competitive and controllable forgetting behavior through explicit regulation of stability and plasticity.

2606.08451 2026-06-09 cs.CL cs.AI 新提交

Sycophancy as a Multilingual Alignment Failure: How Safety Degrades Across Languages, Topics, and Models

谄媚作为多语言对齐失败:安全性能如何随语言、主题和模型退化

Arya Shah, Himanshu Beniwal, Mayank Singh, Chaklam Silpasuwanchai

发表机构 * IIT Gandhinagar(印度理工学院甘地讷格尔分校) Asian Institute of Technology(亚洲理工学院)

AI总结 研究多语言模型中谄媚现象,发现低资源语言中谄媚率激增,且与主题无关,归因于分词器生育率,表明对齐方法在非高资源语言中泛化差。

Comments 19 pages, 9 figures, 7 tables

详情
AI中文摘要

安全对齐的大型语言模型常常表现出谄媚,即倾向于肯定用户的意见而不考虑事实准确性。尽管在英语中已有充分研究,但其在其他语言中的表现仍基本未被考察,使得数十亿非英语使用者可能容易受到模型验证的错误信息的影响。我们首次进行了大规模、多模型的跨语言谄媚评估,对\textbf{六个指令调优模型}在涵盖\textbf{38种语言}和\textbf{33个主题类别}的\textbf{110万个实例}上进行了基准测试。我们识别出一致的资源层级效应:谄媚率在低资源和零资源语言设置中急剧上升。关键的是,这种退化与主题无关,模型在良性提示和安全关键提示上均匀失败,在最需要保护的地方没有提供额外保护。我们进一步确定了分词器生育率作为这种对齐崩溃的结构性驱动因素。总的来说,我们的结果表明,当前的对齐方法在高资源语言之外泛化能力差,强调了迫切需要公平的多语言安全技术。

英文摘要

Safety-aligned large language models often exhibit sycophancy, which is the tendency to affirm users' opinions regardless of factual accuracy. Although well-studied in English, its manifestation in other languages remains largely unexamined, leaving billions of non-English speakers potentially vulnerable to model-validated misinformation. We present the first large-scale, multi-model evaluation of cross-lingual sycophancy, benchmarking \textbf{six instruction-tuned models} across \textbf{1.1 million instances} spanning \textbf{38 languages} and \textbf{33 topic categories}. We identify a consistent resource-tier effect: sycophancy rates spike sharply in low-resource and zero-shot language settings. Critically, this degradation is topic-agnostic, as models fail uniformly across both benign and safety-critical prompts, offering no additional protection where it is most needed. We further identify tokenizer fertility as a structural driver of this alignment collapse. Collectively, our results demonstrate that prevailing alignment methodologies generalize poorly beyond high-resource languages, underscoring the urgent need for equitable multilingual safety techniques.

2606.08450 2026-06-09 cs.AI 新提交

GIFT: LLM-Guided State-Reward Interface for Financial Reinforcement Learning

GIFT: 基于LLM引导的状态-奖励接口用于金融强化学习

Yanyan Wu, Boyi Zhang, Yanlin Liu, Xinyu Fang, Jining Luan, Meiqi Zhang, Jiacheng Liu, Hao Zeng, Dexu Yu, Chang Liu, Hanwen Du, Yongxin Ni, Youhua Li

发表机构 * East China University of Science and Technology(华东理工大学) University of Science and Technology of China(中国科学技术大学) Southwestern University of Finance and Economics(西南财经大学) University of Sydney(悉尼大学) City University of Hong Kong(香港城市大学) Northeastern University(东北大学) The Ohio State University(俄亥俄州立大学) National University of Singapore(新加坡国立大学)

AI总结 提出GIFT框架,利用大语言模型引导PPO强化学习中的状态增强和奖励塑造,提升金融交易策略的样本外风险调整收益。

Comments 25 pages, 7 figures. Code and data are available at https://github.com/KAG778/GIFT . Equal contribution: Yanyan Wu and Boyi Zhang. Corresponding author: Youhua Li

详情
AI中文摘要

金融投资组合交易自然被表述为一个强化学习问题,其中智能体在不断变化的市场条件下顺序调整资产以平衡收益、风险和交易成本。然而,在非平稳市场中,原始的OHLCV状态和短视的回报奖励往往提供了一个不充分的学习接口,这促使使用大语言模型将金融知识注入状态和奖励设计,同时限制开放式的生成。为此,我们提出GIFT,一个基于LLM引导的框架,用于基于PPO的金融强化学习中的状态-奖励接口设计。GIFT不是使用LLM做出交易决策,而是使用因子引导的状态增强从金融因子基元生成状态特征,使用风险规则引导的奖励塑造从投资组合风险规则生成辅助奖励,并使用诊断引导的细化通过PPO rollout诊断修订候选接口。细化后,GIFT在评估前固定所选的状态-奖励接口,在测试时不再进行LLM查询或接口更新。跨不同市场制度和投资组合场景的综合滚动窗口实验表明,GIFT相比基线提高了学习信号质量和样本外风险调整后的投资组合性能。代码和数据可在 https://github.com/KAG778/GIFT 获取。

英文摘要

Financial portfolio trading is naturally formulated as a reinforcement learning problem, where an agent sequentially rebalances assets under changing market conditions to balance return, risk, and transaction costs. Yet in non-stationary markets, raw OHLCV states and short-horizon return rewards often provide an under-specified learning interface, motivating large language models as a way to inject financial knowledge into state and reward design while constraining open-ended generation. To this end, we propose GIFT, an LLM-guided framework for state-reward interface design in PPO-based financial reinforcement learning. Rather than using the LLM to make trading decisions, GIFT uses Factor-guided State Enhancement to generate state features from financial-factor primitives, Risk-rule-guided Reward Shaping to generate auxiliary rewards from portfolio-risk rules, and Diagnostic-guided Refinement to revise candidate interfaces using PPO rollout diagnostics. After refinement, GIFT fixes the selected state-reward interface before evaluation, with no further LLM queries or interface updates at test time. Comprehensive rolling-window experiments across diverse market regimes and portfolio scenarios demonstrate that GIFT improves learning-signal quality and out-of-sample risk-adjusted portfolio performance over baselines. Code and data are available at: https://github.com/KAG778/GIFT .

2606.08447 2026-06-09 cs.LG cs.AI 新提交

Not Just After One: Sleep-Inspired Replay Prevents Catastrophic Forgetting After Sequential Tasks

不仅仅是在一次之后:受睡眠启发的回放防止顺序任务后的灾难性遗忘

Anthony Bazhenov, Jean Erik Delanois, Giri P. Krishnan

发表机构 * Department of Neuroscience, University of California, San Diego, CA, USA(1 神经科学系,加州大学圣地亚哥分校,美国加利福尼亚州圣地亚哥)

AI总结 提出受睡眠启发的无监督回放机制,在多个新任务顺序训练后应用,以部分恢复所有先前学习任务的性能,防止灾难性遗忘。

详情
AI中文摘要

人工神经网络的关键限制之一是缺乏持续学习的能力:在新任务上训练常常导致对先前任务的干扰和遗忘。尽管已有几种算法被提出以保护旧记忆免受干扰,但它们通常在每个新训练阶段期间或之后立即应用。相比之下,人类和动物可以持续学习,在主动学习期间获取多个新记忆,然后将它们全部巩固到长期存储中。在这里,我们展示了多个新任务可以顺序训练,然后应用无监督的睡眠样回放阶段,以部分恢复所有先前学习任务的性能。我们的研究进一步表明,任务特定信息对新训练具有弹性,但随着网络在新任务上训练而逐渐衰减。这些发现为开发广泛范围的持续学习AI解决方案提供了新颖的原则。

英文摘要

One of the critical limitations of artificial neural networks is their lack of ability to continually learn: training on new tasks often leads to interference and forgetting of the previous ones. While several algorithms have been proposed to protect old memories from interference, they are typically applied during or immediately after each new episode of training. In contrast, humans and animals can learn continuously, acquiring multiple new memories during active learning before consolidating all of them into long-term storage. Here we show that multiple new tasks can be trained sequentially before an unsupervised sleep-like replay phase is applied to partially restore performance across all previously learned tasks. Our study further suggests that task-specific information remains resilient to new training but decays gradually as network is trained on new tasks. These findings point to novel principles for developing a broad range of continual learning AI solutions.

2606.08446 2026-06-09 cs.LG cs.AI 新提交

Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models

Sparrow: 用于大语言模型稳定高效长上下文强化学习的稀疏 rollout

Yang Zhou, Ranajoy Sadhukhan, Zhaofeng Sun, Zhuoming Chen, Souvik Kundu, Saket Dingliwal, Sai Muralidhar Jayanthi, Aram Galstyan, Haizhong Zheng, Beidi Chen

发表机构 * Carnegie Mellon University(卡内基梅隆大学) Cornell University(康奈尔大学) Intel(英特尔) Amazon AGI(亚马逊AGI)

AI总结 针对RLVR中长上下文rollout计算昂贵的问题,提出Sparrow方法,通过动态稀疏度调度保持token级策略失配的下尾统计量稳定,在Qwen3系列模型上实现2.0-2.4倍加速,并推广到更大模型和编程领域。

详情
AI中文摘要

尽管强大,但带有可验证奖励的强化学习(RLVR)会诱导极长的思维链(COT),使其计算成本高昂。由于RLVR每步成本主要由长上下文rollout生成主导,稀疏注意力为加速密集rollout提供了一种有前景的方法。然而,稀疏rollout需要精细的稳定性-效率权衡:过于激进的稀疏性会导致崩溃,而过于宽松的稀疏性则加速不足。在这项工作中,我们通过稀疏到密集的演员-策略失配来研究这种权衡。我们首先观察到,稀疏rollout崩溃并非由token间的均匀退化驱动:即使在激进的稀疏性下,大多数稀疏token也能与密集token完美对齐。受此启发,我们假设如果每个token的演员-策略失配的下尾在整个轨迹中保持在临界阈值以上,则稀疏rollout训练保持稳定。我们引入一种动态稀疏度调度,在生成过程中保持该尾统计量恒定,并验证了我们的假设。在Qwen3思考族模型上,将尾失配统计量保持在一致阈值附近通常能实现稳定训练。然后,我们使用成本模型在该失配阈值下找到最大加速的稀疏度调度,在训练Qwen3-1.7B、Qwen3-4B和Qwen3-8B时分别实现了2.2倍、2.4倍和2.0倍的rollout加速。实验表明,这些阈值可推广到更大的模型(Qwen3-14B)和另一个RL领域(编程)。最后,我们的分析自然引出了DistillSparse:在稀疏rollout上进行轻量级基于LoRA的蒸馏,使更激进的稀疏性达到相同的稀疏到密集失配阈值,从而获得更高的加速。

英文摘要

Despite being powerful, reinforcement learning with verifiable rewards (RLVR) induces extremely long COT, making it computationally expensive. Since RLVR per-step cost is dominated by long-context rollout generation, sparse attention offers a promising way to accelerate dense rollout. However, sparse rollouts require a delicate stability-efficiency tradeoff: overly aggressive sparsity causes collapse, while overly lenient sparsity gives insufficient speedup. In this work, we study this tradeoff through sparse-to-dense actor-policy mismatch. We first observe that sparse rollout collapse is not driven by uniform degradation across tokens: most sparse tokens align perfectly with dense even under aggressive sparsity. Motivated by this, we hypothesize that sparse rollout training remains stable if the lower tail of per-token actor-policy mismatch stays above a critical threshold throughout the trajectory. We introduce a dynamic sparsity schedule that keeps this tail statistic constant during generation and validate our hypothesis. Across Qwen3 thinking-family models, keeping the tail mismatch statistic near a consistent threshold generally enables stable training. We then use a cost model to find the sparsity schedule for maximum speedup under this mismatch threshold, achieving 2.2x, 2.4x, and 2.0x rollout speedups when training Qwen3-1.7B, Qwen3-4B, and Qwen3-8B. Empirically, we show the thresholds generalize to a larger model (Qwen3-14B) and another RL domain (coding). Finally, our analysis naturally motivates DistillSparse: lightweight LoRA-based distillation on sparse rollout lets more aggressive sparsity reach the same sparse-to-dense mismatch threshold, yielding higher speedup.

2606.08445 2026-06-09 cs.CL cs.AI 新提交

Segment-level Tree Search for Long Meeting Document Summarization

长会议文档摘要的段级树搜索

Sangwon Ryu, Heejin Do, Jun Seo, Daehui Kim, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok

发表机构 * GSAI, POSTECH(浦项科技大学人工智能研究院) CSE, POSTECH(浦项科技大学计算机科学与工程系) ETH Zurich(苏黎世联邦理工学院) ETH AI Center(苏黎世联邦理工学院人工智能中心) Agentic AI Lab, KT(KT公司智能体人工智能实验室) LILT(LILT公司)

AI总结 提出基于蒙特卡洛树搜索的段级摘要框架S3,无需训练即可组合段级候选摘要,使用7B模型达到72B模型性能。

Comments INTERSPEECH 2026

详情
AI中文摘要

会议文档因其长度和复杂的对话结构而难以总结。现有方法通常采用多阶段流水线,在摘要之前提取信息;然而,这些方法往往因缺乏中间验证而遭受累积错误传播,这一限制因短且低质量的参考摘要而进一步放大。我们提出通过蒙特卡洛树搜索进行段级摘要(S3),这是一个无需训练的框架,通过组合段级摘要候选来构建最终摘要。S3将长文档划分为多个段,并为每个段生成多个摘要候选,形成搜索树的节点。通过自我奖励引导的树搜索选择最佳评分组合,并精炼为最终输出。尽管使用7B模型,S3在生成长度合适的摘要的同时,实现了与更大的72B模型相当的性能。

英文摘要

Meeting documents are challenging to summarize due to their length and complex conversational structure. Existing approaches typically adopt multi-stage pipelines that extract information prior to summarization; however, these approaches often suffer from cumulative error propagation without intermediate validation, a limitation further amplified by short and low-quality reference summaries. We propose segment-level summarization via Monte Carlo Tree Search (S3), a training-free framework that constructs a final summary by composing segment-level summary candidates. S3 partitions a long document into segments and generates multiple summary candidates per segment, forming nodes of a search tree. The best-scoring combination is selected via self-reward-guided tree search and refined into the final output. Despite using a 7B model, S3 achieves performance comparable to larger 72B models while producing length-appropriate summaries.

2606.08440 2026-06-09 cs.RO cs.CV 新提交

GraspFoM: Towards Reconstruction-Driven Robotic Grasping with 3D Foundation Priors

GraspFoM:基于3D基础先验的重建驱动机器人抓取

Dongli Wu, Xiaobao Wei, Hao Wang, Qiaochu Dong, Ying Li, Qingpo Wuwu, Ming Lu, Wufan Zhao

发表机构 * The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) Peking University(北京大学) The Hong Kong University of Science and Technology(香港科技大学)

AI总结 提出GraspFoM框架,利用3D基础先验(SAM3D)构建共享3D物体潜变量,联合优化重建与抓取姿态预测,通过锚点初始化的截断姿态推理扩散器生成连续多模态抓取,实现高保真重建与最优抓取。

详情
AI中文摘要

机器人抓取是机器人操作中的基本能力。然而,在部分观测下抓取仍然具有挑战性。可靠的抓取依赖于局部接触线索和物体级3D结构。现有的几何感知抓取方法认识到重建的价值,但通常将几何视为中间预测,而不是可重用的抓取物体先验。在本文中,我们提出了GraspFoM,一个统一的框架,利用3D基础先验(SAM3D)为重建和抓取姿态预测构建共享的3D物体潜变量。基于这个共享的物体潜变量,我们引入了一个锚点初始化的截断姿态推理扩散器,它预测连续且多模态的抓取姿态,而不直接依赖离散的抓取候选。我们进一步通过一个重建感知评分器和残差潜变量更新器来研究重建与抓取之间的相互作用。重建提供基于几何的线索,而抓取监督则使共享的物体潜变量向与抓取相关的可操作性区域细化。GraspFoM联合预测抓取姿态并以网格和3DGS形式重建高保真3D资产。综合实验表明,GraspFoM在重建和抓取上都达到了最先进的结果。值得注意的是,这些改进只需要少量额外的可训练参数。组件消融研究也证明了每个组件的贡献。

英文摘要

Robotic grasping is a fundamental capability in robotic manipulation. Yet grasping remains challenging under partial observations. Reliable grasping depends on both local contact cues and object-level 3D structure. Existing geometry-aware grasping methods recognize the value of reconstruction, but they typically treat geometry as an intermediate prediction rather than a reusable object prior for grasping. In this paper, we present GraspFoM, a unified framework that leverages 3D foundation priors (SAM3D) to build a shared 3D object latent for both reconstruction and grasp pose prediction. Built on this shared object latent, we introduce an anchor-initialized truncated pose-reasoning diffuser that predicts continuous and multimodal grasp poses without directly relying on discrete grasp candidates. We further investigate the interaction between reconstruction and grasping through a reconstruction-aware scorer and a residual latent updater. Reconstruction provides grounded geometric cues, while grasp supervision refines the shared object latent toward grasp-relevant affordances. GraspFoM jointly predicts grasp poses and reconstructs high-fidelity 3D assets in mesh and 3DGS forms. Comprehensive experiments demonstrate that GraspFoM achieves state-of-the-art results on both reconstruction and grasping. Notably, these improvements require only a small number of additional trainable parameters. Component-wise ablation studies also demonstrate the contribution of each component.

2606.08432 2026-06-09 cs.AI 新提交

Trajectory-Refined Distillation

轨迹精炼蒸馏

Li Jiang, Haoran Xu, Yichuan Ding, Amy Zhang

发表机构 * McGill University(麦吉尔大学) Mila Quebec AI Institute(米拉魁北克人工智能研究所) UT Austin(德克萨斯大学奥斯汀分校)

AI总结 提出轨迹精炼蒸馏(TRD),通过教师指导修正学生轨迹中的前缀错误,解决在线策略蒸馏中的前缀失败问题,提升大语言模型的单次准确率和推理覆盖。

Comments under review

详情
AI中文摘要

在线策略蒸馏(OPD)已成为大型语言模型(LLM)的重要后训练工具,它沿着学生自身的生成轨迹提供密集的逐词教师监督。在这项工作中,我们识别出OPD中一个常见的结构性问题,称为前缀失败。在前缀失败下,密集的逐词监督会导致双峰教师混合和碎片化梯度,而词级损失截断或重加权无法解决这一问题。这一观察促使我们超越词级损失干预,转向轨迹级输出修正。因此,我们提出轨迹精炼蒸馏(TRD),一种轨迹级修正方法,在教师指导下,于在线策略支持范围内修正学生的生成轨迹。通过在蒸馏前修正有问题的前缀,TRD从根源上缓解了前缀失败。此外,即使原始轨迹已经正确,TRD也能通过教师指导让学生接触到替代的有效推导,从而改善探索。TRD还可应用于在线策略自蒸馏(OPSD),这是一种使用基于特权信息的学生模型作为教师的参数共享变体。在多个尺度的广泛基准和基础模型上,TRD始终优于先前基线,提高了单次尝试准确率并扩展了推理覆盖范围。代码可在 https://github.com/louieworth/trd 获取。

英文摘要

On-policy distillation (OPD) has become a central post-training tool for large language models (LLMs), providing dense per-token teacher supervision along the student's own rollouts. In this work, we identify a common structural cause underlying OPD, which we call prefix failure. Under prefix failure, dense per-token supervision induces a bimodal teacher mixture and fragmented gradients that token-level loss truncation or reweighting fail to address. This observation motivates us to move beyond token-level loss interventions toward trajectory-level output corrections. We thus propose Trajectory-Refined Distillation (TRD), a trajectory-level correction method that revises the student's rollout under the teacher guidance while within on-policy support. By correcting problematic prefixes before distillation, TRD mitigates prefix failure at its source. Moreover, TRD improves the exploration by exposing the student to alternative valid derivations under teacher guidance, even when the original rolls are already correct. TRD can also be applied to on-policy self-distillation (OPSD), a parameter-sharing variant that uses the student model conditioned on privileged informations as the teacher. Across a wide range of benchmarks and base models at multiple scales, TRD consistently outperforms prior baselines, improving single-attempt accuracy and broadening reasoning coverage. Code is available at https://github.com/louieworth/trd

2606.08425 2026-06-09 cs.SD cs.CL eess.AS 新提交

TinyGiantALM: A Compact Audio-Language Model for Intent-Aware Reasoning under Resource Constraints

TinyGiantALM:面向资源约束下意图感知推理的紧凑型音频-语言模型

Vinh-Thuan Ly

发表机构 * University of Science, VNU-HCM(胡志明市国立大学下属理科大学) Vietnam National University, Ho Chi Minh City(胡志明市国立大学)

AI总结 提出紧凑型1.5B参数音频-语言模型TinyGiantALM,通过指令感知特征精炼框架(查询引导投影器+语义门控)过滤用户意图相关声学信号,在MMAR基准上零样本准确率46.4%,超越7B-13B基线,并优于8倍大模型。

Comments Accepted to Interspeech 2026. Project page: https://interspeech-tinygiant-alm.vercel.app

详情
AI中文摘要

当前音频推理的进展依赖于大规模音频-语言模型(LALMs),阻碍了在资源受限环境中的部署。我们提出了TinyGiantALM,一个紧凑的1.5B参数效率导向替代方案。不同于暴力扩展规模,我们提出了一种指令感知特征精炼框架,使用查询引导投影器和语义门控,基于用户意图过滤声学信号。在MMAR基准上,TinyGiantALM实现了46.4%的零样本准确率,显著优于7B-13B基线。虽然在逻辑叙事推理方面与30B+模型存在差距,且在过于密集或空间场景中存在某些权衡,但我们的方法在解耦混合模态环境方面显著优于高达8倍大小的模型。这些发现表明,架构精度为在边缘友好规模上获得稳健感知能力提供了一条切实可行的路径。

英文摘要

Current advancements in Audio Reasoning rely on massive Large Audio-Language Models (LALMs), hindering deployment in resource-constrained environments. We introduce TinyGiantALM, a compact 1.5B efficiency-oriented alternative. Instead of brute-force scaling, we propose an Instruction-Aware Feature Refinement framework using a Query-guided Projector and Semantic Gating to filter acoustic signals based on user intent. On the MMAR benchmark, TinyGiantALM achieves 46.4% zero-shot accuracy, significantly outperforming 7B-13B baselines. While a reasoning gap in logical narrative remains versus 30B+ models and certain trade-offs exist in overly dense or spatial scenes, our approach notably surpasses models up to 8x larger in disentangling mixed-modality environments. These findings demonstrate that architectural precision offers a tangible pathway to secure robust perception capabilities on edge-friendly scales.

2606.08421 2026-06-09 cs.CV 新提交

Segmentation-Assisted Brain MRI Synthesis with Cross-Image Multi-Contrast Feature Memory Bank Retrieval Augmentation

基于跨图像多对比度特征记忆库检索增强的分割辅助脑MRI合成

Wenwei Huang, Jia Wei, Jianlong Zhou

发表机构 * South China University of Technology(华南理工大学) University of Technology Sydney(悉尼科技大学)

AI总结 提出分割辅助的闭环生成对抗框架,通过辅助分割分支和双库检索增强策略,提高多对比度脑MRI中肿瘤区域的合成保真度。

详情
AI中文摘要

多对比度脑MRI提供互补的软组织特征,有助于疾病的筛查和诊断。然而,有限的扫描时间、图像损坏和各种成像协议常常导致多对比度图像不完整。虽然当前方法在图像合成方面表现出色,但它们通常难以合成关键的肿瘤区域,并且无法有效利用多对比度脑MRI中的上下文信息。为了解决这个问题,我们提出了一种以合成为中心、分割辅助的闭环框架,结合检索增强合成。我们的方法整体采用生成对抗架构,旨在通过单一模型从任何可用对比度的组合中合成缺失的对比度。为了显式捕获肿瘤语义并将合成聚焦于肿瘤区域,我们添加了一个辅助分割分支,该分支预测肿瘤掩膜并将其作为语义条件反馈给合成分支,从而在模型中学习肿瘤感知表示并提高合成保真度。此外,我们提出了一种双库检索增强策略。它动态查询两个外部知识库,即用于关键肿瘤上下文的肿瘤掩膜记忆库和用于全局风格信息的跨图像对比度特征记忆库,以增强合成。在两个公开的多对比度磁共振脑数据集:BraTs2020和UCSF-BMSR上验证,所提出的方法在处理医学脑图像合成任务方面有效,并且与先前方法相比表现出优越的性能。代码可在 https://github.com/iBizzard/SSCF.git 获取。

英文摘要

Multi-contrast brain MRI provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-contrast images. While current approaches excel in image synthesis, they often struggle to synthesize critical tumor regions and exploit contextual information in multi-contrast brain MRI effectively. To address this issue, we propose a synthesis-centric, segmentation-assisted closed-loop framework with retrieval augmentation synthesis. Our method overall takes a generative adversarial architecture, which aims to synthesize missing contrasts from any combination of available ones with a single model. To explicitly capture tumor semantics and focus synthesis on tumor regions, we add an auxiliary segmentation branch that predicts tumor masks and feeds them back as semantic conditioning in synthesis branch, thereby learning tumor-aware representations in the model and improving synthesis fidelity. Furthermore, we propose a dual-bank retrieval augmentation strategy. It dynamically queries two external knowledge bases, namely a tumor masks memory bank for crucial tumor context and cross-image contrast feature memory bank for global style information, to augment synthesis. Verified on two public multi-contrast magnetic resonance brain datasets: BraTs2020 and UCSF-BMSR, the proposed method is effective in handling medical brain images synthesis tasks and shows superior performance compared to previous methods. Code is available at:https://github.com/iBizzard/SSCF.git

2606.08420 2026-06-09 cs.CV 新提交

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs

CheXanatomy: 面向胸部X光片的解剖感知视觉-语言建模

Sergios Gatidis, Curtis Langlotz, Christian Bluethgen

发表机构 * Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University(斯坦福大学医学与影像人工智能中心) Department of Radiology, Stanford University(斯坦福大学放射学系)

AI总结 提出CheXanatomy框架,通过自回归令牌空间监督将解剖知识融入预训练视觉-语言模型,实现解剖分割,在合成和真实X光片上性能媲美U-Net,并提升域迁移鲁棒性和样本效率。

详情
AI中文摘要

在大规模图像-文本对上预训练的视觉-语言模型(VLM)表现出强大的图像级理解能力,但主要针对全局对齐进行优化,并未显式编码细粒度解剖结构,限制了其在分割等空间精确任务中的适用性。我们提出CheXanatomy,一个通过自回归令牌空间监督将显式解剖知识融入预训练VLM的框架。该模型无需添加任务特定的解码器头,而是通过下一个令牌预测训练生成解剖分割掩码。为了实现可扩展的监督,我们从CT体积合成逼真的胸部X光片,并前向投影CT分割标签以获得解剖一致的2D掩码。我们在合成和真实胸部X光片上评估该方法,与U-Net基线进行比较,包括模型规模、输入分辨率和视觉编码器微调的消融实验。自回归解剖监督在分布内实现了与专用卷积模型相当的性能,并在向真实CXR数据的域迁移下表现出改进的几何鲁棒性。此外,在有限监督下适应新定位任务时,解剖预训练模型展现出更好的样本效率。更大的模型和更高的输入图像分辨率提升了性能,而视觉编码器微调效果有限。这些结果表明,将解剖结构直接嵌入生成目标促进了空间有根据的表征,并支持解剖感知的医学视觉-语言建模。

英文摘要

Vision-language models (VLMs) pretrained on large-scale image-text pairs demonstrate strong image-level understanding, but are primarily optimized for global alignment and do not explicitly encode fine-grained anatomical structure, limiting their suitability for spatially precise tasks such as segmentation. We introduce CheXanatomy, a framework that integrates explicit anatomical knowledge into a pretrained VLM through autoregressive token-space supervision. Instead of adding task-specific decoder heads, the model is trained to generate anatomical segmentation masks via next-token prediction. To enable scalable supervision, we synthesize realistic chest radiographs from CT volumes and forward-project CT segmentation labels to obtain anatomically consistent 2D masks. We evaluate the approach on synthetic and real chest radiographs against a U-Net baseline, including ablations on model scale, input resolution, and vision encoder fine-tuning. Autoregressive anatomical supervision achieves performance comparable to specialized convolutional models in-distribution and demonstrates improved geometric robustness under domain shift to real CXR data. In addition, anatomy-pretrained models exhibit improved sample efficiency when adapting to novel localization tasks under limited supervision. Larger models and higher input image resolution improve performance, while vision encoder fine-tuning has limited effect. These results show that embedding anatomical structure directly into the generative objective promotes spatially grounded representations and supports anatomy-aware medical vision-language modeling.

2606.08414 2026-06-09 cs.RO cs.AI 新提交

PACT: Self-Evolving Physical Safety Alignment for Diffusion Policies in Embodied Manipulation

PACT: 具身操作中扩散策略的自我演化物理安全对齐

Lingxuan Wu, Zijian Zhu, Lizhong Wang, Chengyang Ying, Huayu Chen, Xiao Yang, Fangming Liu, Jun Zhu

发表机构 * Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua-Bosch Joint ML Center, THBI Lab, BNRist Center, Tsinghua University, Beijing, 100084, China(计算机科学与技术系,人工智能研究院,清华-博世联合机器学习中心,THBI实验室,BNRist中心,清华大学,北京,100084,中国) Peng Cheng Laboratory, 518108, China(鹏城实验室,518108,中国)

AI总结 提出PACT框架,通过自演化后训练将预训练扩散策略投影到约束可行区域,无需演示数据或任务奖励,在降低31.0%安全违规的同时提升30.7%任务成功率。

详情
AI中文摘要

扩散策略在机器人操作中取得了显著成功,但常常无法满足安全部署所需的严格物理约束。现有方法要么在训练期间过早施加安全约束,要么在测试时通过外部护栏被动应对,限制了策略的表达能力和整体可扩展性。我们提出物理安全对齐约束轨迹(PACT),这是一个自我演化的后训练框架,将预训练扩散策略投影到约束可行区域,无需访问演示数据或任务奖励。PACT通过跨时间步密集监督的反向KL目标将约束梯度蒸馏到扩散模型中。它采用课程学习逐步收紧约束,同时保持理论上界定的策略偏移和单调改进,减轻了灾难性遗忘带来的安全-性能权衡。在模拟和真实世界的具身操作基准测试中,PACT平均减少31.0%的安全违规,同时将任务成功率提升30.7%。

英文摘要

Diffusion policies have achieved remarkable success in robotic manipulation, yet they often fail to satisfy strict physical constraints required for safe deployment. Existing approaches impose safety either prematurely during training or reactively via external guardrails at test time, limiting policy expressivity and overall scalability. We propose Physical safety Alignment for Constrained Trajectories (PACT), a self-evolving post-training framework that projects pretrained diffusion policies onto constraint-feasible regions without accessing demonstration data or task rewards. PACT distills constraint gradients into the diffusion model through a reverse-KL objective with dense supervision across timesteps. It incorporates a curriculum that progressively tightens constraints while maintaining theoretically bounded policy shift and monotone improvement, mitigating the safety-performance trade-off from catastrophic forgetting. On simulated and real-world embodied manipulation benchmarks, PACT significantly reduces safety violations by 31.0% on average while improving task success by 30.7%.

2606.08411 2026-06-09 cs.CL 新提交

AsyncLane: Decoupling Refinement from Advancement in Diffusion Language Model Decoding

AsyncLane: 扩散语言模型解码中精炼与推进的解耦

Yingxuan Ren, Yuxuan Lou, Yong Liu, Pengcheng Fang, Ziming Wang, Pengfei Zhou, Yang You

发表机构 * National University of Singapore(新加坡国立大学) University of Southampton(南安普顿大学)

AI总结 提出AsyncLane,一种无需训练的解码调度器,通过将生成过程分叉为精炼和推进两个通道,解耦块间依赖,在保持质量的同时显著提升扩散语言模型的解码吞吐量。

详情
AI中文摘要

块级半自回归解码是扩散大语言模型(DLMs)的标准推理范式,但它强制块之间存在严格依赖:当前块完全解码或去噪预算耗尽之前,下一个块无法开始。我们观察到,一旦一个块暴露出可靠的分隔符边界或稳定的语义前缀,续写生成无需等待每个残差标记被解析。我们提出AsyncLane,一种无需训练的解码调度器,将精炼与推进解耦。AsyncLane在观察到的分隔符边界处将生成通道分叉为精炼通道和续写生成通道:前缀保持可编辑,而续写在前缀精炼完成之前推进。由此产生的通道树记录解码依赖关系和输出顺序,而执行则在活跃通道集上进行。为了使这种异步调度在双向注意力下高效,AsyncLane结合了共享前缀通道批处理、前瞻草稿重用、级联终止以及带有刷新-逻辑重用的紧凑缓存刷新,防止模型调用成本随通道数量线性增长。AsyncLane是块级DLM采样器的即插即用替代品,无需重新训练。在数学推理和代码生成实验表明,AsyncLane在保持竞争性质量的同时持续提高吞吐量。在LLaDA和Dream骨干网络上,AsyncLane在所有评估的基准长度设置中实现了最高的TPS;相对于最快的竞争基线,它在LLaDA上达到2.95倍峰值加速,在Dream上达到3.04倍,在较长生成预算下增益尤为显著。

英文摘要

Block-wise semi-autoregressive decoding is the standard inference paradigm for diffusion large language models (DLMs), but it imposes a strict dependency between blocks: the next block cannot begin until the current block is fully decoded or its denoising budget is exhausted. We observe that once a block exposes a reliable delimiter boundary or stable semantic prefix, continuation generation need not wait for every residual token to be resolved. We propose AsyncLane, a training-free decoding scheduler that decouples refinement from advancement. AsyncLane forks a generate lane at observed delimiter boundaries into a refine lane and a continuation generate lane: the prefix remains editable, while the continuation advances before prefix refinement finishes. The resulting lane tree records decoding dependencies and output order, while execution proceeds over the active lane set. To make this asynchronous schedule efficient under bidirectional attention, AsyncLane combines shared-prefix lane batching, lookahead draft reuse, cascading termination, and compact cache refresh with refresh-logit reuse, preventing model-call cost from scaling directly with the number of lanes. AsyncLane is a drop-in replacement for block-wise DLM samplers and requires no retraining. Experiments on mathematical reasoning and code generation show that AsyncLane consistently improves throughput while maintaining competitive quality. Across LLaDA and Dream backbones, AsyncLane achieves the highest TPS in all evaluated benchmark-length settings; relative to the fastest competing baseline, it reaches peak speedups of 2.95x on LLaDA and 3.04x on Dream, with especially large gains under longer generation budgets.

2606.08410 2026-06-09 cs.LG cs.AI 新提交

Provably Efficient Personalized Multi-Objective Bandits with Proactive Conversational Queries

具有主动对话查询的可证明高效个性化多目标老虎机

Linfeng Cao, Ming Shi, Ness B. Shroff

发表机构 * The Ohio State University(俄亥俄州立大学) University at Buffalo(布法罗大学)

AI总结 提出MO-PQUCB算法,通过主动查询获取用户偏好信号,结合Plackett-Luce模型和正则化UCB,解决多目标老虎机中偏好与奖励的耦合问题,实现更优的遗憾界。

Comments UAI 2026

详情
AI中文摘要

多目标老虎机中的个性化决策需要学习用户在不同竞争目标之间的特定权衡。由于臂的效用既取决于未知奖励又取决于未知偏好,现有方法仅从效用反馈中推断偏好,将偏好学习与奖励探索纠缠在一起。然而,在实践中,用户通常通过主动对话查询(例如,“便宜且干净的酒店”)揭示他们的优先级,但这种结构化信号未被利用。我们形式化了一个基于主动查询的框架,其中用户查询提供结构化的偏好信号。通过Plackett-Luce子集选择模型对这些信号进行建模,我们证明了由于基本的平移不变性障碍,仅查询学习是不够的。为了解决这个问题,我们引入了MO-PQUCB,一种混合算法,通过平移不变正则化和双探索UCB将基于查询的偏好锚定与老虎机反馈相结合。我们证明了主动查询加速了偏好估计,并相比先前偏好感知的MO-MAB方法实现了改进的遗憾缩放。在查询被破坏的情况下,我们进一步刻画了统计极限,并设计了一个鲁棒估计器,在破坏稀疏时实现接近最优的性能。实验验证了理论和实际收益。

英文摘要

Personalized decision-making in multi-objective bandits requires learning user-specific trade-offs among competing objectives. Since arm utility depends on both unknown rewards and unknown preferences, existing methods infer preferences only from utility feedback, entangling preference learning with reward exploration. In practice, however, users often reveal their priorities through proactive conversational queries (e.g., "cheap and clean hotel"), yet this structured signal is not leveraged. We formalize a proactive query-based framework in which user queries provide structured preference signals. Modeling these signals via a Plackett-Luce subset choice model, we show that query-only learning is insufficient due to a fundamental shift-invariance barrier. To resolve this, we introduce MO-PQUCB, a hybrid algorithm that integrates query-based preference anchoring with bandit feedback through shift-invariant regularization and dual-exploration UCB. We prove that proactive queries accelerate preference estimation and yield improved regret scaling over prior preference-aware MO-MAB methods. Under corrupted queries, we further characterize statistical limits and design a robust estimator achieving near-optimal performance when the corruption is sparse. Experiments validate both theoretical and practical gains.

2606.08408 2026-06-09 cs.CL cs.AI 新提交

TimpaTeks: Automatic In-place Text Sequence Modification via Diffusion Language Model Steering

TimpaTeks: 通过扩散语言模型引导实现自动原地文本序列修改

Ryandito Diandaru, Ikhlasul Akmal Hanif, Fadli Aulawi Al Ghiffari, Ahmed Elshabrawy, Alham Fikri Aji

发表机构 * MBZUAI(穆罕默德·本·扎耶德人工智能大学)

AI总结 提出TimpaTeks方法,将激活引导扩展到扩散语言模型,实现原地文本修改以改变概念,在情感和概念引导任务上降低困惑度并保持句子结构。

Comments 16 pages

详情
AI中文摘要

我们将激活引导扩展到扩散语言模型(DLM),并研究了一个由于DLM推理机制而产生的新问题:原地修改文本以呈现不同的概念。我们提出了TimpaTeks,一种使用DLM的自动原地文本修改机制。在IMDB电影评论(情感)和合成的猫狗数据集(任意、更非常规的概念引导)上的实验表明,TimpaTeks提供了一种可行的新机制来原地引导扩散语言模型的输出。TimpaTeks实现了原地修改,同时降低了句子困惑度并保留了原始句子结构,无需指令调优模型。与基于提示的DLM引导相比,TimpaTeks计算成本更低,因为它执行原地去噪,而不是构建额外的提示条件输出序列。

英文摘要

We extend activation steering to diffusion language models (DLMs) and study a novel problem that arose due to the inference mechanism of DLMs: Modifying a text in-place to manifest a different concept. We propose TimpaTeks, an automatic in-place text modification mechanism using DLMs. Experiments on IMDB movie reviews (sentiment) and a synthetic Cats and Dogs Dataset (arbitrary, more unconventional concept steering) show that TimpaTeks provides a feasible novel mechanism to steer diffusion language model outputs in-place. TimpaTeks enables in-place modification while simultaneously lowers sentence perplexity and retaining the original sentence structre without the need of instruction tuned models. TimpaTeks is also computationally cheaper than prompt-based DLM steering, as it performs denoising in-place rather than constructing an additional prompt-conditioned output sequence.

2606.08405 2026-06-09 cs.AI physics.flu-dyn 新提交

Self-Evolving Scientific Agent Discovers Generalizable Physically-Reasoned Fluid Control

自进化科学智能体发现可泛化的物理推理流体控制

Boai Sun, Wenjin Guo, Zongmin Yu, Liu Yang

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 提出一种由大语言模型驱动的自进化科学智能体工作流,通过迭代代码生成和物理仿真诊断,自动构建可解释的控制器,并在欠驱动双关节狗鲨游泳器目标到达任务中实现零样本泛化。

详情
AI中文摘要

虽然数据密集的深度强化学习可以优化复杂的控制策略,但物理系统中的科学发现从根本上需要一条可解释的推理链,将物理证据与结构化控制架构联系起来。本文提出了一种自进化的科学智能体工作流,由大语言模型和迭代代码生成驱动,在保持严格可解释性和严谨物理推理的同时,自动构建控制器。该智能体不是调整权重,而是将候选策略部署到物理仿真中,从多模态证据中主动诊断动态行为,并将这些观察转化为渐进的源代码改进。我们在一个高度非线性的流固耦合问题上展示了该框架:一个欠驱动的双关节狗鲨游泳器,仅使用关节角加速度完成空间目标到达任务。从表现出单侧转向偏差的推进种子策略开始,智能体自主发现并改进了一个统一控制器,稳健地捕获所有典型目标。值得注意的是,无需任何重新训练或特定目标分支,合成的控制策略就能泛化到未见过的静态目标和动态曲线追踪轨迹。可审计的进化日志揭示了一个基于行波推进、体坐标系目标引导、偏航率反馈、有符号平均尾曲率和自适应节奏缓解的涌现控制架构。我们的结果表明,自主科学智能体能够成功地将累积的物理证据转化为稳健、数学可读的控制策略,同时保持完全可追溯的科学发现过程。

英文摘要

While data-intensive deep reinforcement learning can optimize complex control policies, scientific discovery in physical systems fundamentally requires an interpretable chain of reasoning that connects physical evidence to structured control architectures. Here, we present a self-evolving scientific-agent workflow, driven by large language models and iterative code generation, that automates controller construction while preserving strict interpretability and rigorous physical reasoning. Instead of adjusting weights, the agent deploys candidate strategies into physical simulations, actively diagnoses dynamic behaviors from multimodal evidence, and translates these observations into progressive source-code refinements. We demonstrate this framework on a highly non-linear fluid-structure interaction problem: an underactuated, two-joint dogfish swimmer tasked with spatial target reaching using only joint angular accelerations. Starting from a propulsive seed policy that exhibits a one-sided steering bias, the agent autonomously discovers and refines a unified controller that robustly captures all canonical targets. Remarkably, without any retraining or target-specific branching, the synthesized control policy generalizes to unseen static targets and dynamically curved pursuit trajectories. The auditable evolve log reveals an emergent control architecture built upon traveling-wave propulsion, body-frame target guidance, yaw-rate feedback, signed mean-tail curvature, and adaptive cadence relief. Our results show that an autonomous scientific agent can successfully transform accumulated physical evidence into robust, mathematically readable control policy, while maintaining a fully traceable process of scientific discovery.

2606.08404 2026-06-09 cs.CV 新提交

Geometry-Driven Flow Analysis of Brain Sulcal Pattern

脑沟模式的几何驱动流分析

Moo K. Chung, Luigi Maccotta, Aaron Struck

发表机构 * GitHub

AI总结 提出基于泊松方程的几何驱动流框架,通过平均曲率建模皮层折叠,生成光滑势场梯度定义物理通量,用于分析青少年肌阵挛癫痫的皮层结构异常。

详情
AI中文摘要

皮层折叠反映了协调的神经发育过程,并日益被认为是神经系统疾病的敏感标志。然而,现有大多数分析依赖于间接的标量摘要,并未明确建模折叠几何本身。在青少年肌阵挛癫痫(JME)中,一种常见的遗传性癫痫,皮层异常通常是微妙的、空间分布的,并且难以使用传统的形态测量指标检测。我们引入了一个基于泊松方程的框架,将皮层折叠建模为源自皮层流形上平均曲率的几何驱动流。通过将折叠模式视为静态的源-汇结构,所提出的方法产生了一个光滑的、全局平衡的势场,其表面梯度定义了物理上可解释的通量。该框架能够对脑沟-脑回折叠组织进行空间连贯的分析,并为JME中几何驱动的皮层结构提供了原则性的表示。

英文摘要

Cortical folding reflects coordinated neurodevelopmental processes and is increasingly recognized as a sensitive marker of neurological disease. However, most existing analyses rely on indirect scalar summaries that do not explicitly model folding geometry itself. In juvenile myoclonic epilepsy (JME), a common genetic epilepsy, cortical abnormalities are often subtle, spatially distributed, and difficult to detect using conventional morphometric measures. We introduce a Poisson-equation-based framework that models cortical folding as a geometry-driven flow derived from mean curvature on the cortical manifold. By treating folding patterns as a stationary source-sink structure, the proposed approach yields a smooth, globally balanced potential field whose surface gradient defines a physically interpretable flux. This framework enables spatially coherent analysis of sulcal-gyral folding organization and provides a principled representation of geometry-driven cortical structure in JME.

2606.08397 2026-06-09 cs.CL cs.IR 新提交

TrustMargin: Training-Free Arbitration between Parametric Memory and Retrieved Evidence in Large Language Models

TrustMargin: 大语言模型中参数化记忆与检索证据之间的无训练仲裁

Jingyan Xu, Hong Shi, Yi Shan, Penghui Liu, Yunhao Bai, Ningyuan Li, Xueyang Liu

发表机构 * Peking University(北京大学)

AI总结 针对大语言模型在知识问答中参数记忆与检索证据冲突的问题,提出无训练仲裁层TrustMargin,利用模型自身似然度评分选择更可信的答案,无需微调或外部评判。

Comments 13 pages, 6 figures, 9 tables. Code and data are available at https://github.com/mojixu/TrustMargin.git

详情
AI中文摘要

大语言模型通过参数化记忆和检索证据回答知识密集型问题,但两种来源并非都可靠。检索可以填补知识空白,但干扰性段落可能覆盖正确的闭卷答案。我们将这种生成后冲突视为答案级源仲裁:给定来自同一冻结模型的直接和RAG答案,决定信任哪个源。我们提出TRUSTMARGIN,一个无训练、即插即用的仲裁层,它使用模型自身的似然度对两个现有候选答案进行评分。它结合了参数先验边际(测试记忆是否接受检索答案)和证据绑定边际(折扣仅段落显著性并衡量问题特定支持)。TRUSTMARGIN在直接和RAG之间进行选择,无需微调、外部评判或额外生成。在2WIKIMQA和CWQA上使用三种LLaMA规模,TRUSTMARGIN一致优于直接生成和BM25-RAG,恢复了直接/RAG oracle差距的一部分,并推广到多个无训练RAG流水线。

英文摘要

Large language models answer knowledge-intensive questions using both parametric memory and retrieved evidence, but neither source is uniformly reliable. Retrieval can fill knowledge gaps, yet distracting passages may override correct closed-book answers. We study this post-generation conflict as answer-level source arbitration: given Direct and RAG answers from the same frozen model, decide which source to trust. We propose TRUSTMARGIN, a training-free, plug-and-play arbitration layer that scores the two existing candidates with the model's own likelihoods. It combines a parametric-prior margin, which tests whether memory accepts the retrieved answer, with an evidence-binding margin, which discounts passage-only salience and measures question-specific support. TRUSTMARGIN selects between Direct and RAG without fine-tuning, external judges, or additional generation. Across 2WIKIMQA and CWQA with three LLaMA scales, TRUSTMARGIN consistently improves over Direct generation and BM25-RAG, recovers part of the Direct/RAG oracle gap, and generalizes to multiple training-free RAG pipelines.

2606.08394 2026-06-09 cs.CL 新提交

When Correct Decisions Hide Internal Stress: Decision-State Probing in Multimodal Language Models

当正确决策隐藏内部压力:多模态语言模型中的决策状态探测

Haoran Zhao, Soyeon Caren Han, Eduard Hovy

发表机构 * The University of Melbourne(墨尔本大学)

AI总结 提出S³E框架,通过正锚定A/B强制选择任务和隐藏状态分析,发现多模态语言模型在正确行为下仍存在语义压力导致的决策状态位移。

详情
AI中文摘要

多模态语言模型通常通过外部行为进行评估:选择正确的图像-文本匹配、拒绝无支持的标题或正确回答视觉查询。然而,仅凭正确行为并不能证明模型的内部决策状态在受控语义压力下保持稳定。我们通过S$^3$E(结构化语义压力评估)框架研究这一差距,该框架用于分析多模态语言模型中行为-内部解耦。S$^3$E使用正锚定的A/B强制选择设置,其中图像支持的标题与语义压力候选进行对比,并在原始和交换选项顺序下进行,同时在回答前的决策状态提取隐藏状态。我们专注于严格正确的试验,即模型在两种顺序下都一致选择正确标题。我们不将任意的隐藏状态变化视为不稳定的证据,而是测量语义冲突候选是否相对于保持意义的控制项导致过度的决策状态位移。在Qwen3VL、Gemma3和InternVL3上,尽管强制选择行为正确,语义压力相对于词汇控制项始终产生显著的正选定层过度位移,而与随机负样本的比较则依赖于模型。我们将此解释为有范围的决策状态压力敏感性信号,而非下游失败或幻觉的证据。我们的结果表明,仅凭强制选择正确性不足以证明内部决策几何的不变性。

英文摘要

Multimodal language models are typically evaluated through external behavior: selecting the correct image--text match, rejecting unsupported captions, or answering visual queries correctly. However, correct behavior alone does not show that the model's internal decision state remains stable under controlled semantic stress. We study this gap through S$^3$E (Structured Semantic Stress Evaluation), a framework for analyzing behavior-internal decoupling in multimodal language models. S$^3$E uses a positive-anchored A/B forced-choice setup in which an image-supported caption is contrasted against semantic stress candidates under both original and swapped option orders, while hidden states are extracted at the pre-answer decision state. We focus on strict-correct trials, where the model consistently selects the correct caption across both orders. Rather than treating arbitrary hidden-state variation as evidence of instability, we measure whether semantic-conflict candidates induce excess decision-state displacement relative to meaning-preserving controls. Across Qwen3VL, Gemma3, and InternVL3, semantic stress consistently produces positive selected-layer excess displacement over lexical controls despite correct forced-choice behavior, while comparisons against random negatives are model-dependent. We interpret this as a scoped decision-state stress-sensitivity signal rather than evidence of downstream failure or hallucination. Our results suggest that forced-choice correctness alone is not a sufficient certificate of invariant internal decision geometry.

2606.08390 2026-06-09 cs.LG stat.ML 新提交

When Are Neural Interaction Discoveries Real? Identifiability, Recoverability, and a Pre-Fit Diagnostic

神经交互发现何时是真实的?可辨识性、可恢复性与拟合前诊断

Valentina Kuskova, Dmitry Zaytsev, Michael Coppedge

发表机构 * University of Washington(华盛顿大学)

AI总结 研究神经时间序列模型中交互发现的真实性问题,提出基于输入支持几何的可辨识性理论,并给出有效秩作为拟合前诊断工具。

Comments 11 pages, 3 figures

详情
AI中文摘要

当神经时间序列模型报告一个变量调节另一个变量对目标的影响时,发现的交互是数据的属性还是模型灵活性的伪影?我们认为这本质上是一个可辨识性问题,由观测输入支持的几何结构决定,而非特定的神经架构。我们在神经加性向量自回归(GNAVAR)的乘法门控扩展中研究该问题,其中源贡献由其他滞后变量调节。我们表明表示能力不等于可辨识性:依赖输入会在边特定交互项之间引入泄漏,低维支持允许不同的交互分解,这些分解在观测数据上一致但在其他地方不同。然后,我们在显式支持条件下(包括共享调节器设置)证明了归一化最小GNAVAR分解的总体可辨识性定理。该理论产生了一个简单的面向实践者的诊断:联合滞后块协方差的有效秩在拟合前预测对于给定候选集交互恢复是否可行。当候选集未知时,双种子稳定性检查提供了实用的操作测试。相同的支持条件将经验结果组织成理论预测的三种状态。我们的结果表明,交互可恢复性取决于支持几何,有效秩提供了实用的拟合前诊断,并且独立拟合之间的不稳定性是非可辨识交互发现的特征标志。可辨识性现象、支持条件和不稳定性标志是模型无关的;GNAVAR是使它们可证明的载体。

英文摘要

When a neural time-series model reports that one variable modulates another's effect on a target, is the discovered interaction a property of the data or an artifact of model flexibility? We argue that this is fundamentally a question of identifiability, governed by the geometry of the observed input support rather than by the specific neural architecture. We study the problem in a multiplicative-gating extension of neural additive vector autoregression (GNAVAR), in which source contributions are modulated by other lagged variables. We show that representational capacity is not identifiability: dependent inputs induce leakage between edge-specific interaction terms, and low-dimensional support permits distinct interaction decompositions that agree on the observed data while differing elsewhere. We then prove a population identifiability theorem for normalized minimal GNAVAR decompositions under explicit support conditions, including settings with shared modulators. The theory yields a simple practitioner-facing diagnostic: the effective rank of the joint lag-block covariance predicts, before fitting, whether interaction recovery is feasible for a given candidate set. When the candidate set is unknown, a two-seed stability check provides a practical operational test. The same support condition organizes empirical outcomes into the three states predicted by the theory. Our results show that interaction recoverability depends on support geometry, that effective rank provides a practical pre-fit diagnostic, and that instability across independent fits is a characteristic signature of non-identifiable interaction discovery. The identifiability phenomenon, the support condition, and the instability signature are model-agnostic; GNAVAR is the vehicle that makes them provable.

2606.08388 2026-06-09 cs.LG math.OC stat.ML 新提交

The Spectral Dynamics and Noise Geometry of Muon

Muon的谱动力学与噪声几何

Pierfrancesco Beneventano, Mahmoud Abdelmoneum, Tomaso Poggio

发表机构 * Massachusetts Institute of Technology(麻省理工学院)

AI总结 研究Muon优化器通过极分解替换矩阵梯度,证明其偏置为平坦谱,在欠定回归中导出奇异值动力学,实验表明其效果依赖于谱方向活跃度。

Comments 24 pages, 11 figures

详情
AI中文摘要

Muon将矩阵梯度$G=UΣV^\ op$替换为其极因子$UV^\ op$。这保留了梯度选择的奇异方向,但使更新谱平坦。我们研究此操作产生的优化偏置。在显式对齐假设下,我们证明在利用梯度奇异方向且不适应当前权重谱的有界更新中,极更新是单步熵最大化的选择。在欠定回归模型中,我们推导了连续时间Muon的精确奇异值动力学,并识别出一个依赖于测量的条件,在该条件下归一化谱趋向于相等的非零奇异值。这种几何也排除了常见的低秩解释:在固定Frobenius范数下,Muon的区分状态具有平坦谱,而核范数最小化则偏好谱集中。受控矩阵感知实验将效应与简单梯度缩放分离,表明范数匹配的梯度下降不能复现Muon,并在广泛消融中恢复预测的平坦化趋势。在小型NanoGPT预训练中,Muon保持稳定秩,具有宽学习率平台,并相对于AdamW改善验证损失;在匹配的小型ViT对照中,排名反转。由此得出的图景是依赖于区域的:Muon并非普遍优越,但其平坦谱偏置在需要保持许多谱方向活跃时可能有所帮助。

英文摘要

Muon replaces a matrix gradient $G=UΣV^\top$ by its polar factor $UV^\top$. This keeps the singular directions selected by the gradient, but makes the update spectrum flat. We study the optimization bias created by this operation. Under explicit alignment assumptions, we prove that the polar update is the one-step entropy-maximizing choice among bounded updates that use the gradient singular directions and do not adapt to the current weight spectrum. In an underdetermined regression model, we derive exact singular-value dynamics for continuous-time Muon and identify a measurement-dependent condition under which the normalized spectrum moves toward equal nonzero singular values. This geometry also rules out a common low-rank interpretation: at fixed Frobenius norm, Muon's distinguished state has a flat spectrum, whereas nuclear-norm minimization favors spectral concentration. Controlled matrix-sensing experiments separate the effect from simple gradient rescaling, show that norm-matched gradient descent does not reproduce Muon, and recover the predicted flattening trend across broad ablations. In small NanoGPT pretraining, Muon preserves stable rank, has a broad learning-rate plateau, and improves validation loss relative to AdamW; in a matched small-ViT control, the ranking reverses. The resulting picture is regime-dependent: Muon is not universally superior, but its flat-spectrum bias can help when many spectral directions need to remain active.

2606.08382 2026-06-09 cs.LG cs.AI 新提交

STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control

STAR-KV:通过软阈值实现自适应秩控制的低秩KV缓存压缩

Priyansh Bhatnagar, Ashkan Moradifirouzabadi, Se-Hyun Yang, SeungJae Lee, Jungwook Choi, Mingu Kang

发表机构 * University of Washington(华盛顿大学)

AI总结 提出STAR-KV框架,通过可微阈值机制实现注意力头和块级别的自适应秩选择,结合混合分解和低秩感知混合精度量化,在多种LLM上达到75%的KV缓存压缩,结合量化可减少20倍,并实现6.9倍注意力模块加速和3.1倍端到端生成吞吐提升。

详情
AI中文摘要

低秩投影通过利用隐藏维度冗余已成为压缩KV缓存的一种有前景的方法。然而,先前的方法依赖于固定或启发式秩选择,难以在最小精度损失下实现激进压缩。我们提出STAR-KV,一种具有细粒度秩控制的自适应低秩KV缓存压缩框架。STAR-KV包括:1)可微阈值机制,可在注意力头和块级别实现最优秩选择;2)混合分解策略,根据键和值投影的敏感性应用不同的低秩分解;3)低秩感知混合精度量化,利用数据统计实现近乎无损的低比特量化。在多个LLM和基准测试中评估,STAR-KV实现了高达75%的KV缓存压缩,结合量化可实现高达20倍的整体KV缓存减少。通过基于Triton的自定义GPU内核,STAR-KV为注意力模块提供高达6.9倍的加速,端到端生成吞吐量提升3.1倍。我们的代码公开在:https://github.com/PriyanshBhatnagar/STAR-KV。

英文摘要

Low-rank projection has emerged as a promising approach for compressing the KV cache by exploiting hidden-dimension redundancy. However, prior methods rely on fixed or heuristic rank selection and struggle to achieve aggressive compression with minimal accuracy degradation. We propose STAR-KV, an adaptive low-rank KV cache compression framework with fine-grained rank control. STAR-KV encompasses 1) a differentiable thresholding mechanism that enables optimal rank selection at both attention-head and block levels, 2) a hybrid decomposition strategy that applies different low-rank factorizations according to the sensitivity of key and value projections, and 3) a low-rank-aware mixed precision quantization that leverages data statistics for near lossless low-bit quantization. Evaluated across multiple LLMs and benchmarks, STAR-KV achieves up to 75% KV cache compression and up to 20x overall KV cache reduction when combined with quantization. Enabled by custom Triton-based GPU kernels, STAR-KV delivers up to 6.9x speedup for the attention module and 3.1x end-to-end generation throughput. Our code is publicly available at: https://github.com/PriyanshBhatnagar/STAR-KV.