arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2056
专题追踪
2605.31021 2026-06-01 cs.AI cs.CL cs.LG

A Persona-Based Evaluation Framework for Pluralistic Alignment in Generative AI

基于人格的生成式AI多元对齐评估框架

Atahan Karagoz

发表机构 * Atahan Karagöz(阿塔汗·卡拉戈兹)

AI总结 提出一种状态空间约束仿真框架,通过合成认知轮廓替代单一评估函数,实现反映真实世界共识变异性的多元、视角依赖的基准测试,并分析仿真评估者的稳定性问题,论证动态调节机制的必要性。

详情
AI中文摘要

当前生成式人工智能的对齐范式主要依赖单一基准测试框架,将人类判断的多元性简化为聚合统计基线,从而掩盖了评估中的文化、人口和语境变异性。我们引入一种用于AI评估的状态空间约束仿真框架,用代表不同人类视角的合成认知轮廓的结构化流形替代单一评估函数。我们表明,现代生成架构能够以高度一致性实例化和维护这些评估人格,从而实现一种更接近现实世界共识变异性的多元、视角依赖的基准测试。然而,我们进一步分析了这些模拟评估者在顺序推理和随机提示扰动下的稳定性,揭示了人格一致性的系统性退化,表现为状态空间漂移和语义不一致。这些发现表明,静态对齐约束不足以维持随时间推移的稳健评估行为。相反,我们主张必须在生成系统中嵌入动态的、可行性驱动的调节机制,以保持连贯的认知仿真。通过将基于人格的评估视为潜在表征流形上的结构化动力系统,本研究为更自适应、更符合人类、更注重语境的AI评估方法奠定了基础。

英文摘要

Current alignment paradigms for generative artificial intelligence rely predominantly on monolithic benchmarking frameworks that reduce the plurality of human judgment to aggregated statistical baselines, thereby obscuring cultural, demographic, and contextual variability in evaluation. We introduce a state-space constrained emulation framework for AI evaluation that replaces singular assessment functions with a structured manifold of synthetic cognitive profiles representing diverse human perspectives. We show that modern generative architectures can instantiate and maintain these evaluative personas with high consistency, enabling a form of pluralistic, perspective-dependent benchmarking that more closely reflects real-world consensus variability. However, we further analyze the stability of these simulated evaluators under sequential inference and stochastic prompt perturbations, revealing systematic degradation in persona coherence that manifests as state-space drift and semantic inconsistency. These findings suggest that static alignment constraints are insufficient for sustaining robust evaluative behavior over time. Instead, we argue for the necessity of embedding dynamic, viability-driven regulatory mechanisms within generative systems to preserve coherent cognitive emulation. By framing persona-based evaluation as a structured dynamical system over latent representation manifolds, this study provides a foundation for more adaptive, human-aligned, and context-sensitive approaches to AI evaluation.

2605.31016 2026-06-01 cs.LG

An Efficient and Scalable Graph Condensation with Structure-Preserving

一种高效且可扩展的保结构图压缩方法

Yulin Hu, Fuyan Ou, Ye Yuan

发表机构 * Southwest University(西南大学)

AI总结 提出一种解耦节点压缩与图结构生成的保结构图压缩方法(SP-ESGC),通过热核特征传播和混合聚类策略实现高效图压缩,并利用预训练边预测器生成可迁移的结构模式,在保持高计算效率的同时提升跨GNN架构的泛化能力。

详情
AI中文摘要

图压缩(GC)对于在资源受限场景中部署图神经网络(GNN)至关重要,它通过将大规模图压缩为紧凑的合成图来实现。现有的GC方法通常由于耦合优化而面临计算效率低的问题,并且在不同GNN架构上泛化能力差。为了解决这些挑战,本研究提出了一种高效且可扩展的保结构图压缩方法(SP-ESGC),该方法采用解耦设计,将节点压缩与图结构生成分离。具体来说,首先利用热核特征传播,通过谱图理论启发的扩散生成节点表示。进一步,设计了一种新颖的混合聚类策略,从节点表示中提取判别性的类内质心。最后,一个预训练的边预测器从原始图中推断可迁移的结构模式,确保合成图的准确生成。在真实世界图数据集上的大量实验表明,所提出的SP-ESGC实现了精确的图压缩,同时具有显著高的计算效率。此外,SP-ESGC在多种GNN架构上也具有良好的泛化能力。

英文摘要

Graph condensation (GC) is pivotal for enabling Graph Neural Networks (GNNs) deployment in resource-constrained scenarios by compressing large-scale graphs into compact synthetic counterparts. Existing GC methods commonly suffer from computational inefficiency due to coupled optimization as well as encountering poor generalization across GNN architectures. To address these challenges, this study proposes an Efficient and Scalable Graph Condensation with Structure-Preserving (SP-ESGC), which possesses a decoupled design that separates node condensation from graph structure generation. Specifically, it first employs heat kernel feature propagation to generate node representation via spectral graph theory-inspired diffusion. Further, a novel hybrid clustering strategy is designed to extracts discriminative intra-class centroids from the node representation. Finally, a pre-trained edge predictor infers transferable structural patterns from the original graph, ensuring accurate synthetic graph generation. Extensive experiments on real-world graph datasets demonstrate that the proposed SP-ESGC implementes a precise GC with significantly high computational efficiency. Moreover, SP-ESGC also generalizes well across diverse GNN architectures.

2605.31013 2026-06-01 cs.LG

Physics-Informed Coarsening for Multigrid Graph Neural Surrogates

物理信息粗化用于多重网格图神经网络代理

Amir Bazzi, David Cardinaux, Ramy Nemer, Jose Alaves, Arjun Kalkur Matpadi Raghavendra, Elie Hachem

发表机构 * Amir Bazzi(阿米尔·巴齐) David Cardinaux(大卫·卡迪纳克斯) Ramy Nemer(拉米·纳默) José Alves(若泽·阿尔维斯) Arjun Kalkur(阿鲁金·卡尔库) Matpadi Raghavendra(马特帕迪·拉吉文德拉) Elie Hachem(埃利·哈克)

AI总结 针对固体力学中的非线性弹性、塑性和瞬态行为,提出一种结合物理信息粗化策略的多重网格图神经网络,通过基于残差的局部活动评分保留高应变/应力区域,实现分层消息传递,提升长期滚动稳定性和精度。

Comments Accepted at ICML 2026. 16 pages, 5 figures

详情
AI中文摘要

基于学习的偏微分方程代理最近在流体设置和结构化几何中达到了经典求解器的精度,同时实现了数量级的加速。相比之下,尽管存在非线性弹性、塑性和瞬态行为挑战标准架构,但针对可变形固体的鲁棒代理仍未得到充分探索。我们提出了一种用于固体力学的多重网格图神经网络,它将编码器-处理器-解码器主干与物理信息粗化策略相结合。我们的方法不是通过几何启发式进行下采样,而是使用基于残差的局部物理活动度量对节点进行评分,并优先保留高应变或应力集中区域,在最需要的地方分配多尺度容量。这通过分层消息传递保留了长程相互作用,同时提高了长期滚动的稳定性。我们在涵盖线性、非线性和瞬态状态的多个数据集上进行评估,并观察到与标准采样基线相比,在精度和滚动稳定性方面的一致提升。我们的结果突出了物理信息粗化对于固体力学中可扩展代理建模的重要性。

英文摘要

Learning-based surrogates for partial differential equations have recently matched the accuracy of classical solvers while achieving orders-of-magnitude speedups, predominantly in fluid settings and structured geometries. In contrast, robust surrogates for deformable solids remain underexplored, despite the presence of nonlinear elasticity, plasticity, and transient behavior that challenge standard architectures. We introduce a multigrid graph neural network for solid mechanics that couples an encoder-processor-decoder backbone with a physics-informed coarsening strategy. Instead of downsampling via geometric heuristics, our method scores nodes using a residual-based measure of local physical activity and preferentially retains regions of high strain or stress concentration, allocating multiscale capacity where it is most needed. This preserves long-range interactions through hierarchical message passing while improving stability over long rollouts. We evaluate on multiple datasets covering linear, nonlinear, and transient regimes, and observe consistent gains in accuracy and rollout stability compared to standard sampling baselines. Our results highlight the importance of physics-informed coarsening for scalable surrogate modeling in solid mechanics.

2605.31010 2026-06-01 cs.CL

MoG: Mixture of Experts for Graph-based Retrieval-Augmented Generation

MoG:用于基于图的检索增强生成的混合专家模型

Zheng Yuan, Chuang Zhou, Linhao Luo, Siyu An, Di Yin, Xing Sun, Xiao Huang

发表机构 * The Hong Kong Polytechnic University(香港理工大学) Monash University(墨尔本大学) Tencent Youtu Lab(腾讯优图实验室)

AI总结 提出MoG框架,通过组织知识为中心枢纽图和稀疏激活的专家图,利用拓扑感知路由器动态选择相关专家图,以解决检索增强生成中统一知识库引入无关信息的问题,在MuSiQue上相对提升超过20%。

详情
AI中文摘要

检索增强生成被广泛研究以将大型语言模型建立在外部证据上。然而,从统一的知识库中检索可能会不可避免地引入无关信息,从而误导复杂推理的生成。受混合专家(MoE)条件计算的启发,其中路由器为每个输入稀疏地选择专门的专家以及共享专家,我们提出了用于基于图的检索增强生成的混合专家模型,即MoG。它将知识组织为两个核心组件:(i)多样且始终可访问的枢纽图,编码语义和结构上的核心知识,并为专家激活提供上下文线索;(ii)稀疏激活的专家图,包含特定领域的证据。MoG首先访问枢纽图以识别一般证据并推导上下文线索。然后,一个拓扑感知路由器根据查询动态激活一组有限的专家图,从而将检索限制在一个集中的证据子空间中。在具有挑战性的基准测试上的大量实验表明,MoG始终优于强基线,在MuSiQue上相对提升超过20%。我们的代码可在https://github.com/DEEP-PolyU/MoG获取。

英文摘要

Retrieval-augmented generation is intensively studied to ground large language models on external evidence. However, retrieving from a unified knowledge base could inevitably introduce irrelevant information that may mislead generation for complex reasoning. Inspired by the conditional computation of mixture of experts (MoE), where a router sparsely selects specialized experts alongside shared ones for each input, we propose \textbf{M}ixture \textbf{o}f experts for \textbf{G}raph-based Retrieval-Augmented Generation, i.e., \textbf{MoG}. It organizes knowledge into two core components: (i) diverse, always-accessible hub graphs that encode semantically and structurally central knowledge and provide contextual clues for expert activation, and (ii) sparsely activated expert graphs that contain domain-specific evidence. MoG first accesses hub graphs to identify general evidence and derive contextual clues. Then, a topology-aware router dynamically activates a limited set of expert graphs conditioned on the query, thereby confining retrieval to a focused evidence subspace. Extensive experiments on challenging benchmarks show that MoG consistently outperforms strong baselines, with over 20\% relative improvement on MuSiQue. Our code is available in https://github.com/DEEP-PolyU/MoG.

2605.31007 2026-06-01 cs.LG cs.AI

DEM: A Distilled Explanation Model for Interpretable Anomaly Detection in Physiological Sensor Networks

DEM:面向生理传感器网络中可解释异常检测的蒸馏解释模型

Jyotirmoy Singh, Anushka Roy, Shreea Bose, Chittaranjan Hota

发表机构 * Department of Computer Science and Information Systems(计算机科学与信息系统系) Department of Electrical and Electronics Engineering(电气与电子工程系)

AI总结 提出一种三阶段玻璃箱框架DEM,通过将梯度提升专家模型的知识蒸馏到基于线性基线残差的决策树中,实现高精度与内在可解释性的异常检测,并引入蒸馏保真度指标量化解释可信度。

Comments 21 pages, 10 figures, 7 tables. Code: https://github.com/Jyotirmoy17/dem-model

详情
AI中文摘要

无线体域网(WBANs)中生理传感器数据的异常检测可能由传感器故障、网络中断或数据缺失引起,导致误报。因此,它既需要高预测精度,也需要临床可解释的解释。现有方法要么依赖性能强但无透明度的黑盒模型,要么依赖SHAP和LIME等事后解释方法。本文提出蒸馏解释模型(DEM),一个三阶段玻璃箱框架,将梯度提升专家模型的非线性知识蒸馏到基于线性基线残差的可解释决策树中,使得解释不是近似而是预测本身。DEM引入了一种新颖的蒸馏保真度指标,量化解释树忠实捕捉专家模型非线性贡献的程度,提供了先前可解释模型所缺乏的解释可信度的原则性度量。在包括MIMIC-IV、WESAD、eICU和内部SmartNet WBAN语料库在内的四个生理数据集上评估,DEM在临床上下文异常检测上达到0.9964的AUC,在可穿戴压力检测上达到0.9047,同时以可控深度生成人类可读的if-then规则。推理每1000个样本需要0.17ms,使DEM比基于SHAP的事后解释快1235倍,适用于实时生理监测。消融研究证实,XGBoost蒸馏步骤比朴素残差拟合提供了可测量的增益,深度敏感性分析展示了DEM在现有内在可解释模型中独有的、用户可控的准确性-可解释性权衡。

英文摘要

Anomaly detection in physiological sensor data from Wireless Body Area Networks (WBANs) can be caused by sensor faults, network disruptions, or missing data, leading to false alarms. Hence, it demands both high predictive accuracy and clinically interpretable explanations. Existing approaches rely either on black-box models that achieve strong performance but offer no transparency, or on post-prediction explanation methods such as SHAP and LIME. In this paper, we propose the Distilled Explanation Model (DEM), a three-stage glass-box framework that distills the non-linear knowledge of a gradient boosting expert into an interpretable decision tree operating on residuals relative to a linear baseline, so that the explanation is not an approximation but the prediction itself. DEM introduces a novel distillation fidelity metric that quantifies how faithfully the explanation tree captures the expert model's non-linear contribution, providing a principled measure of explanation trustworthiness absent from prior interpretable models. Evaluated across four physiological datasets, including MIMIC-IV, WESAD, eICU, and an in-house SmartNet WBAN corpus, DEM achieves an AUC of 0.9964 on clinical contextual anomaly detection and 0.9047 on wearable stress detection while producing human-readable if-then rules at a controllable depth. Inference requires 0.17ms per 1000 samples, rendering DEM 1235x faster than SHAP-based post-hoc explanation and suitable for real-time physiological monitoring. Ablation studies confirm that the XGBoost distillation step provides measurable gains over naive residual fitting, and depth-sensitivity analysis demonstrates an explicit, user-controlled accuracy-interpretability trade-off unique to DEM among existing intrinsically interpretable models.

2605.31005 2026-06-01 cs.LG

Learning Multi-Agent Coordination via Sheaf-ADMM

通过 Sheaf-ADMM 学习多智能体协调

Jeffrey Seely, Bartłomiej Cupiał, Llion Jones

发表机构 * universityofwarsaw(华沙大学)

AI总结 提出一种可微优化框架,利用细胞层(sheaf)和ADMM实现多智能体协调,在迷宫路径规划、图像分类和数独任务中验证了其有效性,并展现出优于标准消息传递架构的可解释性和鲁棒性。

Comments 17 pages, 8 figures, 6 tables. Accepted at ICML 2026

详情
AI中文摘要

我们提出了一种用于多智能体协调的可微优化框架。输入被分解为重叠的局部视图,每个视图由一个智能体处理,该智能体求解由神经编码器参数化的凸子问题。智能体通过交替方向乘子法(ADMM)进行协调,其中智能体间的约束由细胞层(cellular sheaf)指定。该层指定了相邻解必须在哪些方面达成一致,从而允许异构的全局共识概念。通过展开的优化进行反向传播,联合训练多智能体系统的所有组件。我们在迷宫路径规划、图像分类和数独任务上进行了评估,在这些任务中,局部视图单独不足的智能体学会了协调以产生正确的全局输出。在MNIST上,相对于标准CNN,局部视图分解提高了对分布偏移的鲁棒性。在数独上,优化导出的结构比参数匹配的MPNN基线产生了显著更高的求解率。最后,ADMM结构暴露了不同的原始、共识和对偶状态变量,使得协调动态可以直接分析和干预——这是标准消息传递架构所不具备的特性。

英文摘要

We present a differentiable optimization framework for multi-agent coordination. An input is decomposed into overlapping local views, each processed by an agent that solves a convex subproblem parameterized by a neural encoder. Agents coordinate through the Alternating Direction Method of Multipliers (ADMM) with inter-agent constraints specified by a cellular sheaf. The sheaf specifies which aspects of neighboring solutions must agree, allowing for heterogeneous notions of global consensus. Backpropagating through the unrolled optimization jointly trains all components of the multi-agent system. We evaluate on maze pathfinding, image classification, and Sudoku, where agents with individually insufficient local views learn to coordinate to produce correct global outputs. On MNIST, the local-view decomposition yields improved robustness to distribution shifts relative to a standard CNN. On Sudoku, the optimization-derived structure yields markedly higher solve rates than parameter-matched MPNN baselines. Finally, the ADMM structure exposes distinct primal, consensus, and dual state variables, opening the coordination dynamics to direct analysis and intervention -- a property unavailable in standard message-passing architectures.

2605.31001 2026-06-01 cs.CV

Iterative Framework For Data Augmentation Of Segmented Fingerprints

分割指纹数据增强的迭代框架

João Leonardo H. D. Agnol, Wesley Augusto de Bona, Erick Oliveira Rodrigues, Luiz Fernando Puttow Southier, Jefferson Oliva, Marcelo Filipak, Dalcimar Casanova

发表机构 * Federal University of Technology (UTFPR) Pato Branco, Parana Brazil(联邦技术大学(UTFPR)帕托布兰科,巴西南里维亚州)

AI总结 针对婴儿指纹数据稀缺问题,提出一种迭代数据增强方法,通过在训练用于提取指纹脊线和谷线的卷积神经网络中引入错误,生成多样化的分割指纹变体,实验证明该方法能有效扩展指纹变异性且保持视觉相似性。

Journal ref Anais do XV Workshop de Sistemas de Informação 2024

详情
AI中文摘要

婴儿生物识别由于婴儿与成人之间的生理差异而面临独特挑战,加上可用于研究的数据稀缺,限制了稳健匹配系统的发展。本文提出一种新颖的数据增强方法,使用迭代技术通过在训练用于提取指纹脊线和谷线的卷积神经网络中引入错误,生成分割指纹的多样化变体。在真实婴儿指纹上的实验证明了该方法在扩展指纹变异性方面的有效性,增强后的指纹在细节计数上表现出显著波动,同时仍保持与原始指纹的视觉相似性。研究还强调了该方法在应用不同程度变化到指纹分割方面的可定制性。未来研究包括使用所提框架增强的数据集训练分割和匹配神经网络。

英文摘要

Infant biometrics presents unique challenges due to the physiological differences between infants and adults, compounded by the scarcity of available data for research that limits the development of robust matching systems. This paper proposes a novel data augmentation method that uses iterative techniques to generate diverse variants of segmented fingerprints by inducing errors in a convolutional neural network trained to extract fingerprint ridges and valleys. Experiments on real infant fingerprints demonstrate the method's effectiveness in expanding fingerprint variability, with augmentations exhibiting significant fluctuations in minutiae counts while still retaining visual similarity to the originals. The study also highlights the method's customizable nature for applying varying levels of changes to fingerprint segmentations. Future research includes training segmentation and matching neural networks using datasets augmented by the proposed framework.

2605.30992 2026-06-01 cs.LG

Eigenvectors of Experts are Training-free Non-collapsing Routers

专家特征向量是无需训练的非崩溃路由器

Giang Do, Hung Le, Truyen Tran

发表机构 * Applied Artificial Intelligence Intiative (A2I2), Deakin University, Victoria, Australia(应用人工智能倡议(A2I2),德肯大学,维多利亚,澳大利亚)

AI总结 针对稀疏混合专家模型中专家崩溃问题,提出基于专家权重矩阵特征向量的无需训练路由框架SSMoE,通过奇异值分解利用谱特性提升模型性能。

Comments 24 pages

Journal ref ICML 2026

详情
AI中文摘要

稀疏混合专家(SMoE)架构通过将输入令牌路由到选定的专家子集来提高大型语言模型(LLMs)的训练效率。尽管取得了显著成功,SMoE模型在训练和推理中仍面临专家崩溃问题(Chi等人,2022),这会降低模型性能。先前研究主要关注改进路由器;然而,这些方法依赖于从头训练或微调,需要高昂的计算和数据处理成本。此外,我们通过理论和实证结果证明,尽管有这些努力,在推进预训练良好的SMoE模型时,该问题仍然存在。为填补这一空白,我们分析了先进的SMoE模型,观察到专家权重矩阵的特征向量编码了丰富的语义信息,指向传统路由策略的有效替代方案。基于这一见解,我们提出了奇异值分解SMoE(SSMoE),一种新颖且无需训练的框架,利用专家权重的谱特性来解决崩溃问题并提升模型性能。在多种语言和视觉任务上的大量实验,包括干净和损坏数据设置,证明了SSMoE的强大泛化能力和鲁棒性。我们的发现强调了更深入理解模型内部结构如何指导开发更有效的SMoE架构。我们的实现已在https://github.com/giangdip2410/SSMoE公开。

英文摘要

Sparse Mixture of Experts (SMoE) architectures improve the training efficiency of Large Language Models (LLMs) by routing input tokens to a selected subset of specialized experts. Despite their remarkable success, both training and inference in SMoE models suffer from the expert collapse issue (Chi et al., 2022), which degrades model performance. Prior studies primarily focus on improving the router; however, such methods rely on training from scratch or fine-tuning, which requires high computational and data-processing costs. Furthermore, we demonstrate that, despite these efforts, the issue persists when advancing well-pretrained SMoE models, as evidenced by both theoretical and empirical results. To fill that gap, we analyze the advanced SMoE models and observe that the eigenvectors of expert weight matrices encode rich semantic information, pointing to an effective alternative to conventional routing strategies. Building on this insight, we propose Singular Value Decomposition SMoE (SSMoE), a novel and training-free framework that leverages spectral properties of the expert weights to address the collapse issue and enhance model performance. Extensive experiments across diverse language and vision tasks, under both clean and corrupt data settings, demonstrate the strong generalization and robustness of SSMoE. Our findings highlight how a deeper understanding of model internals can guide the development of more effective SMoE architectures. Our implementation is publicly available at https://github.com/giangdip2410/SSMoE.

2605.30991 2026-06-01 cs.LG cs.CV

Parallel Tempering Initial Sampling in Inference-Time Reward Alignment

推理时奖励对齐中的并行回火初始采样

Myeongjun Oh, Gwangho Kim, Sungyoon Lee

发表机构 * Department of Artificial Intelligence(人工智能系) Department of Computer Science(计算机科学系)

AI总结 针对推理时奖励对齐中标准SMC方法因初始采样陷入局部模式的问题,提出基于并行回火的PATHS方法,通过耦合多条回火链实现高效探索,提升对齐质量。

Comments 31 pages, 11 figures

详情
AI中文摘要

推理时奖励对齐无需重新训练即可引导预训练的扩散和基于流的生成模型满足用户指定的奖励。最近,序贯蒙特卡洛(SMC)通过迭代过滤和传播多个粒子成为该任务的有力框架。然而,我们表明基于SMC的标准方法通常性能不佳,因为它们从标准先验初始化粒子,而复杂奖励景观中的高奖励区域极为罕见。此外,我们表明即使最近的奖励感知初始采样方法仍然容易陷入局部模式,因为复杂奖励景观通常是多模态的。为克服这些限制,我们提出PATHS(用于高复杂度奖励采样的并行回火),一种通过并行回火耦合多个采样链的新型初始化方法。PATHS维护一个奖励回火链的阶梯,并定期执行Metropolis交换,从而在平坦化的奖励景观中实现高效探索,缓解模式陷阱问题。我们的分析表明,该机制显著增强了有限预算下对通常难以采样的罕见高奖励区域的探索。在布局到图像和数量感知生成上的实验表明,PATHS在对齐质量上取得了一致的提升,尤其是在复杂提示上。

英文摘要

Inference-time reward alignment steers pretrained diffusion and flow-based generative models to satisfy user-specified rewards without retraining. Recently, Sequential Monte Carlo (SMC) has emerged as a powerful framework for this task by iteratively filtering and propagating multiple particles. However, we show that standard SMC-based methods often suffer from poor performance because they initialize particles from a standard prior, whereas high-reward regions in complex reward landscapes are extremely rare. Further, we show that even recent reward-aware initial sampling approaches remain vulnerable to getting trapped in local modes, as complex reward landscapes are often multi-modal. To overcome these limitations, we propose PATHS (PArallel Tempering for High-complexity reward Sampling), a novel initialization method that couples multiple sampling chains through parallel tempering. PATHS maintains a ladder of reward-tempered chains and periodically performs Metropolis swaps, enabling efficient exploration across flattened reward landscapes, thereby mitigating the mode-trapping issues. Our analysis reveals that this mechanism substantially enhances the finite-budget exploration of rare, high-reward regions that are typically challenging to sample. Experiments on layout-to-image and quantity-aware generation show that PATHS achieves consistent gains in alignment quality, particularly on complex prompts.

2605.30989 2026-06-01 cs.RO

A study on a Real-Time VR-Based Teleoperation Framework for Manipulator in Dynamic Environment

动态环境下基于实时VR的机械臂遥操作框架研究

InGyu Choi, GeonYeong Go, SunWoo Ahn, HyoJae Kang, Min-Sung Kang

发表机构 * Department of Robotics, Hanyang University(韩世大学机器人系) Department of Smart Construction Engineering, Hanyang University(韩世大学智能建造工程系) Department of Interdisciplinary Robot Engineering Systems, Hanyang University(韩世大学跨学科机器人工程系统系) School of Smart Convergence Engineering, Hanyang University, Ansan(韩世大学智能融合工程学院,安山)

AI总结 提出一种集成GPU加速逆运动学和轨迹优化的VR遥操作框架,在静态和动态障碍物环境中实现低延迟、碰撞感知的实时机械臂控制。

Comments This manuscript has been submitted for possible publication

详情
AI中文摘要

机器人遥操作能够在人类难以直接进入的危险环境中安全、非接触地执行任务,并且随着最近VR技术的发展,其应用范围已经扩大。然而,许多VR遥操作研究主要作为机器人模仿学习的数据收集工具,因此它们通常没有明确处理操作过程中的动态障碍物、工作空间变化或碰撞风险。为了实际部署以保障操作员安全,遥操作必须能够以低延迟响应动态情况,并对经验不足的操作员的错误保持鲁棒性。本文提出了一种VR遥操作框架,支持实时操作,同时处理与静态和移动障碍物的碰撞。该框架在VR界面中集成了GPU加速的逆运动学和轨迹优化,以在机器人约束下在每个控制周期生成可行的关节命令。使用7自由度机械臂进行的实验展示了在无障碍物、静态障碍物和移动障碍物三种场景下的稳定在线行为和碰撞感知运动生成。结果表明,所提出的方法生成的运动与操作员的命令一致,并在障碍物干扰命令路径时产生安全的绕行。

英文摘要

Robot teleoperation enables safe, non-contact task execution in hazardous environments where direct human access is difficult, and its application has expanded with recent VR technologies. Many VR teleoperation studies, however, have primarily served as data-collection tools for robot imitation learning, so they often do not explicitly address dynamic obstacles, workspace changes, or collision risks during operation. For real deployment aimed at operator safety, teleoperation must react to dynamic situations with low latency and remain robust to mistakes made by inexperienced operators. This paper presents a VR teleoperation framework that supports real-time manipulation while handling collisions with both static and moving obstacles. The framework integrates GPU-accelerated inverse kinematics and trajectory optimization within a VR interface to generate feasible joint commands at each control cycle under robot constraints. Experiments with a 7-DoF manipulator demonstrate stable online behavior and collision-aware motion generation across three scenarios: obstacle-free, static-obstacle, and moving-obstacle environments. The results indicate that the proposed approach generates motion consistent with the operator's command while producing safe detours when obstacles interfere with the commanded path.

2605.30987 2026-06-01 cs.CV

Benchmarking Single-Step Inpainting Methods for Multi-Object 3D Gaussian Splatting Scenes

多对象3D高斯泼溅场景的单步修复方法基准测试

Finn Dröge, Cecilia Curreli, Abhishek Saroha, Daniel Cremers

发表机构 * Technical University of Munich(慕尼黑技术大学) Munich Center for Machine Learning(慕尼黑机器学习中心)

AI总结 针对3D高斯泼溅场景中的对象移除与修复任务,比较了2D修复器在3D一致性上的表现,发现基于重建的修复器优于生成扩散模型,且从头初始化场景比微调现有场景效果更好,同时引入了一个带真实数据的新多对象场景。

Comments Accepted as an extended abstract to the CVEU Workshop at CVPR 2026

详情
AI中文摘要

对象移除和修复3D高斯泼溅(3DGS)场景面临跨相机视图的3D一致性等挑战。在比较2D修复器及其对3D领域的适用性时,我们发现基于重建的修复器在3D一致性上优于生成扩散模型。将这些2D修复器集成到创建和微调3DGS场景的不同单步方法中,我们的结果表明,从头初始化场景比微调现有场景产生更高质量的结果。使用最先进的生成式2D修复器,我们创建了一个简单的基线,以强调在3D设置中先移除对象再进行修复的重要性。由于360°数据集很少包含真实世界的地面真值,且具有挑战性的遮挡场景同样稀少,我们引入了一个新的多对象场景,其中包含记录的地面真值数据和多个存在对象遮挡的视图。

英文摘要

The tasks of object removal and inpainting 3D Gaussian Splatting (3DGS) scenes face challenges such as 3D consistency across camera views. In comparing 2D inpainters and their suitability for the 3D domain, we find that reconstruction-based inpainters outperform generative diffusion models in 3D consistency. Integrating these 2D inpainters into different single-step methods for creating and finetuning 3DGS scenes, our results indicate that initializing the scene from scratch produces higher quality results than finetuning the existing scene. Using a state-of-the-art generative 2D inpainter, we create a straightforward baseline to underline the importance of object removal before inpainting in the 3D setting. Since 360° datasets rarely include real-world ground truths, and challenging occlusion scenarios are equally sparse, we introduce a novel multi-object scene with recorded ground truth data and many views with object occlusions.

2605.30984 2026-06-01 cs.CV cs.AI cs.CL

Generating Reports or Repeating Templates? Measuring and Mitigating Template Collapse in 3D CT Report Generation

生成报告还是重复模板?测量和缓解三维CT报告生成中的模板崩溃

Tom Maye-Lasserre, Yitong Li, Bailiang Jian, Morteza Ghahremani, Benedikt Wiestler, Christian Wachinger

发表机构 * Technical University of Munich (TUM)(慕尼黑技术大学) TUM Hospital(TUM医院) Munich Center for Machine Learning (MCML)(慕尼黑机器学习中心)

AI总结 针对三维CT报告生成中模型输出多样性低、病理检测能力差的模板崩溃问题,提出解耦框架CLarGen,通过分离临床检测与语言合成,显著提升临床准确性并保持报告流畅性。

详情
AI中文摘要

现代三维医学视觉语言模型(VLM)能够生成流畅的放射学风格文本,但表现出极低的病理检测率和输出多样性,崩溃为低估罕见但关键发现的通用模板。我们将这种失败模式识别为模板崩溃。这种失败源于三维医学成像的独特限制,例如数据有限、标签严重不平衡以及体积编码器的弱信号。在这些限制下,文本生成目标鼓励捷径学习和流畅但基础薄弱的报告。我们通过临床保真度、输出多样性、正常模板偏差和罕见发现存活率系统性地诊断模板崩溃。为了缓解它,我们提出CLarGen,一个解耦框架,将说什么(临床检测)与怎么说(语言合成)分开。CLarGen使用(i)用于多标签病理检测的潜在查询变换器,(ii)用于临床匹配示例的病理引导检索,以及(iii)用于从检测到的发现和检索到的上下文中合成最终报告的医学语言模型。在最新的三维CT报告生成基线中,CLarGen缓解了模板崩溃,并在保持流畅报告的同时显著提高了临床准确性(macro-F1 0.487 vs. 0.189;CRG 0.472 vs. 0.368)。我们的结果表明,明确、可测量的临床基础对于抗模板崩溃的三维CT报告生成至关重要。代码将在接收后发布。

英文摘要

Modern 3D medical vision-language models (VLMs) can generate fluent radiology-style text while exhibit critically low pathology detection and output diversity, collapsing to generic templates that under-report rare yet critical findings. We identify this failure mode as Template Collapse. This failure stems from the unique constraints of 3D medical imaging, e.g., limited data, severe label imbalance, and weak signals from volumetric encoders. Under these constraints, text-generation objectives encourage shortcut learning and fluent but weakly grounded reports. We systematically diagnose the Template Collapse through clinical fidelity, output diversity, normal-template bias, and rare-finding survival. To mitigate it, we propose CLarGen, a decoupled framework that separates what to say (clinical detection) from how to say it (language synthesis). CLarGen uses (i) a Latent Query Transformer for multi-label pathology detection, (ii) pathology-guided retrieval for clinically matched exemplars, and (iii) a medical language model to synthesize the final report from detected findings and retrieved context. Across state-of-the-art 3D CT report generation baselines, CLarGen mitigates Template Collapse and substantially improves clinical accuracy (macro-F1 0.487 vs. 0.189; CRG 0.472 vs. 0.368) while maintaining fluent reporting. Our results suggest that explicit, measurable clinical grounding is essential for template-collapse-resistant 3D CT report generation. Code will be released upon acceptance.

2605.30983 2026-06-01 cs.CV

Can BEV Perception Gracefully Degrade under Sensor Failures?

BEV感知能否在传感器故障下优雅降级?

Haifa Zhang, Yijing Wang, Haoyu Wang, Zheng Li, Zhiqiang Zuo

发表机构 * Tianjin Key Laboratory of Intelligent Unmanned Swarm Technology and System(天津智能无人群技术与系统重点实验室) School of Electrical and Information Engineering, Tianjin University(天津大学电气与信息工程学院) Key Laboratory of System Control and Information Processing, Ministry of Education of China(系统控制与信息处理重点实验室,中华人民共和国教育部)

AI总结 针对多模态BEV感知在传感器损坏时性能骤降的问题,提出Grace-BEV框架,通过主动可靠性评估和动态特征重校准实现优雅降级,在极端LiDAR故障下将mAP从0.0%恢复至34.7%。

详情
AI中文摘要

尽管多模态鸟瞰图(BEV)感知在自动驾驶中取得了显著成功,但现有系统存在一个关键脆弱性:现有融合机制对传感器损坏高度敏感,常导致灾难性性能下降。这种脆弱性主要源于标准融合框架通常以静态方式集成多模态表示,导致在缺失或损坏模态下性能急剧崩溃。相比之下,我们表明通过主动模态可靠性评估可以实现优雅降级。为此,我们提出Grace-BEV,一个轻量级即插即用框架,在多模态融合过程中强制引入主动可靠性感知。Grace-BEV不依赖计算昂贵的跨模态交互,而是利用对齐的BEV空间通过TrustGate路由器显式评估模态可信度,并使用FailSafe融合块动态重新校准特征集成。此外,我们设计了带模态丢弃的三阶段训练策略,以防止模态主导并鼓励在不可靠输入下进行平衡的跨模态学习。在nuScenes-R和nuScenes-C上的大量实验表明,Grace-BEV在各种损坏设置下保持稳健性能。值得注意的是,在标准基线崩溃至0.0%平均精度(mAP)的灾难性LiDAR故障下,Grace-BEV将性能恢复至高达34.7% mAP。此外,它将干净准确率提升高达1.4%,实现了鲁棒性与效率之间的强权衡。

英文摘要

Despite the remarkable success of multi-modal bird's-eye view (BEV) perception in autonomous driving, current systems exhibit a critical vulnerability: existing fusion mechanisms are highly brittle to sensor corruptions, often causing catastrophic performance degradation. This vulnerability largely stems from the fact that standard fusion frameworks typically integrate multi-modal representations in a static manner, leading to a precipitous performance collapse under missing or corrupted modalities. In contrast, we show that graceful degradation is achievable through active modality reliability assessment. To this end, we present Grace-BEV, a lightweight and plug-and-play framework that enforces active reliability awareness during multi-modal fusion. Instead of relying on computationally expensive cross-modal interactions, Grace-BEV leverages the aligned BEV space to explicitly assess modality trustworthiness via a TrustGate Router and dynamically recalibrate feature integration using the FailSafe Fusion Block. Furthermore, we devise a Three-Phase Training strategy with Modality Dropout to prevent modality dominance and encourage balanced cross-modal learning under unreliable inputs. Extensive experiments on nuScenes-R and nuScenes-C show that Grace-BEV maintains robust performance across diverse corruption settings. Notably, under catastrophic LiDAR failures where standard baselines collapse to 0.0% mean Average Precision (mAP), Grace-BEV restores performance to as high as 34.7% mAP. Moreover, it improves clean accuracy by up to 1.4%, achieving a strong trade-off between robustness and efficiency.

2605.30981 2026-06-01 cs.CL cs.LG

Cognitive Fatigue in Autoregressive Transformers: Formalization and Measurement

自回归Transformer中的认知疲劳:形式化与测量

Riju Marwah, Ritvik Garimella, Vishal Pallagani, Atishay Jain, Michael Stewart, Amit Sheth

发表机构 * Guru Gobind Singh Indraprastha University, India(古鲁·戈宾德·辛格·印度普拉斯塔大学) Artificial Intelligence Institute, University of South Carolina, USA(人工智能研究所,南卡罗来纳大学) Indian Institute of Technology, Kanpur, India(印度理工学院,坎浦尔) Indian AI Research Organization, India(印度人工智能研究组织)

AI总结 本文形式化自回归语言模型在长程生成中的退化现象为认知疲劳,并提出轻量级诊断指标疲劳指数(FI),通过聚合注意力衰减、表征漂移和熵校准三个信号实现实时监测,实验表明FI能高精度预测任务退化和重复生成。

Comments 9 pages, 7 figures. Accepted at the 43rd International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

自回归语言模型在长程生成过程中经常退化,产生重复文本、失去指令遵循能力并表现出不稳定的熵。尽管这些失败普遍存在,但从业者缺乏在线诊断工具来实时检测它们。我们将这种退化形式化为认知疲劳,这是一种可测量的生成时状态,其特征是对原始提示的注意力衰减、表征漂移和熵校准错误。我们引入了疲劳指数(FI),这是一种轻量级、模型无关的诊断方法,在明确的公理(单调性、有界性、可解释性)下聚合这三个信号,从而实现可靠的运行时监控。在九个模型(1B-13B参数)上,FI轨迹表现出结构化的时间动态,预测任务退化(AUROC = 0.95)和重复(Spearman rho = 0.94),并揭示了非单调的缩放行为:低于3B的指令微调模型比基础模型退化更快,而在7B时这一趋势逆转。压力分析进一步表明,在更长的上下文、中间位置的证据和降低的数值精度下,FI onset加速。这些结果确立了认知疲劳作为一个连贯且可测量的现象,并将FI定位为生产级LLM系统中运行时可靠性监控的原则性工具。

英文摘要

Autoregressive language models frequently degrade during long-horizon generation, producing repetitive text, losing instruction adherence, and exhibiting unstable entropy. Despite the prevalence of these failures, practitioners lack online diagnostics to detect them in real-time as they occur. We formalize this degradation as cognitive fatigue, a measurable generation-time state characterized by decay in attention to the original prompt, representational drift, and entropy miscalibration. We introduce the Fatigue Index (FI), a lightweight, model-agnostic diagnostic that aggregates these three signals under explicit axioms (monotonicity, boundedness, interpretability) enabling reliable runtime monitoring. Across nine models (1B-13B parameters), FI trajectories exhibit structured temporal dynamics, predict task degradation (AUROC = 0.95) and repetition (Spearman rho = 0.94), and reveal non-monotonic scaling behavior: instruction-tuned models below 3B exhibit faster collapse than base models, with this trend reversing at 7B. Stress analyses further show that FI onset accelerates under longer contexts, middle-positioned evidence, and reduced numerical precision. These results establish cognitive fatigue as a coherent and measurable phenomenon, and position FI as a principled tool for runtime reliability monitoring in production LLM systems.

2605.30972 2026-06-01 cs.CV

BiSegMamba: Efficient Bidirectional Tri-Oriented Mamba for 3D Medical Image Segmentation

BiSegMamba: 用于3D医学图像分割的高效双向三向Mamba

Bakht Zada, Chao Tong, Qile Su, Shuai Zhang

发表机构 * School of Computer Science and Engineering, Beihang University(北航计算机科学与工程学院) State Key Laboratory of Virtual Reality Technology and Systems, Beihang University(北航虚拟现实技术与系统国家重点实验室)

AI总结 提出BiSegMamba,一种基于双向三向Mamba的高效3D医学图像分割网络,通过渐进压缩主干、多尺度空间混合器、双向正交Mamba块和自适应方向融合,在降低计算成本的同时提升分割精度。

Comments 10 pages, 7 figures, 5 tables. Code is available at: https://github.com/bakhtzadaabshare/BiSegMamba

详情
AI中文摘要

精确的3D医学图像分割需要长程体积上下文和精细边界保持。基于CNN的方法全局依赖建模有限,而基于Transformer的模型对于密集3D输入通常计算成本高昂。最近的基于Mamba的方法提供了一种高效替代方案,但现有的体积设计仍依赖于重复的高分辨率扫描、仅前向的顺序建模和固定的方向求和,导致高成本、扫描顺序偏差和次优的方向聚合。我们提出BiSegMamba,一种用于3D医学图像分割的高效双向三向Mamba网络。BiSegMamba遵循紧凑到细节的设计,其中渐进压缩主干(PCS)能够进行高效的潜在空间推理,同时保留浅层高分辨率特征用于重建。多尺度空间混合器(MSSM)在早期阶段捕获局部解剖模式,而提出的双向三向正交Mamba(Bi-ToOM)块使用联合处理的前向和后向扫描序列,从多个正交视图建模长程依赖。自适应方向融合(ADF)学习跨扫描方向的输入相关通道权重,用方向感知融合替代固定求和。在收集的颈动脉CTA数据集和三个公共基准BraTS2023、ACDC和AMOS-CT上的实验表明,BiSegMamba在血管、心脏、脑肿瘤和腹部多器官分割任务中具有良好的泛化能力。与SegMamba-V2相比,BiSegMamba在BraTS2023上性能略有提升,在ACDC和颈动脉数据集上显著改进,同时计算成本降低高达77.9% FLOPs,展示了在通用3D医学图像分割中强大的精度-效率平衡。

英文摘要

Accurate 3D medical image segmentation requires both long-range volumetric context and fine boundary preservation. CNN-based methods have limited global dependency modeling, while Transformer-based models are often computationally expensive for dense 3D inputs. Recent Mamba-based methods provide an efficient alternative, but existing volumetric designs still depend on repeated high-resolution scanning, forward-only sequential modeling, and fixed directional summation, causing high cost, scan-order bias, and suboptimal directional aggregation. We propose BiSegMamba, an efficient bidirectional tri-oriented Mamba network for 3D medical image segmentation. BiSegMamba follows a compact-to-detail design, where a progressive compacting stem (PCS) enables efficient latent-space reasoning while retaining shallow high-resolution features for reconstruction. A multi-scale spatial mixer (MSSM) captures local anatomical patterns in early stages, and the proposed bidirectional tri-oriented Ortho Mamba (Bi-ToOM) block models long-range dependencies from multiple orthogonal views using jointly processed forward and backward scan sequences. Adaptive directional fusion (ADF) learns input-dependent channel-wise weights across scan orientations, replacing fixed summation with orientation-aware fusion. Experiments on a collected carotid CTA dataset and three public benchmarks, BraTS2023, ACDC, and AMOS-CT, show that BiSegMamba generalizes well across vascular, cardiac, brain tumor, and abdominal multi-organ segmentation tasks. Compared with SegMamba-V2, BiSegMamba achieves slightly better performance on BraTS2023 and clear improvements on ACDC and the carotid dataset, while reducing computational cost by up to 77.9% FLOPs, demonstrating a strong accuracy-efficiency balance for general 3D medical image segmentation.

2605.30969 2026-06-01 cs.CV

Omni-Supervised Motion Editing: Balancing Change and Invariance through Positive-Negative Learning

全监督运动编辑:通过正负学习平衡变化与不变性

Zhenwu Shi, Jingyu Gong, Peiwei Wang, Xingzan Wang, Tianwen Qian, Wenxi Li, Yuan Fang, Jiao Xie, Lizhuang Ma, Shaohui Lin

发表机构 * Shanghai Institute of Artificial Intelligence for Education, East China Normal University, China(上海人工智能教育研究院,华东师范大学,中国) School of Computer Science and Technology, East China Normal University, China(华东师范大学计算机科学与技术学院,中国) School of Statistics, East China Normal University, China(华东师范大学统计学院,中国) The 27th Research Institute of CETC, Zhengzhou, China(中国电子科技集团第27研究所,郑州,中国) Key Laboratory of Advanced Theory and Application in Statistics and Data Science, MOE, China(教育部统计与数据科学先进理论与应用重点实验室,中国) Shanghai Key Laboratory of Computer Software Evaluating and Testing, China(上海计算机软件评测测试重点实验室,中国) School of Computer Science, Shanghai Jiao Tong University, China(上海交通大学计算机科学学院,中国)

AI总结 提出OmniME框架,通过正负学习结合回顾特征监督、运动保持机制和三元组语义对齐,平衡运动编辑中的变化与不变性,在MotionFix和STANCE Adjustment数据集上达到最优性能。

详情
AI中文摘要

基于文本的人体运动编辑旨在根据自然语言指令修改现有运动序列,同时保持原始运动的一致性。现有的基于扩散的方法通常依赖启发式相似性线索或粗糙的全局条件,导致运动失真和次优的语义对齐。关键挑战在于平衡变化(即精确编辑目标区域)和不变性(即保留未编辑部分)。为应对这一挑战,我们提出了一个全监督正负学习框架,名为OmniME。我们的方法集成了三个互补组件:(1)回顾特征监督,在Transformer层之间强制执行从粗到细的一致性;(2)运动保持机制,根据源-目标相似性关注细微变化;(3)基于三元组的语义对齐,增强文本-运动对应关系。这些组件共同形成了一个统一的监督范式,平衡变化与不变性。在MotionFix和STANCE Adjustment数据集上的大量实验表明,OmniME在编辑对齐方面达到了最先进的性能,验证了我们统一学习框架的有效性。我们的源代码和模型已发布在:https://github.com/rocket-ycyer/OmniME.git

英文摘要

Text-based human motion editing aims to modify existing motion sequences according to natural language instructions while maintaining the consistency of the original motion. Existing diffusion-based approaches often rely on heuristic similarity cues or coarse global conditioning, leading to motion distortion and suboptimal semantic alignment. The key challenge lies in balancing change (i.e. precisely editing target regions) and invariance (i.e. preserving unedited parts). To handle such challenge, we propose an Omni-Supervised Positive-Negative Learning framework, named OmniME. Our method integrates three complementary components: (1) retrospective feature supervision that enforces coarse-to-fine consistency across transformer layers,(2) motion preservation mechanism that focuses on subtle variations according to the source-target similarity, and (3) triplet-based semantic alignment that strengthens text-motion correspondence. Together, these components form a unified supervision paradigm that balances change and invariance. Extensive experiments on the MotionFix and STANCE Adjustment datasets demonstrate that OmniME achieves state-of-the-art performance in editing alignment, validating the effectiveness of our unified learning framework. Our source codes and models have been released at: https://github.com/rocket-ycyer/OmniME.git

2605.30968 2026-06-01 cs.CV cs.AI

Variational Adapter for Cross-modal Similarity Representation

变分适配器用于跨模态相似性表示

WenZhang Wei, Zhipeng Gui, Dehua Peng, Tiandi Ye, Huayi Wu

发表机构 * School of Remote Sensing and Information Engineering(遥感与信息工程学院) Wuhan University(武汉大学) School of Data Science and Engineering(数据科学与工程学院) East China Normal University(华东师范大学) State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing(测绘遥感信息工程国家重点实验室)

AI总结 针对跨模态匹配中细粒度标注稀缺导致二元分类边界压缩和假负样本问题,提出变分适配器VACSR,将匹配任务重构为变分推断问题,通过构建潜在相似性空间和正则化缓解过拟合,在图像-文本检索、域泛化和基类到新类泛化任务上验证了有效性。

Comments Accepted by the 43rd International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

视觉-语言模型的核心在于在统一表示空间中度量跨模态相似性。然而,大多数图像-文本匹配或多类图像分类数据集缺乏细粒度的跨模态匹配标注,迫使连续的相似性空间压缩为二元分类边界。这种压缩引入了假负样本,并严重损害了跨模态任务的泛化性能。尽管先前的研究试图通过建模模态内模糊性来缓解这一问题,但往往忽略了固有的标注缺陷,导致不确定性分配次优。为了解决这些挑战,我们提出了一种变分适配器用于跨模态相似性表示(VACSR)。该方法将具有细粒度语义稀缺性的图像-文本匹配重新表述为变分推断问题。它构建了一个跨模态相似性的潜在空间,并使用正则化技术来减轻对二元标注的过拟合。在图像-文本检索、域泛化和基类到新类泛化上的实验证明了所提出方法的有效性和鲁棒的泛化能力。

英文摘要

The core of vision-language models lies in measuring cross-modal similarity within a unified representation space. However, most image-text matching or multi-class image classification datasets lack fine-grained cross-modal matching annotations, forcing the continuous similarity space into binary classification boundaries. This compression induces false negative samples and significantly impairs the generalization performance of cross-modal tasks. While prior research has attempted to mitigate this by modeling intra-modal ambiguity, it often overlooks inherent annotation flaws, leading to suboptimal uncertainty allocation. To address these challenges, we propose a Variational Adapter for Cross-modal Similarity Representation (VACSR). This approach reformulates image-text matching with fine-grained semantic scarcity as a variational inference problem. It constructs a latent space for cross-modal similarity and uses regularization techniques to mitigate overfitting to binary annotations. Experiments on image-text retrieval, domain generalization, and base-to-novel generalization demonstrate the proposed method's effectiveness and robust generalization ability.

2605.30961 2026-06-01 cs.CL

EvoGens: A Population-Based Heuristic Search Framework for Scientific Idea Generation

EvoGens:一种基于种群的启发式搜索框架用于科学思想生成

Xu Li, Hanzhe Tu, Xinyi Li, Kuncheng Zhao, Xun Han, Zhonghui Liu

发表机构 * Southwest Petroleum University(西南石油大学) Sichuan Police College(四川警察学院)

AI总结 针对现有LLM方法生成科学思想时语义趋同、多样性和新颖性不足的问题,提出EvoGens框架,通过进化搜索(变异、交叉、选择)增强思想探索,显著提升新颖性和多样性。

Comments 21 pages, 6 figures

详情
AI中文摘要

生成新颖的研究思想是科学进步的基础。虽然大型语言模型(LLM)在辅助这一过程中显示出潜力,但现有方法常表现出语义趋同,导致多样性和新颖性有限。为解决这一问题,我们引入了EvoGens,一个受进化启发的框架,将科学思想生成重新构想为对思想种群的进化搜索。EvoGens迭代地应用基于排名的变异与差异化检索规划以融入外部知识,以及语义感知的交叉以融合互补概念进行概念重组。一个轻量级的评估信号指导选择过程,鼓励持续探索同时缓解过早收敛。大量实验表明,与最先进的基线相比,EvoGens显著增强了探索能力。具体而言,在当前的自动评估协议下,它将新颖性从0.1提升到0.4,多样性从0.24提升到0.55,同时保持了可比的思想质量。这些发现表明,进化机制可以作为面向探索的研究构思的有用框架,特别是在共享自动评估设置下拓宽候选思想的新颖性和多样性。

英文摘要

Generating novel research ideas is fundamental to scientific progress. While Large Language Models (LLMs) show promise in assisting this process, existing approaches often exhibit semantic convergence, resulting in limited diversity and novelty. To address this, we introduce EvoGens, an evolution-inspired framework that recasts scientific idea generation as an evolutionary search over a population of ideas. EvoGens iteratively applies rank-based mutation with differentiated retrieval planning to incorporate external knowledge, and semantic-aware crossover to fuse complementary concepts for conceptual reorganization. A lightweight evaluation signal guides the selection process, encouraging sustained exploration while mitigating premature convergence. Extensive experiments demonstrate that EvoGens substantially enhances exploration capabilities compared to state-of-the-art baselines. Specifically, it improves the Novelty from 0.1 to 0.4 and the Diversity from 0.24 to 0.55, while maintaining comparable idea quality under the current automatic evaluation protocol. These findings suggest that evolutionary mechanisms can serve as a useful framework for exploration-oriented research ideation, especially for broadening the novelty and diversity of candidate ideas under a shared automatic evaluation setting.

2605.30960 2026-06-01 cs.LG

Revisiting Zeroth-Order Hessian Approximation: A Single-Step Policy Optimization Lens

重新审视零阶Hessian近似:单步策略优化视角

Junbin Qiu, Zhaowei Hong, Renzhe Xu, Yao Shu

发表机构 * Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) Shanghai University of Finance and Economics(上海财经大学)

AI总结 本文通过单步策略优化视角统一零阶Hessian估计,提出方差缩减的ZoVH框架,实现全Hessian矩阵、正则化逆及偏差校正逆Hessian-梯度积的高效估计。

详情
AI中文摘要

精确的零阶Hessian估计是无导数方法的基石,对于双层优化、贝叶斯推断和不确定性量化等任务至关重要。然而,在高维设置中获取完整的低方差Hessian及其逆估计器仍然是一个重大挑战。为了解决这一问题,我们提出了一个统一框架,通过单步策略优化的视角重新解释零阶Hessian近似。该视角建立了通用零阶Hessian估计器与平滑策略优化目标Hessian之间的理论等价性,将不同的经典随机估计器统一为基线选择的特定实例。在此基础上,我们引入了ZoVH,一个针对全Hessian矩阵、其正则化逆以及偏差校正的逆Hessian-梯度积的方差缩减估计器套件。ZoVH利用两种关键技术:(1) 推导出的唯一最优基线,可证明最小化方差;(2) 一种查询重用策略,结合历史函数查询以提高样本效率而不增加成本。我们严格的理论分析证实了Hessian估计器的无偏性,验证了基线的方差最优性,提供了整个ZoVH套件的误差界,并为由此产生的曲率感知零阶算法建立了收敛保证。广泛的实证结果验证了我们的理论发现,表明ZoVH在实际应用中实现了卓越的估计精度和收敛性能。代码可在 https://github.com/Qjbtiger/ZoVH 获取。

英文摘要

Accurate Zeroth-Order (ZO) Hessian estimation is a cornerstone of derivative-free methods, essential for tasks such as bilevel optimization, Bayesian inference, and uncertainty quantification. However, obtaining a complete suite of low-variance estimators for the Hessian and its inverse in high-dimensional settings remains a significant challenge. To address this, we propose a unified framework that reinterprets ZO Hessian approximation through the lens of single-step Policy Optimization (PO). This perspective establishes a theoretical equivalence between general ZO Hessian estimators and the Hessian of a smoothed PO objective, unifying distinct classical randomized estimators as specific instances of baseline selection. Building on this foundation, we introduce ZoVH, a comprehensive suite of variance-reduced estimators for the full Hessian matrix, its regularized inverse, and the bias-corrected inverse Hessian-gradient product. ZoVH leverages two key techniques: (1) a unique optimal baseline derived to provably minimize variance, and (2) a query reuse strategy that incorporates historical function queries to enhance sample efficiency without inflating costs. Our rigorous theoretical analysis confirms the unbiasedness of the Hessian estimator, validates the variance optimality of our baseline, provides error bounds for the entire ZoVH suite, and establishes convergence guarantees for the resulting curvature-aware ZO algorithm. Extensive empirical results validate our theoretical findings, demonstrating that ZoVH achieves superior estimation accuracy and convergence performance in real-world applications. Code is available at https://github.com/Qjbtiger/ZoVH

2605.30942 2026-06-01 cs.CV

PRISM: Progressive Reasoning through Iterative Slot Memory for Vision

PRISM: 通过迭代槽记忆进行渐进推理的视觉架构

Ziyu Wang, Shuangpeng Han, Mengmi Zhang

发表机构 * Deep NeuroCognition Lab, Nanyang Technological University, Singapore(深神经认知实验室,南洋理工大学,新加坡)

AI总结 提出PRISM架构,通过迭代槽记忆进行渐进推理,在图像分类、目标检测和语义分割等任务上取得竞争性能,并在遮挡等不完整观测下展现出更强的鲁棒性。

详情
AI中文摘要

现代视觉模型通过单次前馈传递处理图像,这限制了它们在观测不完整时恢复缺失证据或细化不确定表示的能力。受人类感知迭代性质的启发,我们引入了PRISM(通过迭代槽记忆进行渐进推理),这是一种通过迭代细化对图像进行推理的金字塔视觉架构。在高层次上,PRISM将视觉特征分组为以对象为中心的表示,从学习到的记忆中检索相关模式,并迭代细化表示以解决歧义和恢复缺失信息。这种组织-回忆-细化过程在多个尺度上循环运行,实现了视觉表示的渐进改进。在包括图像分类、目标检测和语义分割在内的标准视觉任务中,PRISM取得了竞争性能,同时在遮挡等不完整观测下展现出更强的鲁棒性。这些结果表明,使用结构化表示和记忆进行迭代推理是构建更具弹性和适应性的视觉模型的一个有前景的方向。源代码和模型将发布。

英文摘要

Modern vision models process images in a single feed-forward pass, which limits their ability to recover missing evidence or refine uncertain representations under incomplete observations. Inspired by the iterative nature of human perception, we introduce PRISM (Progressive Reasoning through Iterative Slot Memory), a pyramid vision architecture that reasons over images through iterative refinement. At a high level, PRISM groups visual features into object-centric representations, retrieves relevant patterns from a learned memory, and iteratively refines the representation to resolve ambiguity and recover missing information. This organize-recall-refine process operates recurrently across multiple scales, enabling progressive improvement of visual representations. Across standard vision tasks, including image classification, object detection, and semantic segmentation, PRISM achieves competitive performance while demonstrating improved robustness under incomplete observations such as occlusion. These results suggest that iterative reasoning with structured representations and memory is a promising direction for building more resilient and adaptive vision models. Source code and models will be released.

2605.30939 2026-06-01 cs.CV

IAF-Net: Illumination-Adaptive Fusion for Low-Light Urban Road Segmentation

IAF-Net:用于低光照城市道路分割的照明自适应融合网络

Bingtao Wang, Daojie Peng, Fulong Ma, Jun Ma, Liang Zhang

发表机构 * The Shandong University(山东大学) The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州))

AI总结 提出IAF-Net,通过照明自适应融合模块动态调整RGB与几何特征的融合权重,并利用亮度调制注意力解码器增强低光照特征选择,实现不同光照条件下鲁棒的道路分割。

详情
AI中文摘要

语义道路分割对于自动驾驶至关重要,但现有方法在低光照条件下性能严重下降。许多现有的多模态融合方法没有显式适应模态可靠性的光照依赖性变化,这可能在夜间将退化的RGB特征传播到融合表示中。我们提出IAF-Net(照明自适应融合网络),一种端到端框架,具有照明自适应融合功能,可在不同光照条件下实现鲁棒的道路分割。它通过核心的照明自适应融合(IAF)模块动态调整RGB和几何特征的融合权重,并使用亮度调制注意力解码器增强低光照特征选择。我们还构建了两个专用数据集:nuScenes夜间道路分割(nuScenes-NRS)和CARLA多天气道路分割(CARLA-MWRS)。在nuScenes-NRS上的实验显示,在比较方法中整体性能达到最先进水平,而CARLA-MWRS进一步验证了在恶劣天气条件下的鲁棒性。在40%训练子集上的消融研究进一步强调了IAF模块的重要性,该模块在MaxF中提供了最大的个体增益0.70%。

英文摘要

Semantic road segmentation is important for autonomous driving, but existing methods suffer severe performance degradation under low-light conditions. Many existing multi-modal fusion methods do not explicitly adapt to illumination-dependent changes in modality reliability, which can propagate degraded RGB features into the fused representation at night. We propose IAF-Net (Illumination-Adaptive Fusion Network), an end-to-end framework with illumination-adaptive fusion for robust road segmentation across different lighting conditions. It dynamically adjusts fusion weights of RGB and geometric features via the core Illumination-Adaptive Fusion (IAF) module, and enhances low-light feature selection with a brightness-modulated attention decoder. We also construct two dedicated datasets: nuScenes Nighttime Road Segmentation (nuScenes-NRS) and CARLA Multi-Weather Road Segmentation (CARLA-MWRS). Experiments on nuScenes-NRS show state-of-the-art overall performance among the compared methods, while CARLA-MWRS further validates robustness across adverse weather conditions. Ablation studies on a 40% training subset further highlight the importance of the IAF module, which provides the largest individual gain of 0.70% in MaxF.

2605.30936 2026-06-01 cs.LG math.OC stat.ML

Local linear convergence of gradient methods for overparameterized Gaussian mixtures

过参数化高斯混合模型梯度方法的局部线性收敛性

Jingxing Wang, Vasileios Charisopoulos, Maryam Fazel

发表机构 * Electrical & Computer Engineering, University of Washington(华盛顿大学电气与计算机工程系) National Institute for Theory and Mathematics in Biology(生物理论与数学国家研究所) Amazon, Inc.(亚马逊公司)

AI总结 针对过参数化高斯混合模型,提出一种交替使用短梯度步和长Polyak步的方法,实现局部线性收敛速率,克服了过参数化导致的慢收敛问题。

Comments 45 pages, 7 figures

详情
AI中文摘要

我们研究了过参数化下学习高斯混合模型的问题。先前的工作表明,虽然过参数化对于避免虚假局部最优和通过梯度EM算法实现全局恢复真实模型至关重要,但它会显著减慢局部收敛速度。在混合权重的某些假设下,我们证明了统计学习过程最小化的标准散度度量具有一个缓慢增长的流形,在该流形上著名的Polyak步长可以几何级地减少损失,并设计了一种基于梯度的方法,该方法以局部线性速率收敛到极小值点。此外,我们表明,对于具有任意权重的混合模型,我们的方法收敛到接近最优的解——直到一个自然的误设阈值。在高层次上,该方法在接近流形的几个“短”梯度下降步和收缩到极小值点距离的“长”Polyak步之间交替。我们的结果表明,慢收敛不是过参数化的内在挑战,而是可以通过利用损失景观的有利结构来克服。

英文摘要

We study the problem of learning Gaussian mixture models under overparameterization. Prior work has shown that while overparameterization is essential for avoiding spurious local optima and enables global recovery of the ground-truth model using the gradient-EM (expectation-maximization) algorithm, it can dramatically slow down the local rate of convergence. Under certain assumptions on the mixture weights, we show that a standard divergence measure minimized by statistical learning procedures possesses a manifold of slow growth on which the well-known Polyak stepsize reduces the loss geometrically, and design a gradient-based method that converges to minimizers at a locally linear rate. Additionally, we show that our method converges to nearly optimal solutions -- up to a natural misspecification threshold -- for mixtures with arbitrary weights. At a high level, the method alternates between several "short" gradient descent steps that approach the manifold and "long" Polyak steps that contract the distance to minimizers. Our results suggest that slow convergence is not an intrinsic challenge of overparameterization, but can be overcome by exploiting the favorable structure of the loss landscape.

2605.30934 2026-06-01 cs.CL cs.AI

Do Large Language Models Encode Institutional Experience? Evidence from Cross-Linguistic Moral Reasoning Under Ambiguity

大型语言模型是否编码了制度经验?来自跨语言模糊道德推理的证据

Nattavudh Powdthavee

发表机构 * Nanyang Technological University(南洋理工大学)

AI总结 通过跨语言道德困境实验,研究大型语言模型在模糊情境下是否通过语言编码制度经验,发现隐含制度线索会放大跨语言道德分歧,而明确框架则抑制这种差异。

Comments 44 pages

详情
AI中文摘要

大型语言模型(LLMs)在不同语言中表现出系统性的道德推理差异,但这种差异的来源尚不清楚。我们检验了一个假设:语言编码了其使用环境中的制度方面,使得LLMs通过训练继承了特定制度的道德先验。跨越制度质量梯度广泛的九种语言、六个前沿LLM以及两项预注册研究,我们考察了道德困境的可接受性取决于制度功能的情况。在研究1中,明确的制度框架产生了统一的无结果:跨语言道德分歧在制度依赖场景中没有增加,也没有追踪语言社区之间的制度差异。在研究2中,我们引入了制度模糊场景,其中制度利益存在但未明确说明。在这些条件下,跨语言道德分歧相对于制度无关控制组增加,并且除一个理论上有信息的例外,与语言社区之间的现实世界制度差异相关。明确的框架再次减弱了这些效应。这些发现表明,制度经验可能在语言中留下可检测的痕迹,塑造LLM的道德推理,同时也表明明确的制度线索可以抑制这些差异的表达。

英文摘要

Large language models (LLMs) exhibit systematic differences in moral reasoning across languages, yet the source of this variation remains unclear. We test the hypothesis that languages encode aspects of the institutional environments in which they are spoken, allowing LLMs to inherit institution-specific moral priors through training. Across nine languages spanning a broad gradient of institutional quality, six frontier LLMs, and two preregistered studies, we examine moral dilemmas whose acceptability depends on institutional functioning. In Study 1, explicit institutional framing produced uniformly null results: cross-linguistic moral divergence did not increase in institutionally contingent scenarios, nor did it track institutional differences between language communities. In Study 2, we introduced institutionally ambiguous scenarios in which institutional stakes were present but not explicitly stated. Under these conditions, cross-linguistic moral divergence increased relative to institutionally inert controls and, with one theoretically informative exception, was associated with real-world institutional differences between language communities. Explicit framing again attenuated these effects. These findings suggest that institutional experience may leave detectable traces in language that shape LLM moral reasoning, while also indicating that explicit institutional cues can suppress the expression of those differences.

2605.30928 2026-06-01 cs.RO

Enhancing Human-Likeness in Reinforcement Learning Agents via Hierarchical Macro Action Quantization

通过分层宏动作量化增强强化学习智能体的人类相似性

Usman Nizamani, M. Shaheer Luqman, Fawad Javed Fateh, Ali Shah Ali, Murad Popattia, M. Zeeshan Zia, Quoc-Huy Tran

发表机构 * Retrocausal, Inc.(Retrocausal公司)

AI总结 提出一种分层宏动作量化框架(HiMAQ),通过两级向量量化将人类演示编码为宏动作,使强化学习智能体在保持高回报的同时生成更接近人类的行为序列,在D4RL基准上优于非分层基线并兼容多种RL算法。

详情
AI中文摘要

人类化智能体是人工智能的长期目标。尽管性能强劲,大多数强化学习(RL)智能体仍以奖励驱动,且常表现出与人类不同的行为,限制了可解释性和可靠性。在这项工作中,我们引入了一种新颖的人类化RL框架,该框架在最大化奖励的同时预测与人类行为紧密对齐的动作序列。具体来说,我们使用一种分层宏动作量化方法(称为HiMAQ)将人类演示编码为宏动作,该方法包含两个连续的向量量化层级。低层量化将输入动作映射到细粒度的子动作簇,而高层量化将这些子动作簇聚合成动作簇。在D4RL基准上的广泛评估表明,我们的分层方法优于非分层基线(MAQ),在保持与先前RL智能体相当或更高成功率的同时,获得了更好的人类相似性分数。这些改进泛化到与各种RL算法(即IQL、SAC和RLPD)的集成中。

英文摘要

Human-like agents are a long-standing goal of artificial intelligence. Despite strong performance, most reinforcement learning (RL) agents remain reward-driven and often exhibit behaviors that differ from humans, limiting interpretability and reliability. In this work, we introduce a novel human-like RL framework that predicts action sequences closely aligned with human behaviors while maximizing rewards. Specifically, we encode human demonstrations into macro actions using a hierarchical macro action quantization approach (termed HiMAQ) consisting of two successive levels of vector quantization. The lower quantization level maps input actions to fine-grained subaction clusters, while the higher quantization level aggregates these subaction clusters into action clusters. Extensive evaluations on the D4RL benchmarks show that our hierarchical approach outperforms the non-hierarchical baseline (MAQ), achieving better human-likeness scores while maintaining comparable or better success rates than previous RL agents. The improvements generalize across integrations with various RL algorithms, namely IQL, SAC, and RLPD.

2605.30925 2026-06-01 cs.CV cs.GR

MultiAct: Text-to-Motion Generation from Composite Text via Tailored Attention Guidance

MultiAct: 通过定制注意力引导从复合文本生成动作

Nathan Sala, Ofir Abramovich, Ariel Shamir, Daniel Cohen-Or, Andreas Aristidou, Sigal Raab

发表机构 * Tel Aviv University(特拉维夫大学) Reichman University(雷赫曼大学) CYENS Centre of Excellence(CYENS卓越中心) University of Cyprus(塞浦路斯大学)

AI总结 提出MultiAct,一种无需重新训练或修改架构的推理时框架,通过自适应增强未充分表示提示组件的交叉注意力分数,解决复合文本到动作生成中语义覆盖不全的问题。

Comments Accepted to SIGGRAPH 2026 conference. Project page: https://natsala13.github.io/multiact.github.io

详情
AI中文摘要

近年来,文本到动作生成发展迅速,为动画和人机交互提供了富有表现力的界面。然而,当前模型在处理描述同时发生的多个动作的提示时仍然脆弱。模型常常优先考虑单个主导动作而忽略其余部分,导致动作不完整或模糊,而不是实现复合描述的所有组成部分。我们提出MultiAct,一种无需配对、推理时的组合文本到动作合成框架,可直接作用于预训练的动作生成器,无需重新训练或架构修改。我们的方法通过自适应增强与未充分表示提示组件相关的交叉注意力分数来对抗语义崩溃。我们注意到有效调制取决于提示特定的选择,例如要定位的令牌和层,并引入一个轻量级辅助决策方案,以确定最有效的注意力增强参数化。广泛的定量和定性评估表明,MultiAct在复合提示上持续优于现有基线,在保持动作真实感的同时实现了改进的语义覆盖。项目页面:https://natsala13.github.io/multiact.github.io。

英文摘要

Text-to-motion generation has progressed rapidly in recent years, offering an expressive interface for animation and human-computer interaction. However, current models remain brittle when handling prompts that describe multiple actions occurring at the same time. Rather than realizing all components of a composite description, models frequently prioritize a single dominant action and neglect the rest, leading to incomplete or ambiguous motion. We present MultiAct, an unpaired, inference-time framework for compositional text-to-motion synthesis that operates directly on pretrained motion generators without retraining or architectural modification. Our method counteracts semantic collapse by adaptively amplifying cross-attention scores associated with underrepresented prompt components. We note that effective modulation depends on prompt-specific choices, such as which tokens and layers to target, and introduce a lightweight auxiliary decision scheme that determines the most effective attention-strengthening parametrization. Extensive quantitative and qualitative evaluations demonstrate that MultiAct consistently outperforms existing baselines on composite prompts, achieving improved semantic coverage while preserving motion realism. Project page: https://natsala13.github.io/multiact.github.io.

2605.30924 2026-06-01 cs.CL

EMBGuard: Constructing Hazard-Aware Guardrails for Safe Planning in Embodied Agents

EMBGuard:为具身智能体安全规划构建危险感知护栏

Dongwook Choi, Taeyoon Kwon, Bogyung Jeong, Minju Kim, Yeonjun Hwang, Hyojun Kim, Byungchul Kim, Young Kyun Jang, Jinyoung Yeo

发表机构 * Independent Researcher(独立研究者) Department of Biomedical Engineering(生物医学工程系) the Department of Intelligent Precision Healthcare Convergence, Sungkyunkwan University(智能精准医疗融合系,全州大学) Department of Artificial Intelligence, Yonsei University(人工智能系,延世大学)

AI总结 提出首个基于MLLM的具身安全护栏EMBGuard,通过解耦物理风险推理与智能体策略,评估(视觉观察,动作)对来识别危险配置并提供自然语言解释,同时构建训练数据集EMBHazard和基准测试EMBGuardTest,在紧凑模型尺寸下达到与专有MLLM竞争的性能并降低误报率。

Comments Accepted at ICML 2026

详情
AI中文摘要

部署在真实环境中的MLLM驱动的具身智能体会遇到物理危险。然而,现有方法缺乏识别危险和推理动作条件风险的内在机制,导致智能体要么错过危险交互,要么过度识别风险。为解决此问题,我们提出EMBGuard,这是首个基于MLLM的具身智能体安全护栏,旨在将物理风险推理与智能体策略解耦。通过评估(视觉观察,动作)对,EMBGuard识别危险配置并提供潜在风险的自然语言解释。伴随EMBGuard,我们贡献了EMBHazard,一个包含15.1K个动作条件对的训练数据集,以及EMBGuardTest,一个包含329个手动策划的真实世界场景的基准测试,涵盖七种物理风险类别。通过危险和动作的组合变化,我们生成了智能体在规划过程中可能遇到的各种危险和良性场景。尽管模型尺寸紧凑(2B,4B),EMBGuard达到了与专有MLLM(例如GPT-5.1,Gemini-2.5-Pro)竞争的性能,同时显著降低了阻碍实时部署的误报率。我们在https://github.com/dongwxxkchoi/EMBGuard公开了代码、数据和模型。

英文摘要

MLLM-powered embodied agents deployed in real-world environments encounter physical hazards. However, existing approaches lack explicit mechanisms for identifying hazards and reasoning about action-conditioned risks, leading agents to either miss risky interactions or over-identify risks. To address this, we propose EMBGuard, the first MLLM-based safety guardrail for embodied agents designed to decouple physical risk reasoning from agent policy. By evaluating a (visual observation, action) pair, EMBGuard identifies hazardous configurations and provides natural language explanations of potential risks. Alongside EMBGuard, we contribute EMBHazard, a training dataset of 15.1K action-conditioned pairs, and EMBGuardTest, a benchmark of 329 manually curated real-world scenarios spanning seven physical risk categories. Through compositional variation of hazards and actions, we generate diverse risky and benign scenarios that agents may encounter during planning. Despite its compact size (2B, 4B), EMBGuard achieves performance competitive with proprietary MLLMs (e.g., GPT-5.1, Gemini-2.5-Pro) while significantly reducing the false-positive rates that hinder real-time deployment. We make the code, data, and models publicly available at https://github.com/dongwxxkchoi/EMBGuard

2605.30919 2026-06-01 cs.LG cs.AI

De-attribute to Forget for LLM Unlearning

De-attribute to Forget for LLM Unlearning

Xinyang Lu, Jiabao Pan, Rachael Hwee Ling Sim, See-Kiong Ng, Anthony Kum Hoe Tung, Bryan Kian Hsiang Low

发表机构 * Department of Computer Science, National University of Singapore(新加坡国立大学计算机科学系)

AI总结 本文提出基于数据归因奖励的LLM遗忘框架DareU,通过强化学习降低生成响应与遗忘数据的归因分数,实现有效遗忘并平衡模型效用。

详情
AI中文摘要

大型语言模型(LLM)的快速发展引发了对使用不当数据进行训练的担忧,这导致了对LLM遗忘研究的兴趣日益增长。许多现有的LLM遗忘方法依赖于优化预测损失,例如最大化遗忘集上的损失,但常常面临过度遗忘和模型效用差等关键问题。为了解决这些问题,本文创新地将LLM遗忘的优化目标定义为归零数据归因。具体而言,我们提出了第一个基于数据归因奖励的LLM遗忘框架,称为DareU,该框架通过强化学习来更新LLM,通过降低其生成响应与遗忘数据所有者的归因分数(即去归因)来实现遗忘。使用LLM分类器作为归因的有效近似进行的实证评估表明,DareU在实现有效遗忘的同时,很好地平衡了遗忘质量和模型效用,优于现有基线。

英文摘要

The rapid development of large language models (LLMs) has raised concerns on the use of inappropriate data for training, which has led to a growing interest in LLM unlearning. Many existing LLM unlearning approaches rely on optimizing prediction loss(es), such as maximizing the loss on the forget set, but often face critical issues like over-forgetting and poor model utility. To address them, this paper novelly frames the optimization objective for LLM unlearning as one of zeroing out data attribution instead. In particular, we propose the first LLM unlearning framework based on data attribution rewards called DareU that performs reinforcement learning to update the LLM by reducing the attribution score of its generated responses (i.e., de-attributing) to the forget data owners. Empirical evaluation using an LLM classifier as an efficient approximation of attribution shows that DareU outperforms existing baselines by achieving effective unlearning while balancing forget quality and model utility well.

2605.30916 2026-06-01 cs.LG cs.GT econ.TH

Welfare, Improvability, and Variance: A Principal-Agent Approach to Optimal Benchmark Item Aggregation

福利、可改进性与方差:最优基准测试项聚合的主-代理方法

Andreas Haupt, Justin Hartenstein, Anka Reuel, Mykel Kochenderfer, Sanmi Koyejo

发表机构 * Department of Economics & Computer Science(经济与计算机科学系) Institute for Computational and Mathematical Engineering(计算与数学工程研究所) Department of Computer Science(计算机科学系) Department of Aeronautics & Astronautics(航空与航天系)

AI总结 提出将基准测试建模为多任务主-代理博弈,通过福利、可改进性和方差三个维度评估项目,并应用于OLMES数据集识别帕累托劣势项目。

详情
AI中文摘要

AI基准测试存在记录完善的局限性,先前研究探讨了污染、饱和以及构造不明确等问题。聚合受到的关注要少得多:基准测试通常通过统一平均项目级分数来总结,隐含地将每个测试项目视为同等重要。我们将基准测试建模为多任务主-代理博弈,并表明基准测试的福利损失由三个项目级原始要素共同决定:与规范性福利优先级的一致性、边际可改进性和性能方差。我们将该理论转化为一个审计框架,沿这三个轴对项目进行排序,并使用WORKBank(福利)、EvoLM 4B套件(可改进性)和PolyPythias 410M面板(方差)将其应用于OLMES项目。该框架揭示了在OLMES中,在亲工人福利操作化下帕累托劣势的项目。所有代码可在 https://github.com/stair-lab/principal-agent-benchmarks 获取。

英文摘要

AI benchmarks have well-documented limitations, with prior work examining contamination, saturation, and construct underspecification. Aggregation has received far less attention: benchmarks are typically summarized by uniformly averaging item-level scores, implicitly treating every test item as equally valuable. We model benchmarking as a multitask principal-agent game and show that the welfare loss from a benchmark is determined jointly by three item-level primitives: alignment with normative welfare priorities, marginal improvability, and performance variance. We translate the theory into an audit framework that ranks items along each of these three axes, and apply it to OLMES items using WORKBank for welfare, the EvoLM 4B suite for improvability, and the PolyPythias 410M panel for variance. The framework surfaces items that are Pareto-inferior within OLMES subject to a pro-worker welfare operationalization. All code is available at https://github.com/stair-lab/principal-agent-benchmarks.

2605.30914 2026-06-01 cs.LG cs.SE

Automating Formal Verification with Reinforcement Learning and Recursive Inference

用强化学习和递归推理自动化形式验证

Max Tan

发表机构 * Department of Electrical Engineering and Computer Science(电气工程与计算机科学系) Massachusetts Institute of Technology(麻省理工学院)

AI总结 研究通过可验证奖励的强化学习和验证器引导的推理搜索,提升大语言模型生成验证程序和证明的能力,在Dafny和Lean上取得显著进展。

Comments Master's thesis, 140 pages, 16 figures, 17 tables

详情
AI中文摘要

自动化形式验证对大语言模型仍然具有挑战性,因为证明助手和验证感知语言的数据稀缺,且正确性取决于满足精确的机器可检查规范,而非生成合理的代码。本文研究验证器环境如何通过可验证奖励的强化学习(RLVR)和验证器引导的推理时搜索,改进大语言模型生成验证程序和证明的能力。首先,我们使用组相对策略优化(GRPO)及相关变体,在Dafny中训练开源模型,将生成的候选程序组装成完整程序,并根据编译器和验证器的结果进行评分。在APPS衍生的Dafny数据集上的初步实验将验证奖励从2.2%提升至58.1%,但发现了规范破解问题,即模型利用弱形式规范而非实现预期解决方案。在过滤掉欠规范和易受攻击的任务后,多轮RLVR在改进的基准上将验证通过率从9.7%提升至31.1%。其次,我们在Lean中开发了一个验证器引导的推理框架,将证明生成视为对分解子目标、验证器反馈、诊断和修复的结构化搜索。使用固定的基础模型,包含证明修订器的完整框架在初始VeriCoding试点集上将通过率从直接修复的46.2%提升至69.2%。在更大的VERINA数据集上,整体任务分解加上证明修订器解决了42个先前未解决任务中的7个。我们还引入了Dalek-Bench,一个从Rust $ exttt{curve25519-dalek}$验证项目派生的仓库级Lean基准;初步结果仍然较弱,表明仍需更强的进度评估和特定任务的工具使用策略。

英文摘要

Automated formal verification remains challenging for large language models because data for proof assistants and verification-aware languages is scarce, and correctness depends on satisfying precise machine-checkable specifications rather than producing plausible code. This thesis studies how verifier environments can improve LLM generation of verified programs and proofs through reinforcement learning from verifiable rewards (RLVR) and verifier-guided inference-time search. First, we train open-source models in Dafny with RLVR using Group Relative Policy Optimization (GRPO) and related variants, assembling generated candidates into complete programs and scoring them with compiler and verifier outcomes. Initial experiments on an APPS-derived Dafny dataset increased verified reward from 2.2% to 58.1%, but revealed specification hacking, where models exploit weak formal specifications instead of implementing the intended solutions. After filtering underspecified and vulnerable tasks, multi-turn RLVR on the refined benchmark improves the verified pass rate from 9.7% to 31.1%. Second, we develop a verifier-guided inference scaffold in Lean that treats proof generation as structured search over decomposed subgoals, verifier feedback, diagnostics, and repair. With a fixed base model, the full scaffold with proof reviser improves pass rate on an initial VeriCoding pilot set from 46.2% under direct repair to 69.2%. On the larger VERINA dataset, whole-task decomposition plus proof reviser solves 7 of 42 previously unsolved tasks. We also introduce Dalek-Bench, a repository-scale Lean benchmark derived from the Rust $\texttt{curve25519-dalek}$ verification project; preliminary results remain weak, indicating that stronger progress evaluation and task-specific tool-use policies are still needed.

2605.30913 2026-06-01 cs.CL cs.AI cs.CY cs.HC

Toxic HallucinAItions: Perturbing Prompts and Tracing LLM Circuits

有毒幻觉:扰动提示与追踪LLM电路

Soorya Ram Shimgekar, Agam Goyal, Amruta Parulekar, Joshua Chen, Yian Wang, Navin Kumar, Hari Sundaram, Eshwar Chandrasekharan, Koustuv Saha

发表机构 * University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) Nimblemind

AI总结 研究有毒语言扰动对LLM事实可靠性的影响,发现有毒词汇降低准确率并增加不确定性,通过归因图分析揭示内部机制。

详情
AI中文摘要

大型语言模型(LLMs)越来越多地部署在对话环境中,用户语气从礼貌到对抗性或毒性不等,但尚不清楚在语义等效的提示中,有毒语言是否会降低事实可靠性。我们研究基于词汇和语气的提示扰动如何影响LLM的事实可靠性。通过礼貌、随机和三种毒性水平的受控提示变化,我们在ARC-Easy、GSM8K和MMLU上评估了五个LLM。我们发现有毒词汇扰动持续降低事实准确性并增加不确定性,而礼貌措辞产生有限且不一致的变化。为了检查这些答案不一致是否对应内部变化,我们进行了模型激活和影响的归因图分析。我们发现增加毒性选择性地放大对扰动敏感的变体节点,而相对稳定的核心推理节点保持更不变。这些发现将提示语气定位为LLM可靠性的关键维度,并提供了行为和机制证据,表明表面词汇变化可以改变事实输出和内部计算。

英文摘要

Large language models (LLMs) are increasingly deployed in conversational settings where user tone ranges from polite to adversarial or toxic, yet less is known about whether toxic language in otherwise semantically equivalent prompts can degrade factual reliability. We study how lexical and tone-based prompt perturbations affect the factual reliability of LLMs. Using controlled prompt variations across polite, random, and three toxicity levels, we evaluate five LLMs on ARC-Easy, GSM8K, and MMLU. We find that toxic lexical perturbations consistently reduce factual accuracy and increase uncertainty, while polite phrasing yields limited and inconsistent changes. To examine whether these answer inconsistencies correspond to internal changes, we conduct attribution-graph analyses of model activations and influences. We find that increasing toxicity selectively amplifies perturbation-sensitive variant nodes while relatively stable core reasoning nodes remain more invariant. These findings position prompt tone as a critical dimension of LLM reliability and provide behavioral and mechanistic evidence that surface-level lexical variation can alter factual outputs and internal computation.