arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.28742 2026-06-08 cs.AI 版本更新

CORE: Contrastive Reflection Enables Rapid Improvements in Reasoning

CORE: 对比反思实现推理能力的快速提升

Linas Nasvytis, Simon Jerome Han, Ben Prystawski, Satchel Grant, Noah D. Goodman, Judith E. Fan

发表机构 * Stanford University（斯坦福大学）

AI总结提出对比反思（CORE）非参数学习算法，通过对比成功与失败的推理轨迹生成自然语言洞察，在少量样本和 rollout 下实现比参数方法（GRPO）和非参数方法（GEPA、情景RAG、MemRL）更快的推理性能提升。

详情

AI中文摘要

语言模型可以利用可验证奖励在多种推理任务上提升性能。然而，无论是参数方法（如RLVR）还是非参数方法（如提示优化），通常都需要数百个训练样本和数千次模型 rollout，这在最佳情况下成本高昂，最坏情况下则难以处理。为解决这一挑战，我们引入了对比反思（CORE），一种非参数学习算法，通过比较过去的推理轨迹来生成洞察：即捕捉成功与不成功问题尝试之间差异的推理策略和约束的简短自然语言描述。在四个推理任务上，我们证明CORE比参数方法（GRPO）和非参数方法（GEPA、情景RAG和MemRL）实现更快的改进，同时使用更少的rollout。在固定rollout预算下，使用少至五个训练样本，我们进一步展示CORE也实现了与各基线相当或更大的性能提升。最后，我们强调CORE在上下文效率上也显著优于非参数基线，需要更少的提示词，同时将学到的知识存储为紧凑、可解释的自然语言洞察。因此，我们的结果表明，将成功与不成功推理轨迹之间的对比提炼为抽象且有用的洞察，比权重更新、提示优化或直接重用存储的推理轨迹，为模型自我改进提供了一条更高效且可解释的途径。

英文摘要

Language models can use verifiable rewards to improve at a wide variety of reasoning tasks. However, both parametric (e.g. RLVR) and non-parametric (e.g. prompt optimization) approaches to doing so typically require hundreds of training samples and thousands of model rollouts, making them expensive in the best case and intractable in the worst. To address this challenge, we introduce Contrastive Reflection (CORE), a non-parametric learning algorithm that compares past reasoning traces to generate insights: short natural-language descriptions of reasoning strategies and constraints that capture differences between successful and unsuccessful problem attempts. Across four reasoning tasks, we demonstrate that CORE enables more rapid improvement than both parametric (GRPO) and non-parametric (GEPA, episodic RAG, and MemRL) methods, while using fewer rollouts. Under fixed rollout budgets with as few as five training samples, CORE achieves the strongest performance in most task-data regimes. Finally, we highlight how CORE is substantially more context-efficient than non-parametric baselines, requiring fewer prompt tokens while storing learned knowledge as compact, interpretable natural-language insights. Our results therefore suggest that distilling contrasts between successful and unsuccessful reasoning traces into abstract and useful insights can provide a more efficient and interpretable route to model self-improvement than weight updates, prompt optimization, or direct reuse of stored reasoning traces.

URL PDF HTML ☆

赞 0 踩 0

2605.26099 2026-06-08 cs.CL cs.AI 版本更新

Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference

语言模型需要睡眠吗？用于改进在线推理的离线循环

Sangyun Lee, Sean McLeish, Tom Goldstein, Giulia Fanti

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； University of Maryland（马里兰大学）

AI总结本文提出一种类似睡眠的巩固机制，通过离线循环将上下文转换为快速权重，以解决Transformer注意力机制随上下文长度扩展性差的问题，并在合成任务和数学推理任务上验证了其有效性。

详情

AI中文摘要

基于Transformer的大型语言模型越来越多地用于长时任务；然而，它们的注意力机制随上下文长度扩展性差。为了解决这个问题，我们研究了一种类似睡眠的巩固机制，其中模型在清除其键值缓存之前，定期将最近的上下文转换为持久的快速权重。在睡眠期间，模型对累积的上下文进行$N$次离线循环传递，并通过学习到的局部规则更新其状态空间模型（SSM）块中的快速权重。在推理过程中，这会将额外的计算转移到睡眠阶段，同时保持清醒时预测的延迟。我们在受控的合成任务（包括元胞自动机和多跳图检索）以及一个现实的数学推理任务上测试了我们的方法，在这些任务上，常规Transformer以及SSM-注意力混合模型都失败了。然后我们表明，增加我们模型的睡眠持续时间$N$可以提高性能，在需要更深层推理的示例上收益最大。

英文摘要

Transformer-based large language models are increasingly used for long-horizon tasks; however, their attention mechanism scales poorly with context length. To handle this, we study a sleep-like consolidation mechanism in which a model periodically converts recent context into persistent fast weights before clearing its key-value cache. During sleep, the model performs $N$ offline recurrent passes over the accumulated context and updates the fast weights in its state-space model (SSM) blocks through a learned local rule. During inference, this shifts extra computation to sleep while preserving the latency of wake-time prediction. We test our method on controlled synthetic tasks, including cellular automata and multi-hop graph retrieval, as well as a realistic math reasoning task, on which a regular transformer as well as SSM-attention hybrid models fail. We then show that increasing sleep duration $N$ for our models improves performance, with the largest gains on examples that require deeper reasoning.

URL PDF HTML ☆

赞 0 踩 0

2605.30119 2026-06-08 cs.LG cs.AI cs.NE 版本更新

Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis

使用遗传编程进行可解释生存分析：进化特征 vs 进化整个树

Thalea Schlender, Peter A. N. Bosman, Tanja Alderliesten

发表机构 * Leiden University Medical Center（莱顿大学医学中心）； Centrum Wiskunde & Informatica（数学与信息学研究中心）

AI总结本研究使用遗传编程多目标进化可检查的特征集，并联合优化生存树结构与非线性分裂逻辑，以提高浅层生存树的预测性能和可解释性。

详情

AI中文摘要

生存分析涉及预测事件发生时间。常用于医学领域，处理不完整（即删失）数据，例如研究期间未发生事件的患者。实际应用中，准确性和可解释性都很重要。生存树是易于理解的生存模型，将患者队列递归地划分为离散的患者组。虽然生存树可以捕捉复杂关系，但它们通常需要生长得很大，威胁可解释性。此外，生存树通常使用贪婪方法构建，可能忽略全局最优分裂组合，限制预测性能。浅层生存树需要表达性强的高阶特征组合才能达到竞争性准确性。因此，我们使用遗传编程多目标进化固有可检查的特征集，并研究它们与不同树诱导策略的相互作用。我们进一步引入了一种进化方法，联合优化生存树结构和非线性分裂逻辑。我们的发现表明，在两个真实世界数据集和两种不同生存树深度上，进化特征构建提高了不同树诱导策略下的预测性能。完整的联合进化在提出多个性能良好的固有可检查的浅层生存树方面具有最高的潜力。

英文摘要

Survival analysis concerns the task of predicting the time until an event occurs. Often used in the medical field, survival analysis deals with incomplete (i.e., censored) data, for instance, from patients who did not experience the event during the duration of the study. For practical use, both accuracy and interpretability are important. Survival trees are easy-to-follow survival models that split the patient cohort recursively into discrete patient groups. Whilst survival trees can capture complex relationships, they typically need to grow large, threatening interpretability. Moreover, survival trees are often built using greedy approaches that may overlook globally optimal split combinations, limiting predictive performance. Shallow survival trees require expressive, higher-order feature combinations to achieve competitive accuracy. We therefore use genetic programming to multi-objectively evolve inherently inspectable feature sets and study how they interact with different tree induction strategies. We further introduce an evolutionary approach that jointly optimises the survival tree structure and the non-linear split logic. Our findings demonstrate that evolutionary feature construction improves predictive performance across different tree induction strategies on two real-world datasets and two different survival tree depths. Given its speed and flexible presentation, the multi-objective evolution of entire trees likely holds the most future promise.

URL PDF HTML ☆

赞 0 踩 0

2605.26974 2026-06-08 cs.RO 版本更新

Trust, Geometry, and Rules: A Credibility-Aware Reinforcement Learning Framework for Safe USV Navigation under Uncertainty

信任、几何与规则：不确定性下安全USV导航的可信感知强化学习框架

Yuhang Zhang, Shuqi Chai, Yukang Zhang, Liusha Yang, Mingchuan Zhang, Wei Wang, Qingjiang Shi, Quanbo Ge

发表机构 * School of Information Engineering, Henan University of Science and Technology（河南科技大学信息工程学院）； Shenzhen Research Institute of Big Data（深圳大数据研究院）； School of Logistics Engineering, Shanghai Maritime University（上海 Maritime University物流工程学院）； Shenzhen Technology University（深圳科技大学）； School of Computer Science, Wuhan University（武汉大学计算机学院）； School of Software Engineering, Tongji University（同济大学软件工程学院）

AI总结提出一种集成可信感知学习、几何安全屏蔽和连续规则感知嵌入的强化学习框架，以解决动态海洋环境中USV导航的安全性和COLREGs合规性问题。

详情

AI中文摘要

在动态海洋环境中，无人水面艇（USV）的安全自主导航并遵守《国际海上避碰规则》（COLREGs）仍然是一项艰巨的挑战，特别是当感知系统表现出校准不当的不确定性时。现有的基于强化学习（RL）的方法常常因为状态估计误差导致不可靠的信念状态误导价值函数，而离散的交通规则则引入了学习目标的不连续性而失败。为了解决这些挑战，我们提出了一个集成可信感知学习、几何安全屏蔽和连续规则感知嵌入的框架。首先，可信加权价值学习（CW-VL）引入了一个动态信任因子，该因子源自滤波器估计协方差与经验误差统计之间的差异，以调节评论家的异方差损失，防止策略对噪声样本过拟合。其次，协方差膨胀速度障碍（CI-VO）将位置估计不确定性映射为集合角裕度，形成一个保守的几何屏蔽，覆盖危险的探索行为。第三，风险感知COLREGs职责嵌入将二元相遇职责放松为连续的规则感知信号，提供平滑的扇区过渡信息，并抑制稀疏规则奖励引起的振荡。模拟相遇研究表明，该方法在感知不一致性下具有更好的训练鲁棒性，并且在避碰和COLREGs合规性方面优于基线方法。

英文摘要

Autonomous navigation of Unmanned Surface Vehicles (USVs) that is safe and compliant with the International Regulations for Preventing Collisions at Sea (COLREGs) remains a formidable challenge in dynamic maritime environments, particularly when perception systems exhibit miscalibrated uncertainty. Existing Reinforcement Learning (RL)-based methods often falter because state-estimation errors induce unreliable belief states that mislead the value function, while discrete traffic rules introduce discontinuity in the learning objective. To address these challenges, we propose a framework integrating credibility-aware learning, geometric safety shielding, and continuous rule-aware embedding. First, Credibility-Weighted Value Learning (CW-VL) introduces a dynamic trust factor derived from the discrepancy between filter-estimated covariance and empirical error statistics to modulate the critic's heteroscedastic loss, preventing policy overfitting to noisy samples. Second, the Covariance-Inflated Velocity Obstacle (CI-VO) maps position-estimation uncertainty into set-wise angular margins, forming a conservative geometric shield that overrides hazardous exploratory actions. Third, Risk-Aware COLREGs Duty Embedding relaxes binary encounter duties into continuous rule-aware signals, providing smooth sector-transition information and suppressing oscillation from sparse rule rewards. Simulated encounter studies demonstrate improved training robustness against perceptual inconsistency and superior collision avoidance and COLREGs compliance over baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.29223 2026-06-08 cs.LG 版本更新

Inferring the Size of Large Language Models From Popular Text Memorization

从流行文本记忆推断大型语言模型的规模

Ivica Nikolic

发表机构 * National University of Singapore（新加坡国立大学）

AI总结提出一种黑盒方法，通过分析模型对流行文本的记忆准确性，仅从生成文本推断LLM参数规模的下界，并验证了开源和闭源模型。

详情

AI中文摘要

最广泛使用的大型语言模型（LLM）的参数数量通常被其开发者隐瞒，使得模型规模——解释能力和成本的主要参考点——在很大程度上未被公开。我们提出了一种黑盒方法，仅从生成的文本输出推断LLM规模的下界，除了提交文本片段和观察下一个词预测的能力外，不需要任何其他条件。我们的方法基于一个关键观察：流行的、广泛传播的文本——如古典文学、宗教文本和基础文档——几乎存在于每个大规模预训练语料库中，而模型在不同长度文本片段上预测下一个词的准确度是其记忆程度的可靠信号，而记忆程度又从根本上受到其总参数数量的限制。我们将来自不同文本和片段长度的记忆信号聚合成每个模型的单一准确率轮廓向量，并在此基础上构建了两种互补的推断方法：一种成对统计检验，用于确定两个模型中哪个更大；以及一种缩放律估计器，通过主成分分析（PCA）从这些向量中提取一维潜在指数，将聚合信号映射到参数数量。在广泛的开源模型上验证，两种方法都产生了准确可靠的下界。当应用于流行的闭源模型时，我们的框架恢复了内部产品层级，并揭示了行业扩展策略的明显分歧：虽然一些开发者产生了显著更高的下界，表明代际参数大幅增长，但其他开发者在严格的参数上限下运行，表明即使在严格的API限制下，隐藏的设计选择也可以被系统地探测。

英文摘要

The parameter counts of the most widely used large language models (LLMs) are often withheld by their developers, leaving model size -- a primary reference point for interpreting capabilities and costs -- largely undisclosed. We propose a black-box method to infer conservative lower bounds on LLM size from generated text outputs alone, requiring nothing beyond the ability to submit text fragments and observe next-token predictions. Our approach is grounded in a key observation: popular, widely-circulated texts -- such as classical literature, religious texts, and foundational documents -- are present in virtually every large-scale pretraining corpus, and how accurately a model predicts the next word across text fragments of varying length is a reliable signal of how much it has memorized them, which in turn is fundamentally limited by its total parameter count. We aggregate this memorization signal across a diverse corpus of texts and fragment lengths into a single accuracy profile vector per model, and build two complementary inference methods on top of it: a pairwise statistical test that determines which of two models is larger, and a scaling-law estimator that extracts a one-dimensional latent index from these vectors via Principal Component Analysis (PCA) to map the aggregated signal to a parameter count. Validated on a broad set of open-weight models, both methods produce accurate and reliable lower bounds. When applied to popular closed-weight models, our framework recovers internal product hierarchies and reveals a clear divergence in industry scaling strategies: while some developers yield significantly higher bounds indicative of large generational parameter growth, others operate under strict parameter ceilings, demonstrating that hidden design choices can be systematically probed even under strict API limitations.

URL PDF HTML ☆

赞 0 踩 0

2605.25413 2026-06-08 cs.LG cs.AI cs.NA math.NA 版本更新

Autoregression-Free Neural Operators for Time-Dependent PDEs

无自回归的神经算子用于时间相关偏微分方程

Jiaquan Zhang, Caiyan Qin, Haoyu Bian, Libin Cai, Yi Lu, Chaoning Zhang, Wei Dong, Yuanfang Guo, Yang Yang, Heng Tao Shen

发表机构 * School of Computer Science and Engineering, University of Electronic Science and Technology of China（电子科技大学计算机科学与工程学院）； School of Robotics and Advanced Manufacture, Harbin Institute of Technology（哈尔滨工业大学机器人与先进制造学院）； School of Mathematical Sciences, Capital Normal University（首都师范大学数学学院）； College of Information and Control Engineering, Xi’an University of Architecture and Technology（西安建筑科技大学信息与控制工程学院）； Laboratory of Intelligent Recognition and Image Processing, School of Computer Science and Engineering, Beihang University（北京航空航天大学智能识别与图像处理实验室）； School of Computer Science and Technology, Tongji University（同济大学计算机科学与技术学院）

AI总结提出AFNO，通过将PDE时间演化映射到潜空间并利用流匹配学习连续时间向量场，避免自回归展开，实现长期稳定预测。

Comments 23 pages, 18 figures

详情

AI中文摘要

神经算子学习从函数依赖输入到解的映射，为求解偏微分方程（PDE）提供了有效框架。对于时间相关PDE，现有方法通常通过在高维物理场空间中直接进行自回归展开来执行长时域预测，其中每个预测状态被递归地反馈作为下一步的输入。尽管对短期预测有效，但这种自回归展开以及缺乏连续时间建模导致长时域展开中误差逐渐累积。在这项工作中，我们提出无自回归神经算子（AFNO），将PDE的时间演化映射到潜空间并在其中建模连续时间向量场。AFNO使用流匹配来学习潜向量场，从而能够在扩展时域上实现连续演化，避免自回归展开，并通过显式条件化物理参数来捕捉不同参数配置下的动力学。对六个PDE的理论分析和广泛实验表明，与基线相比，AFNO提高了长时域预测稳定性并持续减少了展开误差。

英文摘要

Neural operators learn mappings from function-dependent inputs to solutions, providing an effective framework for solving partial differential equations (PDEs). For time-dependent PDEs, existing methods typically perform long-horizon prediction through autoregressive rollout directly in high-dimensional physical field spaces, where each predicted state is recursively fed back as the input for the next step. Although effective for short-term prediction, this autoregressive rollout and the lack of continuous-time modeling lead to progressive error accumulation over long-horizon rollouts. In this work, we propose Autoregression-Free Neural Operators (AFNO), which map the time evolution of PDEs into a latent space and model continuous-time vector fields within it. AFNO uses flow matching to learn the latent vector field, thereby enabling continuous evolution over extended horizons, avoiding autoregressive rollout and capturing dynamics under varying parameter configurations through explicit conditioning on physical parameters. Theoretical analysis and extensive experiments on six PDEs demonstrate that AFNO improves long-horizon prediction stability and consistently reduces rollout errors compared with the baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.25806 2026-06-08 cs.CV 版本更新

An Analysis Focused on Womens Safety: Can VAD Models Be Enhanced by a Multi-modal Dataset?

聚焦女性安全分析：多模态数据集能否增强VAD模型？

Sangeeta ., Maddikuntla Sai Prajwal, Debi Prosad Dogra, Kamalakar Vijay Thakare, Hyungjoo Jung, Ig-Jae Kim, Heeseung Choi

发表机构 * Indian Institute of Technology Bhubaneswar（印度理工学院巴特那分校）； Artificial Intelligence and Robotics Institute, Korea Institute of Science and Technology（人工智能与机器人研究所，韩国科学技术院）； Yonsei-KIST Convergence Research Institute, Yonsei University（延世大学KIST融合研究中心）

AI总结针对现有视频异常检测数据集缺乏女性中心异常样本的问题，提出包含1001个视频及文本描述的多模态基准ExtrAnom，覆盖5种犯罪类型，并验证了多模态方法在检测女性中心异常上的有效性。

Comments 7 pages, 6 figures, 4 tables

详情

AI中文摘要

女性安全对于现代社会至关重要。针对女性的犯罪既发生在白天也发生在低光照条件下。通常，此类事件通过低分辨率的现实监控摄像头捕捉。尽管计算机视觉相关研究取得了显著进展，但专注于女性安全的视频异常检测（VAD）尚未得到充分解决。现有的视频异常数据集包含光照良好、高分辨率、近景视频，未能涵盖女性中心异常，如抢项链、跟踪、不当触摸及其他针对女性的细微犯罪形式。为解决这些问题，我们提出了ExtrAnom数据集，这是一个新的多模态基准，包含1001个带有文本描述的视频（500个正常，501个异常），分为5种不同类型的女性中心犯罪。该数据集包含低光照（8%）、低分辨率（13%）、远景（15%）以及白天（64%）异常视频。它涵盖了异常事件如跟踪（3.9%）、抢项链（17.6%）、绑架（7.3%）、暗杀（2.3%）、骚扰（18.9%）和正常（50%）。每个视频附带4个文本标注，包括一个人工生成和三个大语言模型生成的描述，支持跨模态和基于视觉语言模型（VLM）的验证。创建女性中心数据集的目标是准确检测可能通过视觉观察到的女性中心异常模式。该数据集辅助VLM准确生成视频级描述。ExtrAnom已针对流行的单模态和多模态VAD数据集（如XD-Violence、UCF-Crime和UCA）及最先进方法进行了基准测试。实验表明，现有数据集不足以训练模型检测女性中心异常。

英文摘要

Women's safety and security are paramount for a modern society. Crimes against women occur in daylight as well as in low-light conditions. Often, such events are captured through real-world surveillance cameras that operate at lower resolutions. Despite substantial progress in CV-related research, video anomaly detection (VAD) focused on women's safety has not yet been adequately addressed. Existing video anomaly datasets contain well-lit, high-resolution, close-shot videos, and fail to represent women-centric anomalies such as chain snatching, stalking, inappropriate touch, and other subtle forms of crime against women. To address these problems, we propose the ExtrAnom dataset, a new multi-modal benchmark containing 1001 videos with textual descriptions, 500 normal and 501 anomalous, classified into 5 different types of women-centric crimes. The dataset comprises low-light (8%), low-resolution videos (13%), long-shot (15%), along with daylight (64%) anomalous videos. And it covers anomalous events like stalking (3.9%), chain snatching (17.6%), kidnapping (7.3%), assassinations (2.3%), harassment (18.9%), and normal (50%). Each video is supplemented with 4 textual annotations, including one human-generated and three LLM-generated descriptions, enabling cross-modal and VLM-based validations. The aim of creating a women-centric dataset is to accurately detect the women-centric anomaly patterns, which are possible to observe visually. The dataset supplements the VLMs to accurately generate video-level descriptions. ExtrAnom has been benchmarked against popular unimodal and multi-modal VAD datasets (e.g., XD-Violence, UCF-Crime, and UCA) and SOTA methods. Experiments reveal that the existing datasets are insufficient to train models for detecting women-centric anomalies.

URL PDF HTML ☆

赞 0 踩 0

2605.25757 2026-06-08 cs.CV 版本更新

Broadband Hyperspectral 3D Imaging using Dispersed Structured Light

宽带高光谱3D成像：使用色散结构光

Suhyun Shin, Yunseong Moon, Ryota Maeda, David B. Lindell, Kiriakos N. Kutulakos, Seung-Hwan Baek

发表机构 * POSTECH South Korea（POSTECH韩国）； University of Hyogo Japan（日本广岛大学）； University of Toronto Canada（加拿大多伦多大学）

AI总结提出一种基于单光谱仪的宽带高光谱3D成像方法，通过可见光和SWIR相机立体设置，利用色散结构光同时重建密集宽带高光谱反射率和精确3D几何，解决了传统方法光谱范围窄、系统复杂的问题。

详情

AI中文摘要

高光谱3D成像能够捕获密集的光谱信息和场景几何，但传统上局限于窄光谱窗口，通常是可见光范围。在这项工作中，我们引入了一种宽带高光谱3D成像（BH3D）方法，将这一能力扩展到整个可见-近红外和短波红外（SWIR）光谱（450-1500 nm）。这种宽覆盖范围至关重要，因为它捕获了互补的物理线索：可见光波长揭示表面外观，而SWIR波段提供对次表面特性和材料组成的洞察。然而，实现BH3D具有挑战性，因为可见光谱硅传感器和SWIR光谱InGaAs传感器之间存在基本的传感器限制，需要复杂的多光谱仪设计。在这里，我们提出了一种单光谱仪BH3D系统，使用包含可见光和SWIR相机的立体设置，重建密集的宽带高光谱反射率以及精确的3D几何。我们的关键思想是使用单个光谱仪将色散结构光扩展到宽带范围。我们建模了宽带色散结构光的图像形成过程，并估计了高光谱反射率和深度。我们在多样化的真实场景上验证了我们的方法，展示了精确的重建，平均光谱角映射器为0.13 rad，均方根误差为0.03，平均深度误差为4.5 mm。我们进一步展示了识别同色异谱材料、通过不透明层成像、揭示钞票上的隐藏特征以及显示血管的能力。

英文摘要

Hyperspectral 3D imaging enables the capture of dense spectral information and scene geometry but has traditionally been confined to narrow spectral windows, typically the visible range. In this work, we introduce a broadband hyperspectral 3D imaging (BH3D) method to extend this capability across the full visible-near-infrared and short-wavelength infrared (SWIR) spectrum (450-1500 nm). This broad coverage is critical as it captures complementary physical cues: visible wavelengths reveal surface appearance, while SWIR bands provide insight into subsurface properties and material composition. However, realizing BH3D is challenging due to fundamental sensor constraints between visible-spectrum silicon and SWIR-spectrum InGaAs sensors, which necessitate complex multi-spectrograph designs. Here we propose a single-spectrograph BH3D system, using a stereo setup comprising visible and SWIR cameras, that reconstructs dense broadband hyperspectral reflectance together with accurate 3D geometry. Our key idea is to extend dispersed structured light to the broadband regime using a single spectrograph. We model the image formation of broadband dispersed structured light, and estimate hyperspectral reflectance and depth. We validate our approach on diverse real-world scenes, demonstrating accurate reconstruction with a mean spectral angle mapper of 0.13 rad, root mean square error of 0.03, and mean depth error of 4.5 mm. We further demonstrate identifying metameric materials, performing imaging through opaque layers, uncovering hidden features on banknotes, and revealing blood vessels.

URL PDF HTML ☆

赞 0 踩 0

2605.25638 2026-06-08 cs.CL cs.LG 版本更新

Reinforcement Learning from Denoising Feedback

基于去噪反馈的强化学习

Qi He, Huan Chen, Ya Guo, Huijia Zhu, Yi R. Fung, Baojian Zhou

发表机构 * Fudan University（复旦大学）； Ant Group（蚂蚁集团）； Hong Kong University of Science and Technology（香港科技大学）

AI总结提出RLDF方法，利用去噪反馈进行策略损失估计，通过优化中间噪声状态到裁剪干净状态并结合加权时间步采样，在扩散语言模型上提升性能和泛化性。

详情

AI中文摘要

策略损失估计仍然是扩散语言模型（dLLMs）强化学习中的一个基本且长期存在的挑战。我们引入了基于去噪反馈的强化学习（RLDF），这是一种新颖的训练范式，利用从rollout和训练过程中获得的反馈来实现准确且高效的策略损失估计。为了平衡计算效率和估计有效性之间的权衡，RLDF将模型从中间噪声状态$x_t$优化到裁剪干净状态$\hat{x}_0$，并结合了随时间步$t$的加权采样。大量实验表明，RLDF在两种代表性dLLM架构（LLaDA和Dream）上，在多个推理基准测试中实现了性能和泛化性的一致且显著的提升。我们的工作为扩散语言模型中的可扩展强化学习奠定了原则性基础。我们构建了Drift，一个用于dLLMs的训练框架，可在https://github.com/ant-research/Drift获取。

英文摘要

Policy loss estimation remains a fundamental and long-standing challenge in reinforcement learning (RL) for diffusion language models (DLMs). We introduce Reinforcement Learning from Denoising Feedback (RLDF), a novel training paradigm that leverages feedback obtained from rollout and training processes to facilitate accurate and efficient policy loss estimation. To balance the trade-off between computational efficiency and estimation effectiveness, RLDF optimizes the model toward the clipped clean state from intermediate noisy states, combined with weighted timestep sampling over denoising timesteps. Extensive experiments demonstrate that RLDF achieves consistent and substantial improvements in both performance and generalizability across two representative DLM architectures, LLaDA and Dream, on multiple reasoning benchmarks. Our work lays a principled foundation for scalable reinforcement learning in diffusion language models. We build Drift, a training framework for DLMs, available at https://github.com/ant-research/Drift.

URL PDF HTML ☆

赞 0 踩 0

2605.25451 2026-06-08 cs.LG 版本更新

BigMac: Breaking the Pareto Frontier of Compute and Memory in Multimodal LLM Training

BigMac: 打破多模态大语言模型训练中的计算与内存帕累托前沿

Zili Zhang, Chengxu Yang, Shenglong Zhang, Chenyu Wang, Yufan Zhang, Tuo Dai, Zhouyang Li, Yuhong Ge, Chao Jin, Xin Jin, Yuliang Liu

发表机构 * Peking University（北京大学）； Independent Researcher（独立研究员）； Xiaohongshu, Inc（小红书公司）

AI总结提出BigMac训练流水线，通过嵌套编码器和生成器计算到LLM流水线中，同时优化计算效率和内存使用，打破帕累托前沿。

详情

AI中文摘要

训练多模态大语言模型（MLLMs）面临模型和数据的异构性挑战。现有系统重新设计训练流水线以应对这些挑战，但仍受限于计算与内存效率之间的帕累托前沿，只能以牺牲一方为代价改进另一方。我们提出BigMac，一种新的多模态大语言模型训练流水线。BigMac的核心思想是将编码器和生成器的计算优雅地嵌套到原始LLM流水线中，形成依赖安全的嵌套流水线结构。通过这种设计，BigMac将编码器和生成器的激活内存复杂度降低到O(1)，同时保持LLM的激活内存复杂度不变。同时，它实现了与具有无限内存的理想设置相同的计算效率。因此，BigMac打破了计算效率与内存使用之间的帕累托前沿，使得在MLLM训练中能够同时优化计算和内存。我们在多个MLLM和训练负载上评估了BigMac。实验结果表明，与基线系统相比，BigMac实现了1.08倍至1.9倍的训练加速，同时随着批次大小的增加保持稳定的内存使用。

英文摘要

Training multimodal large language models (MLLMs) is challenged by both model and data heterogeneity. Existing systems redesign the training pipeline to address these challenges, but remain bound by a Pareto frontier between compute and memory efficiency, improving one only at the expense of the other. We present BigMac, a new training pipeline for multimodal LLMs. The core idea of BigMac is to elegantly nest the encoder and generator computation into the original LLM pipeline, forming a dependency-safe nested pipeline structure. With this design, BigMac reduces the activation memory complexity of the encoder and generator to O(1) while keeping the activation memory complexity of the LLM unchanged. At the same time, it achieves the same computational efficiency as the idealized setting with unlimited memory. As a result, BigMac breaks the Pareto frontier between computational efficiency and memory usage, enabling simultaneous optimization of both computation and memory in MLLM training. We evaluate BigMac on multiple MLLMs and training workloads. Experimental results show that BigMac achieves a 1.08$\times$-1.9$\times$ training speedup over baseline systems while maintaining stable memory usage as batch size increases.

URL PDF HTML ☆

赞 0 踩 0

2605.25171 2026-06-08 cs.CL 版本更新

Re-defining Humor Data Objects for AI Humor Research

为AI幽默研究重新定义幽默数据对象

Anna Arnett, Bang Nguyen, Meng Jiang

发表机构 * Department of Computer Science and Engineering, University of Notre Dame（诺特大学计算机科学与工程系）

AI总结本研究将幽默视为具有上下文和解释的社会互动，通过定义幽默推理数据对象并改进提示策略，使LLM生成更高质量的幽默解释，为AI幽默研究的数据合成与增强奠定基础。

Comments Added link to code and data

详情

AI中文摘要

在现有的大多数AI幽默研究中，幽默被简单地视为“存在”或“不存在”。我们探索了幽默作为具有上下文和解释的社会互动的概念。在此项目中，我们定义了一个幽默推理数据对象，并开发了一种提示LLM生成对普通人群有效的幽默解释的方法。我们从早期的提示迭代到改进的提示，发现后一个版本减少了重要错误，然后将生成扩展到大量数据对象，这些对象有潜力为AI幽默研究实现数据合成和数据增强。我们的主要收获是，更好的LLM提示能提高幽默解释质量，特别是通过更仔细地处理缺失上下文、多模态和转录问题。这些结果为未来AI理解幽默作为社会行为的研究奠定了坚实基础。

英文摘要

In most existing AI humor research, humor was treated as either "present" or "not present." We explore the concept of humor as a social interaction with context and explanations. During this project, we defined a humor reasoning data object and developed a way to prompt LLMs to generate an explanation of humor effective for general population. We iterated from an earlier prompt to an improved prompt, found that the later version reduced important errors, and then scaled generation to a large number of data objects which have the potential to enable data synthesis and data augmentation for AI humor research. Our main takeaway is that better prompting of an LLM improves humor explanation quality, especially by handling missing context, multi-modality, and transcript issues more carefully. These results establish a strong foundation for future work on AI understanding of humor as social behavior. All code and data are available at: https://github.com/anna-arnett/ai-humor/ .

URL PDF HTML ☆

赞 0 踩 0

2605.25054 2026-06-08 cs.LG cs.AI 版本更新

Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training

按需扩展：自适应神经元级混合精度量化感知训练

Ayush K. Varshney, Konstantinos Vandikas, Šarūnas Girdzijauskas, Adam Orucu, Aneta Vulgarakis Feljan

发表机构 * University of California, Berkeley（加州大学伯克利分校）； DeepMind（深度思维）； University of Cambridge（剑桥大学）

AI总结提出神经元级混合精度量化感知训练（NMP-QAT），通过可微代理和直通估计器让每个神经元独立学习离散精度，实现按需扩展位宽，在MLP和表格基础模型上取得更优的压缩-精度权衡。

Comments Accepted at ICML - GlobalSouthML workshop, 2026

详情

AI中文摘要

在资源受限的6G边缘设备上部署深度神经网络需要激进压缩且最小化精度损失。量化感知训练（QAT）已成为领先的压缩方法；然而，现有的混合精度方法通常以粗粒度的层或通道级别操作。这些方法通常依赖启发式或基于搜索的位分配策略，可能忽略神经元级别的细粒度变异性。我们提出神经元级混合精度QAT（NMP-QAT），其中每个神经元在训练期间独立学习自己的离散精度。从低位精度开始，NMP-QAT仅在训练信号需要时通过可微代理和直通估计器扩展位宽，同时保持完全离散的推理图。这种适应性扩展到权重和激活，减少内存移动。在电信和非电信数据集上，跨MLP和表格基础模型架构评估，NMP-QAT相比混合精度QAT基线实现了更优的压缩-精度权衡，使其非常适合网络边缘的绿色AI部署。

英文摘要

Deploying deep neural networks on resource-constrained 6G edge devices demands aggressive compression with minimal accuracy loss. Quantization-Aware Training (QAT) has emerged as a leading compression approach; however, existing mixed-precision methods typically operate at coarse layer- or channel-level granularity. These methods often rely on heuristic or search-based bit-allocation strategies, which may overlook fine-grained variability at the neuron level. We propose Neuron-Level Mixed-Precision QAT (NMP-QAT), where each neuron independently learns its own discrete precision during training. Starting from low-bit precision, NMP-QAT expands bit-width only when training signals demand it, via differentiable surrogates and straight-through estimators, while preserving a fully discrete inference graph. This adaptability extends to both weights and activations, reducing memory movement. Evaluated on telecom and non-telecom datasets across MLP and tabular foundation model architectures, NMP-QAT achieves superior compression-accuracy trade-offs over mixed-precision QAT baselines, making it well-suited for Green AI deployments at the network edge.

URL PDF HTML ☆

赞 0 踩 0

2605.24011 2026-06-08 cs.CV cs.AI 版本更新

ActQuant: Sub-4-bit Action-Guided Quantization for Vision-Language-Action Models

ActQuant: 面向视觉-语言-动作模型的亚4比特动作引导量化

Arash Akbari, Arman Akbari, Masih Eskandar, Qitao Tan, Yixiao Chen, Jingwu Luo, Bertha Pangaribuan, Liyun Zhang, Jennifer Dy, Geng Yuan, Xue Lin, Gaowen Liu, Stratis Ioannidis, Yanzhi Wang

发表机构 * Northeastern University（东北大学）； University of California, Berkeley（加州大学伯克利分校）

AI总结提出ActQuant框架，通过动作引导的混合精度后训练量化，在亚4比特权重量化下保持VLA模型性能，并引入OmniModel.cpp实现高效部署。

详情

AI中文摘要

视觉-语言-动作（VLA）模型在具身智能中展现出卓越的动作生成能力，但其高计算量使得在边缘平台部署不切实际。激进的亚4比特权重量化是自然解决方案，但现有后训练量化（PTQ）方法在此情况下性能严重下降。为解决此问题，我们引入ActQuant，一个动作引导的混合精度PTQ框架，包含两个阶段：（1）张量间比特分配器，根据每个权重矩阵对预测智能体动作的贡献程度分配单一比特宽度；（2）张量内尺度优化器，使用动作感知曲率调整每块量化尺度，使动态范围集中在控制影响最大的权重上。为了在设备上实现激进量化的优势，我们进一步引入OmniModel.cpp，一个代理转换流水线，将架构移植到具有高效低位内核的原生C/C++运行时。我们在仿真和真实世界的6自由度UR3机械臂上评估ActQuant，所有模型通过OmniModel.cpp部署。在LIBERO基准上，ActQuant是唯一在每权重3比特或以下运行的方法，在OpenVLA-OFT上保持95.0%的性能，在$π_{0.5}$上保持94.8%。进一步，ActQuant在OpenVLA-OFT上达到2.5 bpw，性能为90.1%，将骨干网络从14.3 GB压缩到2.7 GB（5.3倍）。在物理UR3机械臂上，使用ActQuant量化的$π_{0.5}$保持基线的成功率，同时将内存占用减少2.5倍。

英文摘要

Vision-Language-Action (VLA) models exhibit remarkable action generation for embodied intelligence, but their heavy compute make deployment on edge platforms impractical. Aggressive, sub-4-bit weight quantization is the natural solution, yet existing post-training quantization (PTQ) methods suffer severe performance degradation in this regime. To address this, we introduce ActQuant, an action-guided mixed-precision PTQ framework that operates in two stages: (1) an inter-tensor bit allocator that assigns each weight matrix a single bit-width based on how much it contributes to predicting the agent's actions; (2) an intra-tensor scale optimizer tunes per-block quantization scales using action-aware curvature, so that dynamic range is concentrated on the weights most influential for control. To deliver the on-device benefits of our aggressive quantization, we further introduce OmniModel.cpp, an agentic conversion pipeline that ports architectures into a native C/C++ runtime with efficient low-bit kernels. We evaluate ActQuant both in simulation and on a real-world 6-DoF UR3 arm, with all models deployed through OmniModel.cpp. On the LIBERO benchmark, ActQuant is the only method that operates at or below 3 bits-per-weight, retaining 95.0% on OpenVLA-OFT and 94.8% on $π_{0.5}$. Pushed further, ActQuant reaches 2.5 bpw at 90.1% on OpenVLA-OFT, compressing the backbone from 14.3 GB to 2.7 GB (5.3$\times$). On the physical UR3 arm, $π_{0.5}$ quantized with ActQuant retains the baseline's success rate while reducing the memory footprint by 2.5$\times$.

URL PDF HTML ☆

赞 0 踩 0

2605.22882 2026-06-08 cs.CV cs.RO 版本更新

GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation

GEM-4D：用于机器人操作的几何增强视频世界模型

Kaichen Zhou, Yuzhen Chen, Fangneng Zhan, Hang Hua, Grace Chen, Xinhai Chang, Ao Qu, Yilun Du, Zhuang Liu, Paul Pu Liang, Mengyu Wang

发表机构 * Harvard AI and Robotics Lab（哈佛人工智能与机器人实验室）； Harvard University（哈佛大学）； Media Lab and EECS（媒体实验室和电子工程与计算机科学系）； MIT（麻省理工学院）； Princeton University（普林斯顿大学）； MIT-IBM Watson AI Lab（麻省理工-IBM沃森人工智能实验室）

AI总结提出GEM-4D，通过注入从预训练几何基础模型蒸馏的密集4D对应监督，增强视频世界模型的几何一致性，并引入逆动力学模块将视频滚动转换为可执行机器人轨迹，提升操作成功率。

Comments Robotic World Model, Video Generative Model

详情

AI中文摘要

视频世界模型可以从单个指令生成逼真的未来帧，但它们通常无法在时间上一致地跟踪相同的物理点。因此，生成的视频看似合理，但缺乏可靠动作执行（如机器人操作）所需的物理基础。我们提出GEM-4D，一种几何接地视频世界模型，通过在训练期间将预训练几何基础模型蒸馏的密集4D对应监督注入视频生成骨干网络来解决这一限制。这种监督使模型能够联合捕捉外观和几何结构，同时保持单流架构且无额外推理成本。我们进一步引入逆动力学模块，将对应一致的视频滚动转换为可执行的机器人轨迹，从而能够在真实世界和模拟操作中直接部署。GEM-4D在视频预测和几何一致性方面在模拟和真实场景中均达到最先进性能，并将真实世界操作成功率从61%提升至81%。更多结果见https://gem-4d.github.io/。

英文摘要

Video world models can generate realistic futures from a single instruction, but they often fail to track the same physical points consistently across time. As a result, the generated videos appear plausible, yet lack the physical grounding required for reliable action execution, such as robot manipulation. We present GEM-4D, a geometry-grounded video world model that resolves this limitation by injecting dense 4D correspondence supervision distilled from a pretrained geometry foundation model into the video generative backbone during training. This supervision enables the model to jointly capture appearance and geometric structure while retaining a single-stream architecture with no additional inference cost. We further introduce an inverse dynamics module that converts correspondence-consistent video rollouts into executable robot trajectories, enabling direct deployment in both real-world and simulated manipulation. GEM-4D achieves state-of-the-art performance on both video prediction and geometric consistency across both simulation and realistic scenarios and improves real-world manipulation success from 61% to 81%. Additional results are available at https://gem-4d.github.io/.

URL PDF HTML ☆

赞 0 踩 0

2605.21347 2026-06-08 cs.AI cs.LG cs.SE 版本更新

Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

Insights Generator: LLM代理的系统级语料库追踪诊断

Akshay Manglik, Apaar Shanker, Kaustubh Deshpande, Jason Qin, Yash Maurya, Veronica Chatrath, Vijay S. Kalmath, Levi Lentz, Yuan Xue

发表机构 * Scale AI, Inc.

AI总结本文提出Insights Generator，一种多智能体系统，通过在语料库中提出和测试假设来生成基于证据的洞察报告，从而系统性地诊断LLM代理的行为模式。

详情

AI中文摘要

诊断LLM代理的故障仍然主要依赖人工。从业者检查少量执行轨迹子集，形成临时假设并迭代。此过程错过了仅在轨迹群体中显现的模式，并且无法扩展到生产语料库，其中单个轨迹跨度上万词。我们正式化了语料库级轨迹诊断问题。给定一个执行轨迹语料库，目标是生成具有证据支持的自然语言洞察，以描述轨迹群体中的系统性行为模式，每个模式都与支持证据相关联。我们提出了Insights Generator (IG)，一种多智能体系统，通过在轨迹语料库中提出和测试假设来回答诊断问题，从而生成证据支持的洞察报告。我们从定性和客观维度评估了IG，涵盖基于评分标准的报告评估和通过实施IG洞察获得的下游性能改进。使用IG报告的人类专家将支架性能提高了30.4个点百分比，而利用IG衍生洞察的编码代理显示出一致且稳定的提升。在多个基准测试中，IG的scout-investigator架构生成的发现与竞争方法在检测覆盖方面相当，同时领域专家对IG报告的深度和证据质量评价优异。

英文摘要

Diagnosing failures in LLM agents remains largely manual. Practitioners inspect a small subset of execution traces, form ad-hoc hypotheses, and iterate. This process misses patterns that only emerge across trace populations and does not scale to production corpora where individual traces span tens of thousands of tokens. We formalize the problem of corpus-level trace diagnostics. Given a corpus of execution traces, the goal is to produce grounded natural-language insights that characterize systematic behavioral patterns across trace groups, each linked to supporting evidence. We present the Insights Generator (IG), a multi-agent system that answers diagnostic questions by proposing and testing hypotheses across the trace corpus to produce an evidence-backed insights report. We evaluate IG across qualitative and objective dimensions, spanning rubric-based report assessment and downstream performance improvements achieved by implementing IG insights. Human experts using IG reports improve scaffold performance by 30.4pp over the unmodified baseline scaffold, and coding agents leveraging IG-derived insights show consistent and stable gains. Across benchmarks, IG's scout-investigator architecture produces findings comparable in detection coverage to competing approaches, while domain experts rated IG reports as leading depth and evidence quality.

URL PDF HTML ☆

赞 0 踩 0

2605.21731 2026-06-08 cs.LG 版本更新

I-SAFE: Wasserstein Coherence Metrics for Structural Auditing of Scientific AI Models

I-SAFE：基于瓦尔德斯坦一致性度量的科学AI模型结构审计

Barbara Tarantino, Gennaro Auricchio, Paolo Giudici

发表机构 * Department of Economics, University of Pavia（经济学系，帕维亚大学）； Department of Mathematics, University of Padua（数学系，帕维亚大学）

AI总结本文提出I-SAFE框架，通过瓦尔德斯坦一致性度量对科学AI模型进行结构审计，揭示模型在分布响应上的差异，为科学AI模型提供更可靠的评估方法。

详情

AI中文摘要

深度学习模型在科学预测任务中被越来越多地使用，其中强大的基准性能常被解释为具有科学意义的行为。然而，这种解释是脆弱的，因为模型可能利用捷径特征、数据集特定的规律或分布偏见，这些在验证数据上具有预测性，但与领域相关的结构不一致。为了解决这一限制，我们引入了I-SAFE（Interventional Secure, Accurate, Fair and Explainable）框架，这是一个面向科学AI模型的后验分布审计框架，核心是瓦尔德斯坦一致性度量（WCM）。给定一个训练好的黑盒预测器和一个外部结构先验，该框架评估模型输出在结构引导的输入扰动下的表现。所提出的审计度量通过三个互补的指标输出分布一致性：基于分位数的度量（QBM）用于位置级一致性，WCM用于顺序一致性，以及一个翻译不变的WCM变体用于形状一致性。我们通过药物-靶点相互作用（DTI）预测在Davis激酶基准、KLIFS（激酶-配体相互作用指纹和结构）结合口袋注释以及三个基于序列的DTI模型：DeepConvDTI、DeepDTA和TAPB上实例化I-SAFE。尽管这些模型在可比的预测范围内运行，I-SAFE揭示了显著不同的分布响应特征，这种差异在基于准确性的评估中是不可见的。该框架是模型无关的，适用于任何输入具有结构分解和外部先验可用的领域。

英文摘要

Deep learning models are increasingly used in scientific prediction tasks where strong benchmark performance is often interpreted as evidence of scientifically meaningful behavior. This interpretation is fragile, as models may exploit shortcut features, dataset-specific regularities, or distributional biases that are predictive on held-out data but not aligned with domain-relevant structure. To address this limitation, we introduce the \textsc{I-SAFE} (Interventional Secure, Accurate, Fair and Explainable) framework, a post-hoc distributional auditing framework for scientific AI models centered on the Wasserstein Coherence Metric (WCM). Given a trained black-box predictor and an external structural prior encoding domain knowledge about task-relevant input structure, \textsc{I-SAFE} evaluates raw model outputs under structurally guided perturbations of the input. The proposed audit measures output-distribution coherence through three complementary metrics: a Quantile-Based Metric (QBM) for location-level coherence, the WCM for ordinal coherence, and a translation-invariant WCM variant for shape coherence. We instantiate \textsc{I-SAFE} on drug--target interaction (DTI) prediction using the Davis kinase benchmark, KLIFS (Kinase--Ligand Interaction Fingerprints and Structures) binding-pocket annotations, and three sequence-based DTI models: DeepConvDTI, DeepDTA, and TAPB. Although the models operate in a comparable predictive regime, \textsc{I-SAFE} reveals substantially different distributional response profiles, a distinction invisible to accuracy-based evaluation. The framework is model-agnostic and applicable to any domain where inputs admit a structured decomposition and an external prior is available.

URL PDF HTML ☆

赞 0 踩 0

2605.21706 2026-06-08 cs.AI 版本更新

Latent-space Attacks for Refusal Evasion in Language Models

潜在空间攻击用于语言模型的拒绝规避

Giorgio Piras, Raffaele Mura, Fabio Brau, Maura Pintor, Luca Oneto, Fabio Roli, Battista Biggio

发表机构 * University of Cagliari（卡利亚里大学）； University of Genova（热那亚大学）

AI总结本文研究了如何通过潜在空间攻击来规避语言模型的拒绝行为，提出了一种受控的潜在空间攻击方法，以提高攻击成功率，优于现有方法。

详情

AI中文摘要

安全对齐的语言模型被训练以拒绝有害请求，但通过引导其内部表示可以抑制拒绝行为。现有方法通过消融拒绝方向来消除模型残差流中的拒绝行为。尽管这些方法在经验上取得了成功，但它们缺乏对所诱导的潜在空间转换的系统性解释以及为何会抑制拒绝。在本文中，我们将拒绝抑制视为对训练以区分拒绝和回答提示的线性探测的潜在空间规避攻击。在此观点下，先前工作的差异均值方向自然定义了这样的探测器，其消融正好是对其决策边界上的投影，即最小置信度规避攻击。这种视角不仅解释了先前工作的经验成功，也承认了一个关键限制：规避在决策边界停止，这促使需要将表示进一步推入合规区域，即模型回答的区域。我们通过提出受控的潜在空间规避攻击方法来利用这一点，该方法通过优化置信度将表示投影到边界之外。我们在15个指令微调、多模态和推理模型上实现了最先进的攻击成功率，优于现有拒绝消融基线和专门的 jailbreak 攻击。

英文摘要

Safety-aligned language models are trained to refuse harmful requests, yet refusal behavior can be suppressed by steering their internal representations. Existing methods do so by ablating a refusal direction from model activations, aiming to remove refusal from the model's residual stream. Despite their empirical success, these methods lack a principled account of the latent-space transformation they induce and why it suppresses refusal. In this work, we recast refusal suppression as a latent-space evasion attack against linear probes trained to separate refused from answered prompts. Under this view, prior work's difference-in-means direction naturally defines such a probe, and its ablation is exactly a projection onto its decision boundary, i.e., a minimum-confidence evasion attack. This perspective not only explains the empirical success of prior work but also admits a key limitation: evasion stops at the decision boundary, motivating the need to push representations further into the compliant region, i.e., where the model answers. We leverage this by proposing a Controlled Latent-space Evasion attack that projects representations past the boundary with an optimized confidence. We achieve state-of-the-art attack success rate across 15 instruction-tuned, multimodal, and reasoning models, outperforming existing refusal-ablation baselines and specialized jailbreak attacks.

URL PDF HTML ☆

赞 0 踩 0

2605.06890 2026-06-08 cs.AI cs.MA 版本更新

Beyond the Black Box: Interpretability of Agentic AI Tool Use

超越黑箱：代理AI工具使用的可解释性

Hariom Tatsat, Ariye Shater

发表机构 * GitHub

AI总结本文提出了一种基于稀疏自编码器（SAEs）和线性探针的机制可解释性工具包，旨在提升代理AI在长周期任务中对工具调用的可观测性和可解释性，通过分析模型内部状态来识别工具决策的关键特征，从而揭示代理失败的深层原因。

Comments 12 pages, 4 figures, 17 tables

详情

AI中文摘要

AI代理在高风险企业工作中具有前景，但可靠部署受限，因为工具使用失败难以诊断和控制。代理可能跳过必需的工具调用，错误调用工具，或执行后果只能在执行后才显现的行动。现有的可观测性方法大多是外部的：提示揭示相关性，评估评分输出，日志只能在模型已行动后才出现。在长周期设置中，这些失败尤其昂贵，因为早期工具错误会改变轨迹其余部分，增加token消耗，并创建下游的安全和安全风险。我们引入了一种基于稀疏自编码器（SAEs）和线性探针的机制可解释性工具包。该框架在每次行动前读取模型状态，并推断是否需要工具以及下一步工具动作的可能后果。通过将激活分解为稀疏特征，它识别与工具决策最相关的内部层和特征，并通过特征消融测试其功能重要性。我们训练探针在NVIDIA Nemotron函数调用数据集的多步轨迹上，并将相同的工作流程应用于GPT-OSS 20B和Gemma 3 27B模型。目标不是取代外部评估，而是添加一层缺失的可见性：在行动前模型内部信号的可见性。这有助于揭示代理失败的深层原因，尤其是在长周期运行中，早期错误会重塑其余的代理交互。更广泛地说，本文展示了机制可解释性如何支持实际的内部可观测性，以监控代理系统的工具调用和风险。

英文摘要

AI agents are promising for high-stakes enterprise workflows, but dependable deployment remains limited because tool-use failures are difficult to diagnose and control. Agents may skip required tool calls, invoke tools unnecessarily, or take actions whose consequence becomes visible only after execution. Existing observability methods are external: prompts reveal correlations, evaluations score outputs, and logs arrive only after the model has already acted. In long-horizon settings, these failures are costly because an early tool mistake can alter the rest of the trajectory, increase token consumption, and create downstream safety and security risk. We introduce a mechanistic-interpretability toolkit built on Sparse Autoencoders (SAEs), which decompose activations into sparse internal features, and linear probes, lightweight classifiers that read signals from those features. The framework reads model states before each action and infers whether a tool is needed and how risky the next tool action is. It identifies the model layers and features most associated with tool decisions and tests their functional importance through feature ablation. We train the probes on multi-step trajectories from the NVIDIA Nemotron function-calling dataset and apply the same workflow to GPT-OSS 20B and Gemma 3 27B models. The goal is not to replace external evaluation, but to add a missing layer: visibility into what the model signaled internally before action. This helps surface deeper causes of agent failure, especially in long-horizon runs where an early mistake can impact subsequent agent behavior. More broadly, the paper shows how mechanistic interpretability can support internal observability for monitoring tool calls and risk in agent systems.

URL PDF HTML ☆

赞 0 踩 0

2512.23292 2026-06-08 cs.AI cs.LG 版本更新

Agentic Physical AI toward a Domain-Specific Foundation Model for Energy Systems: A Case Study on Nuclear Reactor Control

面向能源系统的领域特定基础模型的具身物理人工智能：以核反应堆控制为例

Yoon Pyo Lee, Samrendra Roy, Kazuma Kobayashi, Sajedul Talukder, Diab Abueidda, Seid Koric, Souvik Chakraborty, Syed Bahauddin Alam

发表机构 * The Grainger College of Engineering, Nuclear, Plasma & Radiological Engineering, University of Illinois Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校格雷格学院工程学院、核等工程学院）； Department of Nuclear Engineering, Hanyang University（汉阳大学核工程系）； University of Texas - El Paso（德克萨斯大学埃尔帕索分校）； National Center for Supercomputing Applications（国家超级计算应用中心）； Department of Applied Mechanics, Indian Institute of Technology Delhi（印度德里理工学院应用力学系）； Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi（印度德里理工学院亚里人工智能学院）

AI总结本研究提出通过紧凑语言模型作为具身物理人工智能，利用基于物理模拟器验证的策略优化替代感知推理，在核反应堆控制任务中实现领域特定基础模型，并展示了规模扩展带来的可靠性提升和策略集中化行为。

详情

AI中文摘要

当前物理系统人工智能的主流范式——将通用基础模型扩展至通用多模态推理——在控制接口处面临障碍。前沿视觉-语言模型在基本定量物理任务上仅达到50-53%的准确率，表现为近似猜测者，在保持语义合理性的同时违反物理约束。安全关键控制要求对执行动作的结果空间保证，而非参数空间模仿。本文提出了一条通向领域特定基础模型的路径，通过紧凑语言模型作为具身物理人工智能运行：基于物理模拟器验证的策略优化，而非感知推理。我们在从10^3到10^5个样本规模扩展的合成核反应堆场景上训练了一个360M参数的模型。规模扩展在标称模拟条件下产生了强烈的、依赖于工况的可靠性提升，方差缩小约500倍，并在采样分布上消除了>10%的终端功率偏移。尽管模型均衡地接触了四种执行机构族，但它在运行时将95%的执行集中在单一棒组策略上，无需强化学习或奖励工程。表征可在不同模拟器间迁移，无需改变架构。我们将该系统定位为验证、监控和纵深防御架构中的候选决策组件，而非独立的安全解决方案：所展示的行为仅涉及模拟中单步任务的闭环可靠性，尚未解决非标称运行、传感器故障或不确定性量化问题。

英文摘要

The prevailing paradigm in AI for physical systems: scaling general-purpose foundation models toward universal multimodal reasoning, confronts a barrier at the control interface. Frontier vision-language models achieve only 50-53% accuracy on basic quantitative physics tasks, behaving as approximate guessers that preserve semantic plausibility while violating physical constraints. Safety-critical control demands outcome-space guarantees over executed actions, not parameter-space imitation. Here we present a pathway toward domain-specific foundation models through compact language models operating as Agentic Physical AI: policy optimization driven by physics-based simulator validation rather than perceptual inference. We train a 360M-parameter model on synthetic nuclear reactor scenarios scaled from 10^3 to 10^5 examples. Scaling produces strong, regime-dependent reliability gains under nominal simulated conditions, with variance collapse of approximately 500x and elimination of >10% terminal-power excursions on the sampled distribution. Despite balanced exposure to four actuation families, the model concentrates 95% of runtime execution on a single-bank strategy, without reinforcement learning or reward engineering. Representations transfer across simulators without architectural change. We position the system as a candidate decision component within a verification, monitoring, and defense-in-depth architecture, not as a stand-alone safety solution: the demonstrated behavior speaks to closed-loop reliability on a single-step task in simulation and does not yet address off-nominal operation, sensor faults, or uncertainty quantification.

URL PDF HTML ☆

赞 0 踩 0

2605.20950 2026-06-08 cs.CV cs.AI 版本更新

Focus-then-Context: Subject-Centric Progressive Visual Token Reduction for Vision-Language Models

聚焦-然后-上下文：面向视觉-语言模型的主体导向渐进视觉标记缩减

Yulin Zhao, Zheng Zhang

发表机构 * Harbin Institute of Technology, Shenzhen, China（哈尔滨工业大学深圳学院）； ShenZhen Loop Area Institute（深圳环形区研究所）

AI总结本文提出了一种主体导向的渐进视觉标记缩减方法SPpruner，通过模拟人类视觉感知系统的'聚焦-然后-上下文'机制，有效减少视觉标记数量，提升视觉-语言模型的推理效率，实验表明其在速度和资源消耗上均优于现有方法。

详情

AI中文摘要

视觉-语言模型（VLMs）在推理过程中面临由于大规模视觉标记序列带来的计算成本瓶颈。现有的视觉标记缩减方法虽然减轻了这一负担，但无意中保留了与用户查询严格对齐的孤立视觉主体，无法充分探索显著主体及其上下文关系。本文提出SPpruner，一种以主体为中心的渐进缩减范式，模拟人类视觉感知系统的'聚焦-然后-上下文'机制。具体而言，我们首先构建了一个聚焦识别模块，以显式建模视觉显著性与语义相关性之间的相互作用。在此基础上，它可以挖掘全面的视觉主体光谱，确保视觉输入的高保真表示。随后，开发了一个上下文感知的结构扫描模块，用于聚合邻近区域的上下文线索。因此，它可以有效恢复全局关系依赖，以维持保留主体的结构完整性。大量实验表明，我们的范式在速度和资源消耗上均优于现有方法，在Qwen2.5-VL中仅保留22.2%的视觉标记即可实现2.53倍的加速，在LLaVA中实现67%的FLOPs减少，仅导致0.6%的精度下降。

英文摘要

Vision-Language Models (VLMs) face a bottleneck of prohibitive computational costs arising from massive visual token sequences during inference. Existing vision token reduction methods alleviate this burden, but they unintentionally preserve the isolated visual subject strictly aligned with the user's query, which fails to substantially explore salient subjects and their contextual relationships. In this paper, we propose SPpruner, a subject-centric progressive reduction paradigm that emulates the \textit{Focus-then-Context} mechanism of the human visual perception system. Specifically, we first construct a focus identification module to explicitly model the interplay between visual saliency and semantic relevance. Herein, it can excavate the comprehensive visual subject spectrum to ensure a high-fidelity representation of visual input. Subsequently, a context-aware structural scanning module is developed to aggregate contextual cues from neighboring regions. As such, it can effectively restore global relational dependencies to uphold the structural integrity of the preserved subjects. Extensive experiments demonstrate that our paradigm consistently outperforms SOTA methods, achieving up to 2.53 times speedup with only 22.2% of visual tokens retained in Qwen2.5-VL and a 67% FLOPs reduction on LLaVA with a negligible 0.6% accuracy drop.

URL PDF HTML ☆

赞 0 踩 0

2605.19611 2026-06-08 cs.CV cs.ET 版本更新

Physics Guided Conditional Diffusion Framework for Generative Inverse Design of Manufacturable Metasurface based Absorbers

基于物理引导的条件扩散模型的超材料吸收体逆向设计

Vineetha Joy, Jamshed Palai, Satwik Sahu, Anshuman Kumar, Amit Sethi, Hema Singh

发表机构 * Centre for Electromagnetics, CSIR-National Aerospace Laboratories（电磁研究中心，国家航空航天实验室）； Birla Institute of Technology and Science, Pilani（比拉理工学院，皮兰）； Indian Institute of Technology, Bombay（孟买印度理工学院）

AI总结本文提出了一种基于物理引导的条件扩散框架，用于设计具有特定电磁响应的超材料吸收体，通过特征线性调制和预训练的替代电磁模拟器，提高了设计效率和条件准确性，实验表明该方法在2-18GHz频率范围内能够快速生成实用的超材料结构。

详情

AI中文摘要

针对特定电磁响应的超材料逆向设计需要生成满足严格频谱约束且可制造的几何结构。传统设计方法依赖于全波仿真进行迭代优化，对于大设计空间来说非常耗时且计算密集。此外，常用的生成方法往往条件保真度有限，生成的设计通常包含精细或不规则特征，难以制造。为此，我们提出了一种物理引导的条件质量增强扩散框架，用于超材料吸收体的逆向设计。在这里，由目标反射特性构成的条件信息通过特征线性调制（FiLM）整合到模型中。此外，为了确保符合目标频谱，嵌入了预训练的替代电磁模拟器，通过频谱级损失函数引入物理感知的正则化。通过在2至18GHz频率范围内生成不同类型的反射特性实用的超材料结构，证明了所提模型的有效性。该框架实现了目标频谱与生成设计频谱之间的平均频谱均方误差为0.0006，频段对齐精度为0.958，显示出高条件准确性。此外，模型为相同条件生成多种几何结构，从而为工程师提供多样化的设计选择。所提模型在约30秒内生成合适的设计，而传统方法在同等计算资源下需要数月时间。模型的效率还通过实验测量得到验证。

英文摘要

Inverse design of metasurfaces under continuous electromagnetic constraints requires generation of geometries that simultaneously satisfy stringent spectral specifications and remain manufacturable. Conventional approaches based on iterative full wave simulations are computationally prohibitive for large design spaces, while existing generative models often suffer from poor conditional controllability and limited fabrication awareness. In this regard, we propose a physics guided condition quality enhanced diffusion framework for the inverse design of metasurface based absorbers. Fabrication-aware constraints are incorporated to ensure practical realizability of the generated designs. The framework introduces a conditioning mechanism for continuous spectral specifications, wherein feature-wise linear modulation propagates the condition across the denoising hierarchy, enabling stable and accurate generation with improved spectral controllability. Further, to embed EM consistency directly into the generative learning process, a pre trained surrogate EM simulator is integrated within the diffusion training pipeline. The proposed framework generated physically realizable metasurface designs for diverse reflection characteristics in the frequency range of 2 to 18 GHz, achieving a very low average spectral mean squared error of 0.0006 and a high band alignment accuracy of 0.958. The framework also addresses the fundamentally non-unique nature of inverse EM design by enabling structured multimodal generation of geometrically distinct yet spectrally consistent metasurface designs for the same target response. The proposed model produces the suitable design in approximately 30 seconds, whereas the conventional approach can take several months under comparable computational resources. The efficiency of the model is also established via experimental measurements.

URL PDF HTML ☆

赞 0 踩 0

2605.17333 2026-06-08 cs.LG 版本更新

Leveraging Error Diversity in Group Rollouts for Reinforcement Learning

利用群体回滚中的误差多样性进行强化学习

Wenpu Liu, Yuqi Xu, Weichu Xie, Yongfu Zhu, Shuai Dong, Ziyue Wang, Wenqi Shao, Xiaoying Zhang, Tong Yang, Nan Duan, Jiaqi Wang

发表机构 * Peking University（北京大学）； JD.COM（京东公司）； Shanghai Innovation Institute（上海创新研究院）

AI总结本文提出EDAS方法，通过利用群体回滚中的误差多样性来提升强化学习的效果，通过调整错误回滚的优势信号，鼓励模型保持多样化的推理路径，从而提高训练成功率。

Comments Code available at https://github.com/EDAS-jd/EDAS

详情

AI中文摘要

基于可验证奖励的强化学习（RLVR）通常为每个提示生成多个响应并根据个体正确性分配二元奖励，但群体输出的整体结构，特别是误差分布，通常被忽视。我们发现这是一个被忽视的机会：实证分析表明，群体内部的误差多样性是训练成功的重要预测因素，那些产生多样化错误回答的问题比产生同质性失败的问题更能从RLVR中获益。受此启发，我们提出了误差多样性优势塑造（EDAS），一种轻量、算法无关的技术，通过群体内部误差多样性调节错误回滚的优势信号。EDAS放大对主导、重复性错误的惩罚，减弱对罕见、探索性错误的惩罚，从而鼓励模型保持多样化推理路径，防止错误固着。关键的是，EDAS作为一种简单的后处理调整，可以无缝集成到任何RLVR算法中。我们在多个主流RLVR方法上验证了EDAS，展示了在一系列模型和七个具有挑战性的数学基准测试中的持续改进。值得注意的是，EDAS在七个基准测试中对Qwen3-8B的DAPO平均改进了6.29分，证实了利用群体回滚中的潜在信息是增强RLVR的有效策略。

英文摘要

Reinforcement Learning from Verifiable Rewards (RLVR) typically samples multiple responses per prompt and assigns binary rewards based on individual correctness, yet the collective structure of the group output, specifically the distribution of errors, is largely discarded. We identify this as a missed opportunity: empirical analysis reveals that error diversity within a group is a strong predictor of training success, with problems eliciting diverse wrong answers benefiting substantially more from RLVR than those producing homogeneous failures. Motivated by this observation, we propose Error Diversity Advantage Shaping (EDAS), a lightweight, algorithm-agnostic technique that modulates the advantage signal for incorrect rollouts based on intra-group error diversity. EDAS amplifies penalties for dominant, repeated errors and attenuates penalties for rare, exploratory ones, thereby encouraging the model to maintain diverse reasoning paths and discouraging error perseveration. Crucially, EDAS operates as a simple post-hoc adjustment that can be seamlessly integrated into any RLVR algorithm. We validate EDAS on top of several mainstream RLVR methods across a series of models and seven challenging math benchmarks, demonstrating consistent improvements. Notably, EDAS yields an average improvement of 6.29 points over DAPO on Qwen3-8B across seven benchmarks, confirming that exploiting the latent information in group rollouts is a broadly effective strategy for strengthening RLVR.

URL PDF HTML ☆

赞 0 踩 0

2601.06600 2026-06-08 cs.CL 版本更新

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

探究多模态大语言模型在中国短视频虚假信息中的认知偏差

Jen-tse Huang, Chang Chen, Shiyang Lai, Wenxuan Wang, Michelle R. Kaufman, Mark Dredze

发表机构 * Johns Hopkins University（约翰霍普金斯大学）； Chinese University of Hong Kong（香港中文大学）； University of Chicago（芝加哥大学）； Renmin University of China（中国人民大学）

AI总结本文通过200个短视频数据集评估8种多模态大语言模型在健康领域虚假信息中的表现，发现Gemini-2.5-Pro表现最佳（信念分数71.5/100），而模型易受权威频道ID等社会线索影响。

Comments Accepted to ACL 2026 (Findings)

详情

AI中文摘要

短视频平台已成为虚假信息的主要传播渠道，其中欺骗性声明常利用视觉实验和社会线索。尽管多模态大语言模型（MLLMs）展示了令人印象深刻的推理能力，但它们对与认知偏差纠缠的虚假信息的鲁棒性仍未得到充分探索。本文使用一个高质量、手动标注的200个短视频数据集，涵盖四个健康领域，引入了一个全面的评估框架。该数据集为三种欺骗模式——实验错误、逻辑谬误和捏造声明——提供了细粒度标注，每种模式均由国家标准和学术文献等证据验证。我们评估了八个前沿MLLMs在五种模态设置下的表现。实验结果表明，Gemini-2.5-Pro在多模态设置中取得了最高性能，信念分数为71.5/100，而o3表现最差，为35.2。此外，我们研究了视频中诱导错误信念的社会线索，发现模型易受权威频道ID等偏差影响。

英文摘要

Short-video platforms have become major channels for misinformation, where deceptive claims frequently leverage visual experiments and social cues. While Multimodal Large Language Models (MLLMs) have demonstrated impressive reasoning capabilities, their robustness against misinformation entangled with cognitive biases remains under-explored. In this paper, we introduce a comprehensive evaluation framework using a high-quality, manually annotated dataset of 200 short videos spanning four health domains. This dataset provides fine-grained annotations for three deceptive patterns-experimental errors, logical fallacies, and fabricated claims-each verified by evidence such as national standards and academic literature. We evaluate eight frontier MLLMs across five modality settings. Experimental results demonstrate that Gemini-2.5-Pro achieves the highest performance in the multimodal setting with a belief score of 71.5/100, while o3 performs the worst at 35.2. Furthermore, we investigate social cues that induce false beliefs in videos and find that models are susceptible to biases like authoritative channel IDs.

URL PDF HTML ☆

赞 0 踩 0

2605.15888 2026-06-08 cs.LG cs.AI 版本更新

CHoE: Cross-Domain Heterogeneous Graph Prompt Learning via Structure-Conditioned Experts

CHoE: 基于结构条件专家的跨域异构图提示学习

Peiyuan Li, Yongqi Huang, Jitao Zhao, Dongxiao He, Di Jin, Weixiong Zhang

发表机构 * School of Computer Science and Technology, Tianjin University（天津大学计算机科学与技术学院）； Department of Health Technology and Informatics, and Department of Data Science and Artificial Intelligence, The Hong Kong Polytechnic University（香港理工大学健康科技与信息学系、数据科学与人工智能系）

AI总结提出CHoE方法，通过结构条件专家网络和结构感知路由机制，实现跨域异构图提示学习，在少样本跨域任务中优于基线方法。

Comments accepted by IJCAI 2026, 9 pages, 4 figures

详情

AI中文摘要

异构图提示学习（HGPL）已成为弥合预训练基础模型目标与其在下游异构图中应用之间差距的有前景范式。然而，现有HGPL方法主要针对域内场景设计，而实际部署通常跨越多个域，且预训练和下游任务的数据可能来自不同分布。因此，当前HGPL方法的适用性仅限于域内设置，当应用域发生变化时，其性能通常会下降。为解决这一严重限制，我们开发了CHoE，一种基于专家网络的跨域HGPL方法。在预训练期间，我们引入并训练结构条件专家；在提示调优期间，我们采用结构感知的专家路由和负载均衡机制，为每个元路径视图选择结构兼容的专家。此外，我们设计了一个基于提示的语义融合模块，以整合多个视图的表示用于下游预测。大量实验表明，CHoE在少样本跨域应用中持续提升性能，优于所有基线方法。

英文摘要

Heterogeneous Graph Prompt Learning (HGPL)has emerged as a promising paradigm for bridging the gap between the objectives of pre-training foundation models and their downstream applications in heterogeneous graph settings. However, existing HGPL methods are primarily designed for in-domain scenarios, whereas real-world deployments often span multiple domains, and the data used for pre-training and downstream tasks may originate from different distributions. Consequently, the applicability of current HGPL approaches is limited to in-domain settings, and their performance typically degrades when application domains shift. To address this serious limitation, we develop CHoE, a cross-domain HGPL method built upon an expert network. During pre-training, we introduce and train structure-conditioned experts, and during prompt tuning, we adopt a structure-aware expert routing and load balancing mechanism to select structurally compatible experts for each meta-path view. In addition, we design a prompt-based semantic fusion module to integrate representations across multiple views for downstream prediction. Extensive experiments show that CHoE consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches.

URL PDF HTML ☆

赞 0 踩 0

2605.15354 2026-06-08 cs.LG 版本更新

Controllable Molecular Generative Foundation Models

可控分子生成基础模型

Yihan Zhu, Yuhan Liu, Weijiang Li, Tengfei Luo, Meng Jiang

发表机构 * University of Notre Dame（诺丁汉大学）

AI总结提出CoMole，一种基于基团感知图扩散的统一框架，结合强化学习优化条件反向策略，在材料与药物设计的九个目标上均实现最优可控性，MAE最高降低48.2%，且无需规则修正。

详情

AI中文摘要

尽管基础模型在语言和视觉领域取得了成功，分子图生成仍然缺乏一个统一的框架来处理异构设计任务并具有可靠的可控性。虽然强化学习（RL）为任务特定优化提供了一种自然的后训练机制，但将其应用于图生成模型受到巨大的原子级动作空间和化学无效中间状态的阻碍。我们提出了\textbf{Co}ntrollable \textbf{Mole}cular Generative Foundation Models（CoMole），该模型基于统一的基团感知图扩散流程构建。通过学习基团感知图空间，CoMole将预训练的结构先验转化为可控生成，其中RL优化基于化学有意义决策的条件反向策略。我们从理论上刻画了原子级RL的瓶颈，并论证了基团感知策略优化的合理性。在涵盖材料和药物发现的三个异构基准测试中，CoMole在所有九个目标上均排名第一，相对于最强基线，MAE最高降低48.2%，并且在不依赖规则修正或事后过滤的情况下，有效性保持在0.94以上。我们进一步证明，CoMole通过仅优化任务嵌入而冻结生成器，将可控性迁移到未见属性，其性能与强大的任务特定基线相当。

英文摘要

Despite the success of foundation models in language and vision, molecular graph generation still lacks a unified framework for heterogeneous design tasks with reliable controllability. While reinforcement learning (RL) offers a natural post-training mechanism for task-specific optimization, applying it to graph generative models is hindered by the vast atom-wise action spaces and chemically invalid intermediate states. We propose \textbf{Co}ntrollable \textbf{Mole}cular Generative Foundation Models (CoMole), built with a unified motif-aware graph diffusion pipeline. By learning a motif-aware graph space, CoMole transfers pretrained structural priors into controllable generation, where RL optimizes conditional reverse policies over chemically meaningful decisions. We theoretically characterize the bottleneck of atom-level RL and justify motif-aware policy optimization. Across three heterogeneous benchmarks spanning materials and drug discovery, CoMole ranks first in controllability on all nine targets, reduces MAE by up to 48.2% relative to the strongest baselines, and maintains validity above 0.94 without rule-based correction or post-hoc filtering. We further show that CoMole transfers controllability to unseen properties by optimizing only task embeddings with the generator frozen, achieving performance competitive with strong task-specific baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.14194 2026-06-08 cs.CL 版本更新

GradShield: Alignment Preserving Finetuning

GradShield: 保持对齐的微调

Zhanhao Hu, Xiao Huang, Patrick Mendoza, Emad A. Alghamdi, Basel Alomair, Raluca Ada Popa, David Wagner

发表机构 * University of California, Berkeley（加州大学伯克利分校）； HUMAIN ； King Abdulaziz City for Science and Technology（国王阿卜杜勒阿齐兹科学与技术城）； University of Washington, Seattle（华盛顿大学（西雅图））

AI总结提出GradShield过滤方法，通过计算微调隐式危害分数并采用自适应阈值，在微调前移除有害数据，保持LLM安全对齐，攻击成功率低于6%。

详情

AI中文摘要

大型语言模型（LLM）在微调后存在安全对齐的重大风险，因为模型可能被显式和隐式有害数据破坏。即使一些看似良性的数据也可能无意中引导模型走向未对齐的行为。为了解决这个问题，我们引入了GradShield，一种原则性的过滤方法，通过在有害数据破坏模型对齐之前识别并移除它们，从而在微调过程中保护LLM。它通过计算每个数据点的微调隐式危害分数（FIHS）并采用自适应阈值算法来移除潜在有害数据。我们将GradShield应用于多个不同有害数据水平的实用微调任务，并使用各种指标评估所得LLM的安全性和实用性。结果表明，GradShield优于所有基线方法，在保持实用性能的同时，始终将攻击成功率（ASR）维持在6%以下。

英文摘要

Large Language Models (LLMs) pose a significant risk of safety misalignment after finetuning, as models can be compromised by both explicitly and implicitly harmful data. Even some seemingly benign data can inadvertently steer a model towards misaligned behaviors. To address this, we introduce GradShield, a principled filtering method that safeguards LLMs during finetuning by identifying and removing harmful data points before they corrupt the model's alignment. It removes potentially harmful data by computing a Finetuning Implicit Harmfulness Score (FIHS) for each data point and employs an adaptive thresholding algorithm. We apply GradShield to multiple utility fine-tuning tasks across varying levels of harmful data and evaluate the safety and utility performance of the resulting LLMs using various metrics. The results show that GradShield outperforms all baseline methods, consistently maintaining an Attack Success Rate (ASR) below $6\%$ while preserving utility performance.

URL PDF HTML ☆

赞 0 踩 0

2605.14166 2026-06-08 cs.CV 版本更新

You Only Landmark Once: Lightweight U-Net Face Super Resolution with YOLO-World Landmark Heatmaps

你只需一次地标：基于YOLO-World地标热图的轻量级U-Net人脸超分辨率

Riccardo Carraro, Anna Briotto, Endi Hysa, Marco Fiorucci, Lamberto Ballan

发表机构 * Università degli Studi di Milano（米兰大学）； Istituto Italiano di Tecnologia（意大利理工学院）

AI总结提出轻量级U-Net，利用YOLO-World生成的地标热图作为监督，无需额外训练辅助网络，实现8倍人脸超分辨率重建，提升关键区域细节。

Comments Accepted for publication at IEEE AVSS 2026 (Notification date: June 5, 2026)

详情

AI中文摘要

人脸图像超分辨率旨在从严重退化的输入中恢复高分辨率人脸图像。在极端放大因子下，精细的面部细节常常丢失，使得准确重建具有挑战性。现有方法通常依赖重型网络架构、对抗训练方案或单独的对齐网络，增加了模型复杂度和计算成本。为解决这些问题，我们提出了一种基于轻量级U-Net的架构，旨在从严重退化的$16 \ imes 16$输入重建$128 \ imes 128$面部图像，实现$8 \ imes$放大。一个关键贡献是一种新颖的无辅助训练监督策略，利用YOLO-World（一种开放词汇目标检测器）生成的热图来定位关键面部特征，如眼睛、鼻子和嘴巴。这些热图被转换为空间权重，形成热图引导的损失，强调语义重要区域的重建误差。与先前需要专用地标或对齐网络的方法不同，我们的方法直接重用检测器输出作为监督，保持高效的训练和推理流程。在对齐的CelebA数据集上的实验表明，所提出的损失一致地改善了定量指标，并产生了更清晰、更逼真的重建。总体而言，我们的结果表明，轻量级网络可以有效地利用检测驱动的先验进行感知上令人信服的极端放大，而无需对抗训练或增加计算成本。

英文摘要

Face image super-resolution aims to recover high-resolution facial images from severely degraded inputs. Under extreme upscaling factors, fine facial details are often lost, making accurate reconstruction challenging. Existing methods typically rely on heavy network architectures, adversarial training schemes, or separate alignment networks, increasing model complexity and computational cost. To address these issues, we propose a lightweight U-Net based-architecture designed to reconstructs $128{ \times }128$ facial images from severely degraded $16{ \times }16$ inputs, achieving an $8 \times $ magnification. A key contribution is a novel auxiliary-training-free supervision strategy that leverages heatmaps generated by YOLO-World, an open-vocabulary object detector, to localize key facial features such as eyes, nose, and mouth. These heatmaps are converted into spatial weights to form a heatmap-guided loss that emphasizes reconstruction errors in semantically important regions. Unlike prior methods that require dedicated landmark or alignment networks, our approach directly reuses detector outputs as supervision, maintaining an efficient training and inference pipeline. Experiments on the aligned CelebA dataset demonstrate that the proposed loss consistently improves quantitative metrics and produces sharper, more realistic reconstructions. Overall, our results show that lightweight networks can effectively exploit detection-driven priors for perceptually convincing extreme upscaling, without adversarial training or increased computational cost.

URL PDF HTML ☆

赞 0 踩 0

2605.10832 2026-06-08 cs.CL 版本更新

Towards On-Policy Data Evolution for Visual-Native Multimodal Deep Search Agents

面向视觉原生多模态深度搜索智能体的在策略数据演化

Shijue Huang, Hangyu Guo, Guanting Dong, Chenxin Li, Junting Lu, Xinyu Geng, Zhaochen Su, Zhenyu Li, Shuang Chen, Hongru Wang, Yi R. Fung

发表机构 * Hong Kong University of Science and Technology（香港理工大学）； Renmin University of China（中国人民大学）； The Chinese University of Hong Kong（香港中文大学）； Peking University（北京大学）； Tsinghua University（清华大学）； University of Edinburgh（爱丁堡大学）

AI总结提出在策略数据演化（ODE）框架，通过图像库引用协议和闭环数据生成器，解决多模态深度搜索中视觉证据不可复用和训练数据静态问题，在8个基准上显著提升性能。

详情

AI中文摘要

多模态深度搜索要求智能体通过链式搜索、工具使用和对不断变化的文本与视觉上下文的视觉推理来解决开放世界问题。两个瓶颈限制了当前系统。首先，现有的工具使用框架将搜索、浏览或转换返回的图像视为瞬时输出，因此中间视觉证据无法被后续工具重新消费。其次，训练数据通常由固定的整理配方构建，无法跟踪目标智能体不断发展的能力。为应对这些挑战，我们首先引入了一个以图像库引用协议为核心的视觉原生智能体框架，该协议将每个工具返回的图像注册为可寻址引用，使中间视觉证据可被后续工具重用。在此框架之上，在策略数据演化（ODE）运行一个闭环数据生成器，该生成器根据正在训练的策略的 rollout 在每轮中自我改进。这种逐轮改进使得每轮的数据针对当前策略仍需学习的内容。同一框架支持多样化的监督微调数据和策略感知的强化学习数据整理，覆盖目标智能体的完整训练生命周期。在8个多模态深度搜索基准上，ODE 将 Qwen3-VL-8B 智能体的平均得分从24.9%提升至39.0%，在标准智能体工作流设置中超越了 Gemini-2.5 Pro（37.9%）。在30B规模下，ODE 将平均得分从30.6%提升至41.5%。进一步分析验证了图像库重用的有效性，特别是在需要迭代视觉细化的复杂任务上，而 rollout 反馈演化比静态合成产生了更扎实的 SFT 轨迹和更好的策略匹配的 RL 任务。

英文摘要

Multimodal deep search requires an agent to solve open-world problems by chaining search, tool use, and visual reasoning over evolving textual and visual context. Two bottlenecks limit current systems. First, existing tool-use harnesses treat images returned by search, browsing, or transformation as transient outputs, so intermediate visual evidence cannot be re-consumed by later tools. Second, training data is usually built by fixed curation recipes that cannot track the target agent's evolving capability. To address these challenges, we first introduce a visual-native agent harness centered on an image bank reference protocol, which registers every tool-returned image as an addressable reference and makes intermediate visual evidence reusable by later tools. On top of this harness, On-policy Data Evolution (ODE) runs a closed-loop data generator that refines itself across rounds from rollouts of the policy being trained. This per-round refinement makes each round's data target what the current policy still needs to learn. The same framework supports both diverse supervised fine-tuning data and policy-aware reinforcement learning data curation, covering the full training lifecycle of the target agent. Across 8 multimodal deep search benchmarks, ODE improves the Qwen3-VL-8B agent from 24.9% to 39.0% on average, surpassing Gemini-2.5 Pro in standard agent-workflow setting (37.9%). At 30B, ODE raises the average score from 30.6% to 41.5%. Further analyses validate the effectiveness of image-bank reuse, especially on complex tasks requiring iterative visual refinement, while rollout-feedback evolution yields more grounded SFT traces and better policy-matched RL tasks than static synthesis.

URL PDF HTML ☆

赞 0 踩 0

2605.08732 2026-06-08 cs.RO cs.LG 版本更新

Latent Geometry Beyond Search: Amortizing Planning in World Models

超越搜索的潜在几何：在世界模型中摊销规划

Hoang Nguyen, Xiaohao Xu, Xiaonan Huang

发表机构 * Department of Robotics, University of Michigan, Ann Arbor（密歇根大学机器人系，安阿伯）

AI总结提出在正则化潜在几何下，将规划摊销为潜在逆动力学映射，以轻量级GC-IDM替代在线搜索，在七个环境协议中匹配或超越CEM，决策成本降低100-130倍。

Comments 31 pages

详情

AI中文摘要

现代基于视觉的世界模型可以将观测表示为紧凑而富有表现力的潜在流形，但在这些空间中进行快速的目标导向规划仍然具有挑战性。这引发了一个核心问题：学习到的表示何时简化控制，而不仅仅是实现预测？我们在预训练的LeWorldModel中研究这个问题，其潜在几何通过正则化实现平滑性和均匀性。我们的关键见解是，在这种几何下，规划可以摊销为潜在逆动力学映射，而无需在线搜索。因此，我们用一个轻量级的目标条件逆动力学模型（GC-IDM）替代迭代规划，该模型将当前潜在状态、目标潜在状态和剩余时间步直接映射到下一个动作。实验上，在涵盖导航、接触丰富的操作和连续控制的四个基准环境中，我们的控制器在八个环境-协议设置中的七个上匹配或超过了CEM，同时将每次决策成本降低了100-130倍。对测试时规划器（CEM、MPPI、iCEM和基于梯度的方法）的更广泛扫描表明，这一结果并非特定于某个优化器。这些发现表明，测试时规划恢复的大部分结构已经局部编码在潜在表示中。更广泛地说，我们的结果表明，足够结构化的潜在空间可以将部分规划负担从在线优化转移到学习推理。我们的代码公开在 https://github.com/hdnndh/Latent-Geometry-Beyond-Search-Amortizing-Planning-in-World-Models 。

英文摘要

Modern vision-based world models can represent observations as compact yet expressive latent manifolds, but fast goal-oriented planning in these spaces remains challenging. This raises a central question: when does a learned representation simplify control, rather than merely enabling prediction? We study this question in a pretrained LeWorldModel, whose latent geometry is regularized for smoothness and uniformity. Our key insight is that, under such geometry, planning can be amortized into a latent inverse-dynamics mapping instead of requiring online search. We therefore replace iterative planning with a lightweight Goal-Conditioned Inverse Dynamics Model (GC-IDM) that maps the current latent state, goal latent state, and remaining horizon directly to the next action. Empirically, across four benchmark environments spanning navigation, contact-rich manipulation, and continuous control, our controller matches or exceeds CEM in seven of eight environment-protocol settings while reducing per-decision cost by 100-130x. A broader sweep over test-time planners (CEM, MPPI, iCEM, and gradient-based methods) shows that this result is not specific to a particular optimizer. These findings suggest that much of the structure recovered by test-time planning is already locally encoded in the latent representation. More broadly, our results indicate that sufficiently structured latent spaces can shift part of the planning burden from online optimization to learned inference. Our code is publicly available at https://github.com/hdnndh/Latent-Geometry-Beyond-Search-Amortizing-Planning-in-World-Models .

URL PDF HTML ☆

赞 0 踩 0

2605.08692 2026-06-08 cs.LG cs.CL 版本更新

AAAC: Activation-Aware Adaptive Codebooks for 4-bit LLM Weight Quantization

AAAC: 面向4位LLM权重量化的激活感知自适应码本

Beshr IslamBouli, David Jin

发表机构 * University of Waterloo（滑铁卢大学）

AI总结提出AAAC方法，通过每层两个小型学习码本（64字节）替代固定标量码本，以激活加权重建误差最小化选择码本，实现零额外存储开销的4位权重量化，在3-30分钟内完成量化，精度优于现有方法。

详情

AI中文摘要

训练后仅权重量化至4位被广泛用于减少大语言模型推理的内存和计算成本。现有的PTQ方法，如AWQ和GPTQ，通过缩放、裁剪或误差补偿改进权重映射到固定4位网格的方式。为进一步提高精度，OmniQuant和QuIP#等方法使用梯度辅助算法，但需要数小时的量化时间。在这项工作中，我们提出AAAC（激活感知自适应码本），一种用于4位LLM权重量化的轻量级方法。AAAC用每层两个小型学习标量码本（64字节）替换标准量化中使用的固定标量码本。每组权重选择使激活加权重建误差最小的码本，将选择编码在组正缩放的未使用符号位中，并增加零存储开销。AAAC在单个GPU上3-30分钟内完成，且不增加模型本身之外的额外内存。我们跨模型族与AWQ、GPTQ、IF4、GPTVQ、OmniQuant、SqueezeLLM和QuIP#进行评估。AAAC在量化时间少几个数量级的情况下优于基线方法。

英文摘要

Post-training weight-only quantization to 4 bits is widely used to reduce the memory and compute costs of large language model inference. Existing PTQ methods, such as AWQ and GPTQ, improve how weights are mapped onto a fixed 4-bit grid through scaling, clipping, or error compensation. To further improve accuracy, methods such as OmniQuant and QuIP\# uses gradient-assisted algorithms at the cost of hours of quantization time. In this work, we propose AAAC (Activation-Aware Adaptive Codebooks), a lightweight method for 4-bit LLM weight quantization. AAAC replaces the fixed scalar codebook used in standard quantization with two small learned scalar codebooks (64 bytes) per layer. Each group of weights selects the codebook that minimizes activation-weighted reconstruction error, encoding the choice in the unused sign bit of the group's positive scale and adding zero storage overhead. AAAC completes in 3--30 minutes on a single GPU, and adds no memory beyond the model itself. We evaluate against AWQ, GPTQ, IF4, GPTVQ, OmniQuant, SqueezeLLM, and QuIP\# across model families. AAAC outperforms baselines at orders-of-magnitude less quantization time.

URL PDF HTML ☆

赞 0 踩 0