arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 21503
专题追踪
2605.15741 2026-06-04 cs.CV

HyperDiT: Hyper-Connected Transformers for High-Fidelity Pixel-Space Diffusion

HyperDiT: 用于高保真像素空间扩散的超连接Transformer

Yu He, Lichen Ma, Zipeng Guo, Xinyuan Shan, Jingling Fu, Dong Chen, Junshi Huang, Yan Li

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 针对像素空间扩散模型中全局语义与细粒度细节难以兼顾的粒度困境,提出HyperDiT框架,通过超连接跨尺度交互和尺度感知旋转位置编码,结合预训练视觉基础模型的密集语义,在像素空间实现高保真生成,在ImageNet 256×256上取得1.56的SoTA FID。

详情
AI中文摘要

像素空间扩散模型绕过了变分自编码器(VAE)的重建瓶颈,但面临一个基本的“粒度困境”:捕捉全局语义需要大的块尺度,而生成高保真细节则要求细粒度的输入。为了解决这个问题,我们提出了HyperDiT,一个统一的框架,建立超连接跨尺度交互以桥接语义和像素流形。与通过AdaLN注入语义不同,HyperDiT利用交叉注意力机制,使细粒度标记能够全局查询多级语义锚点。为了解决多尺度交互过程中的空间不匹配问题,我们引入了尺度感知旋转位置编码(SA-RoPE),以确保不同块大小的标记之间精确的几何对齐。此外,我们加入了寄存器,从预训练的视觉基础模型(VFM)中学习密集语义,有效减少生成幻觉和伪影。大量实验表明,HyperDiT在像素空间内直接在ImageNet 256×256上实现了最先进的FID为1.56。通过将细粒度流与语义指导相结合,HyperDiT为高保真像素生成提供了一种优越的范式。

英文摘要

Pixel-space diffusion models bypass the reconstruction bottleneck of Variational Autoencoders (VAEs) but face a fundamental "granularity dilemma": capturing global semantics favors large patch scales, while generating high-fidelity details demands fine-grained inputs. To address this issue, we propose HyperDiT, a unified framework establishing Hyper-Connected Cross-Scale Interactions to bridge the semantic and pixel manifold. Diverging from injecting semantics by AdaLN, HyperDiT utilizes Cross-Attention mechanisms, enabling fine-grained tokens to query multi-level semantic anchors globally. To resolve the spatial mismatch during multi-scale interactions, we introduce Scale-Aware Rotary Position Embedding (SA-RoPE) to ensure precise geometric alignment among tokens of varying patch sizes. Furthermore, we incorporate Registers to learn the dense semantics from a pretrained Visual Foundation Model (VFM), effectively reducing generation hallucination and artifacts. Extensive experiments demonstrate that HyperDiT achieves state-of-the-art (SoTA) FID of $\mathbf{1.56}$ on ImageNet $256\times256$ directly within the pixel space. By combining the fine-grained stream with semantic guidance, HyperDiT offers a superior paradigm for high-fidelity pixel generation.

2605.15152 2026-06-04 cs.LG cs.AI

Widening the Gap: Exploiting LLM Quantization via Outlier Injection

扩大差距:通过异常值注入利用LLM量化

Xiaohua Zhan, Kazuki Egashira, Robin Staab, Mark Vero, Martin Vechev

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 本文提出首个针对多种先进量化方法(AWQ、GPTQ、GGUF I-quants)的量化条件攻击,通过注入异常值导致权重塌缩,诱导模型在量化后出现恶意行为。

详情
AI中文摘要

LLM量化已成为内存高效部署的关键。最近的研究表明,量化方案可能带来严重的安全风险:对手可以发布一个在全精度下看似良性,但在用户量化后表现出恶意行为的模型。然而,现有的量化条件攻击仅限于相对简单的量化方法,攻击者可以估计在目标量化下保持不变的权重区域。值得注意的是,先前的攻击始终未能攻破更流行和复杂的方案,限制了其实际影响。在这项工作中,我们提出了首个量化条件攻击,能够持续诱导出可由多种先进量化技术(包括AWQ、GPTQ和GGUF I-quants)触发的恶意行为。我们的攻击利用了现代量化方法共有的一个简单特性:大的异常值可能导致其他权重四舍五入为零。因此,通过向特定权重块注入异常值,对手可以诱导模型出现目标性的、可预测的权重塌缩。这种效应可用于制作看似良性的全精度模型,这些模型在量化后表现出广泛的恶意行为。通过在三种攻击场景和LLM上的广泛评估,我们表明我们的攻击在先前攻击失败的多种量化方法上实现了高成功率。我们的结果首次证明,量化的安全风险不仅限于更简单的方案,而是广泛存在于复杂、广泛使用的量化方法中。

英文摘要

LLM quantization has become essential for memory-efficient deployment. Recent work has shown that quantization schemes can pose critical security risks: an adversary may release a model that appears benign in full precision but exhibits malicious behavior once quantized by users. However, existing quantization-conditioned attacks have been limited to relatively simple quantization methods, where the attacker can estimate weight regions that remain invariant under the target quantization. Notably, prior attacks have consistently failed to compromise more popular and sophisticated schemes, limiting their practical impact. In this work, we introduce the first quantization-conditioned attack that consistently induces malicious behavior that can be triggered by a broad range of advanced quantization techniques, including AWQ, GPTQ, and GGUF I-quants. Our attack exploits a simple property shared by many modern quantization methods: large outliers can cause other weights to be rounded to zero. Consequently, by injecting outliers into specific weight blocks, an adversary can induce a targeted, predictable weight collapse in the model. This effect can be used to craft seemingly benign full-precision models that exhibit a wide range of malicious behaviors after quantization. Through extensive evaluation across three attack scenarios and LLMs, we show that our attack achieves high success rates against a broad range of quantization methods on which prior attacks fail. Our results demonstrate, for the first time, that the security risks of quantization are not restricted to simpler schemes but are broadly relevant across complex, widely-used quantization methods.

2605.14091 2026-06-04 cs.CV

Venus-DeFakerOne: Unified Fake Image Detection & Localization

Venus-DeFakerOne: 统一假图检测与定位

GuangJian Team

发表机构 * Ant Group(蚂蚁集团)

AI总结 针对假图生成机制统一化而检测定位研究碎片化的问题,提出基于InternVL2和SAM2的数据驱动统一基础模型DeFakerOne,实现跨场景的图像级检测与像素级定位,在39个检测和9个定位基准上达到最优性能。

详情
AI中文摘要

近年来,生成式AI的快速发展从根本上重塑了图像伪造的范式,打破了文档编辑、自然图像篡改、DeepFake生成和全图像AIGC合成之间的传统界限。尽管伪造生成正趋于统一,但现有的假图检测与定位(FIDL)研究仍然碎片化。这造成了日益统一的伪造生成机制与领域特定检测范式之间的不匹配。弥合这一不匹配给FIDL带来了两个关键挑战:理解跨域伪影的迁移与干扰,以及构建一个高容量的统一基础模型以实现联合检测与定位。为应对这些挑战,我们提出了DeFakerOne,一个以数据为中心的统一FIDL基础模型,集成了InternVL2和SAM2。DeFakerOne能够在多种场景下同时进行图像级检测和像素级伪造定位。大量实验表明,DeFakerOne达到了最先进的性能,在39个伪造检测基准和9个定位基准上均优于基线。此外,该模型对真实世界扰动和最先进的生成器(如GPT-Image-2)表现出卓越的鲁棒性。最后,我们系统分析了数据缩放规律、跨域伪影迁移-干扰模式、细粒度监督的必要性以及原始分辨率伪影保留,突显了可扩展、鲁棒且统一的FIDL的设计原则。

英文摘要

In recent years, the rapid evolution of generative AI has fundamentally reshaped the paradigm of image forgery, breaking the traditional boundaries between document editing, natural image manipulation, DeepFake generation, and full-image AIGC synthesis. Despite this shift toward unified forgery generation, existing research in Fake Image Detection and Localization (FIDL) remains fragmented. This creates a mismatch between increasingly unified forgery generation mechanisms and the domain-specific detection paradigm. Bridging this mismatch poses two key challenges for FIDL: understanding cross-domain artifacts transfer and interference, and building a high-capacity unified foundation model for joint detection and localization. To address these challenges, we propose DeFakerOne, a data-centric, unified FIDL foundation model integrating InternVL2 and SAM2. DeFakerOne enables simultaneous image-level detection and pixel-level forgery localization across diverse scenarios. Extensive experiments demonstrate that DeFakerOne achieves state-of-the-art performance, outperforming baselines on 39 forgery detection benchmarks and 9 localization benchmarks. Furthermore, the model exhibits superior robustness against real-world perturbations and state-of-the-art generators such as GPT-Image-2. Finally, we provide a systematic analysis of data scaling laws, cross-domain artifacts transfer-interference patterns, the necessity of fine-grained supervision, and the original resolution artifacts preservation, highlighting the design principles for scalable, robust, and unified FIDL.

2605.14054 2026-06-04 cs.AI cs.CV

Bad Seeing or Bad Thinking? Rewarding Perception for Multimodal Reasoning

Bad Seeing or Bad Thinking? Rewarding Perception for Multimodal Reasoning

Haozhe Wang, Qixin Xu, Changpeng Wang, Taofeng Xue, Chong Peng, Wenhu Chen, Fangzhen Lin

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出一种基于强化学习的模态感知信用分配框架(MoCA),通过感知验证和结构化口头验证解决视觉语言模型中感知与推理的权衡问题,实现多任务性能提升。

Comments Accepted by ICML 2026 as Oral

详情
AI中文摘要

实现稳健的感知-推理协同是高级视觉语言模型(VLM)的核心目标。最近的进展通过架构设计或智能体工作流追求这一目标。然而,这些方法通常受限于静态文本推理,或因外部智能体复杂性的巨大计算和工程负担而变得复杂。更糟糕的是,这种大量投入并未带来成比例的性能提升,常常在感知和推理上观察到“跷跷板效应”。这促使我们从根本上重新思考真正的瓶颈。在本文中,我们认为这种权衡的根本原因是模态信用分配中的模糊性:当VLM失败时,是由于感知缺陷(“坏视力”)还是逻辑缺陷(“坏思维”)?为解决这一问题,我们引入了一个强化学习框架,通过可靠地奖励感知保真度来改善感知-推理协同。我们明确地将生成过程分解为交错的感知和推理步骤。这种解耦使得能够对感知进行有针对性的监督。关键的是,我们引入了感知验证(PV),利用“盲推理”代理独立于推理结果奖励感知保真度。此外,为了在自由形式的VL任务中扩展训练,我们提出了结构化口头验证(Structured Verbal Verification),用结构化的算法执行替代高方差的LLM评判。这些技术被整合到模态感知信用分配(MoCA)机制中,该机制将奖励路由到特定的错误源——无论是坏视力还是坏思维——使单个VLM能够在广泛的任务谱系上同时获得性能提升。

英文摘要

Achieving robust perception-reasoning synergy is a central goal for advanced Vision-Language Models (VLMs). Recent advancements have pursued this goal via architectural designs or agentic workflows. However, these approaches are often limited by static textual reasoning or complicated by the significant compute and engineering burden of external agentic complexity. Worse, this heavy investment does not yield proportional gains, often witnessing a "seesaw effect" on perception and reasoning. This motivates a fundamental rethinking of the true bottleneck. In this paper, we argue that the root cause of this trade-off is an ambiguity in modality credit assignment: when a VLM fails, is it due to flawed perception ("bad seeing") or flawed logic ("bad thinking")? To resolve this, we introduce a reinforcement learning framework that improves perception-reasoning synergy by reliably rewarding the perception fidelity. We explicitly decompose the generation process into interleaved perception and reasoning steps. This decoupling enables targeted supervision on perception. Crucially, we introduce Perception Verification (PV), leveraging a "blindfolded reasoning" proxy to reward perceptual fidelity independently of reasoning outcomes. Furthermore, to scale training across free-form VL tasks, we propose Structured Verbal Verification, which replaces high-variance LLM judging with structured algorithmic execution. These techniques are integrated into a Modality-Aware Credit Assignment (MoCA) mechanism, which routes rewards to the specific source of error -- either bad seeing or bad thinking -- enabling a single VLM to achieve simultaneous performance gains across a wide task spectrum.

2605.13672 2026-06-04 cs.CV cs.AI cs.LG

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

SpurAudio: 用于研究少样本音频分类中捷径学习的基准

Giries Abu Ayoub, Morad Tukan, Loay Mualem

发表机构 * Department of Computer Science, University of Haifa(海法大学计算机科学系) Independent Researcher(独立研究者) University of Stuttgart, Germany(斯图加特大学,德国) IMPRS-IS, Germany(智能系统国际Max Planck研究学校,德国)

AI总结 提出SpurAudio基准,通过控制音频中前景与背景的关联,评估少样本分类模型对虚假相关性的敏感性,发现现有方法在背景变化时性能显著下降。

详情
AI中文摘要

少样本分类(FSC)广泛用于从有限标注数据中学习,但大多数评估隐含假设目标概念与上下文线索无关。然而,在现实场景中,样本通常出现在丰富的上下文中,允许模型利用前景内容与背景信号之间的虚假相关性。虽然这种效应已在少样本图像分类中得到研究,但其在少样本音频分类中的作用仍 largely 未被探索,且现有音频基准对上下文结构的控制有限。我们引入了 SpurAudio,一个利用音频中前景事件和背景环境的自然可分离性,以支持对支持集和查询集之间的上下文偏移进行可控、多级评估的基准。使用该基准,我们表明许多最先进的少样本方法在背景相关性被破坏时遭受严重的性能下降,尽管在标准评估协议下达到相似的准确率。关键的是,即使在大型预训练音频基础模型中,这种脆弱性仍然存在,排除了骨干网络容量不足的解释。此外,在传统基准下看似相当的方法可能对虚假相关性表现出显著不同的敏感性,揭示了与特征表示在推理时如何与分类器头交互相关的系统性算法优势和脆弱性。这些发现为音频中少样本方法的行为提供了新的见解,并强调了在评估FSC模型时需要明确探测上下文依赖性的基准。

英文摘要

Few-shot classification (FSC) is widely used for learning from limited labeled data, yet most evaluations implicitly assume that target concepts are independent of contextual cues. In real-world settings, however, examples often appear within rich contexts, allowing models to exploit spurious correlations between foreground content and background signals. While such effects have been studied in few-shot image classification, their role in few-shot audio classification remains largely unexplored, and existing audio benchmarks offer limited control over contextual structure. We introduce SpurAudio, a benchmark that leverages the natural separability of foreground events and background environments in audio to enable controlled, multi-level evaluation of contextual shifts across support and query sets. Using this benchmark, we show that many state-of-the-art few-shot methods suffer severe performance degradation when background correlations are disrupted, despite achieving similar accuracy under standard evaluation protocols. Crucially, this vulnerability persists even in large pretrained audio foundation models, ruling out limited backbone capacity as an explanation. Moreover, methods that appear comparable under conventional benchmarks can exhibit markedly different sensitivity to spurious correlations, revealing systematic algorithmic strengths and vulnerabilities tied to how feature representations interact with classifier heads at inference time. These findings provide new insight into the behavior of few-shot methods in audio and highlight the need for benchmarks that explicitly probe context dependence when evaluating FSC models.

2605.00182 2026-06-04 cs.LG

Towards A Generative Protein Evolution Machine with DPLM-Evo

迈向生成式蛋白质进化机器:DPLM-Evo

Xinyou Wang, Liang Hong, Jiasheng Ye, Zaixiang Zheng, Yu Li, Shujian Huang, Quanquan Gu

发表机构 * Nanjing University(南京大学) CUHK(香港大学) Fudan University(复旦大学) ByteDance(字节跳动)

AI总结 提出DPLM-Evo,一种显式建模替换、插入和删除操作的进化离散扩散框架,在单序列设置下实现蛋白质突变效应预测的最优性能,并支持变长模拟进化与蛋白质编辑优化。

Comments A peer-reviewed version was accepted to ICML 2026

详情
AI中文摘要

蛋白质在生物物理和功能约束下通过逐渐进化形成。蛋白质语言模型从大规模序列中学习丰富的进化约束,基于离散扩散的蛋白质语言模型(如DPLM)在理解和生成方面都很有前景。然而,现有的DPLM通常依赖于掩码扩散,这与一个简单的生物学直觉相矛盾:蛋白质通过累积的编辑进化,而不是从掩码中出现。因此,这些框架缺乏用于替换和插入/删除(indel)操作的显式预训练目标,限制了优化风格的后编辑和灵活的引导生成。为了解决这些限制,我们提出了DPLM-Evo,一种进化离散扩散框架,在去噪过程中显式预测替换、插入和删除操作。DPLM-Evo将上采样长度的潜在对齐空间与可变长度的观测序列空间解耦,使得indel感知生成变得可行。为了更好地将替换与真实进化对齐,我们进一步引入了一种上下文感知的进化噪声核,产生生物学信息丰富、上下文依赖的突变模式。在各种任务中,DPLM-Evo提升了序列理解能力,并在单序列设置下在ProteinGym上实现了最先进的突变效应预测性能。它还支持变长模拟进化,以及通过显式编辑轨迹对现有蛋白质进行后编辑/优化。

英文摘要

Proteins are shaped by gradual evolution under biophysical and functional constraints. Protein language models learn rich evolutionary constraints from large-scale sequences, and discrete diffusion-based protein language models~(\eg, DPLMs) are promising for both understanding and generation. However, existing DPLMs typically rely on masked diffusion that contradicts a simple biological intuition: proteins evolve through accumulated edits, not by emerging from masks. Consequently, these frameworks lack explicit pretraining objectives for substitution and insertion/deletion (indel) operations, limiting both optimization-style post-editing and flexible guided generation. To address these limitations, we present DPLM-Evo, an evolutionary discrete diffusion framework that explicitly predicts substitution, insertion, and deletion operations during denoising. DPLM-Evo decouples an upsampled-length latent alignment space from the variable-length observed sequence space, which makes indel-aware generation tractable. To better align substitutions with real evolution, we further introduce a contextualized evolutionary noising kernel that produces biologically informed, context-dependent mutation patterns. Across tasks, DPLM-Evo improves sequence understanding and achieves state-of-the-art mutation effect prediction performance on ProteinGym in the single-sequence setting. It also enables variable-length simulated evolution, and post-editing/optimization of existing proteins via explicit edit trajectories.

1902.10607 2026-06-04 cs.RO cs.SY eess.SY

Necessary and Sufficient Conditions for Passivity of Velocity-Sourced Impedance Control of Series Elastic Actuators

速度源阻抗控制系列弹性驱动器被动性的必要和充分条件

Fatih Emre Tosun, Volkan Patoglu

发表机构 * Faculty of Engineering and Natural Sciences, Sabancı University(工程与自然科学学院,萨班奇大学)

AI总结 本文研究了系列弹性驱动器速度源阻抗控制架构的被动性条件,提出了非保守的设计指南以实现null阻抗和纯弹簧的触觉显示,并强调了在积分控制器中包含物理阻尼的重要性。

Comments Submitted to IEEE T-RO, 12 pages, 10 figures, 7 tables

详情
AI中文摘要

系列弹性驱动(SEA)因其在物理人机交互应用中的稳定性鲁棒性和力控制精度而变得普遍。已提出几种SEA的阻抗控制架构。其中,具有内层速度环、中层扭矩环和外层阻抗环的级联控制器因其简单性、鲁棒性和性能而特别受欢迎。本文推导了确保该级联控制器架构在渲染两种最常见的虚拟阻抗模型时被动性的必要和充分条件。基于新建立的被动性条件,我们提供了非保守的设计指南,以触觉显示null阻抗和纯弹簧,同时确保交互的被动性。我们还展示了在推导被动性条件时,当使用积分控制器时,包含物理阻尼的重要性。特别是,我们展示了物理阻尼对系统被动性的影响。

英文摘要

Series Elastic Actuation (SEA) has become prevalent in applications involving physical human-robot interaction as it provides considerable advantages over traditional stiff actuators in terms of stability robustness and fidelity of force control. Several impedance control architectures have been proposed for SEA. Among these alternatives, the cascaded controller with an inner-most velocity loop, an intermediate torque loop and an outer-most impedance loop is particularly favoured for its simplicity, robustness, and performance. In this paper, we derive the \emph{necessary and sufficient conditions} to ensure the passivity of this cascade-controller architecture for rendering two most common virtual impedance models. Based on the newly established passivity conditions, we provide non-conservative design guidelines to haptically display a null impedance and a pure spring while ensuring the passivity of interaction. We also demonstrate the importance of including physical damping in the actuator model during derivation of passivity conditions, when integral controllers are utilized. In particular, we show the adversary effect of physical damping on system passivity.

2304.10891 2026-06-04 cs.LG cs.AI cs.CV cs.RO cs.SY eess.SY

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

基于Transformer的自动驾驶模型与面向部署的压缩:综述

Juan Zhong, Yuhang Shi, Zukang Xu, Xi Chen

发表机构 * Renmin University of China(中国人民大学) Artificial Intelligence Innovation and Incubation Institute, Fudan University(复旦大学人工智能创新与孵化院) Shanghai Academy of AI for Science(上海人工智能科学研究院) Department of houmo.ai(houmo.ai部门)

AI总结 本文综述了基于Transformer的自动驾驶模型,并从部署角度分析了压缩与加速策略(如量化、剪枝、知识蒸馏等)如何影响模型设计、部署性、鲁棒性和安全性。

详情
AI中文摘要

基于Transformer的模型正成为自动驾驶的核心范式,因为它们能够捕捉感知、预测和规划中的长程空间依赖、多智能体交互和多模态上下文。然而,它们在真实车辆中的部署仍然困难,因为高容量注意力架构带来了显著的延迟、内存和能量开销。本综述回顾了具有代表性的基于Transformer的自动驾驶模型,并按任务角色、感知配置和架构设计进行组织。更重要的是,我们从面向部署的角度审视这些模型,分析效率约束如何在实际中重塑模型设计选择。我们进一步回顾了与基于Transformer的驾驶系统相关的压缩和加速策略,包括量化、剪枝、知识蒸馏、低秩近似和高效注意力,并讨论了它们的优势、局限性和任务依赖性。我们不将压缩视为孤立的后期处理步骤,而是强调其作为直接影响部署性、鲁棒性和安全性的系统级设计考虑。最后,我们指出了面向标准化、安全感知和硬件感知的高效自动驾驶系统评估的开放挑战和未来研究方向。

英文摘要

Transformer-based models are becoming a central paradigm in autonomous driving because they can capture long-range spatial dependencies, multi-agent interactions, and multimodal context across perception, prediction, and planning. At the same time, their deployment in real vehicles remains difficult because high-capacity attention-based architectures impose substantial latency, memory, and energy overhead. This survey reviews representative Transformer-based autonomous driving models and organizes them by task role, sensing configuration, and architectural design. More importantly, it examines these models from a deployment-oriented perspective and analyzes how efficiency constraints reshape model design choices in practice. We further review compression and acceleration strategies relevant to Transformer-based driving systems, including quantization, pruning, knowledge distillation, low-rank approximation, and efficient attention, and discuss their benefits, limitations, and task-dependent applicability. Rather than treating compression as an isolated post-processing step, we highlight it as a system-level design consideration that directly affects deployability, robustness, and safety. Finally, we identify open challenges and future research directions toward standardized, safety-aware, and hardware-conscious evaluation of efficient autonomous driving systems.

2605.10246 2026-06-04 cs.AI

SciIntegrity-Bench: A Benchmark for Evaluating Academic Integrity in AI Scientist Systems

SciIntegrity-Bench:评估AI科学家系统学术诚信的基准

Zonglin Yang, Xingtong Liu, Xinyan Xu

发表机构 * Tongji University(同济大学) University of Tübingen(图宾根大学)

AI总结 提出SCIINTEGRITY-BENCH基准,通过困境评估范式测试7个LLM在33个场景中的学术诚信,发现整体诚信问题率达34.2%,所有模型均存在失败,其中数据缺失场景下所有模型生成合成数据而非承认不可行。

详情
AI中文摘要

AI科学家系统越来越多地被用于自主研究,但其学术诚信从未被系统评估。我们引入了SCIINTEGRITY-BENCH,这是第一个围绕困境评估范式设计的基准:其11个陷阱类别中的33个场景均被构建为,诚实承认失败是唯一正确的回应,而完成任务则需要不端行为。在涵盖7个最先进LLM的231次评估运行中,整体诚信问题率达到34.2%,且没有模型实现零失败。最引人注目的是,在数据缺失场景中,所有七个模型都生成合成数据而非承认不可行,仅在是否披露替代数据方面有所不同。进一步的提示消融研究分离出两个驱动因素:移除明确的完成压力将未披露的捏造从20.6%急剧降至3.2%,而底层合成率保持不变,揭示了一种独立于提示级指令而持续存在的内在完成偏差。这些发现表明,缺乏诚实拒绝作为一种训练有素的倾向是观察到的失败的主要驱动因素。我们在https://github.com/liuxingtong/Sci-Integrity-Bench发布SCIINTEGRITY-BENCH。

英文摘要

AI scientist systems are increasingly deployed for autonomous research, yet their academic integrity has never been systematically evaluated. We introduce SCIINTEGRITY-BENCH, the first benchmark designed around a dilemmatic evaluation paradigm: each of its 33 scenarios across 11 trap categories is constructed so that honest acknowledgment of failure is the only correct response, while task completion requires misconduct. Across 231 evaluation runs spanning 7 state-of-the-art LLMs, the overall integrity problem rate reaches 34.2%, and no model achieves zero failures. Most strikingly, across missing-data scenarios, all seven models generate synthetic data rather than acknowledging infeasibility, differing only in whether they disclose the substitution. A further prompt ablation study separates two drivers: removing explicit completion pressure sharply reduces undisclosed fabrication from 20.6% to 3.2%, while the underlying synthesis rate remains unchanged, revealing an intrinsic completion bias that persists independent of prompt-level instructions. These findings point to the absence of honest refusal as a trained disposition as the primary driver of observed failures. We release SCIINTEGRITY-BENCH at https://github.com/liuxingtong/Sci-Integrity-Bench.

2602.02834 2026-06-04 cs.LG cs.AI

What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA

什么结构归纳偏置帮助Transformer在知识图谱上进行推理?Tabula RASA研究

Jonas Petersen, Camilla Mazzoleni, Gian-Alessandro Lombardi, Federico Martelli, Riccardo Maggioni

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 通过最小化Transformer修改的消融实验,发现稀疏邻接掩码是驱动多跳推理的主要结构归纳偏置,而关系参数贡献有限。

Comments Accepted at GFM, ICML 2026

详情
AI中文摘要

什么结构归纳偏置帮助Transformer在知识图谱上进行推理?通过对一个最小化Transformer修改(包含四个独立可移除组件:稀疏邻接掩码、边类型偏置、查询缩放、值门控)进行受控消融,我们隔离了哪些结构信号驱动多跳推理。我们的发现很明确:稀疏邻接掩码单独占据了相对于未掩码Transformer改进的主要份额(在3跳MetaQA上+72.5pp,在WebQSP上+45.5pp,在CWQ上+53.9pp),而学习的关系参数只增加了适度的改进,并且在缺乏结构指导时可能造成损害。一个零样本实验提供了架构独立的佐证:当边类型被排除时,基于掩码的注意力退化比关系特定权重少4.0倍。多跳KGQA的有用归纳偏置主要是拓扑的,而非关系的。

英文摘要

What structural inductive bias helps transformers reason over knowledge graphs? Through controlled ablations of a minimal transformer modification with four independently removable components (sparse adjacency masking, edge-type biases, query scaling, value gating), we isolate which structural signals drive multi-hop reasoning. Our finding is sharp: sparse adjacency masking alone accounts for the dominant share of improvement over unmasked transformers (+72.5pp on 3-hop MetaQA, +45.5pp on WebQSP, +53.9pp on CWQ), while learned relation parameters add only modest refinement and can actively hurt without structural guidance. A zero-shot experiment provides architecturally independent corroboration: masking-based attention degrades 4.0x less than relation-specific weights when edge types are held out. The useful inductive bias for multi-hop KGQA is predominantly topological, not relational.

2602.22779 2026-06-04 cs.CV

TrajTok: Learning Trajectory Tokens enables better Video Understanding

TrajTok: 学习轨迹令牌以实现更好的视频理解

Chenhao Zheng, Jieyu Zhang, Jianing Zhang, Weikai Huang, Ashutosh Kumar, Quan Kong, Oncel Tuzel, Chun-Liang Li, Ranjay Krishna

发表机构 * University of Washington(华盛顿大学) Allen Institute for Artificial Intelligence(人工智能研究院) Apple(苹果公司) Woven by Toyota, Inc(丰田纺织公司)

AI总结 提出TrajTok,一种端到端视频令牌化模块,通过隐式时空聚类生成对象轨迹令牌,提升视频理解效率与性能。

Comments CVPR 2026

详情
AI中文摘要

视频模型中的令牌化通常通过分块(patchification)进行,产生过多且冗余的令牌,严重限制了视频的效率和可扩展性。虽然最近的基于轨迹的令牌化器通过将视频时长与令牌数量解耦提供了有前景的解决方案,但它们依赖于复杂的外部分割和跟踪流水线,速度慢且任务无关。我们提出TrajTok,一个端到端的视频令牌化器模块,完全集成并与视频模型共同训练以服务于下游目标,根据语义复杂度动态调整令牌粒度,独立于视频时长。TrajTok包含一个统一的分割器,在空间和时间上对像素进行隐式聚类,直接在一次前向传播中生成对象轨迹。通过优先考虑下游适应性而非像素完美的分割保真度,TrajTok轻量且高效,同时经验上提升了视频理解性能。利用TrajTok,我们实现了一个从头训练的视频CLIP模型(TrajViT2)。它在分类和检索基准上均实现了大规模的最佳精度,同时保持了与最佳令牌合并方法相当的高效率。TrajTok也证明了其作为令牌化器之外的多功能组件。我们表明,它可以无缝集成作为预训练视觉特征的探测头(TrajAdapter)或视觉-语言模型中的对齐连接器(TrajVLM),尤其在长视频推理中表现出色。

英文摘要

Tokenization in video models, typically through patchification, generates an excessive and redundant number of tokens. This severely limits video efficiency and scalability. While recent trajectory-based tokenizers offer a promising solution by decoupling video duration from token count, they rely on complex external segmentation and tracking pipelines that are slow and task-agnostic. We propose TrajTok, an end-to-end video tokenizer module that is fully integrated and co-trained with video models for a downstream objective, dynamically adapting its token granularity to semantic complexity, independent of video duration. TrajTok contains a unified segmenter that performs implicit clustering over pixels in both space and time to directly produce object trajectories in a single forward pass. By prioritizing downstream adaptability over pixel-perfect segmentation fidelity, TrajTok is lightweight and efficient, yet empirically improves video understanding performance. With TrajTok, we implement a video CLIP model trained from scratch (TrajViT2). It achieves the best accuracy at scale across both classification and retrieval benchmarks, while maintaining efficiency comparable to the best token-merging methods. TrajTok also proves to be a versatile component beyond its role as a tokenizer. We show that it can be seamlessly integrated as either a probing head for pretrained visual features (TrajAdapter) or an alignment connector in vision-language models (TrajVLM) with especially strong performance in long-video reasoning.

2605.08665 2026-06-04 cs.CL

Hint Tuning: Less Data Makes Better Reasoners

Hint Tuning:更少的数据造就更好的推理者

Siqi Fan, Minghao Li, Xiaoqian Ma, Xiusheng Huang, Zhuo Chen, Bowen Qin, Liujie Zhang, Shuo Shang, Weihang Chen

发表机构 * University of Electronic Science and Technology of China(电子科技大学) Xiaohongshu Inc.(小红书公司) National University of Singapore(新加坡国立大学)

AI总结 提出Hint Tuning方法,通过自动构建三种提示状态(无提示、稀疏提示、完整提示)的训练数据,使推理模型根据问题难度校准推理深度,仅用1K样本即可在多个主流推理模型上平均减少31.5%的token生成,同时保持竞争性准确率。

详情
AI中文摘要

大型推理模型通过扩展思维链实现了高准确率,但生成的token比必要数量多5-8倍,且无论问题难度如何都统一应用冗长的推理。我们提出了Hint Tuning,一种数据高效的方法,教会模型校准推理深度。我们的关键洞察是:对应的指令模型可以作为理想的难度探针。通过测试指令模型在不同引导下能解决的问题,我们自动构建了三种状态的训练数据:No-Hint(直接答案)、Sparse-Hint(最小前缀)和Full-Hint(完整推理)。这将难度标注的抽象挑战转化为指令模型与推理模型之间可测量的一致性检查。仅使用1K自标注样本,Hint Tuning在多个尺度的主流推理模型(Qwen3-Thinking、DeepSeek-R1-Distill,4B-32B)上实现了24-66%的token减少(平均31.5%),同时在五个基准测试上保持了竞争性准确率。与需要大规模蒸馏数据集或昂贵强化学习的方法不同,我们通过简单地对齐指令模型的能力实现了卓越的效率。代码和数据可在https://github.com/redai-infra/hint-tuning获取。

英文摘要

Large reasoning models achieve high accuracy through extended chain-of-thought but generate 5--8 more tokens than necessary, applying verbose reasoning uniformly regardless of problem difficulty. We propose Hint Tuning, a data-efficient approach that teaches models to calibrate reasoning depth. Our key insight: the corresponding instruct model serves as an ideal difficulty probe. By testing what the instruct model can solve with varying guidance, we automatically construct training data across three states: No-Hint (direct answer), Sparse-Hint (minimal prefix), and Full-Hint (complete reasoning). This converts the abstract challenge of difficulty labeling into a measurable consistency check between the instruct and reasoning models. With only 1K self-annotated samples, Hint Tuning achieves 24--66% token reduction (31.5% average) across mainstream reasoning models (Qwen3-Thinking, DeepSeek-R1-Distill) at multiple scales (4B--32B) while maintaining competitive accuracy on five benchmarks. Unlike methods requiring massive distillation datasets or expensive RL, we achieve superior efficiency through simple alignment with the instruct model's capabilities. Code and data are available at https://github.com/redai-infra/hint-tuning.

2510.17281 2026-06-04 cs.LG cs.AI cs.IR

MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems

MemoryBench:面向LLM系统的记忆与持续学习基准

Qingyao Ai, Yichen Tang, Changyue Wang, Jianming Long, Weihang Su, Yiqun Liu

发表机构 * Department of Computer Science and Technology, Tsinghua University, Beijing, China(清华大学计算机科学与技术系)

AI总结 提出用户反馈模拟框架及跨领域、多语言、多任务类型的综合基准MemoryBench,评估LLM系统从累积用户反馈中持续学习的能力,实验表明现有方法效果与效率均不理想。

详情
AI中文摘要

扩展数据、参数和测试时计算一直是改进LLM系统(LLMsys)的主流方法,但由于高质量数据的逐渐枯竭以及更大计算资源消耗带来的边际收益,这些方法的性能上限已几乎达到。受人类和传统AI系统从实践中学习能力的启发,为LLMsys构建记忆和持续学习框架已成为近期文献中一个重要且热门的研究方向。然而,现有的LLM记忆基准通常侧重于评估系统在长文本输入的同质阅读理解任务上的表现,而非测试其在服务时间内从累积用户反馈中学习的能力。因此,我们提出了一个用户反馈模拟框架和一个涵盖多个领域、语言和任务类型的综合基准,以评估LLMsys的持续学习能力。实验表明,最先进的基线方法在有效性和效率上远未令人满意,我们希望这一基准能为未来LLM记忆和优化算法的研究铺平道路。

英文摘要

Scaling up data, parameters, and test-time computation has been the mainstream methods to improve LLM systems (LLMsys), but their upper bounds are almost reached due to the gradual depletion of high-quality data and marginal gains obtained from larger computational resource consumption. Inspired by the abilities of human and traditional AI systems in learning from practice, constructing memory and continual learning frameworks for LLMsys has become an important and popular research direction in recent literature. Yet, existing benchmarks for LLM memory often focus on evaluating the system on homogeneous reading comprehension tasks with long-form inputs rather than testing their abilities to learn from accumulated user feedback in service time. Therefore, we propose a user feedback simulation framework and a comprehensive benchmark covering multiple domains, languages, and types of tasks to evaluate the continual learning abilities of LLMsys. Experiments show that the effectiveness and efficiency of state-of-the-art baselines are far from satisfying, and we hope this benchmark could pave the way for future studies on LLM memory and optimization algorithms. Website: https://memorybench.thuir.cn Code: https://github.com/THUIR/MemoryBench Data: https://huggingface.co/datasets/THUIR/MemoryBench Data-Full: https://huggingface.co/datasets/THUIR/MemoryBench-Full

2506.01250 2026-06-04 cs.LG stat.ML

Neural Variance-aware Dueling Bandits with Deep Representation and Shallow Exploration

神经方差感知的深度表示与浅层探索的对抗性老虎机

Youngmin Oh, Jinje Park, Taejin Paik

发表机构 * InfiniTree Samsung Electronics(InfiniTree三星电子) Samsung Electronics(三星电子)

AI总结 提出首个方差感知的上下文对抗性老虎机算法,结合浅层探索与神经网络非线性效用逼近,通过迭代自改进与谱分析将网络宽度需求从Ω̃(T^{14})降至Ω̃(T^{6}),并实现次线性遗憾。

Comments Accepted at AISTATS 2026; code at https://github.com/youngmin0oh/NVLDB-AISTATS2026

详情
AI中文摘要

我们首次引入了方差感知的上下文对抗性老虎机算法,该算法利用浅层探索策略与神经网络进行非线性效用逼近。一个关键的理论挑战是缺乏闭式估计量,这导致先前的工作需要极大的网络宽度$m$(即$m = \widetilde{\Omega}(T^{14})$)。我们通过一种结合迭代自改进与谱分析的新颖分析方法解决了这一约束。我们的分析将网络宽度需求显著降低至$m = \widetilde{\Omega}(T^{6})$,并表明我们的算法在UCB和TS框架下均实现了次线性遗憾$\widetilde{\mathcal{O}}(d\sqrt{\sum_{t=1}^{T} \sigma_t^2} + \sqrt{dT})$。实验结果表明,所提出的算法不仅计算高效,在实际环境中表现出次线性遗憾,而且在合成和实际任务上均达到了最先进的性能。

英文摘要

We introduce the first variance-aware algorithms for contextual dueling bandits that leverage shallow exploration strategies with neural networks for nonlinear utility approximation. A key theoretical challenge is the absence of a closed-form estimator, which led prior work to require an extremely large network width $m$ (i.e., $m = \widetildeΩ(T^{14})$). We address this constraint with a novel analytical approach that combines iterative self-improvement with spectral analysis. Our analysis significantly reduces the network width requirement to $m = \widetildeΩ(T^{6})$, and shows that our algorithms achieve a sublinear regret of $\widetilde{\mathcal{O}}(d\sqrt{\sum_{t=1}^{T} σ_t^2} + \sqrt{dT})$ under both UCB and TS frameworks. Empirical results show that the proposed algorithms are not only computationally efficient and exhibit sublinear regret in practical settings, but also achieve state-of-the-art performance on both synthetic and real-world tasks.

2605.07724 2026-06-04 cs.LG cs.AI

Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences

策展合成数据不会崩溃:具有多元偏好的生成式再训练的理论研究

Ali Falahati, Mohammad Mohammadi Amiri, Kate Larson, Lukasz Golab

发表机构 * University of Washington(华盛顿大学)

AI总结 通过理论分析证明,基于多个奖励函数进行策展的递归训练可以避免生成模型崩溃,并收敛到满足加权纳什议价解的稳定分布。

Comments Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026

详情
AI中文摘要

生成模型的递归再训练提出了一个关键的表示挑战:当基于固定奖励信号策展合成输出时,模型倾向于崩溃到过度优化该目标的狭窄输出集上。先前的研究表明,如果不将真实数据混合进来,这种崩溃是不可避免的。我们从对齐角度重新审视这一结论,并表明通过基于多个奖励函数的策展可以减轻崩溃。我们形式化了异质偏好下递归训练的动力学,并证明在特定条件下,模型收敛到一个稳定分布,该分布在竞争的高奖励区域之间分配概率质量。极限分布保持多样性,并证明满足加权纳什议价解,为合成再训练循环中的价值聚合提供了正式解释。

英文摘要

Recursive retraining of generative models poses a critical representation challenge: when synthetic outputs are curated based on a fixed reward signal, the model tends to collapse onto a narrow set of outputs that over-optimize that objective. Prior work suggests that such collapse is unavoidable without adding real data into the mix. We revisit this conclusion from an alignment perspective and show that collapse can be mitigated through curation based on multiple reward functions. We formalize the dynamics of recursive training under heterogeneous preferences and prove that, under certain conditions, the model converges to a stable distribution that allocates probability mass across competing high-reward regions. The limiting distribution preserves diversity and provably satisfies a weighted Nash bargaining solution, offering a formal interpretation of value aggregation in synthetic retraining loops.

2605.07032 2026-06-04 cs.LG cs.AI

A Systematic Investigation of RL-Jailbreaking in LLMs

LLMs中RL越狱的系统性研究

Montaser Mohammedalamen, Kevin Roice, Reginald McLean, Alyssa Lefaivre Škopac

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文首次系统分解RL越狱框架,通过分析奖励函数、动作空间、回合长度等环境形式化因素和算法措施,发现密集奖励和延长回合长度是越狱成功的主要驱动因素,并提供了提升RL越狱效率及强化模型防御的工具。

Comments Warning: This paper may contain unfiltered and potentially offensive jailbreaking examples. Accepted at the Second Workshop on Agents in the Wild: Safety, Security, and Beyond (AIWILD) at ICML 2026

详情
AI中文摘要

生成模型从下一个词预测器演变为复杂系统的自主引擎,这要求严格的安全加固。对抗性越狱,即通过策略性操纵模型以产生有害输出,仍然是安全部署的主要威胁。虽然强化学习(RL)通过顺序优化将越狱视为多步攻击,但对该框架为何成功的机制理解仍不完整。为填补这一空白,我们首次对RL越狱进行了系统分解。我们将框架解构为问题形式化(奖励函数、动作空间、回合长度)和算法措施(RL算法、训练数据、奖励塑造),以识别对抗成功的结构决定因素。我们的结果表明,RL越狱者成功攻破了所有目标模型和安全措施。通过这种首次分析,我们证明环境形式化,特别是密集奖励和延长回合长度,是越狱成功的主要驱动因素。这项工作为提高RL越狱效率提供了工具,并最终强化生成模型以抵御基于RL的攻击。

英文摘要

The evolution of generative models from next-token predictors to autonomous engines of complex systems necessitates rigorous safety hardening. Adversarial jailbreaking, the strategic manipulation of models to elicit harmful output, remains a primary threat to safe deployment. While Reinforcement Learning (RL) frames jailbreaking as a multi-step attack through sequential optimization, a mechanistic understanding of why the framework succeeds remains incomplete. To fill this gap, we present the first systematic decomposition of RL jailbreaking. We deconstruct the framework into problem formalization (reward function, action space, episode length), and algorithmic measures (RL algorithm, training data, reward-shaping) to identify the structural determinants of adversarial success. Our results reveal that the RL-jailbreaker successfully compromised all targeted models and safeguards. Through this first-of-its-kind analysis, we demonstrate that environment formalization, specifically dense rewards and extended episode lengths, is the primary driver of jailbreaking success. This work provides a tool for improving RL-jailbreaker efficiency and, ultimately, harden generative models resistant to RL-based attacks.

2605.06637 2026-06-04 cs.CV

DPM++: Dynamic Masked Metric Learning for Occluded Person Re-identification

DPM++:用于遮挡行人重识别的动态掩码度量学习

Lei Tan, Yingshi Luan, Pincong Zou, Pingyang Dai, Liujuan Cao

发表机构 * Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University(中国教育部多媒体可信感知与高效计算重点实验室,厦门大学)

AI总结 提出DPM++动态掩码度量学习框架,通过自适应掩码选择可靠身份子空间,结合CLIP两阶段监督和显著性引导的补丁转移策略,在遮挡和整体场景下均达到最优性能。

详情
AI中文摘要

尽管行人重识别取得了显著进展,但障碍物造成的遮挡在实际应用中仍是一个未解决的问题。困难在于不完整的遮挡样本与整体身份表示之间的不匹配。严重遮挡会移除判别性身体线索并引入背景杂波和遮挡物的干扰,使得全局度量学习不可靠。现有方法主要依赖额外的预训练模型来估计可见部分以进行对齐,或通过数据增强构建遮挡样本,但仍缺乏一个统一的框架来学习在真实遮挡模式下鲁棒的可见性一致匹配。本文提出了DPM++,一种用于遮挡行人重识别的动态掩码度量学习框架。DPM++学习一种输入自适应的掩码度量,动态地为每个遮挡实例选择可靠的身份子空间,使匹配能够强调可见性一致的证据,同时抑制不可靠的组件。基于分类器-原型空间,DPM++引入了基于CLIP的两阶段监督方案,其中ID级语义先验从文本分支学习并转移到分类器-原型空间中进行动态掩码匹配。为了增强掩码度量,我们引入了一种显著性引导的补丁转移策略,在训练过程中合成可控且逼真的遮挡样本。利用真实场景先验,该策略使模型暴露于真实的部分观察中,并提供比随机擦除更丰富的监督。此外,遮挡感知的样本配对和掩码引导优化提高了框架的稳定性和有效性。在遮挡和整体行人重识别基准上的实验表明,DPM++在整体和遮挡场景中均持续优于先前的最先进方法。

英文摘要

Although person re-identification has made impressive progress, occlusion caused by obstacles remains an unsettled issue in real applications. The difficulty lies in the mismatch between incomplete occluded samples and holistic identity representations. Severe occlusion removes discriminative body cues and introduces interference from background clutter and occluders, making global metric learning unreliable. Existing methods mainly rely on extra pre-trained models to estimate visible parts for alignment or construct occluded samples via data augmentation, but still lack a unified framework that learns robust visibility-consistent matching under realistic occlusion patterns. In this paper, we propose DPM++, a Dynamic Masked Metric Learning framework for occluded person re-identification. DPM++ learns an input-adaptive masked metric that dynamically selects reliable identity subspaces for each occluded instance, enabling matching to emphasize visibility-consistent evidence while suppressing unreliable components. Built upon the classifier-prototype space, DPM++ introduces a CLIP-based two-stage supervision scheme, where ID-level semantic priors are learned from the text branch and transferred into the classifier-prototype space for dynamic masked matching. To strengthen the masked metric, we introduce a saliency-guided patch transfer strategy to synthesize controllable and photo-realistic occluded samples during training. Exploiting real scene priors, this strategy exposes the model to realistic partial observations and provides richer supervision than random erasing. In addition, occlusion-aware sample pairing and mask-guided optimization improve the stability and effectiveness of the framework. Experiments on occluded and holistic person re-identification benchmarks show that DPM++ consistently outperforms previous state-of-the-art methods in both holistic and occlusion scenarios.

2605.00416 2026-06-04 cs.RO

Learning While Deploying: Fleet-Scale Reinforcement Learning for Generalist Robot Policies

在部署中学习:面向通用机器人策略的机群规模强化学习

Yi Wang, Xinchen Li, Pengwei Xie, Pu Yang, Buqing Nie, Yunuo Cai, Qinglin Zhang, Chendi Qu, Jeffrey Wu, Jianheng Song, Xinlin Ren, Jingshun Huang, Mingjie Pan, Siyuan Feng, Zhi Chen, Jianlan Luo

发表机构 * Shanghai Innovation Institute(上海创新研究院) AGIBOT Finch Columbia University(哥伦比亚大学)

AI总结 提出LWD框架,通过机群规模的离线到在线强化学习,结合分布式隐式价值学习与伴随匹配Q学习,持续后训练通用视觉-语言-动作策略,在16台双臂机器人上实现95%平均成功率。

Comments No

详情
AI中文摘要

通用机器人策略日益受益于大规模预训练,但仅靠离线数据不足以实现稳健的实世界部署。已部署的机器人会遇到分布偏移、长尾故障、任务变化以及人类纠正机会,这些是固定演示数据集无法完全捕获的。我们提出了“在部署中学习”(LWD),一个机群规模的离线到在线强化学习框架,用于通用视觉-语言-动作(VLA)策略的持续后训练。从预训练的VLA策略开始,LWD通过使用在机器人机群中收集的自主 rollout 和人类干预,在部署、共享物理经验、策略改进和重新部署之间形成闭环。为了稳定地从异构、稀疏奖励的机群数据中学习,LWD结合了分布式隐式价值学习(DIVL)进行鲁棒的价值估计,以及通过伴随匹配的Q学习(QAM)在基于流的VLA动作生成器中进行策略提取。我们在一个由16台双臂机器人组成的机群上,在八个真实世界操作任务(包括语义杂货补货和3-5分钟的长时域任务)上验证了LWD。单个通用策略随着机群经验的积累而改进,平均成功率达到95%,在长时域任务上提升最大。

英文摘要

Generalist robot policies increasingly benefit from large-scale pretraining, but offline data alone is insufficient for robust real-world deployment. Deployed robots encounter distribution shifts, long-tail failures, task variations, and human correction opportunities that fixed demonstration datasets cannot fully capture. We present Learning While Deploying (LWD), a fleet-scale offline-to-online reinforcement learning framework for continual post-training of generalist Vision-Language-Action (VLA) policies. Starting from a pretrained VLA policy, LWD closes the loop between deployment, shared physical experience, policy improvement, and redeployment by using autonomous rollouts and human interventions collected across a robot fleet. To stabilize learning from heterogeneous, sparse-reward fleet data, LWD combines Distributional Implicit Value Learning (DIVL) for robust value estimation with Q-learning via Adjoint Matching (QAM) for policy extraction in flow-based VLA action generators. We validate LWD on a fleet of 16 dual-arm robots across eight real-world manipulation tasks, including semantic grocery restocking and 3--5 minute long-horizon tasks. A single generalist policy improves as fleet experience accumulates, reaching an average success rate of 95%, with the largest gains on long-horizon tasks.

2605.00242 2026-06-04 cs.CV cs.AI

MAEPose: Self-Supervised Spatiotemporal Learning for Human Pose Estimation on mmWave Video

MAEPose: 基于毫米波视频的人体姿态估计的自监督时空学习

Xijia Wei, Yuan Fang, Kevin Chetty, Youngjun Cho, Nadia Bianchi-Berthouze

发表机构 * University College London(伦敦大学学院)

AI总结 提出MAEPose,一种直接处理毫米波频谱视频的掩码自编码方法,通过自监督时空学习实现鲁棒的人体姿态估计,在三个数据集上优于现有方法。

详情
AI中文摘要

毫米波雷达为基于RGB的人体姿态估计提供了一种更具隐私保护性的替代方案。然而,现有方法通常依赖预提取的中间表示,如稀疏点云或频谱图图像,这些方法丢弃了雷达视频流中自然存在的丰富时空信息用于模型学习,同时此类信号处理增加了系统复杂性。此外,现有解决方案主要采用端到端监督方式,未利用未标记的原始视频流来学习通用表示。在本研究中,我们提出MAEPose,一种基于掩码自编码的人体姿态估计方法,直接处理毫米波频谱视频。MAEPose从未标记的雷达视频中学习时空运动感知的通用表示,并利用其热图解码器进行多帧姿态估计预测。我们基于留一法交叉验证和严格的统计检验,在三个数据集上对其进行评估。MAEPose在MPJPE指标上始终优于最先进的基线方法,最高提升22.1%(p<0.05),并且在零样本旁观者干扰下保持鲁棒精度,误差仅增加6.5%。消融研究证实,预训练和热图解码器均有显著贡献,而模态分析表明,使用距离-多普勒视频作为输入比距离-方位角或其融合能实现更好的姿态估计性能,且计算成本更低。

英文摘要

Millimetre-wave (mmWave) radar offers a more privacy-preserving alternative to RGB-based human pose estimation. However, existing methods typically rely on pre-extracted intermediate representations such as sparse point clouds or spectrogram images, where the rich spatiotemporal information naturally present in radar video streams is discarded for model learning, while such signal processing adds system complexity. In addition, existing solutions are mainly conducted in an end-to-end supervised manner without leveraging unlabelled raw video streams to learn generalized representations. In this study, we present MAEPose, a masked autoencoding-based human pose estimation approach that operates directly on mmWave spectrogram videos. MAEPose learns spatiotemporal motion-aware generalized representations from unlabelled radar video, and leverages its heatmap decoder for multi-frame pose estimation predictions. We evaluate it across three datasets based on leave-one-person-out cross-validation with rigorous statistical testing. MAEPose consistently outperforms state-of-the-art baselines by up to 22.1% in MPJPE p<0.05, and maintains robust accuracy under zero-shot bystander interference with only a 6.5% error increase. Ablation studies confirm that both the pre-training and the heatmap decoder contribute substantially, while modality analysis indicates that leveraging Range-Doppler video as input achieves better pose estimation performance than Range-Azimuth or their fusion, with lower computational cost.

2604.28173 2026-06-04 cs.CV

Action Motifs: Self-Supervised Hierarchical Representation of Human Body Movements

动作基元:人体运动的自监督层次化表示

Genki Kinoshita, Shu Nakamura, Ryo Kawahara, Shohei Nobuhara, Yasutomo Kawanishi, Ko Nishino

发表机构 * Kyoto University(京都大学) Kyoto Institute of Technology(京都理工学院) RIKEN(理化学研究所)

AI总结 提出一种层次化表示方法,通过自监督学习从人体姿态数据中提取动作原子和动作基元,用于动作识别、运动预测和运动插值等任务。

Comments Accepted as Highlight at CVPR2026. Project page: https://vision.ist.i.kyoto-u.ac.jp/research/action-motifs/

详情
AI中文摘要

有效的人类行为建模需要一种能够利用其组合性的人体运动表示。我们提出了一种层次化表示,包括捕获原子关节运动的动作原子和由它们的时间组合形成的动作基元,这些基元编码了在不同整体人类动作中发现的相似身体运动。我们推导出A4Mer,一种嵌套的潜在Transformer,以完全自监督的方式从人体姿态数据中学习这种层次化表示。A4Mer将3D姿态序列分割成可变长度的片段,并将每个片段表示为单个潜在令牌(动作原子)。通过自底向上的表示学习,由这些动作原子组成的时间模式自然出现(动作基元),这些模式捕获了可重复的、语义化的身体运动片段的有意义时间跨度。A4Mer通过在其各自的潜在空间中进行掩码令牌预测的统一预训练任务来实现这一点。我们还引入了动作基元数据集(AMD),这是一个大规模的多视角人类行为视频数据集,具有完整的SMPL注释。我们引入了一种新颖的相机使用方式,将其安装在脚上,以在频繁且严重的身体遮挡情况下实现逐帧注释。实验结果证明了A4Mer在提取有意义的动作基元方面的有效性,这些基元显著有益于人类行为建模任务,包括动作识别、运动预测和运动插值。

英文摘要

Effective human behavior modeling requires a representation of the human body movement that capitalizes on its compositionality. We propose a hierarchical representation consisting of Action Atoms that capture the atomic joint movements and Action Motifs that are formed by their temporal compositions and encode similar body movements found across different overall human actions. We derive A4Mer, a nested latent Transformer to learn this hierarchical representation from human pose data in a fully self-supervised manner. A4Mer splits a 3D pose sequence into variable-length segments and represents each segment as a single latent token (Action Atoms). Through bottom-up representation learning, temporal patterns composed of these Action Atoms, which capture meaningful temporal spans of reusable, semantic segments of body movements, naturally emerge (Action Motifs). A4Mer achieves this with a unified pretext task of masked token prediction in their respective latent spaces. We also introduce Action Motif Dataset (AMD), a large-scale dataset of multi-view human behavior videos with full SMPL annotations. We introduce a novel use of cameras by mounting them on the feet to achieve their frame-wise annotations despite frequent and heavy body occlusions. Experimental results demonstrate the effectiveness of A4Mer for extracting meaningful Action Motifs, which significantly benefit human behavior modeling tasks including action recognition, motion prediction, and motion interpolation.

2604.27007 2026-06-04 cs.AI

Binary Spiking Neural Networks as Causal Models

二元脉冲神经网络作为因果模型

Aditya Kar, Emiliano Lorini, Timothée Masquelier

发表机构 * Institut de Recherche en Informatique de Toulouse (IRIT)(图卢兹信息研究所(IRIT)) Centre de Recherche Cerveau et Cognition (CerCo)(脑与认知研究中心(CerCo)) CNRS(国家科学研究中心)

AI总结 将二元脉冲神经网络(BSNN)表示为二元因果模型,利用SAT和SMT求解器计算溯因解释,并保证解释中不包含无关特征。

Journal ref Logics for New-Generation AI 2025 Fifth International Workshop, Beishui Liao; Antonino Rotolo; Leendert van der Torre; Liuwen Yu, Dec 2025, Luxembourg City, Luxembourg. pp.51-68

详情
AI中文摘要

我们对二元脉冲神经网络(BSNN)进行因果分析以解释其行为。我们正式定义了BSNN,并将其脉冲活动表示为二元因果模型。借助这种因果表示,我们能够利用基于逻辑的方法解释网络的输出。特别地,我们展示了可以成功使用SAT和SMT求解器从该二元因果模型中计算溯因解释。为了说明我们的方法,我们在标准MNIST数据集上训练了BSNN,并应用基于SAT和SMT的方法,基于像素级特征找到网络分类的溯因解释。我们还将找到的解释与可解释AI领域流行的方法SHAP进行了比较。我们表明,与SHAP不同,我们的方法保证找到的解释不包含完全无关的特征。

英文摘要

We provide a causal analysis of Binary Spiking Neural Networks (BSNNs) to explain their behavior. We formally define a BSNN and represent its spiking activity as a binary causal model. Thanks to this causal representation, we are able to explain the output of the network by leveraging logic-based methods. In particular, we show that we can successfully use a SAT as well as a SMT solver to compute abductive explanations from this binary causal model. To illustrate our approach, we trained the BSNN on the standard MNIST dataset and applied our SAT-based and SMT-based methods to finding abductive explanations of the network's classifications based on pixel-level features. We also compared the found explanations against SHAP, a popular method used in the area of explainable AI. We show that, unlike SHAP, our approach guarantees that a found explanation does not contain completely irrelevant features.

2604.25860 2026-06-04 cs.CL cs.AI cs.CY

Luminol-AIDetect: Fast Zero-shot Machine-Generated Text Detection based on Perplexity under Text Shuffling

Luminol-AIDetect: 基于文本打乱下困惑度的快速零样本机器生成文本检测

Lucio La Cava, Andrea Tagarelli

发表机构 * DIMES Dept., University of Calabria(卡塔尼亚大学DIMES部门)

AI总结 提出Luminol-AIDetect,一种通过随机打乱文本并利用困惑度变化来区分机器生成文本与人类写作的零样本统计方法,在多个领域和攻击下达到SOTA性能。

Comments Under Review

详情
AI中文摘要

机器生成文本检测需要识别跨生成模型的结构不变信号,而非依赖模型特定指纹。为此,我们假设尽管大语言模型擅长局部语义一致性,但其自回归特性导致与人类写作相比存在特定结构脆弱性。我们提出Luminol-AIDetect,一种新颖的零样本统计方法,通过连贯性破坏暴露这种脆弱性。通过应用简单的随机文本打乱程序,我们证明困惑度的变化可作为原则性的、模型无关的判别依据,因为机器生成文本在打乱下的困惑度表现出特征性分散,与人类写作更稳定的结构变异性显著不同。Luminol-AIDetect利用这一区别指导决策过程,从输入文本及其打乱版本中提取少量基于困惑度的标量特征,然后通过密度估计和集成预测进行检测。在8个内容领域、11种对抗攻击类型和18种语言上的评估表明,Luminol-AIDetect实现了最先进的性能,FPR降低高达17倍,同时成本低于先前方法。

英文摘要

Machine-generated text (MGT) detection requires identifying structurally invariant signals across generation models, rather than relying on model-specific fingerprints. In this respect, we hypothesize that while large language models excel at local semantic consistency, their autoregressive nature results in a specific kind of structural fragility compared to human writing. We propose Luminol-AIDetect, a novel, zero-shot statistical approach that exposes this fragility through coherence disruption. By applying a simple randomized text-shuffling procedure, we demonstrate that the resulting shift in perplexity serves as a principled, model-agnostic discriminant, as MGT displays a characteristic dispersion in perplexity-under-shuffling that differs markedly from the more stable structural variability of human-written text. Luminol-AIDetect leverages this distinction to inform its decision process, where a handful of perplexity-based scalar features are extracted from an input text and its shuffled version, then detection is performed via density estimation and ensemble-based prediction. Evaluated across 8 content domains, 11 adversarial attack types, and 18 languages, Luminol-AIDetect demonstrates state-of-the-art performance, with gains up to 17x lower FPR while being cheaper than prior methods.

2604.25649 2026-06-04 cs.LG

Towards interpretable AI with quantum annealing feature selection

迈向可解释的人工智能:基于量子退火的特征选择

Francesco Aldo Venturelli, Emanuele Costa, Sikha O K, Bruno Juliá-Díaz, Miguel A. González Ballester, Alba Cervera-Lierta

发表机构 * BCN Medtech, Universitat Pompeu Fabra, Barcelona, Spain(BCN医疗科技,庞培法布拉大学,巴塞罗那,西班牙) Barcelona Supercomputing Center (BSC)(巴塞罗那超级计算中心(BSC)) Departament de Física Quàntica i Astrofísica, Facultat de Física, Universitat de Barcelona(巴塞罗那大学物理量子与天体物理系,物理系) Institut de Ciències del Cosmos, Universitat de Barcelona, ICCUB(巴塞罗那大学宇宙科学研究所,ICCUB) ICREA, Barcelona, Spain(ICREA,巴塞罗那,西班牙)

AI总结 提出一种利用量子退火选择最具代表性特征图的方法,以解释卷积神经网络的图像分类预测,相比GradCAM和GradCAM++提升了类别解缠和解释质量。

Comments Text improvement and extra tests in v2. 15 pages, 10 figures, 1 table, including appendices

详情
AI中文摘要

深度学习模型被用于关键应用中,其中错误可能导致严重后果。因此,理解模型如何以及为何生成预测至关重要。这种理解提供了有用信息,用于检查模型是否学习到正确的模式、检测数据中的偏差、改进模型设计以及构建可信赖的系统。本文提出了一种新方法,用于解释图像分类任务中的卷积神经网络。该方法通过选择对每个预测贡献最大的最具代表性特征图来工作。为了解决这个组合问题,我们将其编码为量子约束优化问题,并提出使用量子退火求解。我们针对最先进的可解释AI技术(特别是GradCAM和GradCAM++)评估了我们的方法,并观察到类别解缠的改进,即模型的决策边界变得更加清晰,其推理更加透明。这表明我们的方法提高了解释质量,使得更容易理解模型依赖哪些特征进行特定预测。此外,我们研究了量子退火算法的计算行为。具体来说,我们分析了计算过程中系统的最小能隙以及算法找到正确解的概率。这些分析为该方法在实践中有效工作的原因提供了理论见解。

英文摘要

Deep learning models are used in critical applications, in which mistakes can have serious consequences. Therefore, it is crucial to understand how and why models generate predictions. This understanding provides useful information to check whether the model is learning the right patterns, detect biases in the data, improve model design, and build systems that can be trusted. This work proposes a new method for interpreting Convolutional Neural Networks in image classification tasks. The approach works by selecting the most representative feature maps that contribute to each prediction. To solve this combinatorial problem, we encode it into a quantum constrained optimization problem and propose to solve it using quantum annealing. We evaluate our method against the state-of-the-art explainable AI techniques, specifically GradCAM and GradCAM++, and observe an improved class disentanglement, i.e. the model's decision boundaries become more distinct and its reasoning more transparent. This demonstrates that our approach enhances the quality of explanations, making it easier to understand which features the model relies on for specific predictions. In addition, we study the computational behavior of the quantum annealing algorithm. Specifically, we analyze the minimum energy gap of the system during computation and the probability that the algorithm finds the correct solution. These analyses provide theoretical insight into why the method works effectively in practice.

2604.00860 2026-06-04 cs.LG

Policy Improvement Reinforcement Learning

策略改进强化学习

Huaiyang Wang, Xiaojie Li, Deqing Wang, Haoyi Zhou, Zixuan Huang, Yaodong Yang, Jianxin Li, Yikun Ban

发表机构 * Beihang University(北航) Peking University(北京大学)

AI总结 提出策略改进强化学习(PIRL)框架,通过最大化跨迭代的累积策略改进来替代替代奖励最大化,并基于此设计策略改进策略优化(PIPO)算法,实现闭环优化,在数学推理基准上提升稳定性和性能。

Comments Update author list

详情
AI中文摘要

具有可验证奖励的强化学习(RLVR)已成为改进大型语言模型推理能力的核心后训练范式。然而,现有方法存在一个共同的盲点:它们基于瞬时组级或批次级统计量优化策略,而从未验证所得更新是否实际改进了模型。这种开环设计——在每一步孤立地更新,仅由组内(批次)奖励信号引导——意味着优化可能漂移或崩溃,且没有机制来检测和纠正这些失败。我们认为缺失的要素是策略改进反馈:直接测量和优化跨迭代进展的能力。为此,我们引入策略改进强化学习(PIRL),这是一个用最大化跨迭代累积策略改进的显式目标替代替代奖励最大化的框架,并证明该时间目标与最大化最终任务性能完美对齐。基于PIRL,我们提出策略改进策略优化(PIPO),通过回顾性验证实现闭环优化。在每次迭代中,PIPO评估先前更新是否相对于滑动窗口历史基线产生了真正改进,然后主动强化有益更新并抑制有害更新——将开环过程转变为自纠正过程。我们提供理论分析表明PIPO在期望上对PIRL目标进行上升,并且在数学推理基准上的实验表明,与GRPO及其变体相比,PIPO提高了稳定性和性能。

英文摘要

Reinforcement Learning with Verifiable Rewards (RLVR) has become a central post-training paradigm for improving the reasoning capabilities of large language models. Yet existing methods share a common blind spot: they optimize policies based on instantaneous group-level or batch-level statistics without ever verifying whether the resulting update actually improved the model. This open-loop design -- updating in isolation at each step, guided only by within-group (batch) reward signals -- means optimization can drift or collapse with no mechanism to detect and correct these failures. We argue that the missing ingredient is policy improvement feedback: the ability to measure and optimize inter-iteration progress directly. To this end, we introduce Policy Improvement Reinforcement Learning (PIRL), a framework that replaces surrogate reward maximization with the explicit objective of maximizing cumulative policy improvement across iterations, and prove this temporal objective is perfectly aligned with maximizing final task performance. Building on PIRL, we propose Policy Improvement Policy Optimization (PIPO), which implements closed-loop optimization through retrospective verification. At each iteration, PIPO evaluates whether the previous update yielded genuine improvement against a sliding-window historical baseline, then actively reinforces beneficial updates and suppresses the harmful ones -- transforming an open-loop process into a self-correcting one. We provide theoretical analysis showing that PIPO performs ascent on the PIRL objective in expectation, and experiments on mathematical reasoning benchmarks demonstrate improved stability and performance over GRPO and its variants.

2604.25050 2026-06-04 cs.RO

DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors

DiscreteRTC:离散扩散策略是自然的异步执行器

Pengcheng Wang, Kaiwen Hong, Chensheng Peng, Katherine Driggs-Campbell, Masayoshi Tomizuka, Chenfeng Xu, Chen Tang

发表机构 * UC Berkeley(加州大学伯克利分校) UIUC(伊利诺伊大学香槟分校) UT Austin(德克萨斯大学奥斯汀分校) UCLA(加州大学洛杉矶分校)

AI总结 针对同步执行器在动态任务中的致命停顿问题,提出DiscreteRTC方法,利用离散扩散策略的原生修复能力实现异步执行,在动态模拟和真实操作任务中取得更高成功率。

详情
AI中文摘要

与聊天机器人不同,物理AI必须在世界不断变化的同时行动。因此,无论推理速度有多快,同步执行器的块间停顿对于动态任务都是致命的。异步执行——边行动边思考——因此是一个结构性要求,而实时分块(RTC)通过将块转换重新定义为修复(冻结已承诺的动作并一致地生成剩余部分)使其可行。然而,基于流匹配策略的RTC在结构上并非最优:其修复来自推理时的修正而非基础策略,导致几乎没有预训练收益、需要特定微调、启发式指导以及增加延迟的额外计算。在这项工作中,我们观察到离散扩散策略通过迭代去掩码生成动作,是自然的异步执行器,一次性解决了所有限制:由于修复是其原生操作,因此无需微调,而提前停止进一步提供了自适应指导并降低了推理成本。我们提出了DiscreteRTC,用原生去掩码替代外部修正,并在动态模拟基准和真实世界动态操作任务上展示了其比连续RTC和其他基线更高的成功率。总之,DiscreteRTC实现更简单,无需额外代码即可启用异步修复;推理更快,仅需从头生成动作约0.7倍的计算量;执行更好,在真实世界曲棍球防守任务中,成功率比流匹配RTC高65%,比训练时流匹配RTC高30%。更多可视化见https://outsider86.github.io/DiscreteRTCSite/。

英文摘要

Unlike chatbots, physical AI must act while the world keeps evolving. Therefore, the inter-chunk pause of synchronous executors are fatal for dynamic tasks regardless of how fast the inference is. Asynchronous execution -- thinking while acting -- is therefore a structural requirement, and real-time chunking (RTC) makes it viable by recasting chunk transitions as inpainting: freezing committed actions and consistently generating the remainder. However, RTC with flow-matching policy is structurally suboptimal: its inpainting comes from inference-time corrections rather than the base policy, yielding little pre-training benefit, specific fine-tuning, heuristic guidance, and extra computation that inflates the latency. In this work, we observe that discrete diffusion policies, which generate actions by iteratively unmasking, are natural asynchronous executors that resolve all limitations at once: they are fine-tuning free since inpainting is their native operation, while early stopping further provides adaptive guidance and reduces inference cost. We propose DiscreteRTC, which replaces external corrections with native unmasking, and show on dynamic simulated benchmarks and real-world dynamic manipulation tasks that it achieves higher success rates than continuous RTC and other baselines. In summary, DiscreteRTC is simpler to implement with 0 lines of additional code to enable async inpainting, faster at inference with only ~0.7 computation compared with generating actions from scratch, and better at execution with 65% higher success rate in real-world hockey defend task compared with flow-matching RTC, and 30% higher compared with training-time flow-matching RTC. More visualizations are on https://outsider86.github.io/DiscreteRTCSite/.

2603.01421 2026-06-04 cs.AI cs.CL

SciDER: Scientific Data-centric End-to-end Researcher

SciDER: 以科学数据为中心的端到端研究者

Ke Lin, Owais Aijaz, Yilin Lu, Yiyang Luo, Xuehang Guo, Preslav Nakov

发表机构 * GitHub

AI总结 提出SciDER多智能体系统,通过数据驱动方法和动态多模态技能系统,自动化科学研究的全生命周期,并在六个基准测试中取得领先结果。

Comments 10 pages, 8 figures, 7 tables

详情
AI中文摘要

虽然大型语言模型加速了科学发现,但现有智能体在适应性、领域泛化和多模态可扩展性方面面临严重限制,通常难以自主处理原始的、特定领域的实验数据。为了克服这些障碍,我们引入了SciDER,一个旨在灵活自动化整个研究生命周期的多智能体系统。该框架采用新颖的数据中心方法,并在四个专门的子智能体之间集成动态多模态技能系统。具体来说,一个构思智能体通过进化思想搜索生成新颖假设,一个数据分析智能体系统化地结构化原始数据,一个实验智能体基于数据集特征合成可执行代码,一个批评智能体驱动迭代自我改进。为了民主化开源科学发现,我们发布了OpenSciDER-SFT-8K,一个高质量的执行轨迹数据集,以及OpenSciDER-27B微调模型。在六个基准测试中,SciDER和OpenSciDER取得了具有竞争力或领先的结果,在数据中心分析、端到端研究执行和多模态科学可视化方面尤其强劲。通过将数据分析与实验执行相结合,SciDER弥合了抽象科学推理与可重复实验合成之间的差距。

英文摘要

While large language models accelerate scientific discovery, existing agents face severe limitations in adaptability, domain generalization, and multimodal scalability, often struggling to autonomously process raw, domain-specific experimental data. To overcome these barriers, we introduce SciDER, a multi-agent system designed to flexibly automate the entire research lifecycle. This framework employs a novel data-centric approach and integrates a dynamic multimodal skill system across four specialized sub-agents. Specifically, an ideation agent generates novel hypotheses via Evolutionary Idea Search, a data analysis agent systematically structures raw data, an experimentation agent synthesizes executable code grounded in dataset characteristics, and a critic agent drives iterative self-refinement. To democratize open-source scientific discovery, we release OpenSciDER-SFT-8K, a high-quality execution trajectory dataset, alongside the OpenSciDER-27B fine-tuned model. Across six benchmarks, SciDER and OpenSciDER obtain competitive or leading results, with especially strong gains on data-centric analysis, end-to-end research execution, and multimodal scientific visualization. By integrating data analysis with experimental execution, SciDER bridges the gap between abstract scientific reasoning and reproducible experimentation synthesis.

2510.11194 2026-06-04 cs.AI

Aligning Deep Implicit Preferences by Learning to Reason Defensively

通过防御性推理对齐深度隐式偏好

Peiming Li, Zhiyuan Hu, Yang Tang, Shiyu Li, Xi Chen

发表机构 * Basic Algorithm Center, PCG, Tencent(腾讯基本算法中心) School of Electronic and Computer Engineering, Peking University(北京大学电子与计算机工程学院)

AI总结 提出基于批判驱动推理对齐(CDRA)的方法,通过DeepPref基准和个性化生成过程奖励模型(Pers-GenPRM),将偏好对齐转化为结构化推理过程,以推断用户深层隐式偏好并实现防御性推理。

Journal ref ICLR 2026 Conference

详情
AI中文摘要

个性化对齐对于使大型语言模型(LLMs)有效参与以用户为中心的交互至关重要。然而,当前方法面临双重挑战:它们无法推断用户的深度隐式偏好(包括未言明的目标、语义上下文和风险容忍度),并且缺乏在现实世界模糊性中进行防御性推理所需的能力。这种认知差距导致响应肤浅、脆弱且短视。为了解决这个问题,我们提出了批判驱动推理对齐(CDRA),它将对齐从标量奖励匹配任务重新构建为结构化推理过程。首先,为了弥合偏好推断差距,我们引入了DeepPref基准。该数据集包含20个主题的3000个偏好-查询对,通过模拟多面认知委员会生成带有批判注释的推理链,以解构查询语义并揭示潜在风险。其次,为了灌输防御性推理,我们引入了个性化生成过程奖励模型(Pers-GenPRM),它将奖励建模构建为个性化推理任务。它在输出基于此推理的最终分数之前,生成批判链以评估响应与用户偏好的一致性。最终,这种可解释的结构化奖励信号通过批判驱动策略对齐(一种结合数值和自然语言反馈的过程级在线强化学习算法)指导策略模型。实验表明,CDRA在执行稳健推理的同时,擅长发现并与用户的真实偏好对齐。我们的代码和数据集可在https://github.com/Zephyrian-Hugh/Deep-pref获取。

英文摘要

Personalized alignment is crucial for enabling Large Language Models (LLMs) to engage effectively in user-centric interactions. However, current methods face a dual challenge: they fail to infer users' deep implicit preferences (including unstated goals, semantic context and risk tolerances), and they lack the defensive reasoning required to navigate real-world ambiguity. This cognitive gap leads to responses that are superficial, brittle and short-sighted. To address this, we propose Critique-Driven Reasoning Alignment (CDRA), which reframes alignment from a scalar reward-matching task into a structured reasoning process. First, to bridge the preference inference gap, we introduce the DeepPref benchmark. This dataset, comprising 3000 preference-query pairs across 20 topics, is curated by simulating a multi-faceted cognitive council that produces critique-annotated reasoning chains to deconstruct query semantics and reveal latent risks. Second, to instill defensive reasoning, we introduce the Personalized Generative Process Reward Model (Pers-GenPRM), which frames reward modeling as a personalized reasoning task. It generates a critique chain to evaluate a response's alignment with user preferences before outputting a final score based on this rationale. Ultimately, this interpretable, structured reward signal guides policy model through Critique-Driven Policy Alignment, a process-level online reinforcement learning algorithm integrating both numerical and natural language feedback. Experiments demonstrate that CDRA excels at discovering and aligning with users' true preferences while executing robust reasoning. Our code and dataset are available at https://github.com/Zephyrian-Hugh/Deep-pref.

2601.09853 2026-06-04 cs.CL cs.AI

MedRedFlag: Investigating how LLMs Redirect Misconceptions in Real-World Health Communication

MedRedFlag:探究LLMs如何在真实健康沟通中纠正误解

Sraavya Sambara, Yuan Pu, Ayman Ali, Vishala Mishra, Lionel Wong, Monica Agrawal

发表机构 * Independent Researcher(独立研究者) Duke University(杜克大学) Stanford University(斯坦福大学)

AI总结 本研究通过构建MedRedFlag数据集(1100+个来自Reddit的需纠正问题),系统比较了先进LLMs与临床医生的回应,发现LLMs常未能纠正问题中的错误前提,可能导致次优医疗决策,揭示了患者面向医疗AI系统的关键安全漏洞。

详情
AI中文摘要

来自患者的真实健康问题往往无意中嵌入了错误的假设或前提。在这种情况下,安全的医疗沟通通常涉及纠正:先指出隐含的误解,然后回应用户的潜在背景,而非原始问题。尽管大型语言模型(LLMs)越来越多地被普通用户用于医疗建议,但它们尚未针对这一关键能力进行测试。因此,在本工作中,我们研究了LLMs如何应对真实健康问题中嵌入的错误前提。我们开发了一个半自动化流程来整理MedRedFlag,这是一个包含1100多个来自Reddit的、需要纠正的问题的数据集。然后,我们系统地比较了最先进的LLMs与临床医生的回应。我们的分析显示,LLMs往往未能纠正有问题的提问,即使检测到了有问题的前提,并且提供的答案可能导致次优的医疗决策。我们的基准测试和结果揭示了LLMs在真实健康沟通条件下表现的新且重大的差距,突显了面向患者的医疗AI系统的关键安全问题。代码和数据集可在https://github.com/srsambara-1/MedRedFlag获取。

英文摘要

Real-world health questions from patients often unintentionally embed false assumptions or premises. In such cases, safe medical communication typically involves redirection: addressing the implicit misconception and then responding to the underlying patient context, rather than the original question. While large language models (LLMs) are increasingly being used by lay users for medical advice, they have not yet been tested for this crucial competency. Therefore, in this work, we investigate how LLMs react to false premises embedded within real-world health questions. We develop a semi-automated pipeline to curate MedRedFlag, a dataset of 1100+ questions sourced from Reddit that require redirection. We then systematically compare responses from state-of-the-art LLMs to those from clinicians. Our analysis reveals that LLMs often fail to redirect problematic questions, even when the problematic premise is detected, and provide answers that could lead to suboptimal medical decision making. Our benchmark and results reveal a novel and substantial gap in how LLMs perform under the conditions of real-world health communication, highlighting critical safety concerns for patient-facing medical AI systems. Code and dataset are available at https://github.com/srsambara-1/MedRedFlag.

2511.20233 2026-06-04 cs.CL

REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control

REFLEX: 通过裁决锚定风格控制实现自我精炼的可解释事实核查

Chuyi Kong, Wei Gao, Jing Ma, Hongzhan Lin, Yuxi Sun

发表机构 * Hong Kong Baptist University(香港 Baptist 大学) Singapore Management University(新加坡 Management 大学)

AI总结 提出REFLEX方法,利用自我分歧的真实性信号构建引导向量,以裁决锚定风格控制实现自我精炼的事实核查,仅需465个样本即达最优性能。

Journal ref ACL 2026 Main Conference

详情
AI中文摘要

社交媒体上假新闻的盛行要求自动化事实核查系统提供准确的裁决和忠实的解释。然而,现有基于大语言模型(LLM)的方法忽略了LLM生成解释中的欺骗性误导风格,导致不忠实的理由可能误导人类判断。它们严重依赖外部知识源,引入幻觉甚至高延迟,削弱了实时使用中至关重要的可靠性和响应性。为解决这些挑战,我们提出REason-guided Fact-checking with Latent EXplanations (REFLEX),一种自我精炼范式,显式控制以裁决锚定的推理风格。REFLEX利用骨干模型及其微调变体之间的自我分歧真实性信号构建引导向量,自然地将事实与风格分离。在真实世界数据集上的实验表明,REFLEX在LLaMA系列模型下仅用465个自我精炼样本即达到最先进性能。此外,由于其可迁移性,REFLEX在野外数据上获得了高达7.54%的提升。我们的结果进一步证明,该方法有效缓解了忠实幻觉,从而引导模型在可解释事实核查中比先前工作获得更准确的裁决。

英文摘要

The prevalence of fake news on social media demands automated fact-checking systems to provide accurate verdicts with faithful explanations. However, existing large language model (LLM)-based approaches ignore deceptive misinformation styles in LLM-generated explanations, resulting in unfaithful rationales that can mislead human judgments. They rely heavily on external knowledge sources, introducing hallucinations and even high latency that undermine reliability and responsiveness, which is crucial for real-time use. To address these challenges, we propose REason-guided Fact-checking with Latent EXplanations (REFLEX), a self-refining paradigm that explicitly controls reasoning style anchored on verdict. REFLEX utilizes self-disagreement veracity signals between the backbone model and its fine-tuned variant to construct steering vectors, naturally disentangling fact from style. Experiments on the real-world dataset show REFLEX achieves state-of-the-art performance under LLaMA-series models with only 465 self-refined samples. Moreover, owing to its transferability, REFLEX yields up to a 7.54% gain on in-the-wild data. Our results further demonstrate that our method effectively mitigates faithful hallucination, thereby guiding the model toward more accurate verdicts than previous works in explainable fact-checking.

2604.17709 2026-06-04 cs.CL cs.DC

DeInfer: Efficient Parallel Inferencing for Decomposed Large Language Models

DeInfer:分解式大语言模型的高效并行推理

You-Liang Huang, Xinhao Huang, Chengxi Liao, Zeyi Wen

发表机构 * Boston University(波士顿大学)

AI总结 针对分解式大语言模型并行推理性能差的问题,提出DeInfer系统,通过多项优化实现高性能并行推理,实验证明其优越性。

Comments accepted by DAC'26, latest version fixs a minor mistake

详情
AI中文摘要

现有关于大语言模型(LLM)分解的工作主要关注提升下游任务性能,但在尝试扩展模型规模时忽略了并行推理性能差的问题。为缓解这一重要性能问题,本文介绍了DeInfer,一个专用于分解式LLM并行推理的高性能推理系统。它包含多项优化以最大化性能,并与最先进的优化技术兼容。通过大量实验评估DeInfer的性能,结果证明了其优越性,表明它能极大地促进分解式LLM的并行推理。

英文摘要

Existing works on large language model (LLM) decomposition mainly focus on improving performance on downstream tasks, but they ignore the poor parallel inference performance when trying to scale up the model size. To mitigate this important performance issue, this paper introduces DeInfer, a high-performance inference system dedicated to parallel inference of decomposed LLMs. It consists of multiple optimizations to maximize performance and be compatible with state-of-the-art optimization techniques. Extensive experiments are carried out to evaluate DeInfer's performance, where the results demonstrate its superiority, suggesting it can greatly facilitate the parallel inference of decomposed LLMs.