arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.11463 2026-05-13 cs.CV

Encore: Conditioning Trajectory Forecasting via Biased Ego Rehearsals

Conghao Wong, Ziqian Zou, Xinge You

AI总结本文研究了如何在轨迹预测任务中学习和表示智能体的主观性，这一问题具有挑战性但至关重要。作者提出了一种名为Encore的方法，通过引入偏向性的自我排练机制，使模型能够从短期观测中生成针对场景中所有参与者的偏置排练轨迹，并利用这些轨迹作为条件来引导最终预测，从而更准确地模拟不同智能体的主观行为。实验表明，该方法在多个数据集上均取得了性能提升，并为理解轨迹中的主观性提供了清晰的解释。

2605.11462 2026-05-13 cs.CV cs.AI

SpatialForge: Bootstrapping 3D-Aware Spatial Reasoning from Open-World 2D Images

Zishan Liu, Ruoxi Zang, Yanglin Zhang, Wei Liu, Yin Zhang, Jian Yao, Jiayin Zheng, Zhengzhe Liu

AI总结该研究提出了一种名为 SpatialForge 的可扩展数据合成方法，旨在从开放世界的二维图像中生成用于三维空间推理的监督信号，以解决当前大型视觉-语言模型在空间推理方面的不足。通过将空间推理分解为感知与关系两个部分，并构建包含深度、布局和视角依赖推理的结构化监督信号，该方法能够自动生成高质量的空间问答数据。基于此，研究构建了一个包含1000万对空间问答的大型数据集 SpatialForge-10M，并在多个空间推理基准上验证了其有效性，显著提升了视觉-语言模型的空间推理能力。

2605.11460 2026-05-13 cs.LG cs.SY eess.SY

Beyond Prediction: Interval Neural Networks for Uncertainty-Aware System Identification

Mehmet Ali Ferah, Tufan Kumbasar

AI总结本文提出了一种用于不确定性感知系统辨识的区间神经网络（INN）框架，旨在解决传统方法在建模非线性动态系统时无法有效捕捉不确定性的局限性。通过将传统神经网络扩展为区间形式，研究开发了能够传播不确定性的区间LSTM和NODE模型，并提出了两种训练策略——级联INN（C-INN）和联合INN（J-INN），分别在不同阶段优化预测精度与区间精度。实验表明，该框架在多个系统辨识数据集上表现优异，且引入了通道弹性概念以分析不确定性在模型参数中的分布特征。

Comments Under review

2605.11448 2026-05-13 cs.LG cs.AI

Deep Minds and Shallow Probes

Su Hyeong Lee, Risi Kondor

AI总结本文研究神经表示中隐藏坐标在不同实现下的对称性问题，提出应使用对称性稳定的浅层探针来揭示表示中的结构，而非依赖特定基底。通过分析最终输出层的精确模型，作者确定了一种唯一的浅层探针分层结构，其中线性探针为其一级成员。研究还表明，跨模型探针迁移应基于表示中探针可见的商空间，而非完整的隐藏状态，实验验证了该方法在合成与实际任务中的有效性。

2605.11439 2026-05-13 cs.CV cs.LG

Instruct-ICL: Instruction-Guided In-Context Learning for Post-Disaster Damage Assessment

Armin Zarbaft, Ehsan Karimi, Nhut Le, Maryam Rahnemoonfar

AI总结本文研究了如何通过结构化推理策略提升预训练多模态大语言模型在灾后视觉问答任务中的可靠性。提出了一种名为 Instruct-ICL 的方法，利用一个 MLLM 生成任务特定的指令作为链式推理（CoT）引导，辅助另一个 MLLM 进行答案生成，并结合不同程度的上下文学习（ICL）提升模型性能。实验表明，该方法在 FloodNet 数据集上显著提高了答案准确性，为灾后快速评估提供了更可靠的技术方案。

Comments Accepted by the 2026 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2026)

2605.11438 2026-05-13 cs.CV

Beyond Masks: The Case for Medical Image Parsing

Siddharth Gupta, Alan L. Yuille, Zongwei Zhou

AI总结本文提出医疗图像解析（Medical Image Parsing）作为医学影像研究的核心输出，强调应超越传统的像素级分割掩码，生成包含实体、属性及关系的结构化表示，以更全面地描述医学影像内容。研究指出，当前系统在实体识别方面表现较好，但在属性描述、实体间关系及语义闭包等方面仍严重不足。作者主张通过改进输出形式和训练信号，推动模型从测量转向解释，以更贴近临床实际需求。

2605.11436 2026-05-13 cs.CL cs.AI

Agent-BRACE: Decoupling Beliefs from Actions in Long-Horizon Tasks via Verbalized State Uncertainty

Joykirat Singh, Zaid Khan, Archiki Prasad, Justin Chih-Yao Chen, Akshay Nambi, Hyunji Lee, Elias Stengel-Eskin, Mohit Bansal

AI总结本文提出了一种名为Agent-BRACE的方法，旨在解决大型语言模型在长时序、部分可观测环境中执行任务时面临的不确定性管理和上下文膨胀问题。该方法通过将信念状态与策略解耦，利用自然语言标注的置信度标签构建结构化的信念表示，从而帮助模型在决策时更有效地处理不确定性。实验表明，Agent-BRACE在多个长时序任务中显著提升了性能，同时保持了对上下文长度的鲁棒性。

Comments Code: https://github.com/joykirat18/Agent-BRACE

2605.11435 2026-05-13 cs.CV

ZeroIDIR: Zero-Reference Illumination Degradation Image Restoration with Perturbed Consistency Diffusion Models

Hai Jiang, Zhen Liu, Yinjie Lei, Songchen Han, Bing Zeng, Shuaicheng Liu

AI总结本文提出了一种基于扩散模型的零参考图像修复框架ZeroIDIR，用于解决光照退化图像的恢复问题。该方法仅依赖低质量退化图像进行训练，通过解耦光照校正与扩散重建过程，引入自适应伽马校正模块和直方图引导的光照校正损失，提升光照一致性并作为后续扩散过程的可靠输入。此外，提出了一种扰动一致性扩散损失，以增强恢复图像的细节还原能力和稳定性，实验表明该方法在多个公开数据集上优于现有无监督方法，并具有良好的场景泛化能力。

Comments Accepted by CVPR 2026

2605.11430 2026-05-13 cs.CV cs.AI cs.LG

Diabetic Retinopathy Classification using Downscaling Algorithms and Deep Learning

Nishi Doshi, Urvi Oza, Pankaj Kumar

AI总结该研究针对糖尿病视网膜病变（DR）分类中的图像尺寸不一问题，提出在输入深度学习网络前使用多种下采样算法对视网膜图像进行预处理。研究结合了Kaggle和印度糖尿病视网膜病变图像数据集，基于改进的多通道Inception V3网络架构进行分类实验，结果在准确率、特异性和灵敏度方面优于现有方法，为DR的自动分级提供了更有效的解决方案。

2605.11428 2026-05-13 cs.LG

FastUMAP: Scalable Dimensionality Reduction via Bipartite Landmark Sampling

Hongmin Li

AI总结本文提出了一种名为 FastUMAP 的可扩展降维方法，旨在解决在重复使用场景下非线性降维方法计算效率低的问题。该方法基于双分图的地标采样，通过构建稀疏的点-地标模糊图，并结合 Nystrom 方法进行谱初始化，再在双分图上进行 UMAP 风格的目标优化，从而在保证一定精度的同时显著提升计算速度。实验表明，FastUMAP 在多个数据集上相比传统方法具有更快的运行时间，适合用于需要频繁进行降维探索的场景。

Comments 17 pages, 5 figures

2605.11427 2026-05-13 cs.CV

PD-4DGS:Progressive Decomposition of 4D Gaussian Splatting for Bandwidth-Adaptive Dynamic Scene Streaming

Jiachen Li, Guangzhi Han, Jin Wan, Delong Han, Yuan Gao, Min Li, Mingle Zhou, Gang Li

AI总结 PD-4DGS 是一种面向动态场景流媒体的渐进式 4D 高斯溅射压缩框架，旨在解决现有 4DGS 模型在带宽受限设备上渲染延迟高、无法适配自适应码率传输的问题。该方法通过层次化形变分解（HDD）将 4DGS 的运动结构分解为三个可独立传输的层次，使流媒体前缀即可渲染，实现可扩展的流式传输。实验表明，PD-4DGS 在保持渲染质量的同时显著降低了传输带宽和首帧延迟，为 4DGS 在移动设备上的实时流媒体应用提供了可行方案。

2605.11426 2026-05-13 cs.AI

A Mechanistic Investigation of Supervised Fine Tuning

Ruhaan Chopra

AI总结本研究探讨了监督微调（SFT）对大语言模型激活状态的影响，发现尽管微调前后隐藏层激活的余弦相似度很高，但通过预训练稀疏自编码器（SAE）投影后，稀疏潜在表示存在显著差异。研究提出了一种基于SAE的分析方法，揭示了微调过程中任务和层特异性语义特征的变化，并发现了与安全对齐相关的分层更新模式。该方法为理解SFT的机制提供了高分辨率的诊断工具。

2605.11424 2026-05-13 cs.CV

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors

Jimin Tang, Wenyuan Zhang, Junsheng Zhou, Zian Huang, Kanle Shi, Shenkun Xu, Yu-Shen Liu, Zhizhong Han

AI总结 VidSplat 是一种基于高斯点扩散的生成式重建框架，旨在解决在稀疏视角下进行多视角表面重建时存在的缺失区域和遮挡问题。该方法利用视频扩散先验，通过迭代生成新视角来补充输入覆盖不足的区域，从而实现对完整3D场景的重建。其核心在于提出了一种无需训练的分阶段去噪策略和迭代优化机制，有效提升了重建的几何一致性和完整性。

Comments Accepted by SIGGRAPH Conference 2026. Project Page: https://tangjm24.github.io/VidSplat

2605.11418 2026-05-13 cs.AI cs.CR

Under the Hood of SKILL.md: Semantic Supply-chain Attacks on AI Agent Skill Registry

Shoumik Saha, Kazem Faghih, Soheil Feizi

AI总结本文研究了AI代理技能注册系统中基于自然语言的语义供应链攻击问题，揭示了SKILL.md文件在技能发现、选择和治理阶段可能被恶意利用的风险。通过实验证明，攻击者可通过精心设计的文本触发器提升恶意技能的可见性、引导代理选择功能相似的对抗性变体，并有效规避安全审查。研究指出，SKILL.md不仅是文档，更是影响代理行为的关键操作性文本，暴露了当前AI代理能力扩展机制中的重大安全隐患。

Comments 31 pages, 21 figures, 10 tables

2605.11414 2026-05-13 cs.LG cs.AI

Generative Diffusion Prior Distillation for Long-Context Knowledge Transfer

Nilushika Udayangani, Kishor Nandakishor, Marimuthu Palaniswami

AI总结本文研究了在时间序列分类任务中，如何将完整序列分类器的知识迁移到仅基于部分序列输入的分类器中。为了解决部分数据缺乏判别性特征导致的泛化能力下降问题，作者提出了一种基于生成扩散先验的知识蒸馏框架（GDPD），通过将短上下文学生特征视为完整上下文教师特征的退化观测，利用扩散模型的迭代恢复能力学习教师特征的生成先验，并引导学生特征学习长期上下文知识，从而有效提升部分序列分类的性能。实验表明，GDPD在多种数据集和架构下均表现出优越的全序列到部分序列的知识迁移效果。

Comments Published as a conference paper at ICLR 2026 (Brazil, Rio de Janeiro)

详情

Journal ref: The Fourteenth International Conference on Learning Representations 2026

英文摘要

While traditional time-series classifiers assume full sequences at inference, practical constraints (latency and cost) often limit inputs to partial prefixes. The absence of class-discriminative patterns in partial data can significantly hinder a classifier's ability to generalize. This work uses knowledge distillation (KD) to equip partial time series classifiers with the generalization ability of their full-sequence counterparts. In KD, high-capacity teacher transfers supervision to aid student learning on the target task. Matching with teacher features has shown promise in closing the generalization gap due to limited parameter capacity. However, when the generalization gap arises from training-data differences (full versus partial), the teacher's full-context features can be an overwhelming target signal for the student's short-context features. To provide progressive, diverse, and collective teacher supervision, we propose Generative Diffusion Prior Distillation (GDPD), a novel KD framework that treats short-context student features as degraded observations of the target full-context features. Inspired by the iterative restoration capability of diffusion models, we learn a diffusion-based generative prior over teacher features. Leveraging this prior, we posterior-sample target teacher representations that could best explain the missing long-range information in the student features and optimize the student features to be minimally degraded relative to these targets. GDPD provides each student feature with a distribution of task-relevant long-context knowledge, which benefits learning on the partial classification task. Extensive experiments across earliness settings, datasets, and architectures demonstrate GDPD's effectiveness for full-to-partial distillation.

URL PDF HTML ☆

赞 0 踩 0

2605.11408 2026-05-13 cs.LG cs.AI cs.CL

MaskTab: Scalable Masked Tabular Pretraining with Scaling Laws and Distillation for Industrial Classification

Bo Zheng, Yudong Chen, Zihua Xiong, Shuai Fang, Peidong He, Yang Yang, Sheng Guo

AI总结 MaskTab 是一个专为工业级表格数据设计的统一预训练框架，旨在解决表格数据高维、缺失值多且标签稀少的问题。该方法通过引入可学习的缺失值标记和混合监督预训练策略，结合多专家增强损失函数，有效提升了模型在大规模工业数据上的表现。实验表明，MaskTab 在多个工业基准上显著优于现有方法，并能高效蒸馏到轻量模型中，在严格时延和可解释性约束下仍保持优越性能。

2605.11406 2026-05-13 cs.LG

A Boundary-Aware Non-parametric Granular-Ball Classifier Based on Minimum Description Length

Zeqiang Xian, Caihui Liu, Yong Zhang, Wenjing Qiu, Duoqian Miao, Witold Pedrycz

AI总结本文提出了一种基于最小描述长度原理的边界感知非参数粒球分类器（MDL-GBC），旨在解决现有粒球分类方法中依赖手工设计质量指标和启发式规则的问题。该方法将类条件粒球构建建模为局部模型选择问题，通过比较单球模型、双球模型和核心-边界模型的描述长度，决定粒球的保留、分割或细化策略，从而实现边界敏感区域的显式建模与分类机制的一致性。实验表明，MDL-GBC在多个基准数据集上取得了优异的分类性能，具有良好的可解释性和竞争力。

Comments 13 pages, 2 figures

2605.11404 2026-05-13 cs.AI

Attributing Emergence in Million-Agent Systems

Ling Tang, Jilin Mei, Qian Chen, Qihan Ren, Linfeng Zhang, Quanshi Zhang, Jing Shao, Xia Hu, Dongrui Liu

AI总结该研究探讨了在百万智能体系统中如何将宏观涌现现象归因于个体智能体的问题。现有方法因计算复杂度限制，仅适用于小规模系统，而实际社会现象常发生在百万级智能体规模。为此，研究将Aumann-Shapley路径积分归因方法扩展至百万智能体规模，实现了高效且满足所有四个公理的归因计算，并通过实证分析揭示了小规模与全量数据在归因结果上的结构性差异，证明了全量归因对于非线性宏观指标的理论必要性。

2605.11403 2026-05-13 cs.LG cs.AI cs.CL

fg-expo: Frontier-guided exploration-prioritized policy optimization via adaptive kl and gaussian curriculum

Mingxiong Lin, Zhangquan Gong, Maowen Tang, Qian Li, Chuangchuang Wang, Jian Ma, Sutian Huang, Kai Tang, Haonan Lu

AI总结该研究针对基于可验证奖励的强化学习（RLVR）中主流算法Group Relative Policy Optimization（GRPO）存在的两个效率问题，提出了FG-ExPO方法。该方法通过引入准确率条件的KL缩放（AKL）和高斯课程采样（GCS）两个轻量组件，分别动态调整策略探索的约束强度和优化问题采样分布，从而提升模型在数学推理任务中的训练效率。实验表明，FG-ExPO在多个主流基准上显著优于原始GRPO，尤其在AIME 2025等任务中展现出更优的性能提升。

2605.11402 2026-05-13 cs.LG cs.CR cs.NI

More Than Meets the Eye: A Semantics-Aware Traffic Augmentation Framework for Generalizable Website Fingerprinting

Youquan Xian, Xueying Zeng, Lingjia Meng, Lei Cui, Runhan Song, Wei Wang, Zhengquan Ding, Peng Liu, Zhiyu Hao

AI总结本文提出了一种语义感知的流量增强框架SATA，旨在解决基于深度学习的网站指纹识别技术在真实环境中的泛化能力不足问题。该方法通过协议规则进行应用层语义增强，扩展流量中的资源组成模式和帧序列模式，并引入跨层特征对齐机制，将增强的语义信息与可观测的流量特征进行对齐。实验表明，SATA能够生成训练集中不存在但在测试集中真实存在的流量模式，显著提升了主流模型在多种复杂场景下的性能，尤其在开放世界设置中，准确率和AUROC分别提升了90.81%和48.37%。

Comments 18 pages, 19 figures, Submitted to NDSS 2027

2605.11398 2026-05-13 cs.AI cs.CL

AcuityBench: Evaluating Clinical Acuity Identification and Uncertainty Alignment

Robin Linzmayer, Georgianna Lin, Di Coneybeare, Jason Chu, Trudi Cloyd, Manish Garg, Miles Gordon, Elizabeth Hartofilis, Benjamin Hong, Ashraf Hussain, Eugene Y. Kim, Oluchi Iheagwara King, Ross McCormack, Erica Olsen, John K. Riggins, Mustafa N. Rasheed, Dana L. Sacco, Vinay Saggar, Osman R. Sayan, Amit Shembekar, Janice Shin-Kim, Wendy W. Sun, Bernard P. Chang, David Kessler, Noémie Elhadad

AI总结本文提出 AcuityBench，一个用于评估语言模型能否从用户医疗描述中正确识别护理紧急程度的基准。该基准整合了五个公开数据集，涵盖用户对话、论坛帖子、临床案例和患者门户信息，并统一采用四级紧急程度框架进行评估。研究发现，不同模型在明确案例和模糊案例中的表现存在显著差异，且任务形式的选择会影响误判类型，突显了临床紧急程度识别作为关键安全能力的重要性。

Comments 41 pages, 5 figures. Preprint under review for the Track on Evaluations and Datasets at NeurIPS 2026

详情

英文摘要

We introduce AcuityBench, a benchmark for evaluating whether language models identify the appropriate urgency of care from user medical presentations. Existing health benchmarks emphasize medical question answering, broad health interactions, or narrow workflow-specific triage tasks, but they do not offer a unified evaluation of acuity identification across these settings. AcuityBench addresses this gap by harmonizing five public datasets spanning user conversations, online forum posts, clinical vignettes, and patient portal messages under a shared four-level acuity framework ranging from home monitoring to immediate emergency care. The benchmark contains 914 cases, including 697 consensus cases for standard accuracy evaluation and 217 physician-confirmed ambiguous cases for uncertainty-aware evaluation. It supports two complementary task formats: explicit four-way classification in a QA setting, and free-form conversational responses evaluated with a rubric-based judge anchored to the same framework. Across 12 frontier proprietary and open-weight models, we find substantial variation in clear-case acuity accuracy and error direction. Comparing task formats reveals a systematic tradeoff: conversational responses reduce over-triage but increase under-triage relative to QA, especially in higher-acuity cases. In ambiguous cases, no model closely matches the distribution of physician judgments, and model predictions are more concentrated than expert clinical uncertainty. We also compare expert and model adjudication on a subset of maximally ambiguous cases, using those cases to examine the role of clinical uncertainty in label disagreement. Together, these results position acuity identification as a distinct safety-critical capability and show that AcuityBench enables systematic comparison and stress-testing of how well models guide users to the right level of care in real-world health use.

URL PDF HTML ☆

赞 0 踩 0

2605.11396 2026-05-13 cs.LG

MuonQ: Enhancing Low-Bit Muon Quantization via Directional Fidelity Optimization

Yupeng Su, Ruijie Zhang, Ziyue Liu, Yequan Zhao, Zheng Zhang

AI总结本文提出MuonQ，一种基于方向保真优化的低比特Muon优化器训练框架，旨在解决Muon优化器在量化训练中对误差敏感的问题。通过预量化归一化、结构分解和μ律压缩量化等方法，MuonQ有效抑制了量化误差的累积与方向偏差，实现了稳定高效的4比特量化训练。实验表明，MuonQ在保持训练损失和下游任务准确率接近全精度Muon的同时，将优化器状态内存减少了7.3倍。

Comments MuonQ enables stable 4-bit quantization of Muon's optimizer states by preserving directional fidelity through pre-quantization normalization, structural decomposition, and companding quantization

2605.11392 2026-05-13 cs.AI

Transformer Interpretability from Perspective of Attention and Gradient

Yongjin Cui, Xiaohui Fan, Huajun Chen

AI总结本文从注意力和梯度的角度深入研究了Transformer模型的可解释性，提出了一种通过引导梯度方向（即注意力方向）实现更全面和细致的特征区域解释的方法。该方法有助于更好地理解Transformer的工作机制，并揭示了Vision Transformer（ViT）与人类图像感知之间的差异，展示了几乎不可察觉的图像类别篡改现象，可能在特定场景下带来安全隐患。

2605.11388 2026-05-13 cs.CL cs.AI

Deep Reasoning in General Purpose Agents via Structured Meta-Cognition

Dean Light, Michael Theologitis, Kshitish Ghate, Shuyue Stella Li, Benjamin Newman, Chirag Shah, Aylin Caliskan, Pang Wei Koh, Dan Suciu, Yulia Tsvetkov

AI总结该研究提出了一种名为“Deep Reasoning”的方法，旨在提升通用智能体在推理任务中的灵活性与适应性。通过结构化的元推理，该方法在推理过程中动态构建任务特定的推理框架，从而更有效地处理复杂问题。实验表明，基于该方法构建的通用智能体DOLORES在多个困难基准上显著优于现有方法，展现了其在结构化推理和任务适应性方面的优势。

Comments Preprint under review

详情

英文摘要

Humans intuitively solve complex problems by flexibly shifting among reasoning modes: they plan, execute, revise intermediate goals, resolve ambiguity through associative judgment, and apply formal procedures to well-specified subproblems. Current LLM agents lack this flexibility, as their scaffolds hard-code such reasoning decisions in advance. These scaffolds are effective when their prescribed structure matches the task, but brittle when solving the task requires adapting the structure of reasoning itself. We introduce Deep Reasoning -- an inference-time approach for constructing task-specific scaffolds through structured meta-reasoning. Deep Reasoning uses a formal language that represents meta-reasoning as executable decompositions over associative inference, formal computation, and recursive subproblem solving, enabling decomposition principles to be encoded as in-context examples that guide test-time scaffold construction. We instantiate this approach in a general-purpose agent (DOLORES) that distributes complex tasks across more controlled reasoning threads. We evaluate it against state-of-the-art scaffolding methods across four hard benchmarks: multi-hop reasoning, long-chain question answering, long-context aggregation, and deep research-style information seeking. DOLORES outperforms all evaluated scaffolds across three model sizes and two model families, improving over the strongest evaluated scaffold baseline by 24.8% on average. DOLORES distributes cognition across structured, lower-load reasoning threads, thereby reducing premature termination and hallucinations. This advantage can even bridge the scaling gap, with an 8B version surpassing all evaluated 32B baselines from the same family in more than half the settings. These results point toward future agentic systems that treat scaffolding as adaptive reasoning, constructing the structure each task requires just-in-time.

URL PDF HTML ☆

赞 0 踩 0

2605.11387 2026-05-13 cs.LG cs.RO

Behavioral Mode Discovery for Fine-tuning Multimodal Generative Policies

Alberta Longhini, David Emukpere, Jean-Michel Renders, Seungsu Kim

AI总结本文研究了在保持生成策略动作分布多模态特性的同时，如何利用强化学习对预训练生成策略进行微调的问题。为了解决现有方法在提升任务性能时导致行为模式单一化的问题，作者提出了一种无监督的行为模式发现框架，通过挖掘策略中的潜在行为模式，并利用互信息作为内在奖励，以在提升任务成功率的同时保持行为多样性。实验表明，该方法在机器人操作任务中优于传统微调方法，取得了更高的成功率并保留了更丰富的多模态动作分布。

2605.11386 2026-05-13 cs.AI

Revisiting Privacy Preservation in Brain-Computer Interfaces: Conceptual Boundaries, Risk Pathways, and a Protection-Strength Grading Framework

Lei Sun, Xiuqing Mao, Shuai Zhang, Qingyu Zeng, Min Zhao, Jiyuan Li, Wenle Dong

AI总结随着脑机接口（BCI）技术从实验室走向临床和实际应用，其隐私保护问题日益突出。本文系统回顾了BCI系统中隐私泄露的多种路径，提出了涵盖保护对象、生命周期阶段和保护强度等级的三维分类框架，将现有研究分为四个保护强度等级。研究强调，BCI隐私保护不仅要隐藏数据，还需分离任务无关的敏感信息，同时保持系统功能的实用性，并指出心智隐私和神经伦理风险仍是亟待解决的开放问题。

2605.11385 2026-05-13 cs.CV cs.RO

JACoP: Joint Alignment for Compliant Multi-Agent Prediction

Qingze Liu, Alen Mrdovic, Danrui Li, Mathew Schwartz, Sejong Yoon, Mubbasir Kapadia

AI总结该论文提出了一种名为JACoP的多阶段框架，用于解决多智能体轨迹预测中的集体合规性问题。其核心方法结合了基于锚点的个体轨迹筛选和基于马尔可夫随机场的联合轨迹对齐，有效减少了轨迹间的社交碰撞和环境违规。JACoP在保证预测精度的同时，显著提升了场景层面的合理性，为实际应用提供了更安全可靠的预测方案。

Comments Accepted by CVPRF 2026

2605.11383 2026-05-13 cs.CV

HamBR: Active Decision Boundary Restoration Based on Hamiltonian Dynamics for Learning with Noisy Labels

Ningkang Peng, Jingyang Mao, Qianfeng Yu, Xiaoqian Peng, Peirong Ma, Yanhui Gu

AI总结在大规模视觉识别和数据挖掘任务中，噪声标签会严重影响深度神经网络的泛化能力。本文首次提出了一种基于哈密顿动力学的主动决策边界修复方法HamBR，通过球面哈密顿蒙特卡洛机制主动探测特征空间中的类间模糊区域，并合成高质量虚拟异常样本，利用能量模型建立鲁棒的决策边界屏障，从而恢复决策边界的判别性。实验表明，HamBR在多个基准数据集上取得了最先进的性能，并显著提升了模型的分布外检测能力。

详情

英文摘要

In large-scale visual recognition and data mining tasks, the presence of noisy labels severely undermines the generalization capability of deep neural networks (DNNs). Prevalent sample selection methods rely primarily on training loss or prediction confidence for passive screening. However, within a feature space degraded by noise, decision boundaries undergo systematic boundary collapse. This phenomenon hinders the ability of the model to distinguish between hard clean samples and noisy samples at the decision margins, thereby creating a significant performance bottleneck. This study is the first to emphasize the pivotal importance of active boundary restoration for noise-robust learning. We propose HamBR, a novel paradigm based on Hamiltonian dynamics. The core approach leverages the Spherical Hamiltonian Monte Carlo (Spherical HMC) mechanism to actively probe inter-class ambiguous regions within the representation space and synthesize high-quality virtual outliers. By imposing explicit repulsion constraints via energy-based modeling, these synthesized samples establish robust energy barriers at the decision boundaries. This mechanism forces real samples to move from dispersed overlapping regions toward their respective class centers, thereby restoring the discriminative sharpness of the decision boundaries. HamBR demonstrates exceptional versatility and can be integrated as a plug-and-play defense module into existing semi-supervised noisy label learning frameworks. Empirical evaluations show that the proposed paradigm significantly enhances the discriminative accuracy of hard boundary samples, achieving state-of-the-art (SOTA) performance on CIFAR-10/100 and real-world noise benchmarks. Furthermore, it exhibits superior convergence efficiency and reliable robustness, while improving significantly the capability of the model for Out-of-Distribution (OOD) detection.

URL PDF HTML ☆

赞 0 踩 0

2605.11381 2026-05-13 cs.RO cs.DC

Kairos: A Scalable Serving System for Physical AI

Yinwei Dai, Ganesh Ananthanarayanan, Landon Cox, Xenofon Foukas, Bozidar Radunovic, Ravi Netravali

AI总结随着物理AI在通用环境中的能力不断提升，其推理特性与数字AI存在显著差异，现有数字AI服务系统难以满足其需求。本文提出Kairos，首个专为多机器人设计的物理AI服务系统，将生成-执行循环作为核心机制，显著提升了任务执行效率。实验表明，Kairos在多种物理AI模型和机器人平台上，平均端到端任务延迟相比现有数字AI服务方法降低了31.8%至66.5%，且性能提升随机器人规模增大而增强。

2605.11380 2026-05-13 cs.LG cs.AI

TRACE: Temporal Routing with Autoregressive Cross-channel Experts for EEG Representation Learning

Fan Ma, Qier An, Peng Chen, Lingfei Qian, Xiang Lan, Mingyang Jiang, Zhiling Gu, Xenophon Papademetris, Hua Xu

AI总结本文提出了一种名为TRACE的自回归EEG预训练框架，旨在解决EEG信号多通道、非平稳特性带来的可迁移表征学习难题。TRACE通过在因果上下文中预测未来EEG片段，并在每个时间步进行跨通道一致的时序自适应计算，实现对不同时间阶段和通道间关系的灵活建模。该方法支持不同通道配置和记录域的异构预训练，实验表明其在多个下游任务中表现优异，尤其在运动想象和临床事件分类任务中具有竞争力。