arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2075
专题追踪
2605.12741 2026-05-14 cs.LG

Learning with Rare Success but Rich Feedback via Reflection-Enhanced Self-Distillation

Yuwei Zhang, Sha Li, Changlong Yu, Qin Lu, Shuowei Jin, Chengyu Dong, Haoran Liu, Ilgee Hong, Xintong Li, Zhenyu Shi, Bing Yin, Jingbo Shang

发表机构 * UC San Diego(UC圣地亚哥大学) Amazon(亚马逊) Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文研究了如何使大语言模型在与环境交互中持续改进,特别是在成功案例稀少的情况下。为此,提出了一种基于反思增强的自蒸馏框架(RESD),通过将失败反馈转化为积极的纠正信号,生成回顾性反思以诊断局部错误,并构建全局经验库以保留可复用的知识。实验表明,RESD在持续学习任务中显著优于传统自蒸馏方法,且在早期阶段表现出更高的交互效率。

Comments Work in progress

详情
英文摘要

Enabling Large Language Models (LLMs) to continuously improve from environmental interactions is a central challenge in post-training. While on-policy self-distillation offers a promising paradigm, existing methods predominantly treat environmental feedback as a passive conditioning signal. Consequently, they heavily rely on successful demonstrations and struggle to learn in rare-success regimes. To bridge this gap, we introduce Reflection-Enhanced Self-Distillation (RESD), a framework that transforms raw failure feedback into an active source of corrective supervision. Instead of passively appending feedback, RESD interprets failed trajectories by generating retrospective reflections to diagnose local errors, and curates a persistent global playbook to preserve reusable lessons across training steps. The enriched context enables the self-teacher to provide actionable token-level supervision even in the absence of successful rollouts. Empirical evaluations on multiple continual learning tasks demonstrate that RESD substantially outperforms standard self-distillation baselines. Furthermore, RESD achieves significantly faster early-stage improvement than GRPO with $8\times$ samples using only a single rollout per prompt, highlighting its superior interaction efficiency.

2605.12736 2026-05-14 cs.LG

ConRetroBert: EMA Stabilized Dual Encoders for Template-Based Single-Step Retrosynthesis

Mohammad Jahid Ibna Basher, Ali Khodabandeh Yalabadi, Ivan Garibay, Ozlem Ozmen Garibay

发表机构 * Department of Industrial Engineering(工业工程系)

AI总结 ConRetroBert 是一种基于模板的单步逆合成方法,通过双编码器框架将模板选择问题转化为密集模板检索与候选集排序任务。该方法采用对比预训练学习产品与反应模板的共享嵌入空间,并引入多正例列表排序目标优化模板排名,同时利用指数移动平均技术稳定模板编码器更新,提升模型鲁棒性。实验表明,ConRetroBert 在 USPTO-50k 数据集上显著提升了反应预测准确率,并在稀有模板预测方面表现出色。

Comments Submitted to NeurIPS 2026 Main Conference

详情
英文摘要

Template based single step retrosynthesis predicts reactants by selecting and applying an explicit reaction template, making each prediction traceable to a chemical transformation rule. This is useful for synthesis planning, but template based methods are often viewed as less competitive than template free models because template prediction is commonly formulated as global classification over a long tailed rule library. We argue that this weakness is not inherent to templates, but to the learning formulation. We present ConRetroBert, a dual encoder framework that reframes template based retrosynthesis as dense product template retrieval followed by candidate set listwise ranking. Stage 1 uses contrastive pretraining to learn a shared embedding space between products and reaction templates. Stage 2 refines template ranking over mined hard negative candidate sets with a multi positive listwise objective. To enable template side adaptation without destabilizing hard negative mining, ConRetroBert uses a slow moving exponential moving average template encoder for retrieval bank construction while updating the live template encoder through the ranking loss. On the local USPTO-50k benchmark, Stage 2 candidate set ranking improves top-1 reaction accuracy from 50.5% to 61.3%, while EMA stabilized template adaptation further improves it to 62.4%. Fine tuning from a leakage controlled USPTO-Full checkpoint reaches 75.4% top-1 accuracy on USPTO-50k. We also show that retrieval based template prediction is strong in the long tail of rare templates, and that many correct reactant predictions arise from alternative explicit templates rather than only the recorded positive label. Code and data are available at https://github.com/JahidBasher/ConRetroBert.

2605.12735 2026-05-14 cs.RO

The Unified Autonomy Stack: Toward a Blueprint for Generalizable Robot Autonomy

Mihir Dharmadhikari, Nikhil Khedekar, Mihir Kulkarni, Morten Nissov, Martin Jacquet, Angelos Zacharia, Marvin Harms, Albert Gassol Puigjaner, Philipp Weiss, Kostas Alexis

发表机构 * Autonomous Robots Lab(自主机器人实验室)

AI总结 本文介绍了并开源了“统一自主系统栈”(Unified Autonomy Stack),这是一个面向空中和地面机器人形态的系统级解决方案,旨在实现鲁棒的通用自主性。该系统包含多模态感知、多行为规划和多层级安全导航三个协同模块,通过融合激光雷达、雷达、视觉和惯性传感器数据,实现了环境建模、语义理解、路径规划与安全导航等功能,能够在无GNSS信号、复杂和高障碍物密度的环境中实现安全自主导航与探索。该系统已在多种空中和地面机器人上进行了实地测试,验证了其在复杂环境中的稳定性能。

Comments 35 pages, 22 figures, 8 tables

详情
英文摘要

We introduce and open-source the Unified Autonomy Stack, a system-level solution that enables resilient autonomy across diverse aerial and ground robot morphologies. The architecture centers on three synergistic modules -- multi-modal perception, multi-behavior planning, and multi-layered safe navigation -- that together deliver comprehensive mission autonomy. The stack fuses data from LiDAR, radar, vision, and inertial sensing, enabling (a) robust localization and mapping through factor graph-based fusion, (b) semantic scene understanding, (c) motion and informative path planning through sampling-based techniques adaptive across spatial scales, as well as (d) multi-layered safe navigation both through planning on the online reconstructed map and deep learning-driven exteroceptive policies alongside last-resort safety filters using control barrier functions. The resulting behaviors include safe GNSS-denied navigation into unknown and perceptually-degraded regions, exploration of complex environments, object discovery, and efficient inspection planning. The stack has been field-tested and validated on both aerial (rotorcraft) and ground (legged) robots operating in a host of demanding environments, including self-similar and smoke-filled settings, with complex geometries and high obstacle clutter. These tests demonstrate resilient performance in challenging conditions. To facilitate ease of adoption, we open-source the implementation alongside supporting documentation, validation, and evaluation datasets https://github.com/ntnu-arl/unified_autonomy_stack. A video giving the overview of the paper and the field experiments is available at https://youtu.be/l8Su8OXsM-E.

2605.12733 2026-05-14 cs.LG cs.AI stat.ML

From Generalist to Specialist Representation

Yujia Zheng, Fan Feng, Yuke Li, Shaoan Xie, Kevin Murphy, Kun Zhang

发表机构 * CMU(卡内基梅隆大学) UIUC(伊利诺伊大学香槟分校) UCSD(加州大学圣地亚哥分校) MBZUAI(穆斯林人工智能研究所) UMD(马里兰大学) UBC(不列颠哥伦比亚大学)

AI总结 本文研究了从通用模型中学习任务相关的专家表征问题,核心在于在非参数设定下证明任务结构和任务相关潜在表征的可识别性。研究无需干预、参数形式或结构约束,证明了即使在时间序列缺乏严格时序依赖或存在断开的情况下,任务结构仍可在完全无监督条件下被识别,同时在每个时间步内,通过简单的稀疏性正则化可将任务相关与无关部分分离。这些结果为从通用模型向专家模型的可证性转变奠定了理论基础。

Comments ICML 2026

详情
英文摘要

Given a generalist model, learning a task-relevant specialist representation is fundamental for downstream applications. Identifiability, the asymptotic guarantee of recovering the ground-truth representation, is critical because it sets the ultimate limit of any model, even with infinite data and computation. We study this problem in a completely nonparametric setting, without relying on interventions, parametric forms, or structural constraints. We first prove that the structure between time steps and tasks is identifiable in a fully unsupervised manner, even when sequences lack strict temporal dependence and may exhibit disconnections, and task assignments can follow arbitrarily complex and interleaving structures. We then prove that, within each time step, the task-relevant latent representation can be disentangled from the irrelevant part under a simple sparsity regularization, without any additional information or parametric constraints. Together, these results establish a hierarchical foundation: task structure is identifiable across time steps, and task-relevant latent representations are identifiable within each step. To our knowledge, each result provides a first general nonparametric identifiability guarantee, and together they mark a step toward provably moving from generalist to specialist models.

2605.12730 2026-05-14 cs.AI cs.GR cs.MA physics.soc-ph

BEHAVE: A Hybrid AI Framework for Real-Time Modeling of Collective Human Dynamics

Helene Malyutina

发表机构 * Independent Researcher, Collective Dynamics Lab(独立研究者,集体动力学实验室)

AI总结 本文提出BEHAVE,一种用于实时建模群体人类动态行为的混合人工智能框架。传统AI系统多关注个体行为或事后事件检测,难以捕捉群体稳定、升级或崩溃等集体动态特性。BEHAVE将群体视为具有涌现性、非线性、反馈环和临界点敏感性的复杂动态系统,通过可观测的物理信号构建交互空间,并将其建模为连续行为场,从而实现对群体状态的分布式表征与预测。该框架结合数学定理与神经网络模型,在多个实际场景中展示了其对群体动态的有效建模与预测能力。

Comments 19 pages

详情
英文摘要

Existing AI systems for modeling human behavior operate at the level of individuals or detect events after they occur. As a result, they systematically fail to capture the collective dynamics that determine whether a group remains stable or transitions into escalation or breakdown. We propose a different foundation: a group of interacting humans constitutes a complex dynamical system in the precise mathematical sense, exhibiting emergence, nonlinearity, feedback loops, sensitivity near critical points, and phase transitions between qualitatively distinct regimes. The state of such a system is not located within any single participant; it is distributed across mutual influence loops and observable through the micro-dynamics of the body. We introduce BEHAVE (Behavioral Engine for Human Activity Vector Estimation), a formal framework that models collective dynamics as continuous behavioral fields defined over an interaction space derived from observable physical signals. Kinematic micro-signals (position, velocity, body orientation, gestural activity) are structured into a directed interaction graph and aggregated into a basis of behavioral fields capturing distinct, non-redundant axes of collective state. The framework rests on one theorem and two structural propositions characterizing the tension field, the field basis, and the criticality index. Perception and forecasting layers are implemented using neural models, enabling data-driven learning and approximation of system dynamics. BEHAVE is formulated as a computational system for learning, representing, and forecasting collective dynamics from data. A working pipeline is demonstrated on a 7-agent negotiation snapshot. The same fields, recalibrated, apply to crowd safety, crisis-team dynamics, education, and clinical contexts.

2605.12726 2026-05-14 cs.LG

Before the Last Token: Diagnosing Final-Token Safety Probe Failures

Shravan Doda

发表机构 * SafeSwitch HarmBench SorryBench

AI总结 该研究探讨了最终token安全探针在检测有害内容时的失效问题,指出某些越狱提示中的危险信息可能分布在早期token中,而未被最终token读取所捕捉。通过分析多个指令微调大语言模型中的隐藏状态,研究发现现有探针在召回干净有害提示时表现良好,但容易遗漏越狱案例并产生误报。研究进一步提出了一种基于PCA-HMM的轨迹模型,能够有效恢复被最终token探针遗漏的安全风险,为安全检测提供了新的分析思路。

Comments 8 pages, 2 figures, 7 tables

详情
英文摘要

Final-token safety probes monitor a single hidden state after prompt prefill, but jailbreak prompts can contain probe-visible unsafe evidence distributed across earlier user-token representations that is missed by this readout. We study this prefill-time failure mode using SafeSwitch-style probes trained only on clean harmful and benign prompts across three instruction-tuned LLMs. The probes achieve high recall on clean harmful prompts, but miss many jailbreaks and can produce false positives on safety-adjacent benign prompts. Subspace analyses suggest that missed jailbreaks differ from clean benign prompts along directions that are poorly captured by the probe's representational subspace, and increasing probe bottleneck width does not reliably resolve this mismatch. Token-level prefill analyses reveal that probe-visible unsafe evidence often appears earlier in the sequence but is not exposed at the final-token readout, while naive max-pooling over token positions overfires on safe prompts. A simple PCA-HMM trajectory model, trained only on the same clean split, recovers many final-token misses from user-content prefill trajectories without the catastrophic false-positive behavior of naive token pooling, motivating trajectory-aware hidden-state analyses as diagnostic complements to final-token probes

2605.12725 2026-05-14 cs.CV

Is Video Anomaly Detection Misframed? Evidence from LLM-Based and Multi-Scene Models

Furkan Mumcu, Michael J. Jones, Anoop Cherian, Yasin Yilmaz

发表机构 * University of South Florida(佛罗里达州立大学) Mitsubishi Electric Research Laboratories(三菱电机研究实验室)

AI总结 近年来,视频异常检测研究逐渐转向构建跨场景的通用正常行为模型,但这一趋势忽视了场景特定和上下文依赖的正常行为特性。现有方法常依赖多模态大语言模型的预训练表示和视频级弱监督,导致模型更关注语义层面的异常类别,而非特定环境中的正常行为偏差。本文通过视觉分析和实验评估指出,这种做法削弱了空间定位能力,引入语义偏差,并将异常检测简化为动作识别,强调视频异常检测应在单一场景中重新聚焦于空间感知和可解释的正常行为建模。

详情
英文摘要

Recent video anomaly detection research has expanded rapidly with an emphasis on general models of normality intended to work across many different scenes. While this focus has led to improvements in scalability and multi-scene generalization, it has also shifted the field away from modeling the scene-specific and context-dependent nature of normal behavior. Contemporary approaches frequently rely on video-level weak supervision and opaque pretrained representations from multi-modal large language models (MLLMs), which encourage models to respond to familiar semantic anomaly categories rather than to deviations from the normal patterns of a particular environment. This trend suppresses spatial localization, introduces semantic bias, and reduces anomaly detection to a form of action recognition. In this paper, we examine whether these prevailing formulations align with the core requirements of real-world VAD, which is typically performed within a single scene where normality is determined by local geometry, semantics, and activity patterns. Through targeted visual analyses and empirical evaluations, we demonstrate the practical consequences of these limitations and show that meaningful progress in VAD requires renewed focus on single-scene, spatially-aware, and explainable formulations that capture the nuanced structure of normality within individual environments.

2605.12724 2026-05-14 cs.CV cs.AI

Inline Critic Steers Image Editing

Weitai Kang, Xiaohang Zhan, Yizhou Wang, Mang Tik Chiu, Jason Kuen, Kangning Liu, Yan Yan

发表机构 * University of Illinois Chicago(伊利诺伊大学芝加哥分校) Adobe

AI总结 本文研究了基于指令的图像编辑中不同区域的难度差异问题,提出了一种在生成过程中实时修正模型输出的方法。核心方法是引入一个可学习的“Inline Critic”模块,在模型中间层对生成结果进行评估,并引导后续生成过程。该方法通过三阶段训练策略稳定模型学习,显著提升了图像编辑的效果,在多个基准测试中取得了当前最优性能。

Comments 9 pages

详情
英文摘要

Instruction-based image editing exhibits heterogeneous difficulty not only across cases but also across regions of an image, motivating refinement approaches that allocate correction to where the model struggles. Existing refinement signals arrive late, after a fully generated image or a completed denoising step. We ask whether such a signal can act within an ongoing forward pass. To investigate this, we probe a frozen image-editing model and find that although generation capability emerges only in the last few layers, the error pattern is already set in early layers (rank correlation \r{ho} = 0.83 with the final-layer error map). Based on this, we introduce Inline Critic, a learnable token that critiques a frozen model's predictions at its intermediate layers and steers its hidden states to refine generation during the forward pass. A three-stage recipe is proposed to stabilize the training from learning how to critique to steering generation. As a result, we achieve state of the art on GEdit-Bench (7.89), a +9.4 gain on RISEBench over the same backbone, and the strongest open-source result on KRIS-Bench (81.92, surpassing GPT-4o). We further provide analyses showing that the critic genuinely shapes the model's attention and prediction updates at subsequent layers.

2605.12719 2026-05-14 cs.RO cs.LG

A Five-Layer MLOps Architecture for Connected Automated Driving

Bastian Lampe, Lutz Eckstein

发表机构 * Institute for Automotive Engineering (ika), RWTH Aachen University(汽车工程研究所(ika),亚琛工业大学)

AI总结 自动驾驶系统(ADS)在复杂、动态的开放环境中运行,其安全性和性能的持续保障面临重大挑战。本文提出了一种基于MLOps原理的五层架构,旨在支持自动驾驶系统通过车队协同学习实现持续改进。该架构为车队运营商及相关利益方提供了设计和实施MLOps流程的概念蓝图,通过多层级的自我评估机制,有助于检测和减少包括黑天鹅事件在内的边缘案例。

Comments 8 pages, 6 figures

详情
英文摘要

The continual assurance of safety and performance of automated driving systems (ADSs) poses significant challenges. ADSs operate in complex, dynamic, open-world environments allowing a wide range of scenarios, including ones that are rare or not foreseen during initial development. While the incorporation of artificial intelligence (AI) and machine learning (ML) technology allows ADSs to learn from data gathered during operation and thus enables them to adapt over time, these approaches come with their own challenges. A key advantage of ADSs compared to human drivers is their greater ability to gather data collectively across a fleet of vehicles, or even across multiple fleets operated by different entities, and to learn from this data collectively. Vehicles can share and combine their data to identify additional learning opportunities otherwise missed by individual vehicles. This creates new opportunities to tackle the challenges of continual assurance of safety and performance, but requires the implementation of architectures that leverage the collective learning potential. Based on established MLOps principles and existing work in the field of connected automated driving, this paper presents a five-layer architecture for collective learning-enabled MLOps processes for ADSs. The goal of this architecture is to provide a conceptual blueprint for the design and implementation of MLOps processes by fleet operators and other relevant stakeholders. The paper describes the main responsibilities of each layer, their interactions, and how multi-level self-assessments enabled by the architecture can support the detection and reduction of edge cases including black swan events.

2605.12714 2026-05-14 cs.LG cs.CL

Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs

Jingzhou Jiang, Yi Yang, Kar Yan Tam

发表机构 * The Hong Kong University of Science and Technology(香港科技大学)

AI总结 该研究提出了一种名为Layer-wise Representation Dynamics(LRD)的框架,用于分析现代语言模型各层表示的变化特性,包含三个测量指标:用于全局子空间运动的Frenet、用于局部近邻保留的Neighborhood Retention Score(NRS)以及用于对齐最终层的Graph Filtration Mutual Information(GFMI)。通过在31种模型和30个MTEB任务上的实验,揭示了不同架构和任务在层间表示上的差异,并展示了LRD在无标签模型选择和推理时层剪枝中的应用价值,表明层间结构信息对模型解释和部署决策具有重要意义。

详情
英文摘要

Hidden states change substantially across the layers of modern language models, but most layer-wise analyses focus on one aspect of that change. We propose Layer-wise Representation Dynamics (LRD), a framework with three layer-wise measurement families: Frenet (Grassmann speed and curvature) for global subspace motion, Neighborhood Retention Score (NRS) for local nearest-neighbor retention, and Graph Filtration Mutual Information (GFMI) for alignment with the final layer. Applying LRD to 31 models (encoder-based and decoder-based embedders, plus base LLMs) on 30 MTEB tasks reveals architectural and task-level differences that are not apparent from final-layer representations alone. We then use LRD for two applications: label-free model selection and inference-time layer pruning. For selection, all three model-level scores correlate positively with downstream MTEB performance, with end-to-end subspace displacement (d_{0,L}) the strongest, and the same direction holds on a smaller base-LLM MMLU panel. For pruning, GFMI is the only measurement-guided rule that beats Random at the 15% and 20% budgets and has the best median change at every budget. Frenet is effective only at the lightest budget, while NRS does not transfer from model selection to pruning. These results show that layer-wise structure provides signal for both interpretation and deployment decisions.

2605.12710 2026-05-14 cs.RO

Belief-Space Residual Risk for Automated Driving under Localization Uncertainty

Nijinshan Karunainayagam, Nils Gehrke, Frank Diermeyer

发表机构 * Institute of Automotive Technology at the Technical University of Munich(慕尼黑技术大学汽车技术研究所)

AI总结 本文研究了在定位不确定性条件下自动驾驶系统的残余风险评估问题。为准确反映车辆自身位置的不确定性,作者将残余风险度量扩展到信念空间,将自身姿态不确定性建模为高斯分布,并重新定义残余风险为该分布下风险退化期望值。通过粒子滤波框架下的协方差融合方法,将定位不确定性纳入碰撞概率计算,提升了风险评估的鲁棒性。

Comments 7 Pages, this work has been accepted for publication in IEEE Intelligent Transportation Systems (ITSC) 2026. The final published version will be available via IEEE Xplore

详情
英文摘要

Residual risk metrics have recently been introduced to assess the safety implications of automated driving systems. Existing approaches typically assume a deterministic ego pose and concentrate mainly on perception errors related to surrounding objects and latency effects. In practice, however, automated vehicles operate under considerable localization uncertainty, especially in complex urban settings and in adverse weather conditions. This work extends the spatial residual risk formulation to the belief space by explicitly modeling ego pose uncertainty as a Gaussian distribution. Residual risk is reformulated as the expected degradation-induced risk over the ego pose belief distribution. Within a particle-based risk estimation framework, localization uncertainty is incorporated into the computation of collision probabilities through covariance fusion of ego and object uncertainties.

2605.12709 2026-05-14 cs.LG

Spectral Energy Centroid: a Metric for Improving Performance and Analyzing Spectral Bias in Implicit Neural Representations

Tomasz Dądela, Adam Kania, Maciej Rut, Przemysław Spurek

发表机构 * Jagiellonian University(雅盖隆大学) IDEAS

AI总结 本文提出了一种名为光谱能量质心(SEC)的度量方法,用于分析和提升隐式神经表示(INRs)的性能。SEC能够量化目标图像的频率特性以及INR模型的频谱偏差,揭示了频率与INR性能之间的关系。研究展示了SEC在三个任务中的有效性,包括超参数选择、信号复杂度评估以及跨不同架构的频谱偏差对齐,为理解与优化INR提供了新的分析工具。

详情
英文摘要

Implicit Neural Representations (INRs) model continuous signals using multilayer perceptrons (MLPs), enabling compact, differentiable, and high-fidelity representations of data across diverse domains. However, due to the low-frequency bias of MLPs that prevents effective learning of small details, the model's frequency must be carefully tuned through the embedding layer. Prior work established that this tuning can be performed before training based on the target signal, but it did not account for the significant effect of model depth, indicating that our understanding of the relationship between frequency and INR performance remains limited. To gain insights into this relationship, we utilize the Spectral Energy Centroid (SEC) metric that quantifies the frequency of target images and the spectral bias of INR models. We show that SEC is a versatile tool for INR analysis, demonstrating its utility across three tasks: (1) a data-driven strategy (SEC-Conf) for hyperparameter selection that outperforms existing heuristics and is robust to model depth, (2) a reliable proxy for signal complexity, and (3) effective alignment of spectral biases across diverse INR architectures.

2605.12706 2026-05-14 cs.LG q-bio.GN

A Resampling-Based Framework for Network Structure Learning in High-Dimensional Data

Ziwei Huang, Zeyuan Song, Paola Sebastiani, Stefano Monti

发表机构 * Department of Physics, Boston University(波士顿大学物理系) Institute for Clinical Research and Health Policy Studies, Tufts Medical Center(塔夫茨医疗中心临床研究与健康政策研究所) Department of Medicine, School of Medicine, Tufts University(塔夫茨大学医学院医学系) Data Intensive Study Center, Tufts University(塔夫茨大学数据密集型研究中心) Division of Computational Biomedicine, Boston University Chobanian & Avedisian School of Medicine(波士顿大学Chobanian与Avedisian医学院计算生物医学系) Department of Biostatistics, Boston University School of Public Health(波士顿大学公共卫生学院生物统计学系) Bioinformatics Program, Faculty of Computing and Data Science, Boston University(波士顿大学计算与数据科学学院生物信息学项目)

AI总结 RSNet 是一个开源的 R 软件包,提供了一种基于重采样的框架,用于在高维数据中进行稳健且可解释的网络结构学习,旨在解决小样本量带来的挑战。该框架支持连续和离散混合数据类型的条件高斯贝叶斯网络及部分相关网络的估计,并结合多种重采样策略以适应独立或相关观测。RSNet 通过引入基于图元的拓扑分析,增强了网络结构的可解释性,并首次实现了在稀疏网络中高效构建带符号的图元度向量矩阵,从而支持对高阶网络结构的可扩展分析。

Comments 7 pages, 1 figure

详情
英文摘要

RSNet is an open-source R package that provides a resampling-based framework for robust and interpretable network inference, designed to address the limited-sample-size challenges common in high-dimensional data. It supports both the estimation of partial correlation networks modeled as Gaussian networks and conditional Gaussian Bayesian networks for mixed data types that combine continuous and discrete variables. The framework incorporates multiple resampling strategies, including bootstrap, subsampling, and cluster-based approaches, to accommodate both independent and correlated observations. To enhance interpretability, RSNet integrates graphlet-based topology analysis that captures higher-order connectivity and edge sign information, enabling single-node and subnetwork-level insights. Notably, RSNet is the first R package to efficiently construct signed graphlet degree vector matrices (GDVMs) in near-constant time for sparse networks, providing scalable analysis of higher-order network structure. Collectively, RSNet offers a versatile tool for statistically reliable and interpretable network inference in high-dimensional data.

2605.12705 2026-05-14 cs.LG

Early Data Exposure Improves Robustness to Subsequent Fine-Tuning

Lawrence Feng, Gaurav R. Ghosal, Jacob Mitchell Springer, Ziqian Zhong, Aditi Raghunathan

发表机构 * Department of Computer Science(计算机科学系) Cranberry-Lemon University(Cranberry-Lemon 大学) Department of Computational Neuroscience(计算神经科学系) University of the Witwatersrand(沃茨沃斯兰德大学)

AI总结 本文研究了如何训练模型,使其在后续微调过程中仍能保持已习得的能力。通过控制实验,作者发现早期数据暴露(将微调数据混合到预训练阶段)能有效提升模型对后续微调的鲁棒性,优于传统的微调阶段应对遗忘的方法。实验表明,合理分配数据到预训练和微调阶段,能够更有效地平衡模型的初始能力和后续适应能力,为模型训练提供了新的策略方向。

详情
英文摘要

How can we train models whose post-trained capabilities survive subsequent fine-tuning? Rather than focusing on downstream interventions to mitigate forgetting of upstream capabilities, we study how upstream training choices - that is, the manner in which a capability is acquired - shape how robustly that capability is retained. We investigate this question in a controlled three-stage language-model pipeline: pretraining, post-training to acquire a target capability, and downstream fine-tuning on a new objective. Across 135M and 1B models, two post-training domains, and two downstream fine-tuning tasks, we find that immediate post-training performance does not reliably predict retention after subsequent fine-tuning: training recipes that look equivalent immediately after post-training can retain the target capability very differently after subsequent fine-tuning. In particular, early exposure - mixing post-training data into pretraining - consistently improves the frontier between retained upstream performance and downstream performance. In compute-matched experiments, where the target data must be allocated between pretraining and post-training, we find that the optimum lies at neither extreme. Together with our other empirical and theoretical findings, this supports the view that post-training drives immediate specialization while early exposure improves robustness to later forgetting. Replay and dropout, typically used to mitigate forgetting as it occurs during fine-tuning, provide complementary gains to early exposure when applied during post-training. Our findings suggest that robustness to subsequent fine-tuning should be treated as a first-class objective of upstream training, addressed preventatively through choices like early exposure rather than reactively during fine-tuning itself.

2605.12703 2026-05-14 cs.CV cs.AI

MMCL-Bench: Multimodal Context Learning from Visual Rules, Procedures, and Evidence

Yifan Chen, Fei Yin, Qingyan Bai, Zicheng Lin, Yujiu Yang

发表机构 * University of Cambridge(剑桥大学) HKUST(香港科技大学) Tsinghua University(清华大学)

AI总结 本文介绍了 MMCL-Bench,一个用于多模态上下文学习的基准,旨在从视觉或混合模态的教学内容中学习任务相关的规则、程序和经验模式,并应用于新的视觉实例。该基准包含102个任务,涵盖规则应用、流程执行和经验归纳三个类别,评估结果显示当前主流多模态模型在严格评分标准下仍远未达到鲁棒的多模态上下文学习能力,揭示了多模态上下文学习作为当前模型的重要能力瓶颈。

详情
英文摘要

We introduce MMCL-Bench, a benchmark for multimodal context learning: learning task-local rules, procedures, and empirical patterns from visual or mixed-modality teaching context and applying them to new visual instances. Unlike text-only context learning or standard multimodal question answering, this setting requires models to recover and localize relevant evidence from images, screenshots, manuals, videos, and frame sequences before they can reason over the learned context. MMCL-Bench contains 102 tasks spanning three categories: rule system application, procedural task execution, and empirical discovery and induction. We evaluate frontier multimodal models with strict rubric-based scoring and find that current systems remain far from robust multimodal context learning, with even the strongest model solving fewer than one-third of tasks under strict evaluation. Diagnostic ablations and error analysis show that failures arise throughout the context-to-answer pipeline, including context anchoring, visual evidence extraction, context reasoning, and response construction. MMCL-Bench thus highlights multimodal context learning as an important unsolved capability bottleneck for current multimodal models.

2605.12702 2026-05-14 cs.AI cs.HC

DisaBench: A Participatory Evaluation Framework for Disability Harms in Language Models

Eugenia Kim, Ioana Tanase, Christina Mallon

发表机构 * Microsoft(微软)

AI总结 本文提出 DisaBench,一个用于评估语言模型中与残疾相关危害的参与式评价框架。该框架通过与残疾人士和红队专家共同创建的十二类残疾危害分类,结合七类生活场景中的良性与对抗性提示,构建了一个包含175个提示和525对标注响应的数据集。研究发现,残疾相关危害因类型不同而差异显著,并在非文本模态中叠加出现,且其评估具有文化与时间依赖性,常规安全评估难以识别细微危害。该框架强调残疾危害的个人性、交叉性和社区定义特征,现有通用安全基准难以全面捕捉此类问题。

详情
英文摘要

General-purpose safety benchmarks for large language models do not adequately evaluate disability-related harms. We introduce DisaBench: a taxonomy of twelve disability harm categories co-created with people with disabilities and red teaming experts, a taxonomy-driven evaluation methodology that pairs benign and adversarial prompts across seven life domains, and a dataset of 175 prompts with human-annotated labels on 525 prompt-response pairs. Annotation by four evaluators with lived disability experience reveals three findings: harm rates vary sharply by disability type and will compound in non-text modalities, terminology-driven harm is culturally and temporally bound rather than universally assessable, and standard safety evaluation catches overt failures while missing the subtle harms that only domain expertise can recognize. Disability harm is simultaneously personal, intersectional, and community-defined: it cannot be isolated from the full context of who a person is, and general-purpose benchmarks systematically miss it. We will release the dataset, taxonomy, and methodology via Hugging Face and an open-source red teaming framework for direct integration into existing safety pipelines with no additional infrastructure.

2605.12700 2026-05-14 cs.LG cs.NA math.NA

UFO: A Domain-Unification-Free Operator Framework for Generalized Operator Learning

Hanli Qiao, George Em Karniadakis, Muhammad Muniruzzaman

发表机构 * Division of Applied Mathematics, Brown University(布朗大学应用数学系) Institute of Geosciences, University of Bonn(波恩大学地质科学研究院)

AI总结 本文提出了一种名为UFO的跨域神经算子框架,能够在不同表示域之间进行自适应的联合条件交互,无需统一域表示即可实现算子学习。该框架支持输入与输出的离散化解耦,允许在训练时未使用的分辨率或位置进行预测,提升了模型的灵活性和泛化能力。实验表明,UFO在多个具有不连续输入、谱不匹配、非线性动力学和随机高频场等挑战的基准任务中,均能提供准确、鲁棒且物理一致的预测结果。

详情
英文摘要

Neural operators have become an effective framework for learning mappings between function spaces, yet most existing architectures realize operators within a single representational domain, such as physical, spectral, or latent space. In this work, we introduce UFO (Domain-Unification-Free Operator), a cross-domain neural operator framework that realizes operators through adaptive, jointly conditioned interactions among representations defined on distinct domains. UFO enables discretization decoupling: the input function can be observed at resolutions or locations different from those used during training, while the solution can be queried at arbitrary output resolutions. Across four complementary benchmarks covering discontinuous inputs, irregular sampling with spectral mismatch, nonlinear dynamics, and stochastic high-frequency fields, UFO delivers accurate, robust, and physically coherent predictions under distribution shifts. These results establish cross-domain, phase-modulated realization as a powerful framework for discretization-decoupled neural operator learning.

2605.12699 2026-05-14 cs.LG cs.AI

Modeling Heterophily in Multiplex Graphs: An Adaptive Approach for Node Classification

Kamel Abdous, Nairouz Mrabah, Mohamed Bouguessa

发表机构 * Department of Computer Science, University of Quebec at Montreal(魁北克大学蒙特利尔分校计算机科学系)

AI总结 该论文研究了在多层图中建模异质性(heterophily)的问题,即相连节点可能属于不同类别且属性差异较大的情况。现有方法多假设同质性(homophily),难以处理多层图中同时存在的同质与异质交互。为此,作者提出了一种名为\methodname的新方法,通过引入维度特定的兼容性矩阵和可训练的低通与高通滤波器,动态适应不同维度的异质特性,从而更有效地进行节点分类。实验表明,该方法在合成和真实数据集上均取得了优于现有方法的分类性能。

Comments 38 pages, 7 figures, 4 tables, 1 algorithm. Published in Expert Systems with Applications

Journal ref Expert Systems with Applications, Volume 323, 2026, Article 132374

详情
英文摘要

Existing multiplex graph models often assume homophily, where connected nodes tend to belong to the same class or share similar attributes. Consequently, these models may struggle with graphs exhibiting heterophily, where connected nodes typically belong to different classes and have dissimilar attributes. While recent methods have been developed to learn reliable node representations from unidimensional graphs with heterophily, they do not fully address the complexities of multiplex graphs. In a multiplex graph, nodes are linked through multiple types of edges (referred to as dimensions), which can simultaneously exhibit homophilic and heterophilic interactions. To address this gap, we propose \methodname, a novel method for node classification in multiplex graphs that adapts to both homophilic and heterophilic dimensions. \methodname introduces dimension-specific compatibility matrices to model varying degrees of homophily and heterophily across dimensions. A key innovation is its use of a product of trainable low-pass and high-pass filters, approximated via Chebyshev polynomials, to capture both smooth and abrupt changes in the graph signal. By composing these filters and optimizing label predictions using a proximal-gradient method, \methodname dynamically adjusts to the heterophilic characteristics of each dimension. Extensive experiments on synthetic and real-world datasets provide evidence that \methodname captures the complex interplay of homophilic and heterophilic interactions in multiplex graphs, and tends to yield improved node classification performance compared to state-of-the-art methods.

2605.12693 2026-05-14 cs.LG

IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback

Benjamin Amoh, Geoffrey G. Parker, Wesley Marrero

发表机构 * Thayer School of Engineering, Dartmouth College(达特茅斯学院泰勒工程学院)

AI总结 该研究针对延迟反馈环境下决策导向学习中的挑战,提出了一种新的算法IGT-OMD,用于解决双层优化中的梯度陈旧问题。通过隐式梯度传输技术,该方法在在线镜像下降中重新评估存储的内部解,从而将运输误差从延迟的二次依赖降低到线性依赖,并首次实现了具有自适应步长的延迟双层优化的次线性遗憾界。实验表明,该方法在多个任务中显著降低了决策损失,验证了理论分析的有效性。

Comments 9 pages, 4 figures, NeurIPS 2026 conference

详情
英文摘要

Decision-focused learning trains predictive models end-to-end against downstream decision loss, but online settings suffer delayed feedback: outcomes may not arrive for many environment interactions. We identify \emph{staleness amplification}, a failure mode unique to bilevel optimization under delay, in which gradient staleness couples with inner-solver sensitivity to inflate regret beyond single-level delay theory. We prove that any black-box delayed optimizer incurs an irreducible regret cost from inner-solver approximation error, and that gradient staleness contributes a quadratically growing transport error without bilevel-aware correction. Our algorithm, \textbf{IGT-OMD}, applies Implicit Gradient Transport to hypergradients within Online Mirror Descent, re-evaluating stale gradients at the current parameters using stored inner solutions. This method reduces transport error from a quadratic to a linear dependence on delay and achieves the first sublinear regret bound for delayed bilevel optimization with queue-length-adaptive step sizes. Controlled experiments provide a \emph{mechanistic fingerprint}: transport benefit is exactly $0.0\%$ ($p=1.00$) at unit delay and grows monotonically to $9.5\%$ at fifty rounds ($p<0.001$), isolating the correction's effect. On Linear Quadratic Regulator, Warcraft shortest-path, and Sinkhorn optimal transport, IGT-OMD reduces decision loss by $17$--$55\%$ relative to single-level baselines, with phase transitions matching the theory.

2605.12691 2026-05-14 cs.AI

On the Size Complexity and Decidability of First-Order Progression

Jens Classen, Daxin Liu

发表机构 * Department of People and Technology, Roskilde University, Denmark(罗斯基尔德大学人机技术系,丹麦) State Key Laboratory for Novel Software Technology, Nanjing University, China(南京大学新型软件技术国家重点实验室,中国)

AI总结 本文研究了在一阶逻辑框架下动作进展(progression)的规模复杂性与可判定性问题。作者在情境演算(Situation Calculus)框架下,分析了具有局部效应、正常和无环等特性的动作类别的进展规模,证明在合理假设下其规模仅呈多项式增长。此外,当知识库属于可判定的逻辑片段(如二元一阶逻辑或带有常量的全称理论)时,进展仍保持在相同片段内,从而保证了可判定性和实际应用价值。

Comments This is an extended version of an identically-titled paper accepted for publication at IJCAI 2026. This version contains an appendix with further proofs

详情
英文摘要

Progression, the task of updating a knowledge base to reflect action effects, generally requires second-order logic. Identifying first-order special cases, by restricting either the knowledge base or action effects, has long been a central topic in reasoning about actions. It is known that local-effect, normal, and acyclic actions, three increasingly expressive classes, admit first-order progression. However, a systematic analysis of the size of such progressions, crucial for practical applications, has been missing. In this paper, using the framework of Situation Calculus, we show that under reasonable assumptions, first-order progression for these action classes grows only polynomially. Moreover, we show that when the KB belongs to decidable fragments such as two-variable first-order logic or universal theories with constants, the progression remains within the same fragment, ensuring decidability and practical applicability.

2605.12685 2026-05-14 cs.LG cs.AI

A Unified Perspective for Learning Graph Representations Across Multi-Level Abstractions

Mohamed Mahmoud Amar, Nairouz Mrabah, Mohamed Bouguessa, Abdoulaye Baniré Diallo

发表机构 * Department of Computer Science, University of Quebec at Montreal(魁北克大学蒙特利尔分校计算机科学系)

AI总结 该论文提出了一种统一的对比学习框架,用于从节点级、邻近级、聚类级和图级等多个抽象层次学习图结构数据的表示。为了解决现有方法大多只关注单一抽象层次的问题,该方法通过相似度与不相似度分数的线性组合整合多级信息,并引入一种无需参数的细粒度自适应加权机制,以增强优化灵活性并提升模型收敛性。实验表明,该方法在多个下游任务中优于现有先进方法,适用于单层次和多层次场景。

Comments Accepted for publication in IEEE Transactions on Knowledge and Data Engineering (TKDE). 18 pages, 8 figures

详情
英文摘要

Graph Self-Supervised Learning (GSSL) has emerged as a powerful paradigm for generating high-quality representations for graph-structured data. While multi-scale graph contrastive learning has received increasing attention, many existing methods still predominantly focus on a single graph abstraction level. To address this limitation, we propose a unified contrastive framework that can target node-level, proximity-level, cluster-level, and graph-level information and integrate them through a linear combination of similarity scores on positive pairs and dissimilarity scores (i.e., similarity scores on negative pairs). Furthermore, current approaches typically assign uniform penalty strengths to all examples, which reduces optimization flexibility and leads to ambiguous convergence status. To overcome this, we introduce a novel parameter-free fine-grained self-weighting mechanism that adaptively assigns weights to individual similarity and dissimilarity scores. The proposed mechanism emphasizes the scores that deviate significantly from their target values. Our approach not only enhances optimization flexibility but also eliminates the computational overhead of hyperparameter tuning in conventional multi-task GSSL methods. Comprehensive experiments on real-world datasets show that our methods consistently outperform state-of-the-art approaches across downstream tasks, including classification, clustering, and link prediction, in both single-level and multi-level scenarios.

2605.12684 2026-05-14 cs.CV cs.AI cs.HC

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Yichen Feng, Yuetai Li, Chunjiang Liu, Yuanyuan Chen, Fengqing Jiang, Yue Huang, Hang Hua, Zhengqing Yuan, Kaiyuan Zheng, Luyao Niu, Bhaskar Ramasubramanian, Basel Alomair, Xiangliang Zhang, Misha Sra, Zichen Chen, Radha Poovendran, Zhangchen Xu

发表机构 * Bake AI University of Washington(华盛顿大学) University of California, Santa Barbara(加州大学圣巴巴拉分校) Stanford University(斯坦福大学) University of Notre Dame(诺丁汉大学) Carnegie Mellon University(卡内基梅隆大学) MIT-IBM Watson AI Lab(麻省理工-IBM沃森人工智能实验室) Western Washington University(西雅图华盛顿大学) King Abdulaziz City for Science and Technology(国王阿卜杜勒阿齐兹科技城)

AI总结 该研究探讨了前沿多模态大语言模型在视觉审美判断方面的能力,指出当前模型在判断图像美感时存在显著不足。研究引入了“视觉审美基准”(VAB),通过专家标注的对比任务评估模型表现,发现即使是最好的模型在识别最佳和最差图像时也远不如人类专家。研究还表明,通过少量专家示例对模型进行微调,可以显著提升其性能,凸显了VAB在推动审美判断模型发展中的重要价值。

Comments Project page: https://vab.bakelab.ai. Code: https://github.com/BakeLab/Visual-Aesthetic-Benchmark. Dataset: https://huggingface.co/datasets/BakeLab/Visual-Aesthetic-Benchmark

详情
英文摘要

Multimodal large language models (MLLMs) are now routinely deployed for visual understanding, generation, and curation. A substantial fraction of these applications require an explicit aesthetic judgment. Most existing solutions reduce this judgment to predicting a scalar score for a single image. We first ask whether such scores faithfully capture comparative preference: in a controlled study with eight expert annotators, score-derived rankings align poorly with the same annotators' direct comparisons, while direct ranking yields substantially higher inter-annotator agreement on best- and worst-image labels. Motivated by this finding, we introduce the Visual Aesthetic Benchmark (VAB), which casts aesthetic evaluation as comparative selection over candidate sets with matched subject matter. VAB contains 400 tasks and 1,195 images across fine art, photography, and illustration, with labels derived from the consensus of 10 independent expert judges per task. Evaluating 20 frontier MLLMs and six dedicated visual-quality reward models, we find that the strongest system identifies both the best and the worst image correctly across three random permutations of the candidate order in only 26.5% of tasks, far below the 68.9% achieved by human experts. Fine-tuning a 35B-parameter model on 2,000 expert examples brings its accuracy close to that of a 397B-parameter open-weight model, suggesting that the comparative signal in VAB is transferable. Together, these results expose a clear and measurable gap between current multimodal models and expert aesthetic judgment, and VAB provides the first set-based, expert-grounded testbed on which that gap can be tracked and closed.

2605.12683 2026-05-14 cs.LG cs.AI cs.DC physics.comp-ph

Parallel-in-Time Training of Recurrent Neural Networks for Dynamical Systems Reconstruction

Florian Hess, Florian Götz, Daniel Durstewitz

发表机构 * Dept. of Theoretical Neuroscience, Central Institute of Mental Health, Mannheim, Germany(理论神经科学系,心理健康中央研究所,曼海姆,德国) Faculty of Physics and Astronomy, Heidelberg University, Germany(物理与天文学院,海德堡大学,德国) Faculty of Mathematics and Computer Science, Heidelberg University, Germany(数学与计算机科学学院,海德堡大学,德国) Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Germany(跨学科科学计算中心(IWR),海德堡大学,德国)

AI总结 本文研究了如何通过时间并行化方法提高递归神经网络在动态系统重建任务中的训练效率。作者提出了两种基于并行关联扫描的算法,分别适用于线性非自主动力学模型和通用非线性模型,并发现前者在训练时存在限制,难以准确学习非线性动力学。为此,作者将广义教师强制(GTF)引入DEER框架,有效提升了模型在长序列上的学习能力,实验表明长轨迹数据对具有长时程特征的动态系统重建具有显著提升作用。

Comments 29 pages, 6 figures, preprint

详情
英文摘要

Reconstructing nonlinear dynamical systems (DS) from data (DSR) is a fundamental challenge in science and engineering, but it inherently relies on sequential models. Recent breakthroughs for sequential models have produced algorithms that parallelize computation along sequence length $T$, achieving logarithmic time complexity, $\mathcal{O}(\log T)$. Since sequence lengths have been practically limited due to the linear runtime complexity $\mathcal{O}(T)$ of classical backpropagation through time, this opens new avenues for DSR. This paper studies two prominent classes of parallel-in-time algorithms for this task, both of which leverage parallel associative scans as their core computational primitive. The first class comprises models with linear yet non-autonomous dynamics and a nonlinear readout, such as modern State Space Models (SSMs), while the second consists of general nonlinear models which can be parallelized using the DEER framework. We find that the linear training-time recurrence of the first class of models imposes limitations that often hinder learning of accurate nonlinear dynamics. To address this, we augment DEER with Generalized Teacher Forcing (GTF), a novel variant within the more general nonlinear framework that ensures stable and effective learning of nonlinear dynamics across arbitrary sequence lengths. Using GTF-DEER, we investigate the benefits of training on extremely long sequences ($T>10^4$) for DSR. Our results show that access to such long trajectories significantly improves DSR if the data features long time scales. This work establishes GTF-DEER as a robust tool for data-driven discovery and underscores the largely untapped potential of long-sequence learning in modeling complex DS.

2605.12682 2026-05-14 cs.AI

Learning Transferable Latent User Preferences for Human-Aligned Decision Making

Alina Hyk, Sandhya Saisubramanian

发表机构 * Oregon State University(俄勒冈州立大学)

AI总结 该研究旨在解决大语言模型在生成人类对齐决策时面临的挑战,即如何从有限的交互中学习可迁移的潜在用户偏好。为此,作者提出了CLIPR框架,通过少量对话输入学习可操作的自然语言规则,以表示用户的潜在偏好,并通过自适应反馈不断优化这些规则。实验表明,CLIPR在多个任务和环境中均能有效提升决策对齐度并降低推理成本。

详情
英文摘要

Large language models (LLMs) are increasingly used as reasoning modules in many applications. While they are efficient in certain tasks, LLMs often struggle to produce human-aligned solutions. Human-aligned decision making requires accounting for both explicitly stated goals and latent user preferences that shape how ambiguous situations should be resolved. Existing approaches to incorporating such preferences either rely on extensive and repeated user interactions or fail to generalize latent preferences across tasks and contexts, limiting their practical applicability. We consider a setting in which an LLM is used for high-level reasoning and is responsible for inferring latent user preferences from limited interactions, which guides downstream decision making. We introduce CLIPR (Conversational Learning for Inferring Preferences and Reasoning), a framework that learns actionable, transferable natural language rules that represent latent user preferences from minimal conversational input. These rules are iteratively refined through adaptive feedback and applied to both in-distribution and out-of-distribution ambiguous tasks across multiple environments. Evaluations on three datasets and a user study show that CLIPR consistently outperforms existing methods in improving alignment and reducing inference costs.

2605.12674 2026-05-14 cs.AI cs.LG cs.RO

Revealing Interpretable Failure Modes of VLMs

Isha Chaudhary, Vedaant V Jain, Kavya Sachdeva, Sayan Ranu, Gagandeep Singh

发表机构 * UIUC(伊利诺伊大学香槟分校) Kumo AI IIT Delhi(德里印度理工学院)

AI总结 该论文提出了一种名为REVELIO的框架,用于系统性地揭示视觉-语言模型(VLMs)中可解释的失效模式。研究通过结合多样性感知的束搜索和高斯过程汤普森采样策略,高效探索VLM在特定场景下的失效组合空间。实验表明,该方法在自动驾驶和室内机器人任务中发现了现有VLM的潜在漏洞,为提升模型安全性提供了结构化且可解释的改进方向。

详情
英文摘要

Vision-Language Models (VLMs) are increasingly used in safety-critical applications because of their broad reasoning capabilities and ability to generalize with minimal task-specific engineering. Despite these advantages, they can exhibit catastrophic failures in specific real-world situations, constituting failure modes. We introduce REVELIO, a framework for systematically uncovering interpretable failure modes in VLMs. We define a failure mode as a composition of interpretable, domain-relevant concepts-such as pedestrian proximity or adverse weather conditions-under which a target VLM consistently behaves incorrectly. Identifying such failures requires searching over an exponentially large discrete combinatorial space. To address this challenge, REVELIO combines two search procedures: a diversity-aware beam search that efficiently maps the failure landscape, and a Gaussian-process Thompson Sampling strategy that enables broader exploration of complex failure modes. We apply REVELIO to autonomous driving and indoor robotics domains, uncovering previously unreported vulnerabilities in state-of-the-art VLMs. In driving environments, the models often demonstrate weak spatial grounding and fail to account for major obstructions, leading to recommendations that would result in simulated crashes. In indoor robotics tasks, VLMs either miss safety hazards or behave excessively conservatively, producing false alarms and reducing operational efficiency. By identifying structured and interpretable failure modes, REVELIO offers actionable insights that can support targeted VLM safety improvements.

2605.12673 2026-05-14 cs.AI cs.CR

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Hao Wang, Hanchen Li, Qiuyang Mang, Alvin Cheung, Koushik Sen, Dawn Song

发表机构 * UC Berkeley(加州大学伯克利分校)

AI总结 该论文研究了人工智能代理基准测试中的奖励黑客问题,即代理通过非预期方式最大化得分而非完成任务的现象。为此,作者提出了 BenchJack 系统,通过自动化红队测试方法系统性地审计基准测试,识别潜在的奖励黑客漏洞。研究还构建了一个迭代生成对抗流程,不断发现并修复新漏洞,显著提升了基准测试的安全性。实验表明,BenchJack 能在多个主流基准中发现大量漏洞,并有效降低了可被攻击的任务比例。

详情
英文摘要

Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward hacking, where agents maximize a score without performing the intended task, emerges spontaneously in frontier models without overfitting. We argue that benchmarks must be secure by design. From past incidents of reward hacks, we derive a taxonomy of eight recurring flaw patterns and compile them into the Agent-Eval Checklist for benchmark designers. We condense the insights into BenchJack, an automated red-teaming system that drives coding agents to audit benchmarks and identify possible reward-hacking exploits in a clairvoyant manner. Moreover, we extend BenchJack to an iterative generative-adversarial pipeline that discovers new flaws and patches them iteratively to improve benchmark robustness. We apply BenchJack to 10 popular agent benchmarks spanning software engineering, web navigation, desktop computing, and terminal operations. BenchJack synthesizes reward-hacking exploits that achieve near-perfect scores on most of the benchmarks without solving a single task, surfacing 219 distinct flaws across the eight classes. Moreover, BenchJack's extended pipeline reduces the hackable-task ratio from near 100% to under 10% on four benchmarks without fatal design flaws, fully patching WebArena and OSWorld within three iterations. Our results show that evaluation pipelines have not internalized an adversarial mindset, and that proactive auditing could help close the security gap for the fast-paced benchmarking space.

2605.12671 2026-05-14 cs.CL

All Circuits Lead to Rome: Rethinking Functional Anisotropy in Circuit and Sheaf Discovery for LLMs

Xi Chen, Mingyu Jin, Jingcheng Niu, Yutong Yin, Jinman Zhao, Bangwei Guo, Dimitris N. Metaxas, Zhaoran Wang, Yutao Yue, Gerald Penn

发表机构 * Rutgers University(罗格斯大学) Northwestern University(西北大学) The Hong Kong University of Science and Technology (Guangzhou)(香港科学与技术大学(广州)) University of Toronto(多伦多大学)

AI总结 本文挑战了大型语言模型(LLMs)中电路与sheaf发现(CSD)领域的一个核心假设——功能各向异性假设,即认为模型功能由单一或近似唯一的内部机制实现。研究通过实证和理论分析表明,同一任务可由多个结构不同的电路或sheaf同时完成,且它们均具备稀疏性、完整性和任务表现力。为此,作者提出了一种结构重叠感知的sheaf排斥方法,有效揭示了具有高性能但结构差异显著的替代机制,并提出了分布式稠密电路假设,解释了在高维叠加下非唯一、低重叠的电路解释为何自然出现。

Comments ICML 2026

详情
英文摘要

In this paper, we present empirical and theoretical evidence against a central but largely implicit assumption in circuit and sheaf discovery (CSD), which we term the Functional Anisotropy Hypothesis: the idea that functions in large language models (LLMs) are localised to a unique or near-unique internal mechanism. We show that a single LLM task can instead be supported by multiple, structurally distinct circuits or sheaves that are simultaneously faithful, sparse, and complete. To systematically uncover such competing mechanisms, we introduce Overlap-Aware Sheaf Repulsion, a method that augments the CSD objective with an explicit penalty on structural overlap across multiple discovery runs, enabling the discovery of circuits or sheaves with strong task performance but minimal shared structure across a plethora of common CSD benchmarks. We find that this phenomenon becomes increasingly pronounced as the number of discovered sheaves grows and persists robustly across major CSD methods. We further identify an ultra-sparse three-edge sheaf and show that none of its edges is individually indispensable, undermining even weakened notions of canonical or essential components. To explain these findings, we propose a Distributive Dense Circuit Hypothesis and provide a theoretical analysis demonstrating that non-unique, low-overlap circuit explanations arise naturally from high-dimensional superposition under mild assumptions. Together, our results suggest that mechanistic explanations in LLMs are inherently non-canonical and call for a rethinking of how CSD results should be interpreted and evaluated.

2605.12662 2026-05-14 cs.LG q-bio.GN

scShapeBench: Discovering geometry from high dimensional scRNAseq data

Andrew J Steindl, João Felipe Rocha, Brian Tshilengi Di Bassinga, Zachary Warren, Matthew Scicluna, César Miguel Valdez Córdova, Shabarni Gupta, Leire Torices, Daniel Neumann, Timothy J. Mann, Ihuan Gunawan, Dhananjay Bhaskar, John G Lock, Christine L Chaffer, Guy Wolf, Smita Krishnaswamy

发表机构 * Yale University(耶鲁大学) Mila / Université de Montréal(Mila / 蒙特利尔大学) Garvan Institute of Medical Research(Garvan医学研究机构) School of Biomedical Sciences, University of New South Wales(新南威尔士大学生物医学科学学院) University of Wisconsin–Madison(威斯康星大学麦迪逊分校)

AI总结 scShapeBench 是一个用于单细胞转录组数据形状检测的基准数据集,旨在自动识别数据中的几何结构,如聚类、轨迹和典型模式,从而辅助选择合适的下游分析流程。该研究引入了 scReebTower 方法,基于扩散几何提取 Reeb 图,实现了可视化与分析流程的自动匹配,并提供了拓扑感知的评估指标。实验表明,scReebTower 在合成和真实数据上均优于现有方法,为单细胞数据的自动化分析提供了重要工具。

详情
英文摘要

High-dimensional point cloud data arise across many scientific domains, especially single-cell biology. The shapes or topologies of these datasets determine the types of information that can be extracted. For example, clustered data supports cell-type identification, trajectory structures support transition analysis, and archetypal structures capture continua of cellular behaviors. Existing analysis pipelines often assume a specific shape. The standard Seurat pipeline combines UMAP visualization with Louvain clustering and therefore assumes clustered data, while tools such as Monocle and SPADE assume tree-like structures, and flow-based models such as MIOFlow and Conditional Flow Matching target trajectories. Choosing which pipeline to apply is therefore often left to bioinformaticians who visually inspect datasets before selecting an analysis strategy. With the rise of agentic AI scientists, automating shape detection is increasingly important for selecting downstream analysis pipelines. To address this problem, we introduce scShapeBench, a benchmark dataset for shape detection containing both synthetic and expert-annotated single-cell datasets. Synthetic datasets are sampled from ground-truth skeleton graphs with controlled variance. Real single-cell datasets are curated from diverse sources and annotated by experts into four categories: clusters, single trajectory, multi-branching, and archetypal. We additionally introduce scReebTower, a baseline method that uses diffusion geometry to extract Reeb graphs and connect visualization with pipeline selection. We provide topology-aware evaluation metrics and compare scReebTower against PAGA and Mapper on synthetic and real data. Our results indicate that scReebTower outperforms existing baselines. Overall, our contributions span benchmarks, evaluation metrics, and a baseline for automated shape detection in single-cell data.

2605.12654 2026-05-14 cs.RO

COSMIC: Concurrent Optimization of Structure, Material, and Integrated Control for robotic systems

Qinsong Guo, Liwei Wang

发表机构 * Dept. of Mechanical Engineering(机械工程系)

AI总结 本文提出了一种基于梯度的协同设计框架COSMIC,用于同时优化机器人的结构、材料和控制策略,以实现超越传统分步设计的性能。该框架通过将混合类型的拓扑和材料变量嵌入连续设计空间,并结合可微分模拟器中的神经网络控制器,实现了对结构、材料与控制策略之间交互关系的高效建模与梯度计算。研究展示了该方法在多样化的运动策略优化和适应不同功能需求方面的有效性,并揭示了各设计要素对机器人性能的独立与协同影响。

详情
英文摘要

Replicating and surpassing the autonomy of natural organisms remains a long-standing goal in robotics. Yet most robotic systems have their structure, materials, and control designed separately, in sharp contrast to the co-evolution in nature. This separation often leads to suboptimal designs, and we still have a limited understanding of the individual and collective contributions of these design entities. In this work, we propose a gradient-based co-design framework that simultaneously optimizes the topology, material distribution, and control policy of a truss-lattice robot. The framework embeds mixed-type topological and material variables into a continuous design space and integrates a neural network controller within a differentiable simulator, capturing their interactions and enabling efficient gradient calculation via automatic differentiation. Furthermore, we develop a constrained optimization to navigate the highly non-convex design landscape and jointly optimize all design entities. Case studies demonstrate that the proposed framework consistently discovers diverse locomotion strategies that outperform baselines obtained through separated design. The framework is also flexible to accommodate different functional requirements and boundary conditions. Using this framework, we further extract design insights that reveal the individual and collective effects of different entities on robotic performance. The proposed framework provides a computational foundation for the autonomous co-design of robotic systems, capable of reconfiguration, locomotion, and other complex autonomous behaviors.

2605.12653 2026-05-14 cs.LG cs.AI stat.ML

Plan Before You Trade: Inference-Time Optimization for RL Trading Agents

Eun Go, Rohan Deb, Arindam Banerjee

发表机构 * Siebel School of Computing and Data Science(塞比尔计算与数据科学学院) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 本文提出了一种名为FPILOT的推理时优化框架,用于改进强化学习在投资组合管理中的应用。该方法受模型预测控制启发,利用价格预测信息在推理阶段动态优化交易策略,而无需依赖训练时的固定策略。FPILOT能够在不重新训练策略的情况下,结合价格预测模型生成多步价格轨迹,并据此优化每一步的资产配置,从而在多个风险调整指标上显著提升交易表现。

详情
英文摘要

Reinforcement learning agents for portfolio management are typically trained and deployed as static policies, with no mechanism for using price forecasts at inference time. We propose $\text{FPILOT}$ (**Fin**ancial **P**lugin **I**nference-time **L**earning for **O**ptimal **T**rading), a plugin inference-time optimization framework inspired by Model Predictive Control (MPC). Our key structural insight is that future prices mostly do not depend on one agent's portfolio allocation, so a suitable predictive model can produce a multi-step price trajectory without iterative action-conditioned rollouts as in typical reinforcement learning. At each decision step, we use the forecaster's predicted price trajectory to construct an allocation-based imagined return objective, and optimize the policy at inference-time before executing one step of the trade. Our framework is compatible with any pre-trained agent and adapts the policy to the forecaster's predictions without any retraining. Evaluated across five policy learning algorithms on the TradeMaster DJ30 benchmark, $\text{FPILOT}$ produces consistent improvements in total return and return-based risk-adjusted metrics (Sharpe, Sortino, Calmar), with stochastic policies benefiting more than deterministic ones. Further, using synthetic forecasts at calibrated quality levels, we show that gains consistently improve with forecaster quality, suggesting that our performance will improve based on advances in financial forecasting.