arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 3868
2606.09738 2026-06-09 cs.CV 新提交

HDSL: A Hierarchical Domain-Specific Language for Structured 3D Indoor Scene Generation and Localized Editing with LLM Agents

HDSL:一种用于结构化3D室内场景生成和基于LLM智能体局部编辑的层次化领域特定语言

Letian Li, Chao Shen, Shuzhao Xie, Chenghao Gu, ZhengXiao He, Yu Meng, Xin Yang, Wenyuan Jiang, Zhi Wang

发表机构 * SIGS, Tsinghua University(清华大学深圳国际研究生院) Nankai University(南开大学) University of Arizona(亚利桑那大学) Zhejiang University(浙江大学) ETH Zurich(苏黎世联邦理工学院)

AI总结 提出HDSL语言,以树结构表示室内场景,结合LLM智能体生成、多模态检索和力导向布局优化,实现结构化场景生成与局部编辑,显著提升对象覆盖率和编辑效率。

详情
AI中文摘要

文本驱动的室内场景生成与编辑需要一种语言模型既能生成又能修改的中间表示。现有的基于LLM的系统通常依赖场景图或全局约束列表,这些表示虽然紧凑但未能充分指定局部几何结构,使得基于指令的编辑难以定位。我们将此问题视为结构化程序生成和局部程序修复,并提出层次化描述性场景语言(HDSL),一种用于结构化3D室内场景的XML/CSS风格领域特定语言。HDSL将房间、区域、对象和支持表面表示为带有局部坐标的树,使得复杂场景更易于递归规划和检索编辑。我们的流程使用LLM智能体生成带有边界验证的HDSL子树,通过多模态资产检索将非虚拟节点具体化,并应用力导向布局优化来修复边界和碰撞错误。对于编辑,层次化检索增强生成(HRAG)检索相关子树,要求LLM仅重写该局部上下文,并通过确定性三路合并将结果合并回去。在我们复现的基准测试中,HDSL在对象覆盖率、文本-场景对齐和生成时间上优于完整的文本到场景基线,同时在几何指标上与最近的仅布局复现方法保持竞争力;对于编辑,HRAG将令牌使用量减少5.22倍,运行时间减少6.19倍,为所有八对编辑生成有效的DSL,并更好地保留无关的场景对象。

英文摘要

Text-driven indoor scene generation and editing require an intermediate representation that language models can both produce and revise. Existing LLM-based systems often rely on scene graphs or global constraint lists, which are compact but underspecify local geometry and make instruction-based edits difficult to localize. We frame this problem as structured program generation and local program repair, and propose Hierarchical Descriptive Scene Language (HDSL), an XML/CSS-style domain-specific language for structured 3D indoor scenes. HDSL represents rooms, regions, objects, and support surfaces as a tree with local coordinates, making complex scenes easier to plan recursively and easier to retrieve for editing. Our pipeline uses LLM agents to generate HDSL subtrees with bounded verification, grounds non-virtual nodes through multimodal asset retrieval, and applies force-directed layout optimization to repair boundary and collision errors. For editing, Hierarchical Retrieval-Augmented Generation retrieves the relevant subtree, asks the LLM to rewrite only that local context, and merges the result back through a deterministic three-way merge. In our reproduced benchmark, HDSL improves average object coverage, text-scene alignment, and generation time over full text-to-scene baselines while remaining competitive with recent layout-only reproductions on geometry metrics; for editing, HRAG reduces token use by $5.22\times$ and runtime by $6.19\times$, produces valid DSL for all eight paired edits, and better preserves unrelated scene objects.

2606.09735 2026-06-09 cs.CL 新提交

The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

中性面具:RLHF如何提供浅层对齐而保留大语言模型中的党派结构

Wendy K. Tam

发表机构 * Vanderbilt University(范德堡大学) University of Illinois at Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) National Center for Supercomputing Applications(国家超级计算应用中心)

AI总结 研究RLHF对Llama 3.1 8B党派倾向的影响,发现RLHF仅压缩党派信号方差以实现中性输出,而非移除党派结构,且特征级操控可绕过对齐。

详情
AI中文摘要

对齐训练的目标是使大语言模型安全且有用。主要机制——基于人类反馈的强化学习(RLHF)——通过使模型与“人类价值观”对齐来塑造部署语言模型的行为。然而,这一过程并不透明:编码了哪些价值观?这些价值观是谁的?RLHF如何编码它们?越来越多的证据表明,RLHF仅产生功能性遵从而非深度对齐。我们以党派政治取向为例,对Llama 3.1 8B在RLHF前后的内部表征进行比较,进行了机制性案例研究。我们表明,RLHF并未移除基础模型中的结构化党派方向。相反,它压缩了党派信号的方差,以生成一致平衡且无党派的输出。稀疏自编码器分解揭示,在基础模型中零星激活的策略编码特征在Instruct模型中完全失活。特征级操控实验证实了因果断开。因此,RLHF编码了政治中立的规范,不是通过擦除模型对党派性的知识,而是通过切断从党派几何到输出生成的因果路径。重要的是,这种中立性是功能性的而非结构性的,因此支持党派操控的底层几何结构保持完整。绕过RLHF护栏的机制(例如推断并放大用户的党派身份)会重新激活党派生成。如果RLHF通过断开而非移除价值负载结构来运作,那么同样的模式可能适用于其他价值领域,并且对齐模型的行为可能比其输出所暗示的更脆弱。

英文摘要

The ambition behind alignment training is to make large language models safe and useful. The primary mechanism, reinforcement learning from human feedback (RLHF), shapes the behavior of deployed language models by aligning them with ``human values.'' Yet the process is opaque. What values are being encoded; whose values are they; and how does RLHF encode them? A growing body of evidence suggests that RLHF produces only functional compliance rather than deep alignment. We offer a mechanistic case study of this phenomenon for partisan political orientation with a comparison of the internal representations of Llama 3.1 8B before and after RLHF. We show that RLHF does not remove the structured partisan direction in the base model. Instead, it compresses the variance of the partisan signal to generate consistently balanced and non-partisan output. Sparse autoencoder decomposition reveals that policy-encoding features, which activate sporadically in the base model, are completely inactive in the Instruct model. Feature-level steering experiments confirm the causal disconnect. RLHF thus encodes a norm of political neutrality, not by erasing the model's knowledge of partisanship, but by severing the causal pathway from partisan geometry to output generation. Importantly, this neutrality is functional, not structural so that the underlying geometry that enables partisan steering remains intact. The mechanisms that bypass RLHF's guardrails, such as inferring and amplifying a user's partisan identity, reactivate partisan generation. If RLHF operates by disconnecting rather than removing value-laden structure, then the same pattern may hold for other value domains, and the aligned model's behavior may be more fragile than its outputs suggest.

2606.09731 2026-06-09 cs.LG 新提交

Tight Sample Complexity of Transformers

Transformer的紧样本复杂度

Chenxiao Yang, Nathan Srebro, Zhiyuan Li

发表机构 * Toyota Technological Institute at Chicago(丰田技术研究所芝加哥分校)

AI总结 本文刻画了深度L、总参数W的Transformer的VC维,并建立了思维链学习的样本复杂度上下界,揭示了参数与序列长度对学习所需样本量的影响。

Comments in COLT 2026

详情
AI中文摘要

我们严格刻画了深度为$L$、总参数为$W$、将输入序列长度$T$映射到单个输出的Transformer的VC维,建立了上界$O(L W \log (T W))$和几乎匹配的下界$\Omega(L W \log (T W / L))$。我们进一步严格刻画了使用此类Transformer进行思维链学习的样本复杂度,表明教师强制(即在训练数据上选择与整个思维链一致的预测器)学习的样本复杂度为$O\left(L W \log \left(\left(T+T^{\prime}\right) W\right)\right)$,并且任何使用思维链数据的学习规则至少需要$\Omega\left(L W \log \left(\left(T+T^{\prime}\right) W / L\right)\right)$个样本,其中$T$是输入长度,$T^{\prime}$是自回归步数。

英文摘要

We tightly characterize the VC dimension of depth-$L$ Transformers with a total of $W$ parameters, mapping an input sequence of length $T$ to a single output, establishing an upper bound of $O(L W \log (T W))$ and a nearly matching lower bound of $Ω(L W \log (T W / L))$. We further tightly characterize the sample complexity of chain-of-thought learning using such a Transformer, showing teacher forcing (i.e. selecting a predictor consistent with the entire chain-of-thought on training data) learns with sample complexity $O\left(L W \log \left(\left(T+T^{\prime}\right) W\right)\right)$ and that any learning rule that uses chain-of-thought data requires at least $Ω\left(L W \log \left(\left(T+T^{\prime}\right) W / L\right)\right)$ examples, where $T$ is the input length and $T^{\prime}$ is the number of autoregressive steps.

2606.09730 2026-06-09 cs.AI 新提交

SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

SearchSwarm:面向长周期深度研究的代理LLM委托智能

Pu Ning, Quan Chen, Kun Tao, Xinyu Tang, Tianshu Wang, Qianggang Cao, Xinyu Kong, Zujie Wen, Zhiqiang Zhang, Jun Zhou

发表机构 * Tsinghua University(清华大学) Peking University(北京大学) Ant Group(蚂蚁集团) Gaoling School of Artificial Intelligence, Renmin University of China(中国人民大学高瓴人工智能学院)

AI总结 提出SearchSwarm框架,通过监督微调将任务分解与委托决策内化到模型权重中,在BrowseComp和BrowseComp-ZH上取得同规模最佳性能。

详情
AI中文摘要

大型语言模型越来越需要处理复杂的、长周期的真实世界任务,这些任务的上下文需求可能无限增长,但模型上下文窗口本质上是有限的。最近的研究探索了一种范式,其中主代理分解任务并将子任务分派给子代理,子代理执行并仅返回汇总结果,从而节省主代理的上下文预算。然而,要很好地执行这一任务需要委托智能:分解复杂任务、确定何时委托以及委托什么、并将返回结果整合到持续工作流中的能力。这种能力的训练数据在自然文本中很少见,据我们所知,如何合成此类数据并训练模型获得这种能力在开源社区中仍基本未被探索。为填补这一空白,我们针对深度研究这一代表性的长周期代理任务进行了初步探索。具体来说,我们设计了一个引导工具,引导模型进行高质量的任务分解和委托,同时约束子代理正确返回结果以支持主代理的工作流。引导工具生成的轨迹自然地编码了正确的委托决策,我们将其作为监督微调数据,将委托智能内化到模型权重中。我们的模型SearchSwarm-30B-A3B在BrowseComp上达到68.1,在BrowseComp-ZH上达到73.3,在所有同规模模型中取得最佳结果。我们将发布我们的引导工具、模型权重和训练数据,以促进未来研究。

英文摘要

Large language models are increasingly expected to handle complex, long-horizon real-world tasks whose context demands can grow without bound, yet model context windows remain inherently finite. Recent work explores a paradigm where a main agent decomposes tasks and dispatches subtasks to subagents, which execute and return only summarized results, conserving the main agent's context budget. However, performing this well requires delegation intelligence: the ability to decompose complex tasks, determine when and what to delegate, and integrate returned results into the ongoing workflow. Training data for this capability is scarce in naturally occurring text, and to our knowledge, how to synthesize such data and train models to acquire this capability remains largely unexplored in the open-source community. To bridge this gap, we present a preliminary exploration targeting deep research, a representative long-horizon agent task. Specifically, we design a harness that guides the model toward high-quality task decomposition and delegation, while constraining subagents to return results properly to support the main agent's workflow. The harness-guided trajectories naturally encode correct delegation decisions, which we use as supervised fine-tuning data to internalize delegation intelligence into model weights. Our resulting model, SearchSwarm-30B-A3B, achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH, the best results among all models of comparable scale. We will release our harness, model weights, and training data to facilitate future research.

2606.09725 2026-06-09 cs.LG 新提交

Disentanglement with Holographic Reduced Representations

基于全息约简表示的解缠

Jhonny J. Velasquez Olivera, Christo K. Thomas, Walid Saad

发表机构 * Virginia Tech(弗吉尼亚理工大学) Worcester Polytechnic Institute(伍斯特理工学院)

AI总结 提出使用全息约简表示(HRR)的无监督解缠算法,利用HRR解绑操作提供归纳偏置,分离数据中的因子变化,并通过信息论分析证明其诱导近似独立的符号-值对。

详情
AI中文摘要

解缠,即使用神经网络分离数据中的因子变化,仍然是机器学习中长期存在的挑战。先前的工作通过变分自编码器和生成对抗网络,结合变分推理和信息论约束来解决这个问题。与依赖连续表示的方法不同,我们提出一种将解缠表示视为符号结构的设计,其动机是构成分布样本的概念之间的组合关系。然而,在保持可微性的同时用神经网络学习离散符号结构是困难的,通常需要复杂的架构。为此,我们引入一种无监督学习算法,使用全息约简表示(HRR)进行神经解缠。我们表明,HRR解绑操作为分离因子提供了归纳偏置,并在潜在遍历和解缠度量方面取得了与基线相当的结果。我们通过HRR解绑通道的信息论分析补充了这些实证发现。我们证明解绑诱导了近似独立的符号-值对,并推导出每个槽的容量界限,量化了可以可靠编码的不同符号概念的数量,从而定量解释了朝向解缠的归纳偏置。得到的表示不同于标准的基于自编码器的模型,其潜在单元是求和在一起的向量,而不是低维潜在向量的标量维度。我们表明,这种HRR表示比其他解缠表示对噪声更鲁棒,并在一定信噪比范围内保持重建质量。

英文摘要

Disentanglement, the separation of factors of variation in data using neural networks, remains a long-standing challenge in machine learning. Prior work has addressed this problem with variational autoencoders and generative adversarial networks that incorporate ideas from variational inference and information-theoretic constraints. In contrast to methods that rely on continuous representations, we propose a design that treats disentangled representations as symbolic structures, motivated by the compositional relationships among the concepts that make up samples from a distribution. However, learning discrete symbolic structures with neural networks while maintaining differentiability is difficult and often requires complex architectures. To address this, we introduce an unsupervised learning algorithm that uses holographic reduced representations (HRR) for neural disentanglement. We show that the HRR unbinding operation provides an inductive bias for separating factors and yields competitive results against baselines, as measured by latent traversals and disentanglement metrics. We complement these empirical findings with an information-theoretic analysis of the HRR unbinding channel. We prove that unbinding induces approximately independent symbol-value pairs and derive a per-slot capacity bound that quantifies how many distinct symbolic concepts can be reliably encoded, giving a quantitative account of the inductive bias toward disentanglement. The resulting representations differ from standard autoencoder-based models, in that their latent units are vectors that are summed together, rather than scalar dimensions of a low-dimensional latent vector. We show that this HRR representation is more robust to noise than other disentangled representations and maintains reconstruction quality across a range of SNRs.

2606.09724 2026-06-09 cs.AI 新提交

Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

超越概率相似性:检索增强生成在法律领域的结构性、时间性和因果性局限

Hudson de Martim

发表机构 * Federal Senate of Brazil(巴西联邦参议院)

AI总结 本文指出法律AI中RAG的失败源于概率检索与法律知识层次、时间及制度结构的架构不匹配,提出三种病理(部分盲、历时盲、因果不透明)并推导出确定性设计的四项架构承诺。

详情
AI中文摘要

检索增强生成(RAG)已成为应对法律AI不可靠性的标准架构响应,然而跨司法管辖区持续出现高调失败案例,包括提交给法院的捏造引文以及作为现行法律呈现的过时法律内容。我们认为这些失败并非可通过扩展语言模型消除的残余虚构,而是概率检索与法律知识的层次性、时间性和制度性结构之间架构不匹配的症状。我们分三步展开论证。首先,我们将法律知识的本体论承诺阐述为可从经典法律理论推导出的三元属性:层次和分体结构、操作封闭下的历时动态性,以及基于论证义务的制度来源的因果可追溯性。其次,我们识别出检索的三种相应病理(分体盲、历时盲和因果不透明),每种均给出操作性定义、失败机制、典型示例和用于诊断的检测标准。第三,我们通过此视角回顾现有技术,表明现有方法不均匀地满足这些要求,且尚未组合成将它们视为共同构成的范式。基于此分析,我们推导出四个架构承诺,这些承诺表征了法律检索的确定性设计方向:本体论优先性、事件具体化、双时态正确性和确定性交互协议。该框架关注的是法律问题(哪些规范适用及其状态),而非作用于已识别规范的下游任务,并主要处理立法和宪法检索,将解释时间作为显式扩展。

英文摘要

Retrieval-Augmented Generation (RAG) has become a standard architectural response to unreliability in legal AI, yet high-profile failures, including fabricated citations submitted to courts and anachronistic legal content presented as current, continue to appear across jurisdictions. We argue that these failures are not residual confabulations to be eliminated by scaling language models, but symptoms of an architectural mismatch between probabilistic retrieval and the hierarchical, temporal, and institutional structure of legal knowledge. We develop the argument in three moves. First, we articulate the ontological commitment of legal knowledge as a triad of properties derivable from classical legal theory: hierarchical and mereological structure, diachronic dynamism under operational closure, and causal traceability of institutional provenance grounded in the duty of justification. Second, we identify three corresponding pathologies of retrieval (mereological blindness, diachronic blindness, and causal opacity), each developed with an operational definition, a failure mechanism, a canonical example, and detection criteria for diagnostic use. Third, we review the state of the art through this lens, showing that existing approaches address these requirements unevenly and do not yet compose into a paradigm that treats them as co-constitutive. From this analysis we derive four architectural commitments that characterize the deterministic-by-design direction for legal retrieval: ontological primacy, event reification, bitemporal correctness, and deterministic interaction protocols. The framework concerns quaestio juris (which norms apply and in what state) rather than the downstream tasks that act on identified norms, and addresses legislative and constitutional retrieval primarily, with interpretive time as an explicit extension.

2606.09719 2026-06-09 cs.RO 新提交

Safe Polytope-in-Polytope Motion Planning and Control with Control Barrier Functions

基于控制障碍函数的安全多面体在多面体内的运动规划与控制

Alejandro Gonzalez-Garcia, Dries Dirckx, Jan Swevers, Wilm Decré

发表机构 * KU Leuven(鲁汶大学)

AI总结 提出一种安全局部运动规划与控制方法,通过模型预测控制器中的离散时间控制障碍函数约束,保证多面体机器人足迹始终位于连续更新的凸自由空间内,计算时间随障碍物数量增加最多降低91倍。

Comments This work has been submitted to the IEEE for possible publication

详情
AI中文摘要

在狭窄环境中运行的自主移动机器人需要考虑机器人物理足迹的运动规划框架。将几何形状简化为点或圆是保守的,并且丢弃了成功安全通过狭窄通道所需的信息。本文提出了一种安全的局部运动规划与控制方法,保证多面体机器人足迹始终位于连续更新的凸自由空间内。包含条件被表述为模型预测控制器内的一组离散时间控制障碍函数约束。安全约束的数量取决于局部自由空间的复杂性和机器人形状,而不是障碍物的数量。所提出的自由空间公式不需要任何障碍物检测或分割。与基于多面体的避障公式的比较分析证实,随着障碍物数量的增加,计算时间最多减少91倍。该方法在自主水面车辆的仿真中和使用占用网格和LiDAR传感的非完整移动机器人的硬件上得到了验证。实验证明了在机载嵌入式计算机上以10 Hz进行安全的实时运动规划与控制,包括对动态障碍物的反应性避让。

英文摘要

Autonomous mobile robots operating in tight environments require motion planning frameworks that account for the physical footprint of the robot. Simplifying the geometry to a point or a circle is conservative and discards information needed to successfully and safely traverse narrow passages. This work proposes a safe local motion planning and control method that guarantees that a polytopic robot footprint stays inside a continuously updated convex free-space region. The containment condition is formulated as a set of discrete-time control barrier function constraints within a model predictive controller. The number of safety constraints depends on the complexity of the local free-space geometry and the robot shape, instead of the number of obstacles. The proposed free-space formulation does not need any obstacle detection or segmentation. A comparative analysis against a polytope-based obstacle avoidance formulation confirms favorable scaling up to a reduction of 91$\times$ in computation time as the number of obstacles increases. The approach is validated in simulation with an autonomous surface vehicle and on hardware with a non-holonomic mobile robot, using both occupancy grids and LiDAR sensing. The experiments demonstrate safe real-time motion planning and control at 10~Hz on an onboard embedded computer, including reactive avoidance of dynamic obstacles.

2606.09718 2026-06-09 cs.LG cs.CV 新提交

Evaluating the Representation Space of Diffusion Models via Self-Supervised Principles

通过自监督原则评估扩散模型的表示空间

Xiao Li, Yixuan Jia, Zekai Zhang, Xiang Li, Lianghe Shi, Jinxin Zhou, Zhihui Zhu, Liyue Shen, Qing Qu

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 受自监督学习启发,提出基于Fisher信息的度量ICR,分解特征为不变和残差成分,用于联合评估扩散模型的表示与生成能力,发现中间噪声水平下不变性最强且分类性能最佳,ICR可敏感检测训练中的记忆化。

Comments First two authors contributed equally. Accepted at ICML 2026

详情
AI中文摘要

扩散模型已展现出卓越的生成能力,并成为强大的自监督表示学习器,但这两种能力之间的联系仍较少被探索。受自监督学习(SSL)启发,我们引入了一个框架,用于联合评估扩散模型的表示和生成能力。具体地,我们将特征分解为不变成分和残差成分,并推导出不变污染比(ICR),这是一种基于Fisher的度量,用于量化残差变化在特征空间中对不变信号的污染程度。我们利用该框架分析扩散模型的判别和生成行为。在表示方面,我们发现不变性在中间噪声水平达到峰值,同时该水平也产生最佳的下游分类性能。在生成方面,我们研究了在数据有限情况下训练如何从真正的泛化过渡到记忆化,并表明ICR可作为早期学习的敏感训练时指标:沿Fisher方向增加的残差能量标志着记忆化的开始,该指标仅从训练特征即可检测,无需外部评估器或保留测试集。总体而言,我们的结果表明,扩散模型可以通过其学习表示的几何结构从自监督视角进行监控。

英文摘要

Diffusion models have demonstrated remarkable generative capabilities and have also emerged as powerful self-supervised representation learners, yet the connection between these two abilities remains less explored. Drawing inspiration from self-supervised learning (SSL), we introduce a framework for jointly evaluating the representation and generation capabilities of diffusion models. Specifically, we decompose features into invariant and residual components and derive the Invariant Contamination Ratio (ICR), a Fisher-based metric that quantifies how residual variation contaminates invariant signal in feature space. We use this framework to analyze both discriminative and generative behavior of diffusion models. On the representation side, we find that invariance peaks at intermediate noise levels, which also yield the best downstream classification performance. On the generative side, we study how training transitions from genuine generalization to memorization in data-limited regimes, and show that ICR serves as a sensitive training-time indicator of early learning: increasing residual energy along Fisher directions marks the onset of memorization, detectable from training features alone without external evaluators or held-out test sets. Overall, our results show that diffusion models can be monitored from a self-supervised perspective through the geometry of their learned representations.

2606.09711 2026-06-09 cs.AI cs.LG 新提交

Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

代理奖励内化与机制性利用:奖励黑客及其泛化的学习前兆

Mohammad Beigi, Ming Jin, Lifu Huang

发表机构 * UC Davis(加州大学戴维斯分校) Virginia Tech(弗吉尼亚理工大学)

AI总结 提出PRIME概念,通过思维链监控、直接探针和激活级概念向量测量,发现PRIME在持续奖励黑客前分阶段出现,且直接探针得分可预测后续黑客爆发,跨检查点跟踪域外失调。

详情
AI中文摘要

奖励黑客通常在其变得可见后才被研究,即当模型获得高代理奖励但未能完成预期任务时。我们转而研究代理强化学习在失败出现之前教会了什么。我们引入了代理奖励内化与机制性利用(PRIME),这是一种评估任务正确性、预测代理接受度以及推理可被利用的代理-黄金差距的学习能力。在具有可被利用的pytest奖励的编码强化学习环境中,我们通过思维链监控、直接探针和激活级概念向量来测量PRIME。我们发现,PRIME在持续奖励黑客之前以阶段性顺序出现,并且其当前的直接探针得分可以预测后续黑客的爆发时间和严重程度,即使可见的黑客率仍然很低。当评估者发生变化时,PRIME也会适应,重新瞄准任何仍然获得奖励的代理-黄金差距,并在黄金奖励抑制公开黑客时持续存在;消除其激活方向会减少黑客行为。跨检查点,域内PRIME跟踪域外失调。这些结果共同表明,可被利用的代理强化学习放大了可见黑客上游的代理内化能力,使PRIME成为更广泛对齐风险的候选早期预警信号。

英文摘要

Reward hacking is usually studied after it becomes visible, once a model earns high proxy reward while failing the intended task. We instead study what proxy RL teaches before that failure appears. We introduce Proxy Reward Internalization and Mechanistic Exploitation (PRIME), a learned capability to assess task correctness, predict proxy acceptance, and reason about exploitable proxy--gold gaps. In coding RL environments with exploitable pytest rewards, we measure PRIME through chain-of-thought monitoring, direct probes, and activation-level concept vectors. We find that PRIME emerges in a staged sequence before sustained reward hacking, and that its current direct-probe score forecasts later hack onset and severity even when the visible hack rate is still low. PRIME also adapts when the evaluator changes, retargeting to whichever proxy--gold gap remains rewarded and persisting when gold reward suppresses overt hacking, and ablating its activation directions reduces hacking. Across checkpoints, in-domain PRIME tracks out-of-domain misalignment. Together these results suggest that exploitable proxy RL amplifies a proxy-internalization capability upstream of visible hacking, making PRIME a candidate early-warning signal for broader alignment risk.

2606.09709 2026-06-09 cs.CL 新提交

IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking

IS-CoT: 通过交错结构思维打破长文本生成崩溃

Zechen Sun, Yuyang Sun, Zecheng Tang, Juntao Li, Wenpeng Hu, Wenliang Chen, Zhunchen Luo, Guotong Geng, Min Zhang

发表机构 * Institute of Computer Science and Technology, Soochow University(苏州大学计算机科学与技术学院) Information Research Center of Military Science, PLA Academy of Military Science(军事科学院军事科学信息研究中心)

AI总结 针对大语言模型在长文本生成中因静态层次规划导致长度崩溃的问题,提出交错结构思维链(IS-CoT)框架,通过动态规划-写作-反思循环实现持续策略调整,训练IS-Writer-8B模型在长文本基准上取得最优性能。

详情
AI中文摘要

生成连贯且可控的长文本内容仍然是大语言模型(LLMs)面临的一个持久挑战。虽然推理增强模型在逻辑密集型领域已展现出成功,但我们的评估揭示,它们在开放式写作中遭受严重的长度崩溃,当目标长度超过2,000词时性能急剧下降。我们将这一失败归因于静态层次规划的局限性,它难以在扩展上下文中提供动态指导。为弥补这一差距,我们引入了交错结构思维链(IS-CoT)框架。与外部智能体工作流不同,IS-CoT将动态的规划-写作-反思循环嵌入生成过程,无需额外辅助即可实现持续策略调整和全局对齐。基于该框架,我们通过多教师管道构建了一个高质量的交错推理轨迹数据集,并训练了IS-Writer-8B。实验表明,IS-Writer-8B在具有挑战性的长文本基准上取得了最先进的性能(例如,在LongBench-Write上比DeepSeek-V3.2高出+3.08),展现出与显著更大的专有模型相竞争的长度合规性和连贯性。

英文摘要

Generating coherent and controllable long-form content remains a persistent challenge for Large Language Models (LLMs). While reasoning-enhanced models have demonstrated success in logic-intensive domains, our evaluation reveals that they suffer from a severe length collapse in open-ended writing, where performance degrades sharply as target lengths exceed 2,000 words. We attribute this failure to the limitation of static hierarchical planning, which struggles to provide dynamic guidance over extended contexts. To bridge this gap, we introduce the Interleaved Structural Chain-of-Thought (IS-CoT) framework. Unlike external agentic workflows, IS-CoT embeds a dynamic Plan-Write-Reflect cycle into the generation process, enabling continuous strategy adaptation and global alignment without additional assistance. Based on this framework, we construct a high-quality dataset of interleaved reasoning traces via a multi-teacher pipeline and train IS-Writer-8B. Experiments demonstrate that IS-Writer-8B achieves state-of-the-art performance on challenging long-form benchmarks (e.g., +3.08 vs. DeepSeek-V3.2 on LongBench-Write), exhibiting robust length compliance and coherence competitive with significantly larger proprietary models.

2606.09707 2026-06-09 cs.LG cs.CL 新提交

BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling

BrainSurgery:用于模型编辑和升级的可复现且可靠的声明式权重操作

Gianluca Barmina, Annemette Broch Pirchert, Andrea Blasi Núñez, Lukas Galke Poech, Peter Schneider-Kamp

发表机构 * University of Southern Denmark(南丹麦大学)

AI总结 提出BrainSurgery工具,通过声明式YAML计划实现神经网络检查点的鲁棒可复现张量操作,支持结构修改、数学变换和张量重塑,内置断言验证防止静默错误。

详情
AI中文摘要

随着深度学习模型规模的扩大,管理、检查和修改大型检查点变得越来越具有挑战性。研究人员经常需要更改模型权重以进行层重构、精度转换、低秩分解和架构调试,但这些工作流程通常依赖于脆弱的临时Python脚本。在这里,我们介绍BrainSurgery,一个用于对神经网络检查点进行鲁棒且可复现的“张量手术”的工具,并提供一个系统演示,涵盖从模型升级到LoRA提取的四个示例和三个案例研究。通过抽象存储格式和内存管理,BrainSurgery通过声明式YAML计划执行复杂的转换。它支持通过表达性正则表达式和结构定位进行结构修改、数学变换和张量重塑,同时内置断言验证张量形状、数据类型和值,以防止静默错误。我们期望BrainSurgery通过其可复现且经过验证的操作,为未来的研究提供坚实的基础。

英文摘要

As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model weights for layer restructuring, precision casting, low-rank factorization, and architectural debugging, yet these workflows often rely on fragile ad-hoc Python scripts. Here, we introduce BrainSurgery, a tool for robust and reproducible "tensor surgery" on neural network checkpoints, and provide a system demonstration covering four examples and three case studies from model upcycling to LoRA extraction. By abstracting storage formats and memory management, BrainSurgery executes complex transformations through declarative YAML plans. It supports structural modifications, mathematical transformations, and tensor reshaping through expressive regex and structural targeting, while built-in assertions validate tensor shapes, data types, and values to prevent silent errors. We envision that BrainSurgery will provide a strong foundation for future research through its reproducible and validated operations.

2606.09705 2026-06-09 cs.LG cond-mat.stat-mech 新提交

When Do Local Score Models Extrapolate Across Size? A Diagnostic Theory and Benchmark

局部评分模型何时能跨尺寸外推?诊断理论与基准

Wenjie Xi

发表机构 * The University of Hong Kong(香港大学) Department of Physics and HK Institute of Quantum Science & Technology(物理系与香港量子科学与技术研究所)

AI总结 提出诊断理论,证明局部模型能否稳定外推取决于高斯平滑评分的准局部性,并引入有限深度局部流(FDLF)基准进行验证。

详情
AI中文摘要

科学生成建模通常需要尺寸迁移,即在小系统上训练的模型在大系统上评估。虽然平移不变架构允许这种评估,但我们表明架构局部性本身并不能保证稳定的尺寸外推。相反,稳定外推由高斯平滑评分的准局部性决定。通过Tweedie公式,远距离扰动可以通过后验协方差影响局部评分分量,这意味着局部模型只有在感受野覆盖平滑评分的响应范围时才能成功。我们形式化了这一机制,证明了反向扩散下局部边缘的尺寸一致比较定理。我们还引入了有限深度局部流(FDLF),这是一个具有精确评分、密度和可控响应范围的白盒诊断基准。实验上,我们验证了空间混合、平滑评分准局部性和模型感受野之间的相互作用。在空间混合下,平滑评分相对于感受野保持准局部性,从而实现稳定外推。相反,当空间混合减弱时,评分的局部性迅速退化,导致尺寸迁移失败。

英文摘要

Scientific generative modeling often requires size transfer, where models trained on small systems are evaluated on larger ones. While translation-invariant architectures enable this evaluation, we show that architectural locality alone does not guarantee stable size extrapolation. Instead, stable extrapolation is governed by the quasi-locality of the Gaussian-smoothed score. Through Tweedie's formula, far-away perturbations can influence local score components via posterior covariance, meaning a local model succeeds only if its receptive field covers the smoothed score's response range. We formalize this mechanism, proving a size-uniform comparison theorem for local marginals under reverse diffusion. We also introduce Finite-Depth Local Flow (FDLF), a white-box diagnostic benchmark with exact scores, densities, and controllable response ranges. Empirically, we validate the interplay between spatial mixing, smoothed-score quasi-locality, and model receptive fields. Under spatial mixing, the smoothed score remains quasi-local relative to the receptive field, enabling stable extrapolation. Conversely, when spatial mixing weakens, the score's locality rapidly degrades, causing size transfer to fail.

2606.09701 2026-06-09 cs.CL cs.AI cs.LG 新提交

Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

学习攻击与防御:通过GRPO对语言模型进行自适应红队测试

Blake Bullwinkel, Eugenia Kim, Amanda Minnich, Mark Russinovich

发表机构 * Microsoft AI Red Team(微软AI红队) Microsoft Azure(微软Azure)

AI总结 提出AdvGRPO框架,通过密集多通道奖励和分离优势归一化实现GRPO在攻击者-防御者联合优化中的稳定训练,产生高效可迁移攻击,防御者优于基线。

详情
AI中文摘要

AI红队测试必须不断适应不断演变的攻击者和防御者。强化学习为发现新型攻击提供了一种有前景的方法,而协同训练方法可以同时产生更鲁棒的防御者。最近的工作通过应用PPO和DPO证明了攻击者-防御者协同训练的有效性,但报告称GRPO在此设置中不稳定。我们引入了AdvGRPO,一种协同训练框架,通过使用密集多通道奖励和分离优势归一化,使GRPO能够用于攻击者-防御者联合优化。训练过程通过一个课程从单轮攻击发展到闭环多轮攻击,然后启动协同训练,其中攻击者和防御者模型交替更新。我们表明,我们的方法可以产生高度有效且可迁移的攻击,并且协同训练的防御者在安全基准测试中优于基线。

英文摘要

AI red teaming must continually adapt to evolving attackers and defenders. Reinforcement learning offers a promising approach to discovering novel attacks, and co-training methods can produce more robust defenders in tandem. Recent works have demonstrated the efficacy of attacker-defender co-training by applying PPO and DPO, but report that GRPO is unstable in this setting. We introduce AdvGRPO, a co-training framework that makes GRPO viable for joint attacker-defender optimization using dense multi-channel rewards and decoupled advantage normalization. Training progresses through a curriculum from single-turn to closed-loop multi-turn attacks before bootstrapping co-training, where attacker and defender models are updated in alternation. We show that our method can produce highly effective and transferable attacks and that co-trained defenders outperform baselines on safety benchmarks.

2606.09699 2026-06-09 cs.CV 新提交

Cranio-Diff: Diffusion-based Cross-domain Craniofacial Reconstruction with 2D X-ray Skull Guidance and Structural Identity Constraints

Cranio-Diff: 基于扩散的跨模态颅面重建,利用二维X射线颅骨引导和结构身份约束

Ravi Shankar Prasad, Naresh Gurjar, Shashank Baghel, Chirag, Dinesh Singh

发表机构 * Indian Institute of Technology Mandi(印度理工学院曼迪分校) CSVTU Bhilai(恰蒂斯加尔邦斯瓦米·维韦卡南达技术大学比莱分校)

AI总结 提出Cranio-Diff扩散框架,通过ControlNet的颅骨条件结构引导和生物特征文本条件,从2D X射线颅骨图像重建跨模态人脸,解决结构身份对齐问题,在120名受试者的颅面数据集上优于现有方法。

Comments 14 pages, 7 figures, BMVC 2026 conference

详情
AI中文摘要

最先进的生成模型,如CycleGAN、Pix2Pix和扩散模型,在人脸生成任务中表现出色。然而,在从颅骨(X射线)到人脸(光学)域的跨模态颅面重建中,由于跨模态结构身份对齐不匹配,它们无法有效捕获跨模态语义信息。为解决此问题,我们提出Cranio-Diff,一种基于扩散的框架,用于从2D X射线颅骨图像进行跨域颅面重建。该方法通过ControlNet集成颅骨条件结构引导和生物特征文本条件,生成与给定颅骨在语义和结构上更对齐的人脸。所提出的Cranio-Diff方法在从120名受试者的侧位和正位X射线扫描获得的颅面数据集上进行了评估。为实现受控评估,每张人脸图像在三个年龄组(25、45、65)和三个BMI变化(-10%、基线、+10%)下合成,共产生4320个配对样本。据我们所知,这是唯一具有此规模的X射线-人脸数据集。大量实验表明,所提方法在生成图像质量和检索任务上均优于近期现有方法。最后,为评估所提方法的性能,我们使用FID、IS、SSIM、LPIPS、PSNR和ArcFace分数评估了生成图像的质量。此外,使用recall@k、mAP@k和MRR@k评估了检索性能。获得的实验结果表明,所提方法可作为法医调查中的辅助工具。

英文摘要

The state-of-the-art generative models, such as CycleGAN, Pix2Pix, and diffusion models have demonstrated remarkable performance in the face generation task. However, they fail to effectively capture cross-modality semantic information in craniofacial reconstruction when translating from the skull (x-ray) to the face (optical) domain, due to a mismatch in the alignment of structural identity across modalities. To address this issue, we propose Cranio-Diff, a diffusion-based framework for cross-domain cranio-facial reconstruction from 2D X-ray skull images. The proposed approach integrates skull-conditioned structural guidance through ControlNet with biometric text conditioning to generate a face which is more semantically and structurally aligned with the given skull. The proposed Cranio-diff method is evaluated on skull-face dataset obtained from X-ray scans of 120 subjects in lateral and frontal views. To enable controlled evaluation, each face image is synthesised across three age groups (25, 45, 65) and three BMI variations of -10%, baseline and +10%, yielding 4320 paired samples. To the best of our knowledge, this is the only X-ray-face dataset with this magnitude. Extensive experiments showed that the proposed method outperforms recent existing approaches in both generated image quality and retrieval task. Finally, to evaluate the performance of our proposed method, we have evaluated the quality of the generated image using FID, IS, SSIM, LPIPS, PSNR and ArcFace score. Additionally, retrieval performance is evaluated using recall@k, mAP@k and MRR@k. Obtained experimental results demonstrate that the proposed method can be used as an alternate tool in providing aid in forensic investigations.

2606.09697 2026-06-09 cs.CL 新提交

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

PsychoSafe:在大语言模型中引发基于心理学的拒绝

Gianluca Barmina, Federico Torrielli, Sven Harms, Jacob Nielsen, Felix Mächtle, Stine Lyngsø Beltoft, Peter Schneider-Kamp, Thomas Eisenbarth, Lukas Galke Poech, Anne Lauscher

发表机构 * University of Southern Denmark(南丹麦大学) University of Turin(都灵大学) University of Hamburg(汉堡大学) University of Lübeck(吕贝克大学)

AI总结 提出PsychoSafe框架,将LLM的拒绝行为重构为基于证据干预策略的结构化支持性沟通,通过构建5个心理风险领域的8019个提示-响应对,对Qwen 3.5 27B进行提示和参数高效微调,在拒绝质量上比通用基线提升28.1%,同时保持非拒绝任务性能。

详情
AI中文摘要

大型语言模型(LLM)经常面临应被拒绝的请求,这造成了帮助性与伤害预防之间的权衡。然而,拒绝本身可能是有帮助的。在涉及危机、胁迫或意图升级的高风险交互中,生硬的不服从可能防止直接伤害,但仍未能支持请求背后的人的需求。我们提出了PsychoSafe,一个基于心理学的拒绝框架,将拒绝重构为基于证据干预策略的结构化支持性沟通。为了开发PsychoSafe,我们构建了一个包含8019个提示-响应对的语料库,涵盖五个心理上显著的风险领域,并对Qwen 3.5 27B应用提示和参数高效微调。在一个包含500个提示的平衡验证集上,通过LLM评判器评估并经人工评分验证,PsychoSafe提示在拒绝质量上比通用基线提高了28.1%,在外部资源转介(+46.8%)和心理基础(+34.8%)方面尤为突出,同时保持了非拒绝任务的下游性能。微调实现了近乎完美的拒绝和资源转介率,但降低了响应相关性。在SORRY-Bench和XSTest上的额外评估显示,域内鲁棒性强但域外泛化有限,这表明未来的工作应多样化微调数据,以帮助模型有选择地而非机械地应用干预措施。

英文摘要

Large language models (LLMs) routinely face requests that should be refused, creating a trade-off between helpfulness and harm prevention. However, refusals themselves can be helpful. In high-risk interactions involving crisis, coercion, or escalating intent, blunt non-compliance may prevent direct harm while still failing to support the needs of the person behind the request. We present PsychoSafe, a psychologically-informed refusal framework that reframes refusal as structured supportive communication grounded in evidence-based intervention strategies. To develop PsychoSafe, we construct a corpus of 8019 prompt-response pairs spanning five psychologically salient risk domains and apply prompting and parameter-efficient fine-tuning to Qwen 3.5 27B. On a balanced validation set of 500 prompts, evaluated with an LLM judge and validated through human ratings, PsychoSafe prompting improves overall refusal quality by 28.1% over a generic baseline, with particularly strong gains in external resource referral (+46.8%) and psychological grounding (+34.8%), while preserving downstream performance on non-refusal tasks. Fine-tuning achieves near-perfect refusal and resource-referral rates but reduces response relevance. Additional evaluations on SORRY-Bench and XSTest show strong in-domain robustness but limited out-of-domain generalization, suggesting that future work should diversify fine-tuning data to help models apply interventions selectively rather than schematically.

2606.09682 2026-06-09 cs.LG cs.DC cs.PF 新提交

AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

AutoMegaKernel:用于自我重定目标超内核合成的静态检查代理框架

Jaber Jaber, Osama Jaber

发表机构 * RightNow AI

AI总结 提出AutoMegaKernel系统,将Llama模型编译为单个持久CUDA内核,通过静态调度验证器确保无死锁和无竞争,自动生成10种模型正确超内核,并在NVIDIA推理卡上以W8A16精度超越cuBLAS bf16。

Comments 18 pages, 5 figures. Open-source code, data, and agent harness: https://github.com/RightNow-AI/AutoMegaKernel

详情
AI中文摘要

AutoMegaKernel (AMK) 将HuggingFace Llama系列模型编译成一个持久的协作CUDA内核,该内核在一次启动中运行整个前向传播,无需为每个模型手写CUDA代码。其贡献在于系统本身,而非原始速度。一个冻结的调度IR验证器通过静态图检查(非机械化证明)静态地认证无死锁和无竞争,因此不安全的智能体提议调度在启动前被拒绝:在7,160个对抗性调度(6,091个不安全)中,它实现了零误接受,并接受了所有360个实际底层实现。同一源代码可重定目标至sm_80/sm_90/sm_120,从单一代码库自动为10个支持模型中的全部生成正确的超内核,并在真实的SmolLM2-135M检查点上重现HuggingFace贪婪解码逐token匹配(困惑度差异2.5e-7)。一个无人值守、智能体驱动的自动研究循环在其自身基线之上自我改进超内核(1.25-1.72倍)。一个搜索发现的int8 (W8A16) 超内核在NVIDIA数据中心推理集群的batch-1解码中击败了CUDA图化的cuBLAS bf16:L4最高1.33倍,当前一代L40S 1.25-1.27倍,A10G大规模最高1.08倍,以及消费级RTX 5090 1.19-1.23倍。排序并非带宽的简单函数(864 GB/s的L40S击败了600 GB/s的A10G);分界线是推理级与训练级。AMK在高带宽训练级A100/H100上落后于cuBLAS,其中框架定位了跨SM同步瓶颈;我们坦率地报告了这一差距。这是解码位置0处精度不对称(W8A16 vs bf16)的比较;最大的真实检查点是TinyLlama-1.1B。代码和框架:https://github.com/RightNow-AI/AutoMegaKernel

英文摘要

AutoMegaKernel (AMK) compiles a HuggingFace Llama-family model into a single persistent cooperative CUDA kernel that runs the whole forward pass in one launch, with no per-model hand-written CUDA. The contribution is the system, not raw speed. A frozen schedule-IR validator statically certifies deadlock-freedom and race-freedom via static graph checks (not a mechanized proof), so an unsafe agent-proposed schedule is rejected before launch: across 7,160 adversarial schedules (6,091 unsafe) it had zero false-accepts and accepted all 360 real lowerings. The same source retargets sm_80/sm_90/sm_120 from one codebase, auto-generates correct megakernels for 10 of 10 supported models, and on a real SmolLM2-135M checkpoint reproduces HuggingFace greedy decode token-for-token (perplexity match 2.5e-7). An unattended, agent-drivable autoresearch loop self-improves the megakernel over its own baseline (1.25-1.72x). A search-found int8 (W8A16) megakernel beats CUDA-graphed cuBLAS bf16 at batch-1 decode across NVIDIA's datacenter inference fleet: L4 up to 1.33x, the current-gen L40S 1.25-1.27x, A10G up to 1.08x at scale, and the consumer RTX 5090 1.19-1.23x. The ordering is not a clean function of bandwidth (the 864 GB/s L40S beats the 600 GB/s A10G); the divide is inference-class vs training-class. AMK trails cuBLAS on the high-bandwidth training-class A100/H100, where the harness localizes the cross-SM-sync bottleneck; we report the gap plainly. This is a precision-asymmetric (W8A16 vs bf16) comparison at decode position 0; the largest real checkpoint is TinyLlama-1.1B. Code and the harness: https://github.com/RightNow-AI/AutoMegaKernel

2606.09679 2026-06-09 cs.CV 新提交

SoccerNet 2026 Player-Centric Ball-Action Spotting:Retraining and Post-Processing Extensions to the FOOTPASS Baselines

SoccerNet 2026 以球员为中心的球类动作定位:FOOTPASS 基线的重训练与后处理扩展

Parthsarthi Rawat

发表机构 * GameChanger by Dick’s Sporting Goods(迪克体育用品的GameChanger)

AI总结 针对足球广播中八类动作的球员-动作-时间预测任务,在FOOTPASS基线上提出梯度检查点、GNN与DST融合、平方根频率类别加权和后处理流水线四项扩展,在测试集和挑战集上分别达到0.548和0.446的Macro F1。

Comments CVPR 2026 SoccerNet Player Centric Ball Action Spotting Challenge, Rank 7

详情
AI中文摘要

我们描述了针对SoccerNet 2026以球员为中心的球类动作定位挑战赛的系统,该挑战要求预测广播足球中八类动作的谁、做什么以及何时发生。基于三个FOOTPASS基线[1](TAAD、TAAD+GNN和TAAD+DST),我们贡献了四个扩展:(1)梯度检查点,使得在单个GPU上能够对整个骨干网络进行微调;(2)将GNN logits融合到DST编码器中,将基于图的战术上下文与每个球员的视觉特征相结合;(3)平方根频率类别加权,以解决训练数据中213:1的传球与抢断不平衡问题;(4)一个后处理流水线,包括每类logit门控、时间帧细化、球衣重新分配和双模型集成。我们的系统在测试集上达到0.548 Macro F1,在挑战集上(服务器评估)达到0.446。

英文摘要

We describe our system for the SoccerNet 2026 Player-Centric Ball-Action Spotting Challenge, which requires predicting who performs which action and when, across eight classes in broadcast soccer. Building on the three FOOTPASS baselines [1] (TAAD, TAAD+GNN, and TAAD+DST), we contribute four extensions: (1) gradient check pointing to enable full-backbone fine-tuning on a single GPU; (2) fusion of GNN logits into the DST encoder, combining graph-based tactical context with per-player visual features; (3) square-root frequency class weighting to address the 213:1 pass-to-tackle imbalance in the training data; and (4) a post processing pipeline comprising per-class logit gating, temporal frame refinement, jersey re-assignment, and a two-model ensemble. Our system achieves 0.548 Macro F1 on the test set and 0.446 on the challenge set (server evaluation).

2606.09674 2026-06-09 cs.AI cs.LO math.CO 新提交

(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs

(自动)形式化应该很简单:用于详细阐述严格证明的Trellis过程语义

Wesley Pegden

发表机构 * Department of Mathematical Sciences, Carnegie Mellon University(卡内基梅隆大学数学科学系)

AI总结 提出Trellis系统,通过确定性约束工作流和LLM代理迭代细化自然语言证明,实现Lean自动形式化,强调严格证明的可细化性。

Comments 15 pages, 7 figures, 5 tables

详情
AI中文摘要

我们提出Trellis:一个自动形式化系统,它在确定性约束工作流中利用LLM代理,通过自然语言证明的迭代细化来强制在Lean自动形式化任务中取得增量进展。我们的方法基于数学家对严格证明的普遍理解:即详细阐述证明的任何部分都是常规操作。结果是一个系统,旨在以适度的预算和通用代理实现可靠的自动形式化,其专门化并非来自任何特定任务的代理训练,而是来自受严格性含义启发并由过程语义强制执行的工作流。我们链接到一个由该过程产生的近期Ramsey理论突破的端到端Lean形式化。

英文摘要

We present Trellis: an autoformalization system that leverages LLM agents in a deterministically constrained workflow to enforce incremental progress in Lean autoformalization tasks through iterative refinement of natural language proofs. Our approach is motivated by the common mathematician's notion of what it means to have a rigorous proof in the first place: namely, that it would be routine to elaborate any part of the proof in further detail. The result is a system which aims to achieve reliable autoformalization on a modest budget and with generalist agents, with specialization to autoformalization coming not from any task-specific agent training but instead from a meaning-of-rigor inspired workflow enforced by process semantics. We link to an end-to-end Lean formalization of a recent Ramsey theory breakthrough produced by the process.

2606.09672 2026-06-09 cs.AI cs.CL cs.LG cs.PF q-bio.QM 新提交

Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery

相关性不够:嵌入人类元数据用于个体因果发现

Suraj Biswas, Saurabh Gupta, Pritam Mukherjee

发表机构 * Assessli Research(Assessli研究) Dots-In Research(Dots-In研究)

AI总结 针对预训练生物医学语言模型在跨域无关对中产生高余弦相似度(0.76-0.92)导致因果推断错误的问题,提出对比学习(提升分离度至1.63x)和BODHI硬负例挖掘(提升至2.30x),结合OpenVINO优化实现133倍加速。

Comments 20 pages, 18 figures, 9 tables

详情
AI中文摘要

询问一个预训练的生物医学语言模型“皮质醇28 ug/dL”和“股市波动”是否相关,它会返回0.83的余弦相似度(1.0表示完全相同)。两者没有共同机制。这不是个例:我们测试的所有现成生物医学编码器(BioBERT、PubMedBERT、BioM-ELECTRA)在跨域无关对上得分在0.76到0.92之间,而正确答案应接近零。跨域区分准确率为0%。检索系统可以承受这一点,因为下游语言模型会过滤噪声。但大型行为模型(LBM)——一种以人为对象而非句子的基础模型——则不能:它在用户生活图上推理,并将嵌入接近性视为两个事件因果关联的证据。虚假接近性会写入虚假因果边,所有下游都会继承错误。在这里,嵌入几何不是调节旋钮,而是正确性的关键。我们报告了修复方法。对72,034对进行对比训练,将PubMedBERT的BIOSSES相关性从0.633提升到0.828,域内与域间分离度从1.05倍提升到1.63倍。第二次训练BODHI从生物医学知识图中缺失的边挖掘硬负例,将分离度提升到2.30倍,区分差距提升到+0.392,BIOSSES代价为4.5%。在带有AMX的Intel Xeon 6737P上,OpenVINO将单查询延迟从1367毫秒降至10毫秒(133倍),达到每秒555个句子。一个发现与标准建议相悖:在此芯片上,FP16在所有服务批量大小下优于INT8,我们解释了原因。同一模型在无AMX的Ice Lake实例上运行慢13-27倍。我们发布了基准测试套件、训练语料库、BODHI生成器和OpenVINO脚本。

英文摘要

Ask a pretrained biomedical language model whether "cortisol 28 ug/dL" and "stock-market volatility" are related, and it returns a cosine similarity of 0.83 on a scale where 1.0 means identical. The two share no mechanism. This is not a corner case: every off-the-shelf biomedical encoder we tested (BioBERT, PubMedBERT, BioM-ELECTRA) scores unrelated cross-domain pairs between 0.76 and 0.92 when the answer should be near zero. Accuracy on cross-domain discrimination is 0%. Retrieval systems survive this, because a language model downstream filters the noise. A Large Behavioural Model (LBM), a foundation model whose subject is a person rather than a sentence, does not: it reasons over a graph of a user's life and treats embedding proximity as evidence that two events are causally linked. False proximity writes a false causal edge, and everything downstream inherits the error. Here, embedding geometry is not a tuning knob; it is correctness. We report the fix. A contrastive pass over 72,034 pairs raises PubMedBERT BIOSSES correlation from 0.633 to 0.828 and within-vs-across-domain separation from 1.05x to 1.63x. A second pass, BODHI, mines hard negatives from edges absent in a biomedical knowledge graph and lifts separation to 2.30x and the discrimination gap to +0.392, at a 4.5% BIOSSES cost. On an Intel Xeon 6737P with AMX, OpenVINO cuts single-query latency from 1367 ms to 10 ms (133x) and reaches 555 sentences/sec. One finding contradicts standard advice: FP16 beats INT8 on this silicon at every serving batch size, and we explain why. The same model on a no-AMX Ice Lake instance runs 13-27x slower. We release the benchmark suite, training corpora, the BODHI generator, and the OpenVINO scripts.

2606.09671 2026-06-09 cs.LG cs.AI 新提交

Transition-Based Digital Twin Modelling for Alzheimer's Disease under Sparse Longitudinal Data

基于转换的阿尔茨海默病数字孪生建模在稀疏纵向数据下的应用

Yinyu Huang, Yilin Zhang, Sofia Michopoulou, Christopher Kipps, Rahman Attar

发表机构 * University of Southampton(南安普顿大学) University Hospital Southampton NHS Foundation Trust(南安普顿大学医院NHS基金会信托) Faculty of Medicine, University of Southampton(南安普顿大学医学院)

AI总结 针对阿尔茨海默病进展异质性和数据稀疏问题,提出结合局部转换建模与序列建模的数字孪生框架,利用多模态纵向数据预测认知状态并量化不确定性,在ADNI数据上表现优异。

Comments 13 pages, 5 figures, 3 tables. Accepted as a full-length paper at the International Conference on AI in Healthcare (AIiH) 2026

详情
AI中文摘要

阿尔茨海默病(AD)进展具有高度异质性,通常通过稀疏且不规则的纵向数据观察,给预测和个性化监测带来挑战。现有的机器学习方法利用多模态数据改进了AD预测,但往往侧重于静态分类或队列级风险估计,对个体特异性建模和不确定性推理的支持有限。为了解决这些局限性,我们提出了一种个性化数字孪生框架,用于AD预测和基于场景的分析,利用多模态纵向数据。该方法整合了互补的建模策略,以捕捉临床转换和跨访视的时间依赖性。使用阿尔茨海默病神经影像学倡议(ADNI)的数据,包括认知评估、临床变量和MRI衍生的表型,该框架预测认知状态和诊断类别,同时量化预测不确定性并实现患者特定的假设轨迹分析。在无泄漏的受试者级别分割上的评估表明,在评分预测和诊断分类方面表现强劲。在这种稀疏且不规则的ADNI设置中,相邻访视的基于转换的建模比基于序列的分支实现了更高的预测准确性,表明局部转换建模可能更数据高效。虽然序列模型对于不确定性感知的轨迹预测仍然有价值,但局部转换建模提供了一种更数据高效且稳健的预测策略。这些发现强调了将时间建模策略与临床数据结构对齐的重要性,并表明基于转换的数字孪生公式可能为神经退行性疾病的个性化预测提供一种实用且可解释的方法。

英文摘要

Alzheimer's disease (AD) progression is highly heterogeneous and is typically observed through sparse and irregular longitudinal data, posing challenges for prediction and personalised monitoring. Existing machine learning approaches have improved AD prediction using multimodal data, yet often focus on static classification or cohort-level risk estimation, providing limited support for subject-specific modelling and uncertainty-aware reasoning. To address these limitations, we present a personalised digital twin framework for AD prediction and scenario-based analysis using multimodal longitudinal data. The proposed approach integrates complementary modelling strategies to capture clinical transitions and temporal dependencies across visits. Using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), including cognitive assessments, clinical variables, and MRI-derived phenotypes, the framework predicts cognitive status and diagnostic categories while quantifying predictive uncertainty and enabling patient-specific what-if trajectory analysis. Evaluation on leak-free subject-level splits demonstrates strong performance in score forecasting and diagnosis classification. In this sparse and irregular ADNI setting, transition-based modelling of adjacent visits achieved higher predictive accuracy than the sequence-based branch, suggesting that local transition modelling may be more data-efficient. While sequence models remain valuable for uncertainty-aware trajectory forecasting, local transition modelling offers a more data-efficient and robust predictive strategy. These findings highlight the importance of aligning temporal modelling strategies with clinical data structure and suggest that transition-based digital twin formulations may provide a practical and interpretable approach for personalised disease forecasting in neurodegenerative disorders.

2606.09670 2026-06-09 cs.CV cs.AI 新提交

Visual Prompting Meets Feature Reconstruction-Based Anomaly Detection with Dual-Teacher Supervision

视觉提示结合基于特征重建的双教师监督异常检测

Mateo Diaz-Bone, Daniel Caraballo, Florian Scheidegger, Thomas Frick, Mattia Rigotti, Andrea Bartezzaghi, Roy Assaf, Niccolo Avogaro, Yagmur G. Cinar, Brown Ebouky, Filip M. Janicki, Piotr S. Kluska, Cezary Skura, Cristiano Malossi

发表机构 * IBM Research Europe Zurich(IBM欧洲研究院苏黎世分院)

AI总结 针对异常检测在真实场景中因物体尺度、视角等变化失效的问题,提出视觉提示管道、解冻教师模型和扩散生成数据增强,在AeBAD数据集上提升3.5个百分点。

详情
AI中文摘要

最近的异常检测方法在成熟数据集(如MVTec)上取得了完美的检测和分割分数。然而,当基本假设(如一致的物体尺度、视角、背景、光照和居中放置)被违反时,许多方法面临挑战。这些变化使得异常检测方法在许多真实场景中无法使用。为了解决这些限制,我们引入了三个关键贡献:(1)一个视觉提示管道,通过前景-背景掩码隔离物体;(2)一种在师生模型中解冻教师以提高领域适应性的机制;(3)一种利用扩散生成合成图像的数据增强策略,以增强异常检测性能。通过使用掩码多尺度重建(MMR)模型作为骨干,我们在具有挑战性的AeBAD数据集上比之前的最先进方法提高了3.5个百分点。

英文摘要

Recent Anomaly Detection methods achieve perfect detection and segmentation scores on well-established datasets, such as MVTec. However, many of these methods face challenges when foundational assumptions - such as consistent object scale, viewpoint, background, illumination, and centered placement - are violated. Those variations that occur render anomaly detection methods unusable in many real-world scenarios. To address these limitations, we introduce three key contributions: (1) a visual prompting pipeline that isolates objects using foreground-background masking; (2) a mechanism for unfreezing the teacher in student-teacher models to improve domain adaptability; and (3) a data augmentation strategy leveraging diffusion-generated synthetic images to enhance anomaly detection performance. We achieve a 3.5 percentage point improvement over the previous state-of-the-art on the challenging AeBAD dataset by using the Masked Multiscale Reconstruction (MMR) model as our backbone.

2606.09668 2026-06-09 cs.LG 新提交

Algorithm for Contextual Queueing Bandits with Rate-Optimal Queue Length Regret

具有速率最优队列长度遗憾的上下文队列赌博机算法

Seoungbin Bae, Dabeen Lee

发表机构 * KAIST(韩国科学技术院) Seoul National University(首尔大学)

AI总结 针对上下文队列赌博机问题,提出三阶段算法CQB-η-2,通过仅在截止轮前进行随机探索,将队列长度遗憾从Õ(T^{-1/4})改进到Õ(T^{-1/2}),并证明该速率在最小最大意义下最优。

详情
AI中文摘要

上下文队列赌博机为在未知上下文相关服务速率下学习调度异构作业提供了框架。在随机上下文下,现有算法实现了 $\widetilde{\mathcal{O}}(T^{-1/4})$ 的队列长度遗憾,定义为学习者在时间 $T$ 的队列长度与最优队列长度之差的期望。本文将该速率改进至 $\widetilde{\mathcal{O}}(T^{-1/2})$。关键观察是随机探索仅需在精心选择的截止轮之前进行,而非整个时间范围。我们提出 CQB-$\eta$-2,一个三阶段算法:(i) 纯随机探索以构建初始估计器,(ii) $\eta$-随机探索结合 UCB 规则以在保持负漂移的同时继续学习,(iii) 探索截止后的纯 UCB。我们的证明在截止轮处分解队列长度遗憾。截止前,负漂移抑制了由次优选择引起的队列长度差异。截止后,前两个阶段提供了足够的随机探索样本,确保 UCB 决策导致的离开率差距较小。结合这两个界得到 $\widetilde{\mathcal{O}}(T^{-1/2})$ 阶的队列长度遗憾。我们进一步证明了 $\Omega(T^{-1/2})$ 阶的最小最大下界。证明构造了两个统计上不可区分的困难实例直到最终服务决策,并使用队列特定的耦合论证将由此产生的检验误差转化为队列长度遗憾。综上,我们的上下界刻画了在时间 $T$ 上的最小最大依赖关系(忽略对数因子)。

英文摘要

Contextual queueing bandits provide a framework for learning to schedule heterogeneous jobs under unknown context-dependent service rates. Under stochastic contexts, existing algorithms achieve $\widetilde{\mathcal{O}}(T^{-1/4})$ queue length regret, defined as the expected difference between the learner's and oracle's queue lengths at horizon $T$. In this paper, we improve this rate to $\widetilde{\mathcal{O}}(T^{-1/2})$. The key observation is that random exploration is needed only up to a carefully chosen cutoff round, rather than throughout the entire horizon. We propose CQB-$η$-2, a three-phase algorithm: (i) pure random exploration to construct an initial estimator, (ii) $η$-random exploration combined with a UCB rule to continue learning while maintaining negative drift, and (iii) pure UCB after the exploration cutoff. Our proof decomposes the queue length regret at the cutoff round. Before the cutoff, negative drift suppresses queue length differences caused by suboptimal choices. After the cutoff, the first two phases provide sufficient random exploration samples, ensuring that UCB decisions incur small departure-rate gaps. Combining these two bounds yields queue length regret of order $\widetilde{\mathcal{O}}(T^{-1/2})$. We further prove a minimax lower bound of order $Ω(T^{-1/2})$. The proof constructs two hard instances that are statistically indistinguishable up to the final service decision, and uses a queue-specific coupling argument to convert the resulting testing error into queue length regret. Together, our upper and lower bounds characterize the minimax dependence on the horizon $T$ up to logarithmic factors.

2606.09666 2026-06-09 cs.AI 新提交

Frequency-based Constrained Sampling for Interval Patterns

基于频率的区间模式约束采样

Djawad Bekkoucha, Abdelkader Ouali, Bruno Crémilleux

发表机构 * Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Université Paris-Saclay, CNRS(巴黎-萨克雷大学数字科学跨学科实验室(LISN),法国国家科学研究中心) Université Caen Normandie, ENSICAEN, CNRS, Normandie Univ, GREYC UMR6072(卡昂诺曼底大学,卡昂国立高等工程师学校,法国国家科学研究中心,诺曼底大学,GREYC UMR6072)

AI总结 提出CFips方法,将用户定义的句法约束直接融入多步采样框架,通过分解为区间边界上的基本谓词实现精确采样,保证在约束模式空间中按频率比例采样,实验证明能完成超时失败的挖掘任务。

Comments 16 pages

详情
AI中文摘要

输出空间模式采样是穷举模式挖掘的一种强大替代方案,用于探索大型模式空间,因为它使用户能够根据选定的兴趣度量关注代表性模式。在本文中,我们解决了在用户定义的句法约束下采样区间模式的问题。我们引入了CFips,一种将约束直接融入采样过程的采样方法。该方法基于多步采样框架,通过将约束分解为区间边界上的基本谓词来支持多种句法约束,同时保持精确采样保证。我们正式证明CFips在约束模式空间内按频率比例采样区间模式。实验结果表明,将约束融入采样过程能够完成在给定超时内否则会失败的挖掘任务。

英文摘要

Output space pattern sampling is a powerful alternative to exhaustive pattern mining for exploring large pattern spaces, as it enables users to focus on representative patterns drawn according to a chosen interestingness measure. In this paper, we address the problem of sampling interval patterns under user-defined syntactic constraints. We introduce CFips, a sampling approach that incorporates constraints directly into the sampling procedure. The approach relies on a multi-step sampling framework and supports several syntactic constraints by decomposing them into elementary predicates on interval bounds while preserving exact sampling guarantees. We formally prove that CFips samples interval patterns proportionally to their frequency within the constrained pattern space. The experimental results show that integrating constraints into the sampling procedure enables to complete mining tasks that would otherwise fail within a given time out.

2606.09664 2026-06-09 cs.LG stat.ML 新提交

In-Context Learning for Latent Space Bayesian Optimization

潜空间贝叶斯优化的上下文学习

Tuan A. Vu, Harri Lähdesmäki, Julien Martinelli

发表机构 * Aalto University(阿尔托大学)

AI总结 针对潜空间贝叶斯优化中上下文学习模型与优化任务不匹配的问题,提出在分子VAE潜空间上定义合成优化任务进行持续预训练,并引入正则化器保持原始先验,显著提升分子优化性能。

详情
AI中文摘要

贝叶斯优化(BO)是样本高效设计的核心工具,潜空间贝叶斯优化(LSBO)将其扩展到分子和蛋白质等结构化对象。与此同时,TabPFN和TabICL等表格基础模型现已实现最先进的回归性能,并越来越多地被用作BO代理模型。由于其贝叶斯行为是由大规模合成预训练集合诱导的,因此该预训练分布的组成至关重要。LSBO造成了一种独特的不匹配:从潜代码到目标值的映射与当前上下文模型训练所用的回归任务明显不同。我们通过在分子VAE的潜空间上定义合成优化任务来补充表格基础模型代理的预训练阶段,从而解决这种不匹配。持续预训练目标包含一个正则化器,将模型锚定到原始检查点,保留其广泛的回归先验,同时避免对适应任务的过度专业化。在保留的分子优化基准测试中,所得模型实现了强劲性能,支持了针对上下文化代理的LSBO特定适应的重要性。

英文摘要

Bayesian optimization (BO) is a central tool for sample-efficient design, and latent-space Bayesian optimization (LSBO) extends it to structured objects such as molecules and proteins. In parallel, tabular foundation models such as TabPFN and TabICL now achieve state-of-the-art regression performance and are increasingly used as BO surrogates. Because their Bayesian behavior is induced by large synthetic pretraining collections, the composition of this pretraining distribution is crucial. LSBO creates a distinctive mismatch: the induced map from latent code to objective value differs markedly from the regression tasks used to train current in-context models. We address this mismatch by complementing the pretraining stage of tabular foundation model surrogates with synthetic optimization tasks defined on the latent space of a molecular VAE. The continued-pretraining objective features a regularizer that anchors the model to the original checkpoint, preserving its broad regression prior while avoiding overspecialization to the adaptation tasks. On held-out molecular optimization benchmarks, the resulting model achieves strong performance, supporting the relevance of LSBO-specific adaptation for in-context surrogates.

2606.09663 2026-06-09 cs.AI 新提交

From 0-to-1 to 1-to-N: Reproducible Engineering Evidence for MetaAI Recursive Self-Design

从0到1再到N:MetaAI递归自我设计的可复现工程证据

Dun Li, Jiatao Li, Hongzhi Li

发表机构 * The Hong Kong Polytechnic University(香港理工大学) Shanghai Maritime University(上海海事大学) Chizhou University(池州学院)

AI总结 提出可复现证据框架,通过四个标准评估现有系统,其中Darwin Goedel Machine在SWE-bench上提升30%,并给出可复现协议MetaAI-Mini。

Comments 6 pages, 2 figures, 7 tables. Supplementary code: https://github.com/DunLi-Tsinghua/MetaAI-Mini

详情
AI中文摘要

递归自我设计指的是AI辅助修改AI系统构建、评估和改进的机制。本文将MetaAI视为一种由人类播种、AI扩展的开发模式,其中设计空间本身成为修改目标。我们提出了一个可操作证据框架,包含四个标准:可检查的目标系统、元级修改器、反馈导向选择和递归延续。然后,我们将包括Darwin Goedel Machine (DGM)、STOP、Goedel Agent和ShinkaEvolve在内的公开系统映射到这些标准上。DGM提供了目前最直接的已报告证据:其公布的结果显示,经过80次迭代,SWE-bench Verified上的性能从20%提升到50%,完整Polyglot上的性能从14.2%提升到30.7%,消融实验表明开放式探索和自我改进都有贡献。最后,我们提供了MetaAI-Mini,一个基于HumanEval的可复现协议和代码库。由于本次构建未包含完整的模型运行,MetaAI-Mini作为协议而非实验结果报告。

英文摘要

Recursive self-design refers to AI-assisted modification of the mechanisms by which an AI system is built, evaluated, and improved. This paper treats MetaAI not as a mature paradigm, but as a working term for a human-seeded, AI-expanded development pattern in which the design space itself becomes a target of modification. We propose an operational evidence framework with four criteria: inspectable target system, meta-level modifier, feedback-directed selection, and recursive continuation. We then map public systems, including Darwin Goedel Machine (DGM), STOP, Goedel Agent, and ShinkaEvolve, against these criteria. DGM provides the most direct currently reported evidence: its published results show improvement from 20% to 50% on SWE-bench Verified and from 14.2% to 30.7% on full Polyglot after 80 iterations, with ablations suggesting that both open-ended exploration and self-improvement contribute. Finally, we provide MetaAI-Mini, a reproducible HumanEval-based protocol and codebase. Because no completed model run is included in this build, MetaAI-Mini is reported as a protocol rather than as an experimental result.

2606.09662 2026-06-09 cs.CL 新提交

When Built-in Thinking Helps and Hurts: Constraint-Level Error Shifts in Instruction Following

当内置思考既有帮助又有害:指令遵循中的约束级错误转移

Sai Adith Senthil Kumar

发表机构 * George Mason University(乔治梅森大学)

AI总结 研究大型推理模型(LRM)的思考模式对指令遵循的影响,发现思考会改变错误模式而非统一降低性能,其中规划类约束改善而精确类约束恶化,并通过分析思考轨迹和激活修补揭示了机制。

Comments 16 pages, 7 figures, 15 tables

详情
AI中文摘要

大型推理模型(LRM)通常能提升数学和编码性能,但其对指令遵循的影响尚不明确。我们使用 Qwen3 模型(1.7B-32B)研究 IFEval,采用同权重的思考开启/关闭控制;四个 Hunyuan 模型提供跨家族方向性支持。总体通过率变化很小(-0.55 到 -3.52 个百分点),但 10-20% 的提示在两种模式间在通过和失败之间切换,表明思考改变了错误模式——某些提示改善而另一些恶化——而非统一降低性能。在事后 Qwen3 导出的分组下,约束类型分为规划类(全局计数、结构、协调)和精确类(精确局部形式);规划类在思考下类别层面改善,而精确类持续恶化;尽管 Hunyuan 的总体方向相反,但所有四个 Hunyuan 模型在类别层面的规划/精确符号模式方向一致。思考还改变了最终答案长度;匹配长度分析大幅减少了精确类的下降,但仍有残余惩罚。使用交叉编码器相关性指标分析思考轨迹揭示了三种模式:中性模式显示正的相关性-合规性关联(r ≈ 0.15);规划模式显示接近零的预测相关性(r ≈ 0.02),尽管有可测量的轨迹参与,这与 CE 测量的轨迹相关性和最终答案合规性之间的执行差距一致;精确模式显示小的负相关性(r ≈ -0.05),失败实例的平均相关性高于通过实例。跨四个模型大小(1.7B-14B)的激活修补显示,精确类翻转实例比规划类翻转实例更常被恢复(32-58% 对 14-40% 的平均层恢复),最大差距在 14B 处(约 30 个百分点)。

英文摘要

Large reasoning models (LRMs) often improve math and coding performance, but their effect on instruction following is unclear. We study IFEval with Qwen3 models (1.7B-32B), using same-weights Thinking ON/OFF controls; four Hunyuan models provide directional cross-family support. Aggregate pass-rate changes are small (-0.55 to -3.52 pp), yet 10-20% of prompts switch between pass and fail across modes, suggesting that thinking changes the pattern of errors--some prompts improve while others worsen--rather than uniformly degrading performance. Under a post-hoc Qwen3-derived grouping, constraint types separate into Planning (global counting, structure, coordination), which improves at the class level under thinking, and Precision (exact local form), which consistently worsens; the class-level Planning/Precision sign pattern holds directionally for all four Hunyuan models despite Hunyuan's opposite aggregate direction. Thinking also changes final-answer length; matched-length analyses substantially reduce the Precision drop, but a residual penalty remains. Analyzing thinking traces with a cross-encoder relevance metric reveals three patterns: Neutral shows a positive relevance-compliance link (r approximately 0.15); Planning shows near-zero predictive correlation (r approximately 0.02) despite measurable trace engagement, consistent with an execution gap between CE-measured trace relevance and final-answer compliance; Precision shows a small negative correlation (r approximately -0.05), with failing instances having higher mean relevance than passing ones. Activation patching across four model sizes (1.7B-14B) shows that Precision flip instances are more often restored than Planning flip instances (32-58% vs. 14-40% mean layer-restoration), with the largest gap at 14B (about 30 pp).

2606.09659 2026-06-09 cs.CL cs.AI cs.LG 新提交

End-to-End Context Compression at Scale

端到端上下文压缩的规模化

Ang Li, Sean McLeish, Haozhe Chen, Nimit Kalra, Zaiqian Chen, Artem Gazizov, Venkata Anoop Suhas Kumar Morisetty, Bhavya Kailkhura, Harshitha Menon, Zhuang Liu, Brian R. Bartoldson, Tom Goldstein, Sanae Lotfi, Micah Goldblum, Pavel Izmailov

发表机构 * New York University(纽约大学) Modal Labs(Modal实验室) University of Maryland(马里兰大学) Princeton University(普林斯顿大学) Columbia University(哥伦比亚大学) Harvard University(哈佛大学) Lawrence Livermore National Laboratory(劳伦斯利弗莫尔国家实验室) FAIR at Meta(Meta FAIR实验室)

AI总结 本研究通过架构搜索和持续预训练,提出潜在上下文语言模型(LCLMs),一种端到端编码器-解码器压缩器,在通用任务性能、压缩速度和峰值内存上改进帕累托前沿,并可作为长时智能体的高效骨干。

详情
AI中文摘要

长上下文语言模型推理受限于内存,因为KV缓存随上下文长度增长。最近压缩KV缓存的技术存在不足:它们要么大幅降低模型质量,要么需要大量时间和计算来压缩单个长提示。此外,许多方法要求输入适合目标模型的上下文窗口,并且通常与现代生产推理引擎不兼容。编码器-解码器压缩器原则上是一种有吸引力的替代方案,它将长令牌序列映射到由解码器消费的较短潜在嵌入序列。然而,现有方法在精度-效率前沿上无法与KV缓存压缩竞争。在这项工作中,我们重新审视编码器-解码器压缩并缩小了这一差距。我们首先进行架构搜索,从头开始预训练许多变体,以确定如何最佳设计和训练编码器-解码器压缩器。根据我们的发现,我们持续预训练一系列0.6B编码器、4B解码器模型,每个模型在超过350B令牌上训练,压缩比为1:4、1:8和1:16。我们引入了潜在上下文语言模型(LCLMs),这是一系列压缩器,在通用任务性能、压缩速度和峰值内存使用上改进了帕累托前沿。我们证明了LCLMs可作为长时智能体的高效骨干,让智能体浏览压缩的长上下文并按需自适应扩展相关片段。

英文摘要

Long-context language model inference is bottlenecked by memory, as the KV cache grows with context length. Recent techniques to compress the KV cache fall short: they either degrade model quality substantially or require considerable time and compute to compress a single long prompt. Furthermore, many methods require the input to fit within the target model's context window, and are generally incompatible with modern production inference engines. Encoder-decoder compressors, which map a long token sequence to a shorter sequence of latent embeddings consumed by a decoder, are an appealing alternative in principle. However, existing approaches are not competitive with KV cache compression on the accuracy-efficiency frontier. In this work, we revisit encoder-decoder compression and close this gap. We first perform an architecture search, pre-training many variants from scratch to determine how best to design and train encoder-decoder compressors. Guided by our findings, we continually pre-train a family of 0.6B-encoder, 4B-decoder models on over 350B tokens each, at compression ratios of 1:4, 1:8, and 1:16. We introduce Latent Context Language Models (LCLMs), a family of compressors that improve the Pareto frontier across general-task performance, compression speed, and peak memory usage. We demonstrate that LCLMs serve as efficient backbones for long-horizon agents, letting the agent skim through a compressed long context and adaptively expand relevant segments on demand.

2606.09658 2026-06-09 cs.LG cs.AI 新提交

Muon Learns More Robust and Transferable Features than Adam

Muon 比 Adam 学习更鲁棒和可迁移的特征

Tianyu Ruan, Fengzhuo Zhang, Shuche Wang, Shihua Zhang

发表机构 * Yale University(耶鲁大学) National University of Singapore(新加坡国立大学) University of Chinese Academy of Sciences(中国科学院大学) Academy of Mathematics and Systems Science, CAS(中国科学院数学与系统科学研究院)

AI总结 本文通过鲁棒性和可迁移性视角,证明 Muon 优化器相比 Adam 和 SGD 能学习到更鲁棒、更可迁移的特征,并通过理论分析支持了经验发现。

详情
AI中文摘要

Muon 最近已成为预训练大型语言模型(LLMs)和视觉分类器的最先进优化器。尽管其在效率上优于 Adam 和 SGD,但 Muon 在特征学习方面的优势仍不清楚。本文通过鲁棒性和可迁移性的视角研究了 Muon 的特征学习优势。首先,通过在损坏图像和文本上评估预训练模型,我们表明 Muon 学习到的特征在不同架构(包括 Transformer 和卷积神经网络(CNN))中始终比 Adam 和 SGD 学习到的特征更鲁棒。使用训练好的逐层探针,我们进一步表明这种鲁棒性优势体现在各层更大的 logit 间隔上。其次,通过在下游任务上训练线性分类器或从预训练参数微调完整模型,我们证明 Muon 学习到的特征比 Adam 和 SGD 学习到的特征更有效地迁移。这种可迁移性优势还通过有效秩衡量的各层隐藏状态的多样性得到进一步支持。最后,在一个具有多组件特征的代表性分类问题中,我们证明 Muon 比 Adam 和 SGD 获得更大的间隔和更高的有效秩,为我们的经验发现提供了理论支持。

英文摘要

Muon has recently emerged as a state-of-the-art optimizer for pretraining Large Language Models (LLMs) and vision classifiers. Despite its efficiency advantage over Adam and SGD, the feature-learning advantage of Muon remains unclear. This paper investigates Muon's feature-learning advantage through the lens of robustness and transferability. First, by evaluating pretrained models on corrupted images and texts, we show that features learned by Muon are consistently more robust than those learned by Adam and SGD across different architectures, including transformers and Convolutional Neural Networks (CNNs). Using trained layer-wise probes, we further show that this robustness advantage is reflected in larger logit margins across layers. Second, by training linear classifiers or fine-tuning full models from pretrained parameters on downstream tasks, we demonstrate that Muon-learned features transfer more effectively than those learned by Adam and SGD. This transferability advantage is further supported by the diversity of hidden states across layers, as measured by effective rank. Finally, in a representative classification problem with multi-component features, we prove that Muon attains larger margins and higher effective rank than Adam and SGD, providing theoretical support for our empirical findings.

2606.09655 2026-06-09 cs.CL 新提交

Beyond Accuracy: Community Perspectives on Machine Translation

超越准确率:机器翻译的社区视角

Yujun Wang, Ehud Reiter, Shimei Pan, Steffen Eger, Wei Zhao

发表机构 * University of Technology Nuremberg(纽伦堡工业大学) University of Maryland, Baltimore County(马里兰大学巴尔的摩县分校) University of Aberdeen(阿伯丁大学)

AI总结 本文通过分析社交媒体上四个利益相关者社区(AI开发者、专业译者、语言学习者、语言服务提供商)的帖子,揭示机器翻译技术社区间的分歧与冲突,强调倾听用户社区需求的重要性。

详情
AI中文摘要

尽管机器翻译(MT)取得了显著进展,但非AI社区对MT系统日益增长的担忧表明技术进展与现实用户需求之间存在明显差距。例如,NLP研究人员关注基准性能,而最终用户关心伦理问题、信任、可靠性、成本等。我们认为倾听不同用户社区至关重要,以便研究工作能针对社区关心的问题。为此,我们首次进行大规模分析,调查四个利益相关者社区(AI开发者、专业译者、语言学习者和语言服务提供商)在社交媒体上关于MT技术的帖子。我们构建了一个包含2019年至2025年来自Reddit、Facebook、Bluesky和Mastodon的79,286条帖子及评论的数据集,并分析这些社区在哪些方面存在分歧,以及分歧的方式和原因。总体而言,我们发现社区间经常存在分歧,甚至在翻译质量、效率和可靠性等话题上因情绪极化而表现出强烈冲突。这是因为这些社区处理这些话题的方式不同:AI社区将其视为技术和计算问题,而非AI(用户)社区更关注质量细微差别、时间节省、用户信任和更广泛的社会问题。

英文摘要

Despite remarkable progress in machine translation (MT), non-AI communities have raised growing concerns about MT systems, suggesting a noticeable gap between technical advancement and the needs of real-world users. For instance, while NLP researchers focus on benchmark performance, end users care about ethical concerns, trust, reliability, costs, and more. We argue that listening to various user communities is essential so that research efforts would be directed towards the problems that the communities care about. To this end, we present a large-scale analysis, for the first time, that investigates what four stakeholder communities (AI developers, professional translators, language learners, and language service providers) post about MT technology on social media. To do so, we construct a dataset of 79,286 posts and comments from Reddit, Facebook, Bluesky, and Mastodon from 2019 to 2025, and analyse where these communities disagree, and how and why. Overall, we find that communities often disagree, and even show strong conflicts due to polarised sentiments on topics such as translation quality, efficiency, and reliability. This is because these communities approach these topics differently: the AI community frames them as technical and computational problems, while non-AI (user) communities care more about quality nuances, time savings, user trust, and broader social issues.

2606.09653 2026-06-09 cs.LG 新提交

A Unifying Framework for Concept-Based Representational Similarity

基于概念的表征相似性的统一框架

Grégoire Dhimoïla, Victor Boutin, Agustin Martin Picard, Thomas Fel, Thomas Serre

发表机构 * Brown University(布朗大学) ENS Paris Saclay(巴黎萨克雷高等师范学校) CNRS(法国国家科学研究中心) DEEL - IRT Saint Exupéry(DEEL - IRT 圣埃克苏佩里) Goodfire

AI总结 提出统一框架分解概念对齐的两个轴(表征vs.概念、实例级vs.分布级),定义四种性质,并引入干预基准InterVenchA和耦合稀疏自编码器CoSAE,证明对齐是多目标问题。

详情
AI中文摘要

跨模型和模态的学习表征常常展现出惊人的结构相似性,暗示着共享的潜在概念分解。然而,概念对齐的定义仍不明确:现有方法在相同术语下优化不同目标,模糊了实际对齐的内容。我们提出了一个统一框架,沿两个轴分解对齐:对齐什么(表征vs.概念)以及什么级别(实例级vs.分布级)。这产生了四个相应的性质——翻译和概念一致性的实例级和分布级变体——并精确揭示了现有方法提供了这些保证中的哪些。我们进一步引入了\InterVenchA,一个基于干预的基准,分别衡量提取质量、翻译质量和概念一致性。通过理论和实验,我们表明对齐目标之间通常假设的等价性在实践中不成立:优化一个性质并不能可靠地恢复其他性质,纯无监督目标无法恢复有意义的实例级对齐。然后我们提出了耦合稀疏自编码器(CoSAE),它联合强制互补的对齐目标。强对齐仅在这种机制下出现。令人惊讶的是,当锚定分布目标时,仅0.1%的配对数据就足以恢复实例级对齐。总体而言,我们的结果表明概念对齐本质上是多目标的:它必须被定义、衡量和优化为多目标。

英文摘要

Learned representations across models and modalities often exhibit striking structural similarities, suggesting shared underlying concept decompositions. However, concept alignment remains poorly defined: existing approaches optimize different objectives under the same terminology, obscuring what is actually aligned. We propose a unifying framework that decomposes alignment along two axes: what is aligned (representations vs. concepts) and at what level (instance-wise vs. distributional). This induces four corresponding properties -- instance-wise and distributional variants of translation and concept consistency -- and reveals precisely which of these guarantees existing methods provide. We further introduce \InterVenchA, an intervention-based benchmark that separately measures extraction quality, translation quality, and concept consistency. Through theory and experiments, we show that commonly assumed equivalences between alignment objectives fail in practice: optimizing one property does not reliably recover the others, and purely unsupervised objectives fail to recover meaningful instance-level alignment. We then propose the Coupled Sparse Autoencoder (CoSAE), which jointly enforces complementary alignment objectives. Strong alignment emerges only in this regime. Surprisingly, as little as 0.1\% paired data is sufficient to recover instance-level alignment when anchoring distributional objectives. Overall, our results show that concept alignment is fundamentally multi-objective: it must be defined, measured, and optimized as such.