arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2136
专题追踪
2606.10874 2026-06-10 cs.CV math.QA quant-ph 新提交

Schmidt Decomposition-Based Methods for Efficient Quantum Image Encoding

基于Schmidt分解的高效量子图像编码方法

Ana-Maria Pangeva, Yassine Ferhi, Alexander Geng, Andreas Weinmann, Desislava Ivanova, Ali Moghiseh

AI总结 针对量子图像编码在NISQ设备上电路复杂度过高的问题,提出基于Schmidt分解的低秩近似方法,在保持图像质量的同时显著降低电路深度和门数量,FRQI模型实现97%的深度缩减且MSE仅约0.27。

详情
AI中文摘要

在量子图像处理中,一个基本步骤是将经典图像数据编码为量子态。这可以通过诸如量子图像的灵活表示(FRQI)、量子概率图像编码(QPIE)和新颖增强量子表示(NEQR)等方法实现。然而,在真实量子硬件上,这些编码会迅速导致电路具有大量门、大电路深度和高量子比特使用量,这对于嘈杂中等规模量子(NISQ)设备来说是一个问题。在这项工作中,我们研究了通过Schmidt分解公式化的低秩状态近似是否有助于降低这种复杂性。该方法仅保留量子态纠缠结构中最显著的部分,使状态准备更高效,同时保留大部分图像信息。我们比较了三种编码技术在其原始形式和低秩近似下的性能,评估了电路深度、CNOT计数、MSE和重建图像的视觉质量等指标。结果揭示了准确性与资源效率之间有意义的权衡,其中FRQI模型实现了97%的电路深度缩减,同时保持了近乎完美的重建(MSE约为0.27)。这证明了低秩技术在近期硬件上推进实用量子图像处理的潜力。

英文摘要

In quantum image processing, a fundamental step is encoding classical image data into quantum states. This can be achieved using methods such as Flexible Representation of Quantum Images (FRQI), Quantum Probability Image Encoding (QPIE), and Novel Enhanced Quantum Representation (NEQR). However, on real quantum hardware, these encodings can quickly lead to circuits with many gates, large circuit depth, and high qubit usage, which is a problem for Noisy Intermediate-Scale Quantum (NISQ) devices. In this work, we investigate whether low-rank state approximation, formulated via Schmidt decomposition, can help reduce this complexity. The method keeps only the most significant parts of a quantum state's entanglement structure, making state preparation more efficient while preserving most of the image information. We compare the three encoding techniques in their original form and with low-rank approximation, evaluating metrics such as circuit depth, CNOT count, MSE, and visual quality of reconstructed images. The results reveal meaningful trade-offs between accuracy and resource efficiency, with the FRQI model achieving a 97 percent reduction in circuit depth while maintaining a near-perfect reconstruction (MSE of about 0.27). This demonstrates the potential of low-rank techniques for advancing practical quantum image processing on near-term hardware.

2606.09857 2026-06-10 cs.LG physics.comp-ph 新提交

Uncertainty-aware Multi-fidelity Closure via Conditional Normalizing Flows

基于条件归一化流的不确定性感知多保真度闭合模型

Jice Zeng, Shady E. Ahmed, David Barajas-Solano, Panos Stinis

AI总结 提出基于条件归一化流的不确定性感知多保真度框架,通过学习低保真度到高保真度系数的概率映射,解决降阶模型中的闭合问题,在涡旋合并问题中验证了残差学习优于直接学习。

Comments No comments

详情
AI中文摘要

降阶模型(ROM)为复杂多尺度系统提供了高效的替代模型,但其预测精度常因截断误差以及已解析尺度与未解析尺度之间相互作用的不足表示而受损。截断(未解析)尺度对ROM(已解析)尺度缺失的影响通常被称为闭合问题。在本工作中,我们将ROM闭合建模视为一个多保真度(MF)学习问题,并基于条件归一化流提出一个不确定性感知的MF框架,以提高ROM的预测精度。所提出的方法学习从低保真度(LF)ROM系数到高保真度(HF)系数的概率映射,从而在量化与所学闭合相关的不确定性的同时提高预测保真度。研究了两种校正策略:直接学习(直接从LF输入预测HF系数)和残差学习(学习LF与HF系数之间的差异,并用其恢复校正后的HF解)。该框架在由二维Navier-Stokes方程控制的涡旋合并问题上进行了验证。结果表明,两种校正策略均比未校正的ROM提高了精度,其中残差学习始终优于直接学习。此外,所提出的两种基于深度生成模型的策略为校正后的ROM系数提供了不确定性量化,这对于评估预测置信度和支持ROM在实际应用中的可靠使用至关重要。

英文摘要

Reduced-order models (ROMs) provide an efficient surrogate for complex multiscale systems, but their predictive accuracy is often compromised by truncation errors and the inadequate representation of interactions between resolved and unresolved scales. The missing effect of truncated (unresolved) scales on ROM (resolved) scales is often denoted as the closure problem. In this work, we formulate ROM closure modeling as a multi-fidelity (MF) learning problem and propose an uncertainty-aware MF framework based on conditional normalizing flow to enhance ROM predictive accuracy. The proposed approach learns a probabilistic mapping from low-fidelity (LF) ROM coefficients to high-fidelity (HF) coefficients, thereby improving predictive fidelity while quantifying the uncertainty associated with the learned closure. Two correction strategies are investigated: direct learning, in which HF coefficients are predicted directly from LF inputs, and residual learning, which learns the discrepancy between LF and HF coefficients and uses it to recover the corrected HF solution. The framework is demonstrated on a vortex merging problem governed by the two-dimensional Navier Stokes equations. Results show that both correction strategies improve ROM accuracy over uncorrected ROM, with residual learning achieving consistently better performance than direct learning. Moreover, the two proposed deep generative model-based strategies provide uncertainty quantification for the corrected ROM coefficients, which is critical for assessing prediction confidence and supporting the reliable use of ROMs in practical applications.

2606.09950 2026-06-10 cs.LG nucl-th physics.comp-ph physics.data-an 新提交

Integrating Out, Twice:The Open-System Case That Neural-Network Ensemble Theory Is Missing

两次积分:神经网络集成理论缺失的开系统情形

Jin Lei

AI总结 本文揭示神经网络参数平均与高斯边缘化等价,指出集成理论仅覆盖闭系统,缺失开系统情形;借鉴核反应理论,通过非厄米有效生成器描述开系统,并在注意力图等应用中测试,发现主要结果为负,并解释其结构原因。

详情
AI中文摘要

将神经网络在其随机参数上平均与边缘化高斯扇区是相同的操作,即被消除块的Schur补,当该块闭合时,它返回协方差及其逆。网络集成产生的全部就是闭情形。开情形缺失,而核反应理论已将其解决。将散射问题投影到选定的通道集上,其余部分不可逆地将概率携带到连续谱,留下一个非厄米有效生成器,它精确地守恒并列举它所失去的:核光学模型及其广义光学定理。我仅使用分布的矩、高斯代数和块逆来并置这两种情形,不使用场论,并完整给出闭情形的词典:神经正切核是Fisher灵敏度核,无限宽高斯极限是高斯过程仿真器,从懒惰到特征转换是简化基仿真器的有效性边界。然后我在截断的注意力图、令牌级传输算子和稀疏专家路由器上测试开情形的导出,并报告一个主要为负的结果。守恒流账本在真正存在开放性的地方起作用,但其独特内容缺失,是所选划分的伪影,或被训练目标固定在某个下限附近,而操作上有用的不确定性实际上是认知性的,存在于对应的闭半部分,而非开半部分。这个负结果有一个结构原因,本文使其精确:开情形需要一个具有连续谱和波动(而非弛豫)动力学的被消除扇区,而主流学习的有限或耗散对象无法提供。这是一篇笔记,而非结果;其主要发现是那个负结果,其价值在于定位它的地图。

英文摘要

Averaging a neural network over its random parameters and marginalizing a Gaussian sector are the same operation, the Schur complement of the eliminated block, and when that block is closed it returns a covariance and its inverse. That is all a network ensemble produces, the closed case. The open case is missing, and nuclear reaction theory has it worked out. Projecting a scattering problem onto a chosen set of channels, with the rest carrying probability irreversibly to a continuum, leaves a non-Hermitian effective generator that conserves and itemizes exactly what it loses: the nuclear optical model and its generalized optical theorem. I set the two cases side by side using only the moments of a distribution, the algebra of Gaussians, and block inversion, no field theory, and give the closed-case dictionary in full: the neural tangent kernel is the Fisher sensitivity kernel, the infinite-width Gaussian limit is the Gaussian-process emulator, and the lazy-to-feature transition is the validity boundary of a reduced-basis emulator. I then test the open export on a truncated attention map, a token-level transfer operator, and a sparse expert router, and report a mostly negative result. The conserved flux ledger ports wherever openness is genuinely present, but its distinctive content is absent, an artifact of the chosen partition, or pinned near a floor by the training objective, and the operationally useful uncertainty turns out to be epistemic, living in the closed half of the correspondence, not the open one. The negative has a structural reason this note makes precise: the open case needs an eliminated sector with a continuous spectrum and wave-like, not relaxational, dynamics, which mainstream learning's finite or dissipative objects do not supply. This is a note, not a result; its main finding is that negative one, and its value is the map that locates it.

2606.10324 2026-06-10 cs.LG cond-mat.stat-mech stat.ML 新提交

Rank Collapse, Fixed Points, and the Renormalization Group Structure of MLP Residual Networks

MLP残差网络的秩坍缩、不动点与重正化群结构

Parviz Haggi-Mani, Irina Rish

AI总结 本文通过MLP残差网络在合成马尔可夫链上的掩码预测任务,首次定量证明网络深度方向存在选择性秩坍缩,对应重正化群中的相关自由度整合,并发现层间核漂移集中在少数转换处。

Comments 16 pages, 9 figures

详情
AI中文摘要

深度神经网络前向传播与重正化群流之间的类比在文献中反复被提及,但现有处理仍是定性的:深度被描述为粗粒化尺度,注意力被比作配分函数,表示被认为流向不动点。尚无工作定义可测量的RG序参量,在输入分布受控变化下测试它,或做出经实验验证的定量预测。我们研究了类比可处理的最简单架构:一个纯MLP残差堆栈,在具有已知谱性质的合成马尔可夫链序列上训练掩码标记预测。我们报告三个发现。(i) 训练后残差流的有效秩随深度单调递减,与无关自由度的逐步整合一致。(ii) 这种秩坍缩是选择性的:它发生在相关长度约1的短链上,但在相关长度约7的长链上不存在(在位置级别测量以控制均值池化伪影)。网络精确保留了预测任务相关的自由度,即RG相关性判据的内容。(iii) 层间核漂移集中在一两个特定转换处,网络其余部分接近不动点,与离散不动点平台一致。这些发现共同构成了首个定量的位置级证据,表明MLP残差网络实现了由输入分布谱结构控制的选择性粗粒化过程。

英文摘要

The analogy between deep neural network forward passes and renormalization group (RG) flows has been repeatedly noted in the literature, but existing treatments remain qualitative: depth is described as a coarse-graining scale, attention is likened to a partition function, and representations are said to flow toward fixed points. No existing work has defined a measurable RG order parameter, tested it under controlled variation of the input distribution, or made quantitative predictions that are empirically verified. We study the simplest architecture for which the analogy is tractable: a pure MLP residual stack trained on masked token prediction over synthetic Markov chain sequences with known spectral properties. We report three findings. (i) The effective rank of the residual stream decreases monotonically with depth after training, consistent with progressive integration of irrelevant degrees of freedom. (ii) This rank collapse is selective: it occurs for chains with short correlation length approximately 1 but is absent for chains with long correlation length approximately 7, measured at the position level to control for mean-pooling artifacts. The network preserves exactly the degrees of freedom relevant to the prediction task, the content of the RG relevance criterion. (iii) Inter-layer kernel drift is concentrated at one or two specific transitions, with the remainder of the network near a fixed point, consistent with a discrete fixed-point plateau. Together these findings constitute the first quantitative, position-level evidence that MLP residual networks implement a selective coarse-graining procedure governed by the spectral structure of the input distribution.

2606.10868 2026-06-10 cs.LG astro-ph.IM 新提交

When Do Autoregressive Sequence Models Forecast Physical Wavefields? A Controlled Study on Synthetic Seismograms

自回归序列模型何时能预测物理波场?基于合成地震图的受控研究

Waleed Esmail, Stuart Russell, Jana Klinge, Alexander Kappes, Christine Thomas

AI总结 通过合成三分量地震图受控消融实验,发现多token预测是自回归波场滚动预测稳定的主要因素,并揭示上下文比率阈值和相位感知损失的关键作用。

Comments 16 pages, 5 figures and 3 tables

详情
AI中文摘要

长时程自回归预测振荡物理信号(如地震图、引力波应变及类似波场)受限于误差累积:当因果模型在数百步中不断接收自身输出时,微小的每步误差会复合为相位漂移,而逐点指标无法检测到这种漂移。我们以合成三分量地震图作为物理结构化的测试平台,以\ extsc{SeismoGPT}自回归预测器作为研究对象,探究这种滚动预测何时保持稳定。通过受控的架构内消融实验,在自由运行滚动预测上结合配对显著性检验进行评估,我们分离了每个设计选择的贡献。多token预测是主要的稳定因素,几乎贡献了相对于单token基线的全部改进(中位数NCC提升+0.040);地平线嵌入混合预测头和跨地平线STFT幅度相干性损失各自增加了微小但一致的额外增益。性能严重依赖于接近1的上下文比率阈值(大致为观测信号的完整P-S区间),低于该阈值时滚动泛化能力崩溃。主要的残余失败是极性反转,而基于幅度的频谱损失无法(按设计)对此进行惩罚,这表明相位感知目标自然成为下一步方向。我们将此定位为对振荡波场滚动稳定性的受控研究,而非预测架构的基准测试。

英文摘要

Long-horizon autoregressive forecasting of oscillatory physical signals, such as seismograms, gravitational-wave strain, and similar wavefields is limited by error accumulation: as a causal model is fed its own outputs over hundreds of steps, small per-step errors compound into phase drift that pointwise metrics fail to detect. We ask when such rollout stays stable, using synthetic three-component seismograms as a physically structured testbed and the \textsc{SeismoGPT} autoregressive forecaster as the model under study. Through controlled, intra-architecture ablations evaluated on free-running rollout with paired significance tests, we isolate the contribution of each design choice. Multi-token prediction is the dominant stabilizer, accounting for almost the entire improvement over a single-token baseline ($+0.040$ median NCC); a horizon-embedding hybrid prediction head and a cross-horizon STFT-magnitude coherence loss each add a small but consistent further gain. Performance depends sharply on a context-ratio threshold near one, roughly the full P-S interval of observed signal, below which rollout generalization collapses. The dominant residual failure is a polarity inversion that a magnitude-based spectral loss cannot, by construction, penalize, identifying phase-aware objectives as the natural next step. We frame this as a controlled study of rollout stability on oscillatory wavefields, not a benchmark of forecasting architectures.

2606.07998 2026-06-10 cs.LG cs.AI 版本更新

Enhancing AI Interpretability and Safety through Localised Architectures

通过局部化架构增强AI可解释性与安全性

Ian Seet, Jonas Bozenhard, Simon Ostermann

AI总结 针对大型生成式AI模型可解释性差、计算成本高的问题,提出局部化机器学习架构,通过降低带宽、提高节点表达能力来提升可解释性和效率,并评估了多种硬件实现方案的适用性。

详情
AI中文摘要

近期生成式AI的进展,特别是强大的大型语言模型(LLM)和大型推理模型(LRM),引发了对这些庞大且不透明的AI模型的可解释性、安全性和可持续性的担忧。这些架构的能力不仅源于深度神经网络的可扩展性,还源于大规模并行硬件(如GPU集群)。深度神经网络的弥散性质使其在提供足够训练数据时具有强大的函数逼近能力,但代价是可解释性和计算效率的降低。观察到局部化机器学习(ML)模型在小数据集上往往比深度神经网络更具可解释性和计算效率,我们通过类比推理,认为类似的优势可能适用于特定的局部化硬件ML架构。我们主张,具有较低带宽但每个节点具有更高表达能力的局部化架构,有潜力在根本上比运行在GPU集群上的深度神经网络更具可解释性,同时在较小数据集上保持竞争力。然后,我们评估了各种硬件ML范式在实现此类局部化架构方面的适用性,并评估了它们的每节点表达能力、能效以及所需技术的实际成熟度。

英文摘要

Recent advances in generative AI, especially powerful Large Language Models (LLMs) and Large Reasoning Models (LRMs), raise concerns over the interpretability, safety and sustainability of these large and opaque AI models. The power of such architectures is derived not only from the scalability of deep neural networks, but also massively parallel hardware such as GPU clusters. The diffuse nature of deep neural networks gives them great function-approximation capability when provided with sufficient training data but imposes a cost in interpretability and computational efficiency. Observing that localised machine learning (ML) models tend to be more interpretable and computationally efficient than deep neural networks on small datasets, we reason by analogy that similar advantages may apply to specific localised hardware ML architectures. We argue that localised architectures with lower bandwidth but higher expressivity per node have the potential to be fundamentally more interpretable than deep neural networks running on GPU clusters while remaining competitive for smaller datasets. We then evaluate the suitability of various hardware ML paradigms for implementing such localised architectures and evaluate their per-node expressivity, energy efficiency and practical maturity of the technology required.

2606.06624 2026-06-10 cs.LG 版本更新

Principles and Practice of Deep Representation Learning: or a Mathematical Theory of Memory

深度表示学习的原理与实践:或记忆的数学理论

Sam Buchanan, Druv Pai, Peng Wang, Yi Ma

AI总结 本书通过表示学习视角,用优化和信息论解释现代神经网络架构设计原理,旨在打开黑箱,提高可解释性、可靠性和可控性。

Comments version 2; TeX source and supplementary material at https://ma-lab-berkeley.github.io/deep-representation-learning-book/

详情
AI中文摘要

在当前深度学习和特别是生成模型的时代,训练非常大的生成模型投入巨大。到目前为止,这类模型是难以理解的“黑箱”,因为它们具有不透明的内部机制,导致在可解释性、可靠性和可控性方面存在困难。自然,这种缺乏理解导致了炒作和恐惧。本书试图通过表示学习的视角“打开黑箱”并理解大型深度网络的机制,这是深度学习模型经验能力的主要因素——可以说是最重要的因素。本书简要大纲如下:第1章将总结贯穿全文的线索。第2、3、4、5和6章将通过优化和信息论解释现代神经网络架构的设计原理,一旦引入基本原理,就将架构开发过程(长期以来被描述为一种“炼金术”)简化为本科水平的线性代数和微积分练习。第7章和第8章将讨论这些原理在更范式化的问题解决中的应用,获得新的方法和模型,这些模型在设计上高效、可解释且可控,但又不亚于——有时甚至超过——它们所模仿的黑箱模型。第9章将讨论深度学习的潜在未来方向、表示学习的作用以及一些开放问题。

英文摘要

In the current era of deep learning and especially generative models, there is significant investment in training very large deep neural networks. Thus far, such models have been "black boxes" that are difficult to understand in the sense that they have opaque internal mechanisms, leading to difficulties in interpretability, reliability, and control. Naturally, this lack of understanding has led to both hype and fear. This book is an attempt to "open the black box" and understand the mechanisms of large deep networks, through the perspective of representation learning, which is a major factor - arguably the single most important one - in the empirical power of deep learning models. A brief outline of this book is as follows. Chapter 1 will summarize the threads that underlie the whole text. Chapters 2, 3, 4, 5, and 6 will explain the design principles of modern neural network architectures through optimization and information theory, reducing the process of architecture development (long having been described as a sort of "alchemy") to undergraduate-level linear algebra and calculus exercises once the underlying principles are introduced. Chapters 7 and 8 will discuss applications of these principles to solve problems in more paradigmatic ways, obtaining new methods and models which are efficient, interpretable, and controllable by design, and yet no less - sometimes even more - powerful than the black-box models they resemble. Chapter 9 will discuss potential future directions for deep learning, the role of representation learning, as well as some open problems.

2604.13717 2026-06-10 cs.CL 版本更新

On Cost-Effective LLM-as-a-Judge Improvement Techniques

关于成本效益的LLM作为评判者的改进技术

Ryan Lail, Luke Markham

AI总结 研究通过集成评分、任务特定标准注入等四种技术提高LLM评判准确性,在RewardBench 2上达到85.8%准确率,成本效益显著。

Comments Accepted at the ICML 2026 workshops "Statistical Frameworks for Uncertainty in Agentic Systems" and "Combining Theory and Benchmarks: Towards a Virtuous Cycle to Understand and Guarantee Foundation Model Performance". 13 pages, 9 figures

详情
AI中文摘要

使用语言模型对候选回答进行评分或排序已成为强化学习从人类反馈(RLHF)流程、基准测试和应用层评估中人类评估的可扩展替代方案。然而,输出可靠性在很大程度上依赖于提示和聚合策略。我们对四种即插即用技术——集成评分、任务特定标准注入、校准上下文和自适应模型升级——进行了实证研究,以在RewardBench 2上提高LLM评判准确性,并通过噪声控制的统一视角对随机评判器进行分析:集成作为每次调用噪声的蒙特卡洛平均,标准注入作为回答间判别锐化,以及每次回答得分方差作为不确定性信号。集成评分和任务特定标准注入(后者几乎零成本)共同达到高达85.8%的准确率,比基线提高13.5个百分点。校准上下文和自适应模型升级也优于基线,但在成本-准确率帕累托前沿上被标准注入+集成所主导。小模型从集成中获益不成比例,使得高准确率的LLM评判器可以低成本获得。我们表明这些技术在不同模型提供商之间具有泛化性,在OpenAI GPT和Anthropic Claude系列上进行了评估。

英文摘要

Using a language model to score or rank candidate responses has become a scalable alternative to human evaluation in reinforcement learning from human feedback (RLHF) pipelines, benchmarking, and application layer evaluations. However, output reliability depends heavily on prompting and aggregation strategy. We present an empirical investigation of four drop-in techniques -- ensemble scoring, task-specific criteria injection, calibration context, and adaptive model escalation -- for improving LLM judge accuracy on RewardBench 2, with a unifying lens of noise control on the stochastic judge: ensembling as Monte Carlo averaging over per-call noise, criteria injection as between-response discrimination sharpening, and per-response score variance as an uncertainty signal. Ensemble scoring and task-specific criteria injection (the latter virtually cost free) together reach up to 85.8% accuracy, +13.5pp over baseline. Calibration context and adaptive model escalation also improve over baseline but are dominated by criteria + ensembling on the cost-accuracy Pareto frontier. Small models benefit disproportionately from ensembling, making high-accuracy LLM judges accessible at low cost. We show that these techniques generalise across model providers, evaluating on both OpenAI GPT and Anthropic Claude families.

2602.19393 2026-06-10 cs.LG 版本更新

In Defense of Cosine Similarity: Normalization Eliminates the Gauge Freedom

为余弦相似度辩护:归一化消除了规范自由度

Taha Bouhsine

AI总结 本文证明,当嵌入被约束到单位球面时,对角规范矩阵的歧义消失,余弦距离与欧氏距离单调等价,从而解决了余弦相似度任意性的问题。

Comments This was a blog post companion draft, it needs to be updated to fit as a preprint, will do later

详情
AI中文摘要

Steck、Ekanadham 和 Kallus [arXiv:2403.05440] 表明,来自矩阵分解模型的学习嵌入的余弦相似度可以通过对角“规范”矩阵 $D$ 变得任意。他们的结果对于使用点积目标训练嵌入并计算余弦相似度的从业者来说是正确的且重要的。然而,我们认为,他们得出的普遍反对余弦相似度的结论,混淆了不兼容训练目标的病理与单位球面上余弦距离的几何有效性。我们证明,当嵌入被约束到单位球面 $\mathbb{S}^{d-1}$ 时(无论是在训练期间还是之后使用适当的目标),$D$ 矩阵的歧义完全消失,并且余弦距离恰好等于平方欧氏距离的一半。这种单调等价性意味着,在归一化嵌入上,基于余弦和基于欧氏距离的邻居排名是相同的。余弦相似度的“问题”不在于余弦相似度本身,而在于未能进行归一化。

英文摘要

Steck, Ekanadham, and Kallus [arXiv:2403.05440] demonstrate that cosine similarity of learned embeddings from matrix factorization models can be rendered arbitrary by a diagonal ``gauge'' matrix $D$. Their result is correct and important for practitioners who compute cosine similarity on embeddings trained with dot-product objectives. However, we argue that their conclusion, cautioning against cosine similarity in general, conflates the pathology of an incompatible training objective with the geometric validity of cosine distance on the unit sphere. We prove that when embeddings are constrained to the unit sphere $\mathbb{S}^{d-1}$ (either during or after training with an appropriate objective), the $D$-matrix ambiguity vanishes identically, and cosine distance reduces to exactly half the squared Euclidean distance. This monotonic equivalence implies that cosine-based and Euclidean-based neighbor rankings are identical on normalized embeddings. The ``problem'' with cosine similarity is not cosine similarity, it is the failure to normalize.

2509.21925 2026-06-10 cs.LG cs.AI 版本更新

Generation Properties of Stochastic Interpolation under Finite Training Set

有限训练集下随机插值的生成性质

Yunchen Li, Shaohui Lin, Zhou Yu

AI总结 研究有限训练集下随机插值生成模型的理论性质,推导最优速度场和得分函数的闭式解,揭示确定性和随机生成过程的行为,并定义欠拟合与过拟合。

Comments We found proof errors affecting key theorems and wish to avoid misleading readers. We have submitted a substantially revised new paper, arXiv:2606.08554, retaining only two old theorems and adding five new ones

详情
AI中文摘要

本文研究了有限训练总体下生成模型的理论行为。在随机插值生成框架内,我们推导了当仅有有限数量的训练样本可用时最优速度场和得分函数的闭式表达式。我们证明,在某些正则性条件下,确定性生成过程精确恢复训练样本,而随机生成过程表现为带有加性高斯噪声的训练样本。在理想化设置之外,我们考虑模型估计误差,并引入生成模型特有的欠拟合和过拟合的正式定义。我们的理论分析揭示,在存在估计误差的情况下,随机生成过程有效地产生训练样本的凸组合,这些组合被均匀噪声和高斯噪声的混合所破坏。在生成任务和分类等下游任务上的实验支持了我们的理论。

英文摘要

This paper investigates the theoretical behavior of generative models under finite training populations. Within the stochastic interpolation generative framework, we derive closed-form expressions for the optimal velocity field and score function when only a finite number of training samples are available. We demonstrate that, under some regularity conditions, the deterministic generative process exactly recovers the training samples, while the stochastic generative process manifests as training samples with added Gaussian noise. Beyond the idealized setting, we consider model estimation errors and introduce formal definitions of underfitting and overfitting specific to generative models. Our theoretical analysis reveals that, in the presence of estimation errors, the stochastic generation process effectively produces convex combinations of training samples corrupted by a mixture of uniform and Gaussian noise. Experiments on generation tasks and downstream tasks such as classification support our theory.

2602.17547 2026-06-10 cs.AI cs.CL

KLong: Training LLM Agent for Extremely Long-horizon Tasks

KLong:训练用于超长 horizon 任务的 LLM 代理

Yue Liu

AI总结 KLong 通过轨迹分割 SFT 和渐进式 RL 训练,解决超长 horizon 任务,实现 106B 模型在 PaperBench 上超越 Kimi K2 Thinking 11.28%。

Comments We request standard withdrawal of this submission because significant errors were discovered in the data after submission, which affect the validity of the results. We may submit a corrected version later

详情
AI中文摘要

本文介绍了KLong,一种开源的LLM代理,旨在解决超长horizon任务。其原理是首先通过轨迹分割SFT冷启动模型,然后通过渐进式RL训练进行扩展。具体而言,我们首先使用全面的SFT配方激活基础模型的基本代理能力。然后,我们引入Research-Factory,一个自动化管道,通过收集研究论文和构建评估标准来生成高质量的训练数据。利用该管道,我们从Claude 4.5 Sonnet(Thinking)中构建了数千条超长horizon轨迹。为了训练这些极长的轨迹,我们提出了一种新的轨迹分割SFT,该方法保留早期上下文,逐步截断后期上下文,并保持子轨迹之间的重叠。此外,为了进一步提高超长horizon任务解决能力,我们提出了一种新的渐进式RL,将训练分为多个阶段,逐步延长超时时间。实验表明KLong的优越性和泛化能力,如图1所示。值得注意的是,我们的KLong(106B)在PaperBench上超越Kimi K2 Thinking(1T)11.28%,且性能提升泛化到其他编码基准如SWE-bench Verified和MLE-bench。

英文摘要

This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The principle is to first cold-start the model via trajectory-splitting SFT, then scale it via progressive RL training. Specifically, we first activate basic agentic abilities of a base model with a comprehensive SFT recipe. Then, we introduce Research-Factory, an automated pipeline that generates high-quality training data by collecting research papers and constructing evaluation rubrics. Using this pipeline, we build thousands of long-horizon trajectories distilled from Claude 4.5 Sonnet (Thinking). To train with these extremely long trajectories, we propose a new trajectory-splitting SFT, which preserves early context, progressively truncates later context, and maintains overlap between sub-trajectories. In addition, to further improve long-horizon task-solving capability, we propose a novel progressive RL, which schedules training into multiple stages with progressively extended timeouts. Experiments demonstrate the superiority and generalization of KLong, as shown in Figure 1. Notably, our proposed KLong (106B) surpasses Kimi K2 Thinking (1T) by 11.28% on PaperBench, and the performance improvement generalizes to other coding benchmarks like SWE-bench Verified and MLE-bench.

2606.10280 2026-06-10 eess.IV cs.CV 新提交

Overlapped Wavelet Diffusion for Low-Light Image Enhancement

重叠小波扩散用于低光照图像增强

Fen Peng, Taizo Suzuki, Seisuke Kyochi

AI总结 提出重叠小波扩散框架OWDiff,通过重叠小波变换消除块伪影,并引入低频引导的高频增强模块恢复细节,在LOLv1和LOLv2-real数据集上优于现有方法。

Comments Advance published in IEICE Transactions on Information and Systems. DOI: 10.1587/transinf.2026PCP0006. Code: https://github.com/FinnPeg/Overlapped-Wavelet-Diffusion

Journal ref IEICE Transactions on Information and Systems, Advance online publication, 2026

详情
AI中文摘要

在这项研究中,我们提出了一种用于低光照图像增强(LLIE)的重叠小波扩散框架,该框架包含两个互补组件,以实现无块伪影和细节保持的增强。尽管与传统方法相比,最近基于扩散的LLIE方法表现出显著性能,但DiffLL仍然遭受由Haar小波变换(WT)引起的块伪影以及由于其高频恢复模块(HFRM)的限制导致的边缘模糊或纹理过度平滑。为了克服这些问题,我们引入了重叠小波变换(OWT),它融合了相邻区域的相关性,从而在结构上防止块伪影。此外,我们集成了一个低频引导的高频增强模块(HFEBlock)来加强细节恢复,产生更清晰的边缘和更可靠的纹理。在LOLv1和LOLv2-real数据集上的大量实验表明,我们的框架(称为OWDiff)在定性和定量上均持续优于现有的LLIE方法,在保持计算效率的同时实现了卓越的视觉质量。OWDiff有效解决了Haar WT和HFRM的结构限制,与DiffLL相比,在LOLv1和LOLv2-real数据集上平均PSNR增益为0.58 dB,SSIM相对提高1.64%,LPIPS相对降低5.9%。

英文摘要

In this study, we propose an overlapped wavelet diffusion framework for Low-Light Image Enhancement (LLIE), which incorporates two complementary components to achieve blocking artifact-free and detail-preserving enhancement. Although recent diffusion-based LLIE methods have demonstrated remarkable performance compared with traditional approaches, DiffLL still suffers from blocking artifacts caused by the Haar Wavelet Transform (WT) and blurred edges or over-smoothed textures due to the limitations of its High-Frequency Restoration Module (HFRM). To overcome these issues, we introduce an Overlapped WT (OWT) that incorporates correlations across neighboring regions, thereby structurally preventing blocking artifacts. Furthermore, we integrate a low-frequency-guided High-Frequency Enhance Block (HFEBlock) to strengthen detail recovery, yielding sharper edges and more reliable textures. Extensive experiments on the LOLv1 and LOLv2-real datasets demonstrate that our framework, termed OWDiff, consistently outperforms existing LLIE methods both qualitatively and quantitatively, achieving superior visual quality while maintaining computational efficiency. OWDiff effectively addresses the structural limitations of the Haar WT and the HFRM, achieving an average PSNR gain of 0.58 dB, along with a 1.64% relative improvement in SSIM and a 5.9% relative reduction in LPIPS, compared to DiffLL across both the LOLv1 and LOLv2-real datasets.

2606.09942 2026-06-10 cs.SE cs.AI 新提交

Anomaly Detection and Root Cause Analysis for Microservice Systems

微服务系统的异常检测与根因分析

Luan Pham

AI总结 针对微服务系统异常检测与根因分析的五大局限性,提出端到端方法BARO、EventADL和TORAI,并构建基准RCAEval,通过实验验证有效性与鲁棒性。

Comments This is the pre-print of my PhD thesis, submitted to RMIT University

详情
AI中文摘要

微服务系统被广泛用于构建云应用,但其复杂性使得故障不可避免,从而降低用户体验并造成经济损失。自动化异常检测与根因分析(RCA)目前是活跃的研究领域,但现有技术存在五个局限性。首先,大多数方法将异常检测和RCA分开处理,假设异常已被正确检测,当检测因噪声或延迟而不精确时便会失效。其次,它们关注指标、日志和跟踪,而忽略了事件数据(如API调用和配置变更)。第三,许多方法需要给定的服务调用图,否则无法诊断。第四,该领域缺乏标准化的数据集和评估框架,导致方法难以公平比较。第五,尽管基于因果推断的RCA已成为主流,但其有效性、效率和鲁棒性仍不明确。本论文通过两组贡献解决这些局限性。第一组引入了独立和联合利用可观测性数据的方法。BARO是一种针对指标数据的端到端异常检测与RCA方法。EventADL是一种针对事件数据的端到端框架。TORAI是一种无需服务调用图的多模态RCA框架。在真实微服务系统上的大量实验证明了它们的有效性和鲁棒性。第二组贡献提供了基准数据集、评估框架和系统性的评估工作。RCAEval是一个全面的基准,为未来研究提供即用数据集和可复现基线。对现有RCA方法(尤其是基于因果推断的方法)的系统性评估提供了指导未来方向的见解。本论文因此推进了微服务故障的自动化异常检测与RCA,为事件缓解和修复的未来研究奠定基础。

英文摘要

Microservice systems are widely used to build cloud applications, yet their complexity makes failures inevitable, degrading user experience and causing economic loss. Automated anomaly detection and root cause analysis (RCA) are now active research areas, but existing techniques share five limitations. First, most treat anomaly detection and RCA separately, assuming anomalies are detected correctly, and falter when detection is imprecise due to noise or delay. Second, they focus on metrics, logs, and traces, leaving event data such as API calls and configuration changes underexplored. Third, many require a given service call graph and cannot diagnose without one. Fourth, the field lacks standardised datasets and evaluation frameworks, so methods are hard to compare fairly. Fifth, although causal inference-based RCA has become dominant, its effectiveness, efficiency, and robustness remain unclear. This thesis addresses these limitations through two groups of contributions. The first introduces methods that exploit observability data both independently and collectively. BARO is an end-to-end anomaly detection and RCA approach for metric data. EventADL is an end-to-end framework for event data. TORAI is a multimodal RCA framework that requires no service call graph. Extensive experiments on real microservice systems demonstrate their effectiveness and robustness. The second group delivers benchmarking datasets, an evaluation framework, and systematic evaluation efforts. RCAEval is a comprehensive benchmark providing ready-to-use datasets and reproducible baselines for future research. A systematic evaluation of existing RCA methods, especially causal inference-based approaches, offers insights that guide future directions. This thesis thereby advances automated anomaly detection and RCA for microservice failures, enabling future research on incident mitigation and remediation.

2606.09930 2026-06-10 cs.PL cs.LG cs.SC 新提交

Compile Once, Differentiate Everywhere: A Differentiable Meta-Circular Interpreter

一次编译,处处微分:可微分元循环解释器

Lucas Sheneman

AI总结 提出一种将Scheme子集编译为可微分计算图的编译器,实现可微分元循环解释(DMCI),支持对包含闭包、递归和数据结构的程序进行反向模式自动微分,无需重新编译。

详情
AI中文摘要

程序执行与基于梯度的优化之间的界限长期以来限制了代码本身作为可学习科学模型的使用。我们提出一个编译器,将Scheme的自托管子集转换为用于自动微分后端的可微分计算图。由于该子集可以编译自身的求值器,这产生了可微分元循环解释(DMCI):一个编译后的Scheme解释器执行作为数据提供的程序,而反向模式自动微分将梯度传播到嵌入在这些程序中的连续常数。解释器只编译一次,因此新程序无需重新编译或自定义梯度机制即可继承可微性,同时保留闭包、递归和数据结构。我们证明通过编译解释器的梯度几乎处处正确,并表明它们在171个递归和高阶程序-种子对上与直接编译的数值精度匹配。然后,我们使用DMCI进行程序与参数联合搜索,其中大型语言模型提出Scheme程序,精确梯度通过单个冻结的解释器校准其连续参数。这实现了OpenEvolve风格的程序搜索,其中外部循环提出离散程序结构,DMCI提供每个候选程序连续参数的精确基于梯度的校准。在电池容量衰减数据上,该搜索恢复了膝盖状退化结构,并在更难的早期外推分割上改善了保留外推性能,优于手工基线,在后期分割上与之匹配。在高维厄尔尼诺反问题中,DMCI优化了基于解释的卡尔曼滤波器似然,而无梯度搜索失败。这些结果将符号回归和神经符号搜索从闭式表达式扩展到可执行、有状态的程序,使模型生成的代码可直接针对数据进行优化。

英文摘要

The boundary between program execution and gradient-based optimization has long limited the use of code itself as a learnable scientific model. We present a compiler that translates a self-hosting subset of Scheme into differentiable computation graphs for autograd backends. Because the subset can compile its own evaluator, this yields differentiable meta-circular interpretation (DMCI): a compiled Scheme interpreter executes programs supplied as data, while reverse-mode autodiff propagates gradients to continuous constants embedded in those programs. The interpreter is compiled once, so new programs inherit differentiability without recompilation or custom gradient machinery, while retaining closures, recursion, and data structures. We prove that gradients through the compiled interpreter are correct almost everywhere and show that they match direct compilation to numerical precision across 171 recursive and higher-order program-seed pairs. We then use DMCI for program-and-parameter co-search, where a large language model proposes Scheme programs and exact gradients calibrate their continuous parameters through a single frozen interpreter. This enables OpenEvolve-style program search in which an outer loop proposes discrete program structures and DMCI supplies exact gradient-based calibration of each candidate's continuous parameters. On battery capacity-fade data, the search recovers a knee-like degradation structure and improves held-out extrapolation over hand-crafted baselines on the harder early-extrapolation split, matching them on the later split. On a high-dimensional El Nino inverse problem, DMCI optimizes an interpreted Kalman-filter likelihood where gradient-free search fails. These results extend symbolic regression and neurosymbolic search from closed-form expressions to executable, stateful programs, making model-generated code directly optimizable against data.

2606.09858 2026-06-10 cs.IT cs.AI math.IT 新提交

Support sufficiency as action-sufficient compression: a single-cycle rate-regret formulation

支持充分性作为行动充分压缩:单周期率-遗憾公式

Mark Walsh

AI总结 本文形式化支持充分性为行动充分压缩,通过策略等价商空间定义精确充分性,并基于期望策略遗憾定义近似充分性,在有限单周期设置下导出率-遗憾问题,区分行动充分性与重建保真度、信息瓶颈预测和理性疏忽。

Comments 22 pages. Submitted to Journal of Mathematical Psychology. Formal single-cycle model of action-sufficient support compression and rate-regret sufficiency

详情
AI中文摘要

鲁棒决策需要压缩。形成丰富支持状态的系统通常无法在行动点保留其完整结构。它必须仅保留在当前后果几何下行动、验证、放弃或推迟所需的区别。本文将支持充分性形式化为行动充分压缩。设$H$表示完整支持状态,$\mathcal{A}$表示有限行动集,$Z$表示指定收益结构的后果几何。对于固定的$Z$,最粗略的精确行动充分压缩是支持空间按策略等价的商。当两个支持状态需要相同的最优行动时,它们可以合并。这阐明了为什么仅内容或仅标量置信度的仲裁在其诱导划分跨越行动边界时失败。然后通过有界期望策略遗憾定义近似充分性。在有限单周期设置中,这产生了一个率-遗憾问题,其源为$H$,再现字母表为$\mathcal{A}$,失真由后果敏感遗憾给出。最优随机行动通道继承了标准率失真吉布斯形式,此处应用于具有遗憾失真的支持状态。贡献是解释性的:行动充分性与重建保真度、信息瓶颈预测和理性疏忽区分开来。鲁棒单周期仲裁不需要保留所有支持,但需要保留后果几何使行动相关的区别。

英文摘要

Robust decision-making requires compression. A system that forms a rich support state cannot usually preserve its full structure at the point of action. It must retain only those distinctions needed to act, verify, abstain, or defer under the current consequence geometry. This paper formalizes support sufficiency as action-sufficient compression. Let $H$ denote a full support state, $\mathcal{A}$ a finite action set, and $Z$ a consequence geometry specifying payoff structure. For fixed $Z$, the coarsest exactly action-sufficient compression is the quotient of support space by policy equivalence. Two support states may be merged exactly when they require the same optimal action. This clarifies why content-only and scalar-confidence-only arbitration fail whenever their induced partitions cross action boundaries. Approximate sufficiency is then defined by bounded expected policy regret. In the finite single-cycle setting, this yields a rate-regret problem with source $H$, reproduction alphabet $\mathcal{A}$, and distortion given by consequence-sensitive regret. The optimal stochastic action channel inherits the standard rate-distortion Gibbs form, applied here to support states with regret distortion. The contribution is interpretive: action adequacy is distinguished from reconstruction fidelity, information-bottleneck prediction, and rational inattention. Robust single-cycle arbitration does not require preserving all support, but it does require preserving the distinctions that consequence geometry makes action-relevant.

2604.05013 2026-06-10 cs.SE cs.AI

Scaling Coding Agents via Atomic Skills

通过原子技能扩大编码代理

Yue Liu

AI总结 本文提出通过原子技能提升编码代理的新型方法,通过联合强化学习提升五个基础技能,从而提高复杂软件任务的泛化能力。

Comments We request standard withdrawal of this submission because significant errors were discovered in the data after submission, which affect the validity of the results. We may submit a corrected version later

详情
AI中文摘要

当前LLM编码代理主要在复合基准上训练,导致任务特定过拟合和泛化能力有限。为此,我们提出一种新的扩展范式,将重点从任务级优化转向原子技能掌握。我们首先正式化五个基本原子技能,即代码定位、代码编辑、单元测试生成、问题重现和代码审查,这些技能作为复杂软件工程任务的基础向量。与复合编码任务相比,这些原子技能更具通用性和可组合性。然后,我们通过联合强化学习扩展编码代理,使原子技能一致提升,而不会产生负面影响或权衡。值得注意的是,这些原子技能的改进在其他未见的复合编码任务中表现良好,如bug修复、代码重构、机器学习工程和代码安全。观察到这一现象,促使我们通过训练原子技能提出新的编码代理扩展范式。广泛实验验证了所提范式的有效性。值得注意的是,我们的联合强化学习在5个原子技能和5个复合任务上平均性能提高了18.7%。

英文摘要

Current LLM coding agents are predominantly trained on composite benchmarks (e.g., bug fixing), which often leads to task-specific overfitting and limited generalization. To address this, we propose a novel scaling paradigm that shifts the focus from task-level optimization to atomic skill mastery. We first formalize five fundamental atomic skills, code localization, code editing, unit-test generation, issue reproduction, and code review, that serve as the basis vectors for complex software engineering tasks. Compared with composite coding tasks, these atomic skills are more generalizable and composable. Then, we scale coding agents by performing joint RL over atomic skills. In this manner, atomic skills are consistently improved without negative interference or trade-offs between them. Notably, we observe that improvements in these atomic skills generalize well to other unseen composite coding tasks, such as bug-fixing, code refactoring, machine learning engineering, and code security. The observation motivates a new scaling paradigm for coding agents by training with atomic skills. Extensive experiments demonstrate the effectiveness of our proposed paradigm. Notably, our joint RL improves average performance by 18.7% on 5 atomic skills and 5 composite tasks.

2606.11053 2026-06-10 econ.TH 新提交

Revealing information -- or not -- in a social network of traders

揭示信息——或不揭示——在交易者社交网络中

Patrick Allmis, Paolo Pin, Fernando Vega Redondo

AI总结 基于Kyle(1985)的资产交易微观基础模型,研究知情交易者为何可能主动分享信息,并发现均衡中信息部分揭示,导致价格不完全反映资产回报,影响社会剩余分配。

详情
AI中文摘要

我们基于Kyle(1985)提出的资产交易简单微观基础模型,研究在何种条件下,一个私下了解资产未来回报的交易者可能希望与其他交易者分享她的信息。与传统观点相反,我们表明在博弈的唯一均衡中,知情交易者以正概率揭示她的信息。其结果是,与相应的无沟通基准相比,均衡价格不必完全揭示资产回报,即使交易者是风险中性的。这反过来对社会剩余的分配有重要影响。虽然我们的模型最初假设代理间的沟通受到任意给定的社交网络的限制,我们也研究了当链接通过交易者先前的连接决策内生形成时,会出现哪些这样的网络。

英文摘要

We build upon a simple micro-founded model of asset trading proposed by Kyle (1985) to study under what conditions a trader who is privately informed of the future return of the asset may want to share her information with other traders. Despite what conventional wisdom suggests, we show that in the unique equilibrium of the game the informed trader reveals her information with positive probability. A consequence of it is that, in contrast with the corresponding no-communication benchmark, the equilibrium price need not be fully revealing of the asset's return, even if traders are risk neutral. This, in turn, has significant implications on the distribution of the social surplus. While our model initially assumes that inter-agent communication is restricted by an arbitrarily given social network, we also study which such networks arise when links are endogenously formed through traders' prior connection decisions.

2606.11047 2026-06-10 econ.EM 新提交

Panel Data Estimation of Individual Demand in Markets with Many Consumers

多消费者市场中个体需求的面板数据估计

Sarah Moon, Whitney K. Newey

AI总结 研究如何利用面板数据估计个体需求,通过差分等方法消除市场定价内生性偏差,发现当每个市场消费者数量增加时偏差消失,并允许宏观经济效应。

详情
AI中文摘要

本文旨在考虑面板数据是否以及如何用于估计个体需求(而非市场层面需求),同时考虑由市场定价导致的同时性问题。我们考虑线性需求模型和随机系数需求模型,以及线性供给模型。我们发现,使用熟悉的的面板数据方法(如差分)获得的个体需求估计的偏差随着每个市场消费者数量的增加而消失,只要偏好的时变(即异质性)成分与供给的未观测时变成分正交。这种近似控制在许多面板离散选择模型中被假设,并且在其他模型中也是合理的,其中异质性偏好代表偏好随时间的随机变化。可以通过包含表征时间效应的回归量(如趋势和时间周期虚拟变量)或固定时间效应来允许宏观经济效应。

英文摘要

The purpose of this paper is to consider whether and how panel data can be used to estimate individual demand, as opposed to market-level demand, while accounting for simultaneity resulting from prices being determined in markets. We consider linear demand models and random coefficient demand models, together with linear supply models. We find that the bias of individual demand estimates obtained using familiar panel data methods, like differencing, disappears as the number of consumers in each market grows, as long as the time-varying, i.e. idiosyncratic, component of preferences is orthogonal to the unobserved, time-varying component of supply. This approximate control is assumed in many panel discrete choice models and is plausible in other models where idiosyncratic preferences represent random variation in preferences over time. Macroeconomic effects can be allowed for by including regressors characterizing time effects, such as trends and time period dummies, or fixed time effects.

2606.10998 2026-06-10 econ.TH 新提交

Consistent Probabilistic Social Choice Revisited

再论一致概率社会选择

Florian Brandl, Felix Brandt

AI总结 将Brandt等人(2016)基于分数偏好概型的最大抽签结果转移到标准有限选民模型,并放宽连续性条件至实数概率。

Comments 18 pages

详情
AI中文摘要

Brandt等人(2016)在一个基于分数偏好概型的框架内刻画了一种称为最大抽签的概率社会选择函数,该框架抽象掉了个体选民。虽然这一建模假设使得证明更加优雅和透明,但它使得与文献中其他结果的比较变得复杂。本注记的目的是将他们的结果转移到社会选择的标准模型,其中每个偏好概型由有限数量的选民定义。在此过程中,我们证明了他们主要定理的一个稍强版本,该版本使用了更弱的连续性条件,并允许实数值(而不仅仅是理性值)概率。

英文摘要

Brandt et al. (2016) characterized a probabilistic social choice function known as maximal lotteries within a framework based on fractional preference profiles, which abstracts away from individual voters. While this modeling assumption enables a more elegant and transparent proof, it complicates comparison with other results in the literature. The purpose of this note is to transfer their results to the standard model of social choice, where each preference profile is defined for a finite number of voters. Along the way, we prove a slightly stronger version of their main theorem that uses a weaker continuity condition and allows for real-valued (rather than only rational-valued) probabilities.

2606.10845 2026-06-10 econ.TH 新提交

Iterative Elimination of Borda Losers: Axiomatizations of the Baldwin and Nanson Rules

迭代消除博达败者:鲍德温和南森规则的公理化

Leo Goto, Satoshi Nakada

AI总结 本文通过公理化方法统一刻画鲍德温和南森两种投票规则,其核心是递归消除博达得分最低或低于平均的选项,并与Young对博达规则的公理化进行对比。

详情
AI中文摘要

鲍德温和南森规则是两种旨在识别孔多塞赢家(当存在时)的投票规则。两种规则都作为递归的博达消除程序运作:鲍德温规则连续消除博达得分最低的选项,而南森规则消除所有博达得分不超过平均值的选项。本文研究了鲍德温和南森规则的公理性质,并提供了统一的公理化刻画。特别地,我们的公理与Young(1974)对博达规则的公理化紧密可比。

英文摘要

The Baldwin and Nanson rules are two voting rules proposed to identify the Condorcet winner whenever one exists. Both rules operate as recursive Borda elimination procedures: the Baldwin rule successively eliminates the alternatives with the lowest Borda score, whereas the Nanson rule eliminates all alternatives whose Borda scores do not exceed the mean. This paper investigates the axiomatic properties of the Baldwin and Nanson rules and provides unified axiomatic characterizations. In particular, our axioms are closely comparable to Young's (1974) characterization of the Borda rule.

2606.10681 2026-06-10 econ.TH 新提交

Limited belief propagation and contingent thinking

有限信念传播与权变思维

Andrew Ellis, Ran Spiegler

AI总结 本文通过有向无环图上的有限推理步骤,刻画了观察后信念更新的非贝叶斯特征,解释了相关忽视和迭代期望违背,并应用于公共品供给和社会学习博弈。

详情
AI中文摘要

一个智能体在观察部分变量后更新其对一组变量的信念。我们提供了更新信念的一种表示,该表示捕捉了观察结果的含义通过表示所有变量之间关系的有向无环图进行有限传播。当她从未观察变量到观察变量进行的推理步骤较少时,就会发生权变思维的失败,导致相关忽视和迭代期望的违背。我们的框架为关于权变思维的现有实验提供了新视角,并提出了新的方向。我们刻画了该模型与熟悉的贝叶斯和非贝叶斯基准之间的关系,并通过公共品供给和社会学习博弈的应用加以说明。

英文摘要

An agent updates her beliefs over a set of variables after observing some of them. We provide a representation of updated beliefs that captures limited propagation of her observation's implications through the directed acyclic graph that represents the relations between all variables. Failure of contingent thinking occurs when she performs fewer inference steps from unobserved variables than observed ones, leading to correlation neglect and violations of iterated expectations. Our framework offers a new perspective on existing experiments about contingent thinking and suggests new directions. We characterize the model's relationship with familiar Bayesian and non-Bayesian benchmarks, and illustrate it with applications to public-good provision and social learning games.

2606.10438 2026-06-10 econ.TH 新提交

Sequential Search with Planning

带有规划的序贯搜索

Ruhi Sonal, Saptarshi Mukherjee, Abhinaba Lahiri, Aniruddha Ghosh

AI总结 本文通过有序潘多拉盒子模型研究新产品开发或资源勘探中的序贯搜索,引入规划成本,证明存在与已支付范围相关的保留值,并分析保证效应、已支付范围效应和剩余阶段效应对最优策略的影响。

详情
AI中文摘要

新产品或技术的序贯开发,或自然资源的勘探,通常通过有序阶段进行,具有不确定的回报,并且需要昂贵的(事前)规划以使未来阶段可访问。我们将此过程建模为一个有序的潘多拉盒子问题,其中决策者首先选择一个初始范围,支付随可访问阶段数量增加的成本,并可能随后以边际调整成本扩大范围。由于已支付的规划成本是沉没的,续值取决于状态变量“已支付范围”。我们证明了与范围相关的保留值的存在性和唯一性,将最优搜索策略刻画为由已支付范围索引的阈值规则,并推导出比较静态。三种经济力量之间的相互作用塑造了最优行为——保证效应(更高的当前最佳报价降低了下一阶段的预期改进并导致更早停止)、已支付范围效应(更大的预付范围降低了未来访问的边际成本,提高了续值,并在更高保证下支持继续)以及剩余阶段效应(剩余阶段越少,继续的期权价值越小)。两个例子说明了这些力量如何在正态和厚尾回报下产生不同的规划和搜索模式。

英文摘要

Sequential development of a new product or technology, or natural resource exploration, often progresses through ordered stages with uncertain rewards and requires costly (ex ante) planning to make future stages accessible. We model this process as an ordered Pandora's box problem where a decision-maker first chooses an initial scope, paying a cost that rises with the number of stages made accessible, and may later expand the scope at a marginal adjustment cost. Since the paid planning costs are sunk, the continuation values depend on the state variable ``paid scope''. We prove existence and uniqueness of scope-dependent reservation values, characterize the optimal search strategy as a threshold rule indexed by paid scope, and derive comparative statics. Interactions among three economic forces shape the optimal behavior -- a guarantee effect (a higher current best offer reduces the expected improvement from the next stage and induces earlier stopping), a paid-scope effect (a larger prepaid scope lowers the marginal cost of future access, raises the continuation value, and supports continuation at higher guarantees), and a remaining-horizon effect (fewer stages remaining shrink the option value of continuing). Two examples illustrate how these forces generate distinct planning and search patterns under normal and fat-tailed rewards.

2606.10127 2026-06-10 econ.TH 新提交

Data-Driven Automation

数据驱动的自动化

Maryam Farboodi, Andrew Koh, Anchi Xia

AI总结 本文构建了一个数据驱动的自动化动态模型,研究数据异质性、内生积累和溢出效应如何影响自动化进程,发现长期自动化速度遵循幂律衰减,且经济通常无效率。

详情
AI中文摘要

我们构建了一个数据驱动的自动化动态模型,其中数据(i)是异质且任务特定的;(ii)作为经济活动的副产品内生积累;且(iii)表现出溢出效应,使得一个任务生成的数据可以增强另一个任务的生产率。在自动化的转型路径上,数据扮演着双重角色:同时增强已自动化任务的生产率并扩展自动化前沿。我们推导了经济长期部分自动化与完全自动化的严格条件。在后一种情况下,自动化表现出丰富的短期动态,取决于数据溢出的模式,但长期总是缓慢的:劳动力生产的任务份额随时间渐近地服从幂律衰减。我们表明经济通常是低效的,并分析规划者如何最优地倾斜数据积累的方向。在资本内生积累的情况下,数据驱动的自动化产生爆炸性增长,但长期工资停滞。

英文摘要

We build a dynamic model of data-driven automation in which data (i) is heterogeneous and task-specific; (ii) accumulates endogenously as a byproduct of economic activity; and (iii) exhibits spillovers such that data generated by one task can augment the productivity of another. Along the transition path of automation, data plays a dual role in simultaneously augmenting the productivity of already-automated tasks and expanding the automation frontier. We derive tight conditions for the economy to be partially versus fully automated in the long-run. In the latter case, automation exhibits rich short-run dynamics that depend on the pattern of data spillovers but is always slow in the long-run: the share of tasks produced by labor decays asymptotically as a power law in time. We show that the economy is generically inefficient and analyze how a planner optimally tilts the direction of data accumulation. With endogenous capital accumulation, data-driven automation generates explosive growth but stagnant long-run wages.

2606.11135 2026-06-10 eess.SP 新提交

Pre-Fault Voltage Discrimination and Time-Domain Protection for Distribution Networks with Inverter-Based Resources

含逆变器资源的配电网故障前电压判别与时域保护

Junyuan Zhao, François Bouffard, Géza Joós

AI总结 针对逆变器资源导致传统过流保护失效的问题,提出故障前电压判别策略结合时域保护原理,实现快速可靠故障检测。

详情
AI中文摘要

配电网中逆变器资源(IBRs)的日益普及给基于相量的过流保护带来了重大挑战。这一挑战源于IBRs缺乏短路电流供给能力。因此,传统的过流保护功能(例如ANSI 51)在此类场景中不足,需要替代方法。例如,时域保护有望克服这一挑战。本文提出了一种故障前电压判别(PVD)策略,其作用是检测故障并将正常开关和变压器励磁涌流扰动与实际故障区分开。PVD的使用允许通过使用含IBRs配电网的时域保护原理,设计一种简单而有效的故障检测算法。PVD的引入提供了更快的故障检测,同时不降低安全性和可靠性。离线仿真实验和控制器硬件在环实时仿真验证了所提算法在各种故障和正常开关事件中的有效性。

英文摘要

The increasing proliferation of inverter-based resources (IBRs) in distribution networks is presenting a major challenge for phasor-based overcurrent protection. This challenge stems from IBRs' lack of short-circuit current sourcing capacity. As a result, traditional overcurrent protection functions (e.g., ANSI 51) are inadequate in such scenarios, and warrant alternative approaches. Time-domain protection, for example, shows promise in overcoming this challenge. In this paper we propose a pre-fault voltage discrimination (PVD) strategy whose role is to detect faults and discriminate normal switching and transformer inrush disturbances from actual faults. The use of PVD allows for the design of a simple, yet effective fault detection algorithm by using time-domain protection principles for distribution networks containing IBRs. The introduction of PVD provides for faster fault detection without reducing security and dependability. Offline simulation experiments and controller hardware-in-the-loop real-time simulation validate the effectiveness of the proposed algorithm against various fault and normal switching events.

2606.10900 2026-06-10 eess.SP 新提交

Personalized Deep Learning for Short-Term Forecasting of Impending Atrial Fibrillation from Continuous Wearable ECG Signals

基于个性化深度学习的连续可穿戴心电图信号短期房颤预测

Jangwon Suh, Soonil Kwon, Jungmin Ko, Yun Kwan Kim, Hee Seok Song, Eue-Keun Choi, Wonjong Rhee

AI总结 针对可穿戴心电图中房颤预测的个体差异问题,提出通过微调全局模型实现个性化预测,在三个数据集上显著提升性能,并揭示了心率、RMSSD等临床相关前兆特征。

Comments Code is available at https://github.com/SNU-DRL/Personalized-AF-Forecasting

详情
AI中文摘要

背景与目的:连续可穿戴心电图监测越来越多地用于动态心律失常监测,然而预测即将发生的房颤面临患者间心电图变异的挑战。本研究探讨了通过基于个体心电图信号的微调来个性化全局模型是否能改善即将发生房颤的短期预测。方法:在ICENTIA11K数据集上训练的全局模型与在三个队列(ICENTIA11K、IRIDIA-AF和MobiCARE)上微调的个性化模型进行了比较。预处理后,模型处理60秒的心电图片段,预测未来五分钟。我们评估了适应数据量的影响,并分析了心电图特征,如心率和RMSSD。结果:个性化模型显著优于全局模型,在ICENTIA11K中AUROC为0.711 vs. 0.614,在MobiCARE中为0.686 vs. 0.585。个性化收益随着患者特定微调数据量的增加而增加。虽然全局模型的准确性随着房颤发作的临近而提高,但两个外部队列中的个性化模型表现出不同的时间动态,这可能表明捕获了患者特定特征,这些特征较少依赖于房颤事件的临近性。房颤前发作显示心率和RMSSD升高。特征归因突出了临床相关的前兆,包括频繁的房性早搏和短阵室上性心动过速。结论:使用患者特定的可穿戴心电图数据自适应深度学习模型显著增强了即将发生房颤的短期预测。这种个性化框架支持及时的预防性干预,并改善动态监测环境中的房颤管理。

英文摘要

Background and Objective: Continuous wearable electrocardiogram (ECG) monitoring is increasingly used for ambulatory arrhythmia surveillance, yet forecasting impending atrial fibrillation (AF) is challenged by inter-patient ECG variability. This study investigated whether personalizing a global model via fine-tuning on an individual's ECG signals improves short-term forecasting of impending AF. Methods: A global model trained on the ICENTIA11K dataset was compared against personalized models fine-tuned across three cohorts: ICENTIA11K, IRIDIA-AF, and MobiCARE. Following preprocessing, models processed 60-second ECG segments for a five-minute forecast horizon. We evaluated the impact of adaptation data volume and analyzed ECG features, such as heart rate and RMSSD. Results: Personalized models significantly outperformed the global model, achieving AUROCs of 0.711 vs. 0.614 in ICENTIA11K and 0.686 vs. 0.585 in MobiCARE. Personalization benefits increased with the amount of patient-specific fine-tuning data. While the global model's accuracy rose as AF onset approached, personalized models in the two external cohorts exhibited distinct temporal dynamics, which may indicate the capture of patient-specific characteristics less dependent on proximity to the AF event. Pre-AF episodes showed elevated heart rates and RMSSD. Feature attributions highlighted clinically relevant precursors, including frequent premature atrial complexes (PACs) and short supraventricular tachycardias (SVTs). Conclusions: Adapting deep learning models with patient-specific wearable ECG data significantly enhances short-term forecasting of impending AF. This personalized framework supports timely preventive interventions and improved AF management in ambulatory monitoring environments.

2606.10869 2026-06-10 eess.SP 新提交

Information Bottleneck Meets Quantization: Finite Rate Analysis and Optimal Designs

信息瓶颈遇上量化:有限速率分析与最优设计

Francesco Binucci, Paolo Banelli

AI总结 本文理论分析了高斯信息瓶颈(GIB)潜在表示的标量和向量量化对目标数据信息性的影响,并提出了在有限速率约束下的任务导向量化设计,在MMSE回归问题上验证了有效性,最后将任务导向思想扩展到非高斯场景。

Comments 16 pages, 9 figures

详情
AI中文摘要

信息瓶颈(IB)是一个成熟的框架,通过权衡速率和数据表示大小,寻找数据源的潜在紧凑表示,以获得相对于另一个目标数据的信息准确性。当目标与源联合高斯时,高斯IB(GIB)是其简单的闭式解。然而,在许多实际问题中,潜在表示必须由有限数量的比特存储或表示,而最优(G)IB解则不然。首先,本文从理论上分析了标量和向量量化对GIB潜在表示的影响,以及其对目标数据(非)信息性的影响。然后,通过在潜在表示上施加有限速率约束,重新表述GIB优化问题,提出了任务导向的量化设计。在MMSE回归问题上的仿真结果证实了所提出的量化设计的有效性,与标准GIB潜在表示的更启发式或分离的量化设计相比,显示出显著的增益。最后,通过适当修改用于IB启发的向量量化器的变分自编码器(VAE)中的代价函数,将任务导向思想扩展到非高斯设置。

英文摘要

The Information Bottleneck (IB) is a well established framework that looks for a latent compact representation of a data source, by trading rate and data-size representation, for information accuracy with respect to another target data. The Gaussian IB (GIB) is its simple closed form solution, when the target is jointly Gaussian with the source. Actually, in many practical problems the latent representation has to be stored or represented by a finite number of bits, while the optimal (G)IB solution has not. First, this manuscript theoretically analyzes the effect of scalar and vector quantization of the GIB latent representation, and its impact on the (dis)informativeness with respect to the target data. Then, task-oriented quantization designs are proposed by (jointly) reformulating the GIB optimization problem under a finite-rate constraint on the latent representation. Simulation results on MMSE regression problems confirm the effectiveness of the proposed quantization designs, which show significant gains with respect to more heuristic, or separate, quantization designs of the standard GIB latent representation. Finally, the paper extends the task-oriented philosophy to non-Gaussian settings, by properly modifying the cost function used in variational auto-encoders (VAEs) of IB-inspired vector quantizers.

2606.10864 2026-06-10 eess.AS 新提交

Phoneme-First Prediction for LLM-Based Speech Recognition

基于LLM的语音识别的音素优先预测

Jakob Poncelet, Hugo Van hamme

AI总结 提出在LLM中集成音素预测步骤,先预测音素再生成转录,以提升低资源场景下的语音识别准确性和可解释性。

Comments Accepted at EUSIPCO 2026

详情
AI中文摘要

近期研究探索了将大型语言模型(LLM)与语音编码器集成,以创建能够进行上下文感知语音识别的语音增强型LLM。主要挑战在于将LLM的语义嵌入与语音编码器的声学表示对齐。我们提出了一种新颖的方法,教导LLM首先从语音特征中预测音素,然后再生成最终转录。通过将音素预测步骤直接集成到LLM中,模型能够获得细粒度的发音知识,减少声学混淆,提高转录准确性和可解释性。我们的方法廉价且简单,因为音素目标可以从现有转录中自动推导。通过全面的实验,我们表明中间音素预测可以改善语音识别,特别是在低资源设置下,并且产生的输出在声学上更忠实于语音。

英文摘要

Recent research has explored integrating Large Language Models (LLMs) with speech encoders to create speech-augmented LLMs capable of contextualized speech recognition. The main challenge lies in aligning the semantic embeddings of LLMs with the acoustic representations of speech encoders. We propose a novel approach that teaches the LLM to first predict phonemes from the speech features before generating the final transcript. By integrating a phoneme prediction step directly into the LLM, the model develops a fine-grained knowledge of pronunciation, reducing acoustic confusion and improving transcription accuracy and explainability. Our method is cheap and simple, as phoneme targets can be automatically derived from existing transcripts. Through comprehensive experiments, we show that intermediate phoneme prediction can improve speech recognition, particularly in low-resource settings, and yields outputs that are acoustically more faithful to the speech.

2606.10853 2026-06-10 eess.AS 新提交

Speech Encoder Fusion for LLM-based Automatic Speech Recognition

面向基于LLM的自动语音识别的语音编码器融合

Jakob Poncelet, Hugo Van hamme

AI总结 研究融合多个预训练语音编码器以增强基于LLM的ASR性能,提出多种融合策略并在多场景下验证其有效性。

Comments Accepted at Interspeech 2026

详情
AI中文摘要

语音感知的大语言模型(LLMs)可以通过预训练的声学编码器将语音特征投影到LLM嵌入空间中来整合语音。虽然语音编码器的选择对性能有重要影响,但不同的编码器通常表现出互补的优势,这激发了它们的组合。在这项工作中,我们研究了融合多个预训练语音编码器是否能增强用于自动语音识别(ASR)的语音感知LLMs。我们探索了多种超越简单特征拼接的融合策略,包括学习组合和基于Transformer的融合架构,并在单语和多语ASR设置以及带说话人日志的语音识别中进行了评估。我们的结果表明,仔细融合多个并行语音编码器能在所有场景中提升下游性能,且计算开销有限。

英文摘要

Speech-aware large language models (LLMs) can incorporate speech through pre-trained acoustic encoders that project speech features into the LLM embedding space. While the choice of the speech encoder critically influences performance, different encoders often exhibit complementary strengths, motivating their combination. In this work, we investigate whether fusing multiple pre-trained speech encoders can enhance speech-aware LLMs for automatic speech recognition (ASR). We explore several fusion strategies beyond simple feature concatenation, including learned combinations and Transformer-based fusion architectures, and evaluate them across mono- and multilingual ASR settings as well as diarized speech recognition. Our results indicate that carefully fusing multiple parallel speech encoders improves downstream performance in all scenarios with limited computational overhead.

2606.10838 2026-06-10 eess.AS 新提交

Towards Deep Contextual Reasoning from Broad Descriptions for ASR with Speech-LLM via Metadata-Driven Reasoning Chains

面向语音-大语言模型的基于元数据驱动推理链的宽描述深度上下文推理

Jakob Poncelet, Hugo Van hamme

AI总结 提出一种训练方法,使语音-LLM利用宽描述作为弱语义先验,通过链式推理进行上下文修正,降低罕见词和命名实体错误率。

Comments Accepted at Interspeech 2026

详情
AI中文摘要

语音识别在罕见领域特定术语和上下文相关的命名实体上常常失败。现有的上下文化技术通常使用关键词或短语列表来偏置解码,这难以扩展或利用更深层次的知识。我们提出一种训练方法,教会语音-LLM使用宽描述(例如来自视频的描述)作为弱语义先验,以执行基于音频的上下文推理。我们通过将错误假设与视频元数据和LLM生成的推理解释配对,构建了400小时的推理增强语音数据,这些解释证明了上下文驱动的修正。我们微调语音-LLM以执行思维链推理:生成初始转录,然后对上下文进行推理,最后返回修正后的转录。在保留的YouTube测试集上,我们的方法减少了错误,特别是在罕见词和命名实体上有所改进,并为语音识别中更深层次的上下文推理奠定了基础。

英文摘要

Speech recognition often fails on rare, domain-specific terms and context-related named entities. Existing contextualization techniques typically bias decoding with keywords or phrase lists, which does not scale well or exploit deeper knowledge. We propose a training method that teaches a speech-LLM to use broad descriptions (e.g. from videos) as weak semantic priors to perform contextual reasoning grounded in the audio. We build 400 hours of reasoning-augmented speech data by pairing erroneous hypotheses with video metadata and LLM-generated reasoning explanations that justify context-driven corrections. We finetune the speech-LLM to perform chain-of-thought reasoning: generate an initial transcript, then reason over the context, and finally return a corrected transcript. On held-out YouTube-derived test sets, our approach reduces errors, with specific improvements on rare words and named entities, and lays groundwork for deeper contextual reasoning in speech recognition.

2606.10758 2026-06-10 eess.AS 新提交

Anchoring the Unknown: Open-Set Model Attribution via Proxy-Anchor Learning

锚定未知:基于代理-锚点学习的开放集模型归因

Cristian-Teodor Neamtu, Serban Mihalache, Stefan Smeu, Dan Oneata, Horia Cucu, Dragos Burileanu

AI总结 提出基于代理-锚点损失函数的度量学习框架,利用Wav2Vec2-BERT嵌入实现TTS源归因和未知系统检测,在140个TTS系统上达到99.76%准确率和2.04%误报率。

Comments Accepted to the 34th European Signal Processing Conference (EUSIPCO 2026)

详情
AI中文摘要

能够生成逼真合成语音的文本到语音(TTS)系统的激增给音频取证带来了日益严峻的挑战。虽然二元深度伪造检测已受到广泛关注,但源追踪(即识别哪个TTS系统产生了给定的音频样本)仍未被充分探索,尤其是在可能遇到未知系统的开放集场景中。我们提出了一种基于代理-锚点损失函数的度量学习框架,该框架在Wav2Vec2-BERT嵌入上操作,以学习用于TTS源归因和未见系统分布外(OOD)检测的判别性嵌入空间。我们在涵盖51种语言、140个TTS系统的MLAAD v9数据集上进行了评估,并引入了一种架构合并策略,将TTS系统版本分组为统一类别,减少了类间混淆。我们的系统在110个分布内类别上达到了99.76%的准确率,OOD检测的假阳性率(FPR@95)低至2.04%。此外,为了与当前最先进的方法进行公平比较,我们进一步在MLAAD v5官方数据集划分上进行了评估,将OOD准确率提高了近一倍。这些结果表明,代理-锚点度量学习结合架构感知的类别设计和事后OOD评分,为闭集和开集场景下的取证TTS源追踪提供了一个有效的框架。

英文摘要

The proliferation of text-to-speech (TTS) systems capable of generating realistic synthetic speech poses growing challenges for audio forensics. While binary deepfake detection has received considerable attention, source tracing (i.e., identifying which TTS system produced a given audio sample) remains underexplored, particularly in open-set scenarios where unknown systems may be encountered. We propose a metric learning framework based on the Proxy-Anchor loss function that operates on Wav2Vec2-BERT embeddings to learn a discriminative embedding space for TTS source attribution and out-of-distribution (OOD) detection of unseen systems. We evaluate it on the MLAAD v9 dataset spanning 140 TTS systems across 51 languages, and introduce an architecture merging strategy that groups TTS system versions into unified classes, reducing inter-class confusion. Our system achieves 99.76% accuracy on 110 in-distribution classes and a False Positive Rate (FPR@95) as low as 2.04% for OOD detection. Also, for a fair comparison against the current state of the art, we further evaluate it on the MLAAD v5 official dataset splits, improving the OOD accuracy by almost doubling it. These results demonstrate that Proxy-Anchor metric learning, combined with architecture-aware class design and post-hoc OOD scoring, provides an effective framework for forensic TTS source tracing in both closed-set and open-set settings.