2605.18655 2026-05-19 stat.ME astro-ph.IM

Self-Supervised Conformal Prediction with Equivariant Bootstrapping for Image Uncertainty Quantification

基于等变自助法的自监督置信区间预测用于图像不确定性量化

Henry J. Aldridge, Tobías I. Liaudat, Marcelo Pereyra, Jason D. McEwen

AI总结本文提出了一种基于等变自助法的自监督置信区间预测方法，用于图像不确定性量化，通过利用数据对称性生成启发式覆盖范围，并通过置信预测校准步骤进行细化，避免了对地面真实数据的依赖，特别在弱引力透镜质量映射中展示了其有效性。

Comments 9 pages, 2 figures; submitted conference proceedings for MaxEnt 2025

详情

AI中文摘要

逆问题在现代科学研究中无处不在，涉及从受噪声干扰的观测中恢复底层信号，通常通过测量算子转换。这些问题往往病态，特别是在成像领域，导致多个可能的解决方案和重建图像中的显著不确定性。在物理和生物科学领域，准确的不确定性量化（UQ）对于可信的科学分析和可靠的诊断至关重要。当前的成像UQ方法往往不足；它们可能不准确，或者需要不可用或难以获取的地面真实数据进行校准，这可能由于校准数据与观测数据之间的分布偏移而引入隐藏的偏见。我们介绍了一种UQ方法，利用等变自助法生成启发式覆盖范围，通过利用数据对称性。然后通过置信预测校准步骤细化这些覆盖范围，同时关键地采用自监督方法以避免对地面真实校准数据的需求。我们通过弱引力透镜质量映射展示了该方法，其中我们旨在从遥远星系的弱引力透镜形变测量中重建收敛场。质量映射特别受益于自监督方法，因为生成校准数据成本高昂且依赖于特定的宇宙学模型，这可能在下游宇宙学推断任务中引入偏见。

英文摘要

Inverse problems are ubiquitous in modern scientific studies and involve recovering an underlying signal from noisy observations often transformed by a measurement operator. These problems are frequently ill-posed, particularly in imaging, leading to multiple plausible solutions and considerable uncertainty in reconstructed images. In fields like the physical and biological sciences, accurate uncertainty quantification (UQ) is critical for trustworthy scientific analyses and confident diagnoses. Current UQ methods for imaging often fall short; they can be inaccurate, or require unavailable or difficult-to-acquire ground truth data for calibration, which can introduce hidden biases due to distribution shifts between calibration and observed data. We introduce a UQ approach that leverages equivariant bootstrapping to generate heuristic coverages by exploiting data symmetries. We then refine these coverages through a conformal prediction calibration step, while crucially employing a self-supervised approach to avoid the need for ground truth calibration data. We demonstrate this method with weak lensing mass-mapping, where we aim to reconstruct the convergence field from shear measurements of distant galaxies weakly-lensed by gravitational fields. Mass-mapping in particular benefits from the self-supervised approach, as simulating calibration data is expensive and relies on specific cosmological models that could introduce biases in downstream cosmological inference tasks.

URL PDF HTML ☆

赞 0 踩 0

2605.18633 2026-05-19 stat.ME stat.ML

Stable Causal Discovery via Directed Acyclic Graph Aggregation

通过有向无环图聚合实现稳定的因果发现

Yunan Wu, Yue Wang, Chunlin Li, Chenglong Ye

AI总结本文提出DAGgr模型平均框架，通过聚合多个候选DAG以获得稳定表示，利用外样本预测似然加权候选图，并通过边重要性评分阈值规则保证聚合图的无环性，通过理论分析和实验验证其有效性。

详情

AI中文摘要

有向无环图（DAGs）在揭示复杂系统中的因果结构中起着核心作用，但从数据中学习单一DAG往往具有挑战性：模型不确定性、有限样本和大规模的搜索空间通常会导致不稳定的估计。我们提出了DAGgr，一种模型平均框架，将多个候选DAG聚合为一个稳定的表示。候选图通过在重复数据分割上的外样本预测似然进行加权，而对结果边重要性评分的应用阈值规则保证聚合图本身是无环的。我们建立了有限样本风险界，证明了该过程保持无环性，并展示了在温和的权重条件下边选择的一致性。在随机、中心枢纽和链式结构的模拟中，以及对Sachs等人（2005）蛋白质信号网络的分析中，DAGgr在匹配或超过最佳单个候选的同时，在结构恢复度量上一致优于bootstrap聚合基线。

英文摘要

Directed Acyclic Graphs (DAGs) are central to uncovering causal structure in complex systems, yet learning a single DAG from data is often challenging: model uncertainty, finite samples, and a combinatorially large search space frequently yield unstable estimates. We propose DAGgr, a model averaging framework that aggregates multiple candidate DAGs into a single stable representation. Candidate graphs are weighted by their out-of-sample predictive likelihood across repeated data splits, and a thresholding rule on the resulting edge-importance scores guarantees that the aggregated graph is itself acyclic. We establish a finite-sample risk bound, prove that the procedure preserves acyclicity, and show that edge selection is consistent under mild conditions on the weights. Simulations across random, hub, and chain structures, together with an analysis of the Sachs et al. (2005) protein-signaling network, show that DAGgr matches or exceeds the best individual candidate while consistently outperforming bootstrap-aggregation baselines across structural recovery metrics.

URL PDF HTML ☆

赞 0 踩 0

2605.18619 2026-05-19 stat.ME stat.CO

Random spanning tree Markov random field priors for Bayesian inverse problems in imaging

随机生成树马尔可夫随机场先验用于成像中的贝叶斯反问题

Jasper Marijn Everink

AI总结本文提出了一种基于随机生成树的马尔可夫随机场先验，用于解决成像中的贝叶斯反问题，通过将连续和离散随机变量结合，改进了图像去噪、去模糊和修复等任务的性能。

详情

AI中文摘要

马尔可夫随机场是贝叶斯反成像问题中常用的先验分布。特别是，差分先验将相邻像素之间的差异分配概率分布，如高斯、拉普拉斯或柯西分布。根据所选差分分布，这些先验具有平滑或边缘保持特性。在本文中，我们提出了一种超先验，用于像素网格的连通图，形式为随机生成树，即具有最小边数的随机连通图，从而在先验中耦合连续和离散随机变量。通过使用随机生成树，仅对边缘的稀疏随机子集进行正则化，这有助于在减少对比损失的情况下保留图像边缘，与标准差分马尔可夫随机场相比。我们讨论了由于随机树连通性而在高分辨率先验样本中出现的类似分形界面。最后，我们提出了一种交替进行离散树更新和连续像素更新的吉布斯采样器，以高效探索后验分布。我们应用该方法到各种标准测试图像恢复问题，包括去噪、去模糊和修复，以研究所提先验的影响，与现有马尔可夫随机场进行比较。

英文摘要

Markov random fields are common prior distributions used in Bayesian inverse imaging problems. In particular, difference priors assign probability distributions to differences between neighbouring pixels, such as Gaussian, Laplace, or Cauchy distributions. Depending on the chosen difference distribution, these priors have smoothing or edge-preserving properties. In this work, we propose a hyperprior on the connectivity graph of the pixel grid in the form of a random spanning tree, i.e., a random connected graph with the minimal number of edges, thereby coupling continuous and discrete random variables in the prior. By using random spanning trees, only a sparse random subset of edges is regularized, which helps preserve edges in the image with reduced contrast loss compared to standard difference-based Markov random fields. We discuss how fractal-like interfaces arise in high-resolution prior samples due to the random-tree connectivity. Finally, we propose a Gibbs sampler that alternates between the discrete tree updates and continuous pixel updates to efficiently explore the posterior distribution. We apply the method to various standard test image restoration problems, including denoising, deblurring, and inpainting, to study the impact of the proposed prior in comparison with existing Markov random fields.

URL PDF HTML ☆

赞 0 踩 0

2605.18598 2026-05-19 cs.LG cond-mat.stat-mech math.FA math.PR math.ST stat.TH

函数ANOVA，或Hoeffding分解，提供了一个原理性的框架用于可解释性，通过将模型预测分解为主效应和高阶交互作用。对于独立输入，这种经典分解是显式的。它与SHAP值、广义加性模型和正交多项式展开密切相关，因此构成了加性可解释性的重要工具。然而，在更一般和现实的依赖设置中，获得可处理的表示并从数据中估计分解仍然具有挑战性。在本文中，我们针对连续输入解决了这个问题。通过结合Hilbert空间方法与广义函数ANOVA，我们构建了一个显式的Riesz基分解，使得分解计算变得容易。我们的方法恢复了经典独立情况及其相关的正交分解。基于此表示，我们提出了一种简单但强大的算法，能够在模型无关的设置下从数据样本中估计分解，并通过与几种最先进的解释方法进行实证比较，展示了该方法的威力。

英文摘要

The functional ANOVA, or Hoeffding decomposition, provides a principled framework for interpretability by decomposing a model prediction into main effects and higher-order interactions. For independent inputs, this classical decomposition is explicit. It is closely connected to SHAP values, generalized additive models, and orthogonal polynomial expansions, and therefore constitutes a fundamental tool for additive explainability. In the more general and realistic dependent setting, however, obtaining a tractable representation and estimating the decomposition from data remain challenging. In this work, we address this problem for continuous inputs. By combining Hilbert space methods with the generalized functional ANOVA, we build an explicit decomposition Riesz Basis allowing to easily compute the decomposition. Our formulation recovers the classical independent case and its associated orthogonal decomposition. Building on this representation, we propose a simple but mighty algorithm to estimate the decomposition from a data sample in a model-agnostic setting and we compare it empirically with several state-of-the-art explanation methods, demonstrating the power of the approach.

URL PDF HTML ☆

赞 0 踩 0

2605.18406 2026-05-19 math.NA cs.NA stat.ML

Computational aspects of the Volterra Signature

Volterra签名的计算方面

Paul P. Hager, Fabian N. Harang, Luca Pelizzari, Samy Tindel

AI总结本文研究了Volterra签名的计算方法，提出了一种高效的算法，通过分解Chen型卷积关系并引入多种高效算法，如二次复杂度O(J²)的近似方案、基于FFT的加速方案以及精确递归方案，解决了Volterra签名计算中的算法挑战。

详情

AI中文摘要

Volterra签名扩展了经典的路径签名，通过将其迭代积分结构中的通用矩阵值核纳入其中，从而获得时间序列的灵活记忆概念。其组成部分可以视为线性受控Volterra方程的连续Picard迭代，使得其精确计算具有额外的数学兴趣。然而，核的引入带来了显著的算法挑战。我们通过首先将[arXiv:2603.04525]中建立的Chen型卷积关系分解为解析和算术部分，然后引入几种高效的算法：一种通用的近似方案，其复杂度为O(J²)，其中J是时间步数；一种基于FFT的加速方案，其复杂度为O(J log J)，适用于在均匀网格上的卷积核；以及一种精确递归方案，其复杂度为O(JR²)，适用于具有状态空间表示维度为R的核；保留标准签名复杂度在路径维度和截断级别N中的标准复杂度。我们进一步证明，矩阵值核形式为K(t,s)=∑_p k_p(t-s)A_p的因子数量不会增加J和N的渐近复杂度。最后，我们推导了与相关Volterra签名核相关的有限差分预测-校正方案。所有算法均在公开可用的JAX基于包

英文摘要

The Volterra signature extends the classical path signature by incorporating general matrix-valued kernel into its iterated integral structure, yielding a flexible notion of memory for time series. Its components can be viewed as successive Picard iterates of linear controlled Volterra equations, making their exact computation of additional mathematical interest. However, the kernel introduces substantial algorithmic challenges. We provide a resolution by first decomposing the Chen-type convolution relation established in [arXiv:2603.04525] into analytic and arithmetic parts, and then introducing several efficient algorithms: a general approximative scheme with quadratic complexity $O(J^2)$ in the number of time steps $J$, an FFT-based acceleration with complexity $O(J\log J)$ for convolution kernels on uniform grids, and an exact recursion with complexity $O(JR^2)$ for kernels admitting a state-space representation of dimension $R$; retaining standard signature complexity in the path dimension and truncation level $N$. We further show that the number of factors in matrix-valued kernels of the form $K(t,s)=\sum_p k_p(t-s)A_p$ do not increase the asymptotic complexity in $J$ and $N$. Finally, we derive a finite-difference predictor--corrector scheme for the associated Volterra signature kernel. All algorithms are implemented in the publicly available JAX-based package "tensordev".

URL PDF HTML ☆

赞 0 踩 0

2605.18358 2026-05-19 math.ST stat.TH

Multi-state model with temporal-consistent survival analysis for homogeneous Markov chains

具有时间一致生存分析的多状态模型用于同质马尔可夫链

Mikael Escobar-Bach, Alexandre Popier, Malo Sahin

AI总结本文提出了一种基于时间一致生存分析的新方法，用于估计同质马尔可夫链中指定终端状态的首次到达时间分布，并讨论了治愈个体的问题，提出了治愈率估计器，并给出了非渐近的理论保证。

2605.18339 2026-05-19 stat.ME math.ST stat.TH

Compositional Periodic Spline Approximation for Circular Density Data in Bayes Spaces

基于贝叶斯空间的组合周期样条近似用于圆密度数据

Jitka Machalová, Jana Heckenbergerová, Karel Hron

AI总结本文提出了一种利用贝叶斯空间中的希尔伯特空间结构，通过组合周期样条对圆密度数据进行近似和分析的新框架，通过中心对数比变换将密度表示为标准L²空间的子空间，从而在保持分布相对性和周期结构的同时应用函数数据分析工具。

详情

AI中文摘要

本文提出了一种利用贝叶斯空间中的希尔伯特空间结构，通过组合周期样条对圆密度数据进行近似和分析的新框架。通过应用中心对数比变换，密度被表示为标准L²空间的子空间，这使得能够使用函数数据分析工具，同时保持分布的相对性质和周期结构。开发了具有零积分约束的系数基周期样条构造，以及用于平滑样条和惩罚样条的矩阵公式，允许高效估计和实现。该方法应用于长期风向数据，提供了平滑且可解释的密度估计，并支持进一步的统计分析，包括函数回归。结果展示了所提出方法的实用相关性和扩展到更复杂密度值数据的潜力。

英文摘要

This paper proposes a novel framework for the approximation and analysis of circular density data using compositional periodic splines within Bayes spaces with the Hilbert space structure. By applying the centered log-ratio transformation, densities are represented in a subspace of the standard $L^2$ space of real-valued functions, which enables the use of functional data analysis tools while preserving the relative nature of distributions and their periodic structure. A coefficient-based construction of periodic splines with a zero-integral constraint is developed, together with matrix formulations for both smoothing splines and penalized splines, allowing efficient estimation and implementation. The methodology is applied to long-term wind direction data, where it provides smooth and interpretable density estimates and supports further statistical analysis, including functional regression. The results demonstrate the practical relevance of the proposed approach and its potential for extensions to more complex density-valued data.

URL PDF HTML ☆

赞 0 踩 0

2605.18338 2026-05-19 stat.AP cs.LG

Robust Player-Conditional Champion Ranking for League of Legends: Style Similarity, Mastery Priors, and Archetype-Constrained Discovery

《英雄联盟中稳健的玩家条件冠军排名：风格相似性、熟练度先验知识和范式约束发现》

Min Heo, Pranav Kadiyam, Prasun Panthi

AI总结本文提出了一种基于玩家条件的稳健冠军排名方法，结合风格相似性、熟练度先验知识和范式约束，以解决《英雄联盟》中的冠军推荐问题。

Comments 11 pages, 3 figures

详情

AI中文摘要

在多人在线战斗竞技场游戏中，冠军推荐通常被非正式地视为元游戏强度、个人舒适度或全局胜率的问题。我们正式将《英雄联盟》中的冠军推荐建模为一个可解释的、玩家条件的排名问题，该问题在稀疏、嘈杂和非平稳的行为数据下进行。所提出的框架结合了四个信息源：人口强度代理、玩家风格相似性、直接和间接熟练度先验知识以及范式级的保护措施。该方法使用稳健的中位数/MAD标准化、对数转换用于偏斜事件计数、近期加权的玩家风格向量、熟练度加权的冠军池向量、加权余弦相似度、排名缩放的得分组件以及k-means++聚类用于粗略的范式支持。实现原型使用Python/Pandas建模层、Supabase支持的存储以及面向网页的推荐接口。与黑箱监督胜利预测系统不同，所提出的方法返回分解的推荐评分，可以作为预期性能代理、拟合、熟练度和范式兼容性的检查。包含一个单人案例研究，针对玩家标识符DIVINERAINRACCON的100场比赛历史进行端到端的合理性检查。因此，本文是一项方法和系统贡献：它指定了一个可重复、模块化和可审计的冠军推荐器，并通过时间训练-测试分割、下一冠军恢复、校准分析和消融研究提供了未来大规模评估的验证协议。

英文摘要

Champion recommendation in multiplayer online battle arena games is usually framed informally as a problem of metagame strength, personal comfort, or global win rate. We formalize champion recommendation in League of Legends as an interpretable, player-conditional ranking problem under sparse, noisy, and non-stationary behavioral data. The proposed framework combines four information sources: a population-strength proxy, player-style similarity, direct and indirect mastery priors, and archetype-level guardrails. The method uses robust median/MAD normalization, logarithmic transforms for skewed event counts, recency-weighted player style vectors, mastery-weighted champion-pool vectors, weighted cosine similarity, rank-scaled score components, and k-means++ clustering for coarse archetype support. The implemented prototype uses a Python/Pandas modeling layer, Supabase-backed storage, and a web-facing recommendation interface. Unlike black-box supervised win-prediction systems, the proposed method returns decomposed recommendation scores that can be inspected as expected-performance proxy, fit, mastery, and archetype compatibility. A single-player case study on a 100-game history for the player identifier DIVINERAINRACCON is included as an end-to-end sanity check. The manuscript is therefore a methods and systems contribution: it specifies a reproducible, modular, and auditable champion recommender and gives a validation protocol for future large-scale evaluation through temporal train-test splits, next-champion recovery, calibration analysis, and ablation studies.

URL PDF HTML ☆

赞 0 踩 0

2605.18315 2026-05-19 math.OC stat.ML

Attention-based PCA

基于注意力的PCA

Rodrigo Maulen-Soto, Claire Boyer

AI总结本文研究了注意力机制在无监督问题PCA中的表现，证明在高斯数据上训练时，softmax和线性注意力层学习的参数与协方差矩阵的主特征向量对齐，建立了与PCA的直接联系，并扩展到上下文设置中。

详情

AI中文摘要

我们通过一个经典无监督问题——主成分分析（PCA）的视角研究注意力机制。我们证明，当在高斯数据上训练时，softmax和线性注意力层学习的参数与协方差矩阵的主特征向量对齐，从而建立了与PCA的直接且明确的联系。我们的分析涵盖了有限和无限提示范围。在无限提示极限下，我们证明收敛到与主谱方向对齐的全局最优解；而在有限提示设置中，我们显示相同的行为在采样效应范围内出现。我们进一步将分析扩展到具有突出Wishart协方差的上下文设置中，其中注意力成功地恢复了底层信号方向。这些结果表明，在无监督目标下，注意力本质上执行类似于PCA的计算，为其实现表示学习能力提供了理论基础。

英文摘要

We study attention mechanisms through the lens of a canonical unsupervised problem: principal component analysis (PCA). We show that, when trained on Gaussian data, both softmax and linear attention layers learn parameters that align with the principal eigenvectors of the covariance matrix, thereby establishing a direct and explicit connection with PCA. Our analysis covers both finite and infinite prompt regimes. In the infinite-prompt limit, we prove convergence to globally optimal solutions aligned with the leading spectral direction, while in the finiteprompt setting we show that the same behavior emerges up to sampling effects. We further extend the analysis to an in-context setting with spiked Wishart covariances, where attention successfully recovers the underlying signal direction. These results demonstrate that attention inherently performs PCA-like computations under unsupervised objectives, providing a theoretical foundation for its representation-learning capabilities.

URL PDF HTML ☆

赞 0 踩 0

2605.18276 2026-05-19 stat.ML cs.LG

Geometric Dictionary Learning of Dynamical Systems with Optimal Transport

通过最优传输的几何字典学习动力系统

Thibaut Germain, Sami Chemlal, Rémi Flamary, Vladimir R. Kostic, Karim Lounici

AI总结本文提出DOODL框架，通过几何字典学习方法在谱算子空间中学习低维流形，从而实现对复杂动力系统的高效表征和可解释的算子估计。

详情

AI中文摘要

通过算子理论表示学习动力系统提供了一个强大的框架，用于分析复杂动态，因为诸如特征值和不变结构等谱量编码了特征时间尺度和长期行为。然而，动力算子通常独立地为每个系统估计，阻止了发现相关动态中的共享结构。为了解决这一限制，我们提出相关动力系统位于谱算子空间中的低维流形附近。基于这一假设，我们引入DOODL（Dynamical OperatOr Dictionary Learning），一个框架，学习一组特征谱动态的字典，其组合近似该流形并产生紧凑、可解释的个体系统嵌入。除了表征学习外，DOODL通过将估计限制在学习的算子流形上，使从短且部分观测轨迹中快速且可解释地估计算子成为可能。在metastable Langevin动力学和湍流等离子体模拟中的实验表明，DOODL能够扩展到高度复杂的多尺度区域，同时捕捉支配动态的特征谱结构，而不是仅仅拟合轨迹，在具有挑战性的低数据区域中，其误差比独立算子估计方法低一个到两个数量级。

英文摘要

Learning dynamical systems through operator-theoretic representations provides a powerful framework for analyzing complex dynamics, as spectral quantities such as eigenvalues and invariant structures encode characteristic time scales and long-term behavior. However, dynamical operators are typically estimated independently for each system, preventing the discovery of shared structure across related dynamics. To address this limitation, we posit that related dynamical systems lie near a low-dimensional manifold in spectral operator space. Based on this hypothesis, we introduce DOODL (Dynamical OperatOr Dictionary Learning), a framework that learns a dictionary of characteristic spectral dynamics whose combinations approximate this manifold and yield compact, interpretable embeddings of individual systems. Beyond representation learning, DOODL enables fast and interpretable operator estimation from short and partially observed trajectories by constraining the estimation to the learned operator manifold. Experiments on metastable Langevin dynamics and turbulent plasma simulations demonstrate that DOODL scales to highly complex multiscale regimes while capturing characteristic spectral structure governing the dynamics rather than merely fitting trajectories, achieving errors one to two orders of magnitude lower than independent operator estimation methods in challenging low-data regimes.

URL PDF HTML ☆

赞 0 踩 0

2605.18206 2026-05-19 stat.ME

A tool to determine the degrees of freedom in tree-structured varying coefficient models

确定树状结构变系数模型自由度的工具

Nikolai Spuck, Moritz Berger

AI总结本文提出了一种确定树状结构变系数模型自由度的公式，通过贝叶斯信息准则进行模型选择，并在模拟研究中验证了其比传统方法更准确且预测能力更强。

详情

AI中文摘要

树状结构变系数（TSVC）模型是一种灵活的广义回归方法，其中协变量的线性效应允许随着效应修饰变量的值而变化。相关效应修饰因子和交互作用通过递归分割来识别。在TSVC模型中，如同其他半参数和非参数回归方法一样，需要考虑数据驱动模型构建的成本以推导模型自由度（DoF）。为了解决这一问题，我们开发了一种易于应用的公式来近似TSVC模型的自由度。该公式用于基于贝叶斯信息准则（BIC）的模型选择，并在模拟研究中与将自由度设为自由模型参数的朴素解进行比较。为了说明所提出的自由度方法，使用BIC基于选择的TSVC模型被拟合到欧洲健康、老龄化和退休调查的数据上。结果表明，使用所提出公式计算自由度导致了更准确的选择结果，并提高了预测能力。

英文摘要

The tree-structured varying coefficient (TSVC) model is a flexible approach for generalized regression, where the linear effects of the covariates are allowed to vary with the values of effect modifiers. Relevant effect modifiers and interactions are identified using recursive partitioning. In TSVC models, analogously to other semi- and nonparametric regression approaches, one needs to account for the cost of data-driven model building when deriving the model degrees of freedom (DoF). To address this issue, we develop an easy-to-apply formula to approximate the DoF of a TSVC model. This formula is employed for model selection based on the Bayesian information criterion (BIC) and compared to the naive solution, setting the DoF to the number of free model parameters, in a simulation study. To illustrate the proposed DoF method, TSVC models using BIC-based selection were fitted to data from the Survey of Health, Ageing, and Retirement in Europe. Results indicated that calculation of the DoF using the proposed formula resulted in more accurate selection results with improved predictive ability.

URL PDF HTML ☆

赞 0 踩 0

2605.18204 2026-05-19 stat.ML cs.LG

Forward-Learned Discrete Diffusion: Learning how to noise to denoise faster

前向学习离散扩散：学习如何更快地噪声去噪声

Grigory Bartosh, Teodora Pandeva, Sushrut Karmalkar, Javier Zazo

AI总结本文提出前向学习离散扩散（FLDD），通过引入可学习的前向（噪声）过程，减少目标分布与模型分布之间的差距，实现少步生成。该方法采用非马尔可夫形式，利用可学习的边缘和后验分布，使生成过程保持因子化同时匹配噪声过程定义的目标。实验表明，在相同采样步数下，FLDD生成的样本质量优于传统离散扩散模型。

详情

AI中文摘要

离散扩散模型是一类强大的生成模型，在许多领域表现出色。然而，为了效率，离散扩散通常用因子化分布参数化生成（反向）过程，这使得模型难以在少量步骤内学习目标过程，并需要长且计算成本高的采样过程。为减少目标与模型分布之间的差距并实现少步生成，我们提出前向学习离散扩散（FLDD），引入可学习的前向（噪声）过程。不同于固定马尔可夫前向链，我们采用非马尔可夫形式，结合可学习的边缘和后验分布。这使生成过程保持因子化，同时匹配由噪声过程定义的目标。我们通过标准变分目标端到端训练所有参数。在各种基准测试中，实验表明，对于给定的采样步数，我们的方法生成的样本质量优于使用相同反向参数化的传统离散扩散模型。

英文摘要

Discrete diffusion models are a powerful class of generative models with strong performance across many domains. For efficiency, however, discrete diffusion typically parameterizes the generative (reverse) process with factorized distributions, which makes it difficult for the model to learn the target process in a small number of steps and necessitates a long, computationally expensive sampling procedure. To reduce the gap between the target and model distributions and enable few-step generation, we propose Forward-Learned Discrete Diffusion (FLDD), which introduces discrete diffusion with a learnable forward (noising) process. Rather than fixing a Markovian forward chain, we adopt a non-Markovian formulation with learnable marginal and posterior distributions. This allows the generative process to remain factorized while matching the target defined by the noising process. We train all parameters end-to-end under the standard variational objective. Experiments on various benchmarks show that, for a given number of sampling steps, our approach produces a higher quality samples than conventional discrete diffusion models using the same reverse parameterization.

URL PDF HTML ☆

赞 0 踩 0

2605.18180 2026-05-19 stat.ML cs.LG

Canonical Regularisation of Wide Feature-Learning Neural Networks

宽特征学习神经网络的规范正则化

George Whittle, Pranav Vaidhyanathan, Juliusz Ziomek, Natalia Ares, Maike A. Osborne

AI总结本文研究了宽特征学习神经网络中梯度流训练所隐含的正则化性质，揭示了在核域中广泛研究的范数正则化在特征学习域中会导致诱导偏差扭曲，并提出了弧范数作为可扩展的替代方案，扩展了范数正则化到特征学习域。

详情

AI中文摘要

宽神经网络在特征学习范式中推动了现代深度学习的发展，但它们的研究远少于核范式中的网络。我们考虑了这两个范式之间一个关键但研究不足的差异：梯度流训练所隐含的正则化和先验。这种规范正则化性质在核范式网络中已被广泛研究——在所有无限全局极小点中，梯度流精确选择消失的岭解——并支撑了著名的NN-GP对应关系，精确允许在训练过程中建模噪声。然而，我们证明在特征学习范式网络中，岭正则化会扭曲梯度流的诱导偏差，即使在正则化趋于零的极限下也是如此。在训练过程中，岭正则化会扭曲网络的诱导偏差，尤其对预训练网络造成损害，因为隐含的先验信息是有信息的。我们通过将规范正则化作为一种无关范式函数空间能量和提升函数来公理化，这在核范式中唯一识别岭解，并且关键地扩展到特征学习范式。通过研究特征学习网络的黎曼几何，我们从框架中推导出黎曼几何岭，将岭扩展到特征学习范式。相应地，我们证明规范函数空间先验是一个黎曼-高斯过程，扩展了更熟悉的高斯过程。作为实际贡献，我们提出了弧岭作为最小最大鲁棒、可扩展的替代方案，揭示了早停和规范正则化在学习范式中的深刻关系。最后，我们在图像处理和NLP迁移学习问题上展示了我们的理论后果。

英文摘要

Wide neural networks in the feature-learning regime drive modern deep learning, and yet they remain far less studied than their kernel-regime counterparts. We consider a critical yet under-explored difference between these two regimes: the regulariser and prior implied by gradient flow training. This canonical regularisation property is well-studied in kernel regime networks -- of all the infinite global minima, gradient flow selects exactly the vanishing ridge solution -- and underpins the celebrated NN-GP correspondence, precisely allowing the modelling of noise during training. However, we prove ridge regularisation biases gradient flow in feature-learning regime networks, even in the infinitesimal limit of vanishing regularisation. Over training, ridge distorts the inductive bias of the network, with a particular damage done to pretrained networks where the implicit prior is informative. We resolve this by axiomatising the canonical regulariser as a regime-agnostic function-space energy and lift, which uniquely identifies ridge in the kernel regime, and crucially generalises to the feature-learning regime. By studying the Riemannian geometry of feature-learning networks, we derive geodesic ridge from our framework, generalising ridge to the feature-learning regime. Correspondingly, we prove the canonical function-space prior is a Riemannian Gibbs Process, generalising the more familiar Gaussian Process. As a practical contribution, we propose arc ridge as a minimax-robust, scalable surrogate to geodesic ridge, revealing a deep relationship between early stopping and canonical regularisation across learning regimes. Finally, we demonstrate the consequences of our theory empirically on both image processing and NLP transfer-learning problems.

URL PDF HTML ☆

赞 0 踩 0

2605.18174 2026-05-19 cs.LG cs.DC math.OC stat.ML

Ringmaster LMO: Asynchronous Linear Minimization Oracle Momentum Method

Ringmaster LMO: 异步线性最小化Oracle动量方法

Abdurakhmon Sadiev, Artavazd Maranjyan, Ivan Ilin, Peter Richtárik

AI总结本文提出Ringmaster LMO，一种用于无约束随机非凸优化的异步线性最小化Oracle动量方法，通过延迟阈值机制改进传统同步方法，适用于异构分布式系统，实验表明其在系统异构性增强时表现更优。

详情

AI中文摘要

Muon最近作为一种强大的替代AdamW方法出现，展现出大规模预训练的良好结果和矩阵结构更新在实践中可能更快的证据。然而，Muon以及更一般的线性最小化Oracle（LMO）方法通常用于同步方式。这在异构分布式系统中存在问题，因为工人完成梯度计算的速度不同，同步训练必须反复等待较慢的工人。本文引入Ringmaster LMO，一种用于无约束随机非凸优化的异步LMO基于动量方法。我们的方法基于Ringmaster ASGD的延迟阈值思想。对于SGD类型方法，Ringmaster ASGD通过丢弃过于陈旧的梯度实现最优时间复杂度。Ringmaster LMO将这一机制扩展到一般LMO更新。我们建立了在广义$(L_0, L_1)$-平滑条件下的收敛保证，并进一步开发了参数无关变体，具有递减步长和自适应延迟阈值。最后，我们将我们的迭代保证转换为在异构工人计算时间下的时间复杂度界限。在经典欧几里得平滑设置中，这些界限恢复了Ringmaster ASGD的最优时间复杂度。在随机二次问题和NanoChat语言模型预训练中的实验表明，Ringmaster LMO的优势随着系统异构性增加而增强，并且该方法在同步和异步基线方法中表现更优。

英文摘要

Muon has recently emerged as a strong alternative to AdamW for training neural networks, with encouraging large-scale pretraining results and growing evidence that matrix-structured updates can be faster in practice. Yet Muon, and more generally Linear Minimization Oracle (LMO) based methods, are typically used synchronously. This is problematic in heterogeneous distributed systems, where workers complete gradient computations at different speeds and synchronous training must repeatedly wait for slower workers. In this work, we introduce Ringmaster LMO, an asynchronous LMO-based momentum method for unconstrained stochastic nonconvex optimization. Our method builds on the delay-thresholding idea of Ringmaster ASGD. For SGD-type methods, Ringmaster ASGD achieves optimal time complexity by discarding overly stale gradients. Ringmaster LMO extends this mechanism to general LMO-based updates. We establish convergence guarantees under generalized $(L_0, L_1)$-smoothness and further develop a parameter-agnostic variant with decreasing stepsizes and adaptive delay thresholds. Finally, we translate our iteration guarantees into time complexity bounds under heterogeneous worker computation times. In the classical Euclidean smooth setting, these bounds recover the optimal time complexity of Ringmaster ASGD. Experiments on stochastic quadratic problems and NanoChat language-model pretraining show that the advantages of Ringmaster LMO grow with system heterogeneity and that the method outperforms strong synchronous and asynchronous baselines.

URL PDF HTML ☆

赞 0 踩 0

2605.18167 2026-05-19 stat.ME

1-truncated C-vine copula mixed models for network meta-analysis of multiple diagnostic tests

1-truncated C-vine copula混合模型用于多诊断测试网络元分析

Aristidis K. Nikoloulopoulos

AI总结本文提出了一种灵活且强大的1-truncated C-vine copula混合模型，用于网络元分析多个诊断测试，以提高对多诊断测试准确性比较的分析能力。

详情

AI中文摘要

随着多诊断测试的元分析对临床决策和患者健康的影响日益增加，统计模型在整合比较多个诊断测试的研究证据方面受到越来越多的关注。为了在单个研究中比较多个诊断测试的准确性，三种设计被广泛使用：（i）多测试比较设计；（ii）随机设计；（iii）非比较设计。广义线性混合模型（GLMMs）目前是联合元分析这三种设计数据的推荐方法，能够实现同时推断。在此背景下，提出1-truncated C-vine copula混合模型作为一种灵活且强大的替代方法。这些模型通过允许随机效应的任意单变量分布，并捕捉尾部依赖性和不对称性，扩展了GLMM框架。我们通过广泛的模拟研究和对深静脉血栓诊断测试网络元分析案例的深入重新分析，展示了我们方法的实用性。结果表明，1-truncated C-vine copula混合模型在GLMMs之上可以提供改进，支持其在多诊断测试网络元分析中的采用。

英文摘要

As meta-analysis of multiple diagnostic tests impacts clinical decision making and patient health, there is growing interest in statistical models that synthesize evidence from studies comparing multiple diagnostic tests. To compare the accuracy of multiple diagnostic tests in a single study, three designs are commonly used: (i) the multiple test comparison design; (ii) the randomized design, and (iii) the non-comparative design. Generalized linear mixed models (GLMMs) are currently the recommended approach for jointly meta-analyzing data from all three designs, enabling simultaneous inference. In this context, 1-truncated C-vine copula mixed models are proposed as a flexible and powerful alternative. These models generalize the GLMM framework by allowing for arbitrary univariate distributions of the random effects and capturing tail dependencies and asymmetries. We demonstrate the utility of our methods with an extensive simulation study and by insightfully re-analysing a case study on the network meta-analysis of diagnostic tests for deep vein thrombosis. Findings indicate that 1-truncated C-vine copula mixed models can offer improvements over GLMMs, supporting their adoption for network meta-analysis of multiple diagnostic tests.

URL PDF HTML ☆

赞 0 踩 0

2605.14565 2026-05-19 stat.ME math.ST stat.AP stat.TH

A Bayesian Longitudinal Spatial Normative Model for Individualized Brain Deviation Mapping

一个用于个性化大脑偏差映射的贝叶斯纵向空间规范模型

J. T. Korley

AI总结本文提出了一种贝叶斯纵向空间规范模型，通过统一的分层框架联合捕捉个体内部时间依赖性和空间结构化的个体偏差，从而在多个模拟场景中减少了偏差图重建误差，并在OASIS-3结构MRI数据应用中显著降低了RMSE。

详情

AI中文摘要

规范建模通过将受试者与参考人群进行比较而不是群体平均来实现对结构性大脑偏差的个性化表征。大多数现有实现将大脑区域独立处理且保持横断面，尽管有重复神经影像测量可用以及神经解剖变异的已知空间组织。我们提出了一种贝叶斯纵向空间规范模型，该模型在一个统一的分层框架中联合捕捉个体内部的时间依赖性和空间结构化的个体偏差。个体化偏差图被视为一个具有显式后验分布的潜在空间过程，从而在平方误差损失下获得一个原理性的贝叶斯估计器，而不是任意的残差总结。在六个涵盖不同空间依赖性、非线性轨迹、不规则访问计划和缺失随访的模拟场景中，所提出的模型在独立的横断面和纵向非空间基准上一致地减少了偏差图重建误差，同时保持了稳定的校准。在OASIS-3结构MRI数据的应用中，该模型相对于独立的横断面模型将RMSE降低了54%，相对于纵向非空间模型降低了45%。区域偏差负担集中在颞极、海马回、下颞叶皮层、后扣带回和旁海马回，这些区域与早期阿尔茨海默病型神经退行性变相关。个体层面的概况揭示了区域异常模式的显著异质性，包括显著的多区域偏差但保持全球认知分数。

英文摘要

Normative modeling enables individualized characterization of structural brain deviations by evaluating subjects against a reference population rather than a group average. Most existing implementations treat brain regions independently and remain cross-sectional, despite the availability of repeated neuroimaging measurements and the well-documented spatial organization of neuroanatomical variation. We propose a Bayesian longitudinal spatial normative model that jointly captures within-subject temporal dependence and spatially structured subject-specific deviations within a unified hierarchical framework. The individualized deviation map is treated as a latent spatial process with an explicit posterior distribution, yielding a principled Bayes estimator under squared error loss rather than an ad hoc residual summary. Across six simulation scenarios encompassing varying spatial dependence, nonlinear trajectories, irregular visit schedules, and missing follow-up, the proposed model consistently reduced deviation-map reconstruction error relative to independent cross-sectional and longitudinal non-spatial benchmarks while maintaining stable calibration. In an application to OASIS-3 structural MRI data, the model reduced RMSE by 54% relative to the independent cross-sectional model and by 45% relative to the longitudinal non-spatial model. Regional deviation burden was concentrated in the temporal pole, entorhinal cortex, inferior temporal cortex, posterior cingulate, and parahippocampal cortex, consistent with regions implicated in early Alzheimer-type neurodegeneration. Subject-level profiles revealed substantial heterogeneity in regional abnormality patterns, including marked multiregional deviation with preserved global cognitive scores.

URL PDF HTML ☆

赞 0 踩 0

2605.11617 2026-05-19 cs.LG math.ST stat.TH

MIST: Reliable Streaming Decision Trees for Online Class-Incremental Learning via McDiarmid Bound

MIST：通过McDiarmid界实现可靠的流决策树用于在线类增量学习

Phu-Hoa Pham, Chi-Nguyen Tran, Nguyen Lam Phu Quy, Dao Sy Duy Minh, Huynh Trung Kiet, Long Tran-Thanh

AI总结本文提出MIST方法，通过三个集成组件解决流决策树在在线类增量学习中的可靠性问题，包括McDiarmid置信半径、贝叶斯继承协议和KLL量化图，以提升在非高斯几何中的鲁棒性。

Comments 9 pages of main text, 5 figures

详情

AI中文摘要

流决策树是开放世界持续学习的自然候选者，因为它们执行局部更新，具有有界内存，并且具有静态决策边界。尽管如此，它们仍然在在线类增量学习中失败，由于两个耦合的校准问题：（i）随着类别数K的增加，其分裂标准逐渐变得不可靠；（ii）在分裂时间缺乏知识转移。这两种失败的共同根源是信息增益的范围本质上与log2 K成比例。因此，任何基于它的Hoeffding式置信半径必然随着类别数的增长而增长，使得结构上独立于K的分裂标准不可能，从而剥夺了应用流决策树进行持续学习的潜在优势。为了解决这个问题，我们提出了MIST（McDiarmid增量流树），通过三个集成组件解决这两种失败：（i）一个紧致且独立于K的McDiarmid置信半径用于Gini分裂，作为结构正则化器；（ii）一个贝叶斯继承协议，通过截断高斯矩将父统计信息投影到子节点，方差减少保证在最保守的分裂时最强；（iii）每个叶子的KLL量化图支持连续阈值评估和几何自适应的叶子预测。在标准和压力测试表格流上，MIST在近高斯基准上与全局参数方法竞争，并在非高斯几何中表现出独特鲁棒性，其中SOTA基准崩溃。

英文摘要

Streaming decision trees are natural candidates for open-world continual learning, as they perform local updates, enjoy bounded memory, and static decision boundaries. Despite these, they still fail in online class-incremental learning due to two coupled miscalibrations: (i) their split criterion grows unreliable as the class count K expands, and (ii) the absence of knowledge transfer at split time. Both failures share a common root: the range of Information Gain intrinsically scales with log2 K. Consequently, any Hoeffding-style confidence radius derived from it must inevitably grow with the class count, making a K-independent split criterion structurally impossible, taking away the potential benefits of applying streaming decision trees to continual learning. To fix this issue, we present MIST (McDiarmid Incremental Streaming Tree), which resolves both failures through three integrated components: (i) a tight, K-independent McDiarmid confidence radius for Gini splitting that acts as a structural regulariser; (ii) a Bayesian inheritance protocol that projects parent statistics to child nodes via truncated-Gaussian moments, with variance reduction guarantees strongest precisely when splitting is most conservative; and (iii) per-leaf KLL quantile sketches that support both continuous threshold evaluation and geometry-adaptive leaf prediction from a single data structure. On standard and stress-test tabular streams, MIST is competitive with global parametric methods on near-Gaussian benchmarks and uniquely robust on non-Gaussian geometry where SOTA benchmarks collapse.

URL PDF HTML ☆

赞 0 踩 0

2605.11365 2026-05-19 cs.AI cs.LG stat.ML

Causal Bias Detection in Generative Artificial Intelligence

生成人工智能中的因果偏见检测

Drago Plecko

AI总结本文研究了生成人工智能中的因果公平性问题，提出了新的因果分解结果，以量化不同因果路径和现实机制被生成模型替代对公平性的影响，并通过分析大型语言模型中的种族和性别偏见验证了方法的有效性。

详情

AI中文摘要

基于人工智能构建的自动化系统越来越多地应用于高风险领域，引发了关于公平性和现实世界中存在的人口差异持续存在的关键担忧。在此背景下，因果推断提供了一个有原则的框架来思考公平性，因为它将观察到的不平等与潜在机制联系起来，并自然与人类直觉和法律上的歧视观念相一致。先前关于因果公平性的研究主要集中在标准机器学习设置中，其中决策者为结果变量Y构建单一预测机制f_Ŷ，同时继承其他协变量的因果机制。然而，生成人工智能的设置却更加复杂：生成模型可以从任意条件下对任何变量集进行采样，隐式地构建了自己对所有因果机制的看法，而不是学习单一预测函数。这种根本性的差异要求因果公平性方法论有新的发展。我们正式定义了生成人工智能中的因果公平性问题，并在统一的理论框架下将其与标准机器学习设置相结合。然后，我们推导了新的因果分解结果，使能够对不同因果路径以及现实机制被生成模型机制替代的公平性影响进行精细量化。我们建立了识别条件并引入了用于因果感兴趣的量的高效估计器，并通过分析不同数据集中的大型语言模型中的种族和性别偏见来证明了我们方法的价值。

英文摘要

Automated systems built on artificial intelligence (AI) are increasingly deployed across high-stakes domains, raising critical concerns about fairness and the perpetuation of demographic disparities that exist in the world. In this context, causal inference provides a principled framework for reasoning about fairness, as it links observed disparities to underlying mechanisms and aligns naturally with human intuition and legal notions of discrimination. Prior work on causal fairness primarily focuses on the standard machine learning setting, where a decision-maker constructs a single predictive mechanism $f_{\widehat Y}$ for an outcome variable $Y$, while inheriting the causal mechanisms of all other covariates from the real world. The generative AI setting, however, is markedly more complex: generative models can sample from arbitrary conditionals over any set of variables, implicitly constructing their own beliefs about all causal mechanisms rather than learning a single predictive function. This fundamental difference requires new developments in causal fairness methodology. We formalize the problem of causal fairness in generative AI and unify it with the standard ML setting under a common theoretical framework. We then derive new causal decomposition results that enable granular quantification of fairness impacts along both (a) different causal pathways and (b) the replacement of real-world mechanisms by the generative model's mechanisms. We establish identification conditions and introduce efficient estimators for causal quantities of interest, and demonstrate the value of our methodology by analyzing race and gender bias in large language models across different datasets.

URL PDF HTML ☆

赞 0 踩 0

2605.09782 2026-05-19 cs.DS stat.ME

Near-Linear Time Generalized Sinkhorn Algorithms for Bounded Genus Graphs

近线性时间的广义Sinkhorn算法用于有界亏格图

Krzysztof Choromanski, Derek Long, Ananya Parashar, Dwaipayan Saha

AI总结本文提出GenusSink算法，一种用于有界亏格图（如平面图）的近线性时间广义Sinkhorn算法，通过分离基于分解、计算几何技术和快速矩阵向量乘法等方法，解决了传统方法的二次时间复杂度问题，并在有界亏格图上实现了更精确的最优运输计算。

详情

AI中文摘要

我们提出了GenusSink，一种新的近似广义Sinkhorn算法，用于具有最短路径距离成本的有界亏格（如平面图）图，提供近线性时间：（1）预处理，（2）迭代步骤，（3）最终运输计划矩阵查询和近线性内存。GenusSink处理的图包括特别是平面图和逼近3D对象的有界亏格网格。GenusSink通过利用图分离分解、计算几何技术以及新的快速矩阵向量乘法结果（特别是傅里叶分析和低位移秩理论）来解决其暴力方法的总二次时间复杂度问题。它受到最近在图论中对用小树宽度度量近似有界亏格度量的突破性进展的启发。图中心的方法使我们能够针对在由加权图近似表示的流形上定义的相应分布的最优运输问题。我们进行了严格的理论分析，提供了实际实现，利用了本文中引入的新数据结构分离图场积分器（S-GFIs），并展示了经验验证。GenusSink提供的计算精度比其他高效的Sinkhorn算法高多个数量级，同时在与基线相比时仍保证了显著的计算改进。作为所开发方法的副产品，我们证明GenusSink在具有O(log log n)树宽的n-顶点图上（例如树）与暴力地理Sinkhorn算法在数值上是等价的。

英文摘要

We present GenusSink, a new class of approximate generalized Sinkhorn algorithms with shortest-path-distance costs for bounded genus (e.g. planar) graphs, providing near-linear time: (1) pre-processing, (2) iteration step, (3) final transport plan matrix querying and near-linear memory. Graphs handled by GenusSink include in particular planar graphs and bounded-genus meshes approximating 3D objects. GenusSink addresses total quadratic time complexity of its brute-force counterpart by leveraging separator-based decomposition of graphs, computational geometry techniques, and new results on fast matrix-vector multiplications with generalized distance matrices, using, in particular, Fourier analysis and low displacement rank theory. It is inspired by recent breakthroughs in graph theory on approximating bounded genus metrics with small treewidth metrics \citep{minor-free-paper}. The graph-centric approach enables us to target optimal transport problem with the corresponding distributions defined on the manifolds approximated by weighted graphs and with cost functions given by geodesic distances. We conduct rigorous theoretical analysis of GenusSink, provide practical implementations, leveraging newly introduced in this paper \textit{separation graph field integrators} (S-GFIs) data structures and present empirical verification. GenusSink provides orders of magnitude more accurate computations than other efficient Sinkhorn algorithms, while still guaranteeing significant computational improvements, as compared to the baseline. As a by-product of the developed methods, we show that GenusSink is \textbf{numerically equivalent} to the brute-force geodesic Sinkhorn algorithm on $n$-vertex graphs with treewidth $O(\log \log (n))$ (e.g. on trees).

URL PDF HTML ☆

赞 0 踩 0

2605.07855 2026-05-19 stat.AP

Jagged AI in Scientific Peer Review: Evidence from POMP Data Analysis

科学同行评审中的锯齿AI：来自POMP数据分析的证据

Jin Wook Lee, William Szegda, Zhisheng Song, Edward L. Ionides

AI总结本研究探讨了人工智能在科学同行评审中的表现，发现AI在某些领域表现出色而在其他领域表现不佳，通过分析POMP数据集，发现AI在技术错误检测上优于人类，但在解释性错误和叙述连贯性方面表现不足。

详情

AI中文摘要

尽管人工智能在学术写作和统计分析中日益普及，但其在科学同行评审中的性能仍鲜有研究。一个关键挑战是锯齿AI现象，即AI在某些领域表现出强劲的能力跃升，而在其他领域则表现不佳。为了在实际数据科学背景下研究这种锯齿性，我们考虑了审查部分观测马尔可夫过程（POMP）数据分析的任务。POMP模型，也称为状态空间模型或隐藏马尔可夫模型，被用于拟合各种应用中的机理动态模型，包括疾病传播、生态动态和金融风险评估。高质量的同行评审需要评估科学背景、识别复杂算法实现中的错误，并做出关于方法学最佳实践的决策。我们研究了来自密歇根大学研究生时间序列课程四个学期的72个POMP项目，这些项目的报告、源代码和学生同行评审已匿名且开放获取。我们比较了人类评审与四个AI评审代理，使用Claude Code配合不同的指令实现为技能文件。我们发现AI评审员表现出锯齿型能力谱，能够高效地发现人类忽视的技术错误和无效推断方法，但在检查解释性错误、叙述连贯性和领域驱动的模型批评方面无法达到人类标准。锯齿性在所有代理中发现相似，这与它主要是一种底层AI模型属性而非特定指令有关。技能文件配置改变了代理强调的弱点，但并未消除锯齿性。

英文摘要

Despite their growing use in academic writing and statistical analysis, the performance of artificial intelligence (AI) tools in scientific peer review remains a largely unexplored area. A key challenge is jagged AI, a phenomenon where AI exhibits strong ability spikes in some domains while remaining deficient in others. To study this jaggedness in a practical data science context, we considered the task of reviewing partially observed Markov process (POMP) data analyses. POMP models, also known as state-space models or hidden Markov models, are used to fit mechanistic dynamic models to time series data in diverse applications including disease transmission, ecological dynamics, and financial risk assessment. High-quality peer review in this area entails assessment of scientific context, identification of errors in implementing complex algorithms, and decisions concerning methodological best practices. We studied 72 POMP projects from four semesters of a University of Michigan graduate time series course for which the project reports, the source code, and student peer reviews are anonymized and open-access. We compared the human reviews with four AI reviewing agents, using Claude Code with differing instructions implemented as skill files. We found that AI reviewers exhibited a jagged capability profile, proficiently catching human-overlooked technical errors and invalid inference methodology, while failing to match human standards in checking interpretive errors, narrative coherence, and domain-informed model critique. The jaggedness was found to be similar for all agents, consistent with it being primarily a property of the underlying AI model rather than the specific instructions. Skill file configuration shifted which weaknesses agents emphasized, without removing the jaggedness.

URL PDF HTML ☆

赞 0 踩 0

2605.07263 2026-05-19 eess.SP cs.AI cs.DC cs.LG stat.ML

Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning

非协作空中联邦学习的资源元素能量差

Hao Chen, Zavareh Bozorgasl

AI总结本文提出了一种非协作物理层原始方法，即资源元素能量差（REED），用于连续符号聚合。该方法通过将实值更新的正负部分映射到配对正交的资源元素上的传输能量，并通过减去对应的接收到的能量来估计符号和。REED利用慢时间尺度校准的平均信道功率，但不需要瞬时发射端或接收端CSI或信道反转。对于独立的瑞利衰落，我们推导了单次REED和芯片多样扩展的精确一阶和二阶矩表达式。

Comments Preprint; Under-review; Codes to replicate the results is available at: https://github.com/zavareh1/REED

详情

AI中文摘要

Over-the-air federated learning (OTA-FL) reduces uplink latency by aggregating client updates directly over the wireless multiple-access channel. Coherent analog aggregation realizes this idea by aligning the phases and amplitudes of simultaneously transmitted waveforms, which typically requires synchronization, instantaneous channel-state information (CSI), phase compensation, and power control. Noncoherent energy detection removes the need for phase-coherent combining, but a single energy measurement is nonnegative and, therefore, cannot represent signed model updates. This paper introduces resource-element energy difference (REED), a noncoherent physical-layer primitive for continuous signed aggregation. REED maps the positive and negative parts of each real-valued update to transmit energies on paired orthogonal resource elements and estimates the signed sum by subtracting the corresponding received energies. The construction uses slow-timescale calibration of average channel powers, but does not require instantaneous transmitter- or receiver-side CSI or channel inversion. For independent Rayleigh fading, we derive exact first- and second-moment expressions for single-shot REED and for a chip-diverse extension that spreads each coordinate over multiple independently faded paired chips. The resulting variance laws separate fading-induced self-noise, signal-noise interaction, and receiver-noise fluctuation, giving an explicit diversity-resource tradeoff. More->The rest of abstract is in the paper.

英文摘要

Over-the-air federated learning (OTA-FL) reduces uplink latency by aggregating client updates directly over the wireless multiple-access channel. Coherent analog aggregation realizes this idea by aligning the phases and amplitudes of simultaneously transmitted waveforms, which typically requires synchronization, instantaneous channel-state information (CSI), phase compensation, and power control. Noncoherent energy detection removes the need for phase-coherent combining, but a single energy measurement is nonnegative and, therefore, cannot represent signed model updates. This paper introduces resource-element energy difference (REED), a noncoherent physical-layer primitive for continuous signed aggregation. REED maps the positive and negative parts of each real-valued update to transmit energies on paired orthogonal resource elements and estimates the signed sum by subtracting the corresponding received energies. The construction uses slow-timescale calibration of average channel powers, but does not require instantaneous transmitter- or receiver-side CSI or channel inversion. For independent Rayleigh fading, we derive exact first- and second-moment expressions for single-shot REED and for a chip-diverse extension that spreads each coordinate over multiple independently faded paired chips. The resulting variance laws separate fading-induced self-noise, signal-noise interaction, and receiver-noise fluctuation, giving an explicit diversity-resource tradeoff. More->The rest of abstract is in the paper.

URL PDF HTML ☆

赞 0 踩 0

2603.17577 2026-05-19 cs.LG cs.AI stat.ML

Identifying Latent Actions and Dynamics from Offline Data via Demonstrator Diversity

通过示范多样性从离线数据中识别潜在动作和动态

Felix Schur

AI总结本文研究了在不观察动作的情况下从离线轨迹中恢复潜在动作和环境动态的问题，通过示范多样性假设，证明了在满足特定条件时，潜在转移和示范策略可以被唯一确定，从而为从离线强化学习数据中学习潜在动作和动态提供了新的方法。

详情

AI中文摘要

在动作未被观察的情况下，能否从离线轨迹中恢复潜在动作和环境动态？我们研究了在轨迹无动作但带有示范者身份标签的设置中这一问题。我们假设每个示范者遵循不同的策略，而环境动态在所有示范者之间是共享的，身份仅通过所选动作影响下一个观测。在这些假设下，条件下一个观测分布 $p(o_{t+1}\mid o_t,e)$ 是潜在动作条件化转移核的混合，具有示范者特定的混合权重。我们证明，这导致每个状态的可观测条件分布具有列随机非负矩阵分解。通过充分分散的策略多样性和秩条件，我们证明潜在转移和示范策略在潜在动作标签的排列下是可识别的。通过Gram行列式最小体积准则，我们将结果扩展到连续观测空间，并证明在连接的状态空间上转移映射的连续性将局部排列模糊性提升为单一全局排列。少量标记的动作数据足以消除最终的模糊性。这些结果确立了示范多样性作为从离线强化学习数据中学习潜在动作和动态的原理性可识别性来源。

英文摘要

Can latent actions and environment dynamics be recovered from offline trajectories when actions are never observed? We study this question in a setting where trajectories are action-free but tagged with demonstrator identity. We assume that each demonstrator follows a distinct policy, while the environment dynamics are shared across demonstrators and identity affects the next observation only through the chosen action. Under these assumptions, the conditional next-observation distribution $p(o_{t+1}\mid o_t,e)$ is a mixture of latent action-conditioned transition kernels with demonstrator-specific mixing weights. We show that this induces, for each state, a column-stochastic nonnegative matrix factorization of the observable conditional distribution. Using sufficiently scattered policy diversity and rank conditions, we prove that the latent transitions and demonstrator policies are identifiable up to permutation of the latent action labels. We extend the result to continuous observation spaces via a Gram-determinant minimum-volume criterion, and show that continuity of the transition map over a connected state space upgrades local permutation ambiguities to a single global permutation. A small amount of labeled action data then suffices to fix this final ambiguity. These results establish demonstrator diversity as a principled source of identifiability for learning latent actions and dynamics from offline RL data.

URL PDF HTML ☆

赞 0 踩 0

2603.17041 2026-05-19 stat.ML cs.AI cs.LG stat.ME

When Marginals Match but Structure Fails: Covariance Fidelity in Generative Models

当边缘匹配但结构失败：生成模型中的协方差保真度

Nazia Riasat

AI总结本文提出了一种基于协方差层面的依赖保真度评估标准，以弥补传统边缘分布匹配评估方法的不足，通过实验证明该标准能更准确地区分结构保留与结构丢失的生成模型。

Comments 44 pages, 25 figures. Extended version of paper accepted at MathAI 2026 (International Conference on Mathematics of Artificial Intelligence), March 30 - April 3, 2026

详情

AI中文摘要

生成模型正越来越多地被用作真实数据的替代品用于下游科学流程，但标准评估标准仍然集中在边缘分布匹配上。我们主张这代表了一个根本性的差距：下游推断很少是边缘操作，且一个通过所有单变量诊断的模型仍可能产生结构不可靠的合成数据。我们引入了协方差层面的依赖保真度，通过D_Sigma(P,Q) = ||Sigma_P - Sigma_Q||_F来衡量生成模型是否在超出单变量边缘之外保留数据的联合结构。三个结果正式化了这一准则。首先，边缘保真度对依赖结构没有任何约束：D_Sigma可以被任意增大，同时所有单变量边缘完全匹配。其次，协方差分歧会引起可量化的下游不稳定性，包括总体回归系数的符号反转。第三，通过Davis-Kahan型界提供对依赖敏感过程如PCA的正向稳定性保证。在三个领域，图像数据（Fashion-MNIST VAE，n = 60,000）、批量RNA-seq（TCGA-BRCA，n = 1,111）和小样本压力测试（阿尔茨海默症基因表达，n = 113）的实证验证显示，D_Sigma/delta在标准边缘诊断显示很少分离的情况下，能一致地区分结构丢弃与结构保留的生成器，确认了协方差层面保真度在跨领域和样本大小上提供了与现有评估指标正交的信息。

英文摘要

Generative models are increasingly deployed as substitutes for real data in downstream scientific workflows, yet standard evaluation criteria remain focused on marginal distribution matching. We argue that this represents a fundamental gap: downstream inference is rarely a marginal operation, and a model that passes every univariate diagnostic can still produce structurally unreliable synthetic data. We introduce covariance-level dependence fidelity, measured by D_Sigma(P,Q) = ||Sigma_P - Sigma_Q||_F, as a principled, computable criterion for evaluating whether a generative model preserves the joint structure of data beyond its univariate marginals. Three results formalise this criterion. First, marginal fidelity provides no constraint on dependence structure: D_Sigma can be made arbitrarily large while all univariate marginals match exactly. Second, covariance divergence induces quantifiable downstream instability, including sign reversals in population regression coefficients. Third, bounding D_Sigma provides positive stability guarantees for dependence-sensitive procedures such as PCA via Davis-Kahan-type bounds. Empirical validation across three domains, image data (Fashion-MNIST VAE, n = 60,000), bulk RNA-seq (TCGA-BRCA, n = 1,111), and a small-sample stress test (Alzheimer's gene expression, n = 113), shows that D_Sigma/delta consistently distinguishes structure-discarding from structure-preserving generators in cases where standard marginal diagnostics show little separation, confirming that covariance-level fidelity provides information orthogonal to existing evaluation metrics across domains and sample sizes.

URL PDF HTML ☆

赞 0 踩 0

2603.06984 2026-05-19 stat.ML cs.AI cs.GT cs.LG cs.SI

Masking Causality and Conditional Dependence

掩盖因果关系与条件依赖

Zou Yang, Sophia Xiao, Bijan Mazaheri

AI总结本文研究了通过平均约束来强制条件独立性的问题，发现这种约束在监管层面无法满足分层要求，而在优化者层面却能有效隐藏依赖关系，从而指出通过观测决策的平均统计来监管直接依赖是有限的，必须在决策规则层面进行监管。

详情

AI中文摘要

许多监管和分析问题要求被禁止的变量只能通过指定的允许渠道影响决策——这是一种出现在路径特定公平性、处理敏感信息和监管非公开信息交易等场景中的条件独立性要求。这些要求可以通过分层方式执行，或更常见且更高效地通过单个平均约束来执行。本文从监管者的角度将因果掩盖建模为一个线性规划，并证明平均约束优化几乎总是产生违反分层要求但恰好满足平均约束的政策。掩盖收益随着混淆和结果异质性增加而增长，检测需要精确的条件独立性测试，而平均约束旨在避免这些测试。从优化者的角度来看，相同的构造表明，被掩盖的政策恢复了大部分无约束利用的收益，但更难被检测到，因此在决策基础本身敏感的任何设置中都具有吸引力。这些结果表明，通过观测决策的平均统计来监管直接依赖在结构上是有限的，有意义的监管必须在决策规则本身层面进行。

英文摘要

Many regulatory and analytic problems require that a prohibited variable influence a decision only through a designated allowable channel -- a conditional-independence requirement that arises in path-specific fairness, the handling of classified information, and the regulation of trading on non-public information, among other settings. Such requirements may be enforced either stratum-by-stratum or, more commonly (and more efficiently), through a single averaged constraint on the conditional effect. We study the resulting enforcement problem from two perspectives. From the regulator's side, we formulate causal masking as a linear program and show that averaged-constraint optimization almost surely produces policies that violate the stratum-wise requirement while satisfying the averaged one exactly. The gains from masking grow with confounding and outcome heterogeneity, and detection requires precisely the conditional-independence tests that average constraints aim to avoid. From the optimizer's side, the same construction shows that masked policies recover most of the reward of unconstrained exploitation while being far harder to detect, making them attractive in any setting where the basis of decisions is itself sensitive. Together, these results argue that regulating direct dependence through averaged statistics on observed decisions is structurally limited, and that meaningful enforcement must operate at the level of the decision rule itself.

URL PDF HTML ☆

赞 0 踩 0

2602.22307 2026-05-19 stat.ME astro-ph.CO astro-ph.GA astro-ph.IM

Global structure of the time delay likelihood

时间延迟似然的全局结构

Namu Kroupa, Will Handley

AI总结本文研究了时间延迟推断中似然函数的固有病态，指出标准推断方法面临挑战，并提出通过增加活点数量等方法来确保收敛的实用解决方案。

Comments 21 pages, 8 figures

2602.07618 2026-05-19 cs.LG stat.ML

Neural Networks With Dense Weights Are Not Universal Approximators

具有密集权重的神经网络不是通用逼近器

Levi Rauchwerger, Stefanie Jegelka, Ron Levie

AI总结研究探讨了密集神经网络的逼近能力，指出在有限的权重约束下，密集连接的神经网络无法逼近任意连续函数，从而揭示了密集层神经网络的固有局限性，推动了稀疏连接在实现真正通用性中的必要性。

2602.05172 2026-05-19 stat.ML cs.LG math.ST stat.TH

Finite-Particle Rates for Regularized Stein Variational Gradient Descent

有限粒子率的正则化Stein变分梯度下降

Ye He, Krishnakumar Balasubramanian, Sayan Banerjee, Promit Ghosal

AI总结本文研究了正则化Stein变分梯度下降算法的有限粒子率，通过应用树脂型预条件器来校正SVGD的常数阶偏差，推导了时间平均经验测度的非渐近界，并在目标满足W₁I条件下，证明了对于光滑核函数的大类，W₁收敛。

2601.16022 2026-05-19 stat.ME

稀疏深度加法模型与交互：增强可解释性和预测性

Yi-Ting Hung, Li-Hsiang Lin, Vince D. Calhoun

AI总结本文提出了一种结合稀疏特征选择与深度子网络的稀疏深度加法模型与交互（SDAMI），通过三阶段策略实现高维回归中的可解释性和预测性提升。

详情

AI中文摘要

近年来深度学习的进步突显了需要能够从少量样本中学习、处理高维特征并保持可解释性的个性化模型。为此，我们提出了稀疏深度加法模型与交互（SDAMI）框架，该框架结合了以稀疏性驱动的特征选择与深度子网络以实现灵活的功能近似。SDAMI的核心是效应足迹原理，该原理认为高阶交互会在构成变量上留下可检测的边际痕迹，从而无需穷尽搜索即可发现它们。SDAMI通过三阶段策略执行这一原理：（1）筛选足迹变量，（2）通过组Lasso分离主效应与交互，（3）使用专用深度子网络建模组件。理论分析证实，足迹仅在测度零对称条件下消失，而这些条件在实践中极为罕见，从而确保了一致的交互恢复。广泛模拟显示，SDAMI能够成功识别出基于遗传的基线方法根本无法识别的纯交互，以接近零的假阳性率恢复复杂的效应结构。这些结果将SDAMI定位为一种原理上适用于高维回归的可解释框架。

英文摘要

Recent advances in deep learning highlight the need for personalized models that can learn from small samples, handle high-dimensional features, and remain interpretable. To address this, we propose the Sparse Deep Additive Model with Interactions (SDAMI), a framework that combines sparsity-driven feature selection with deep subnetworks for flexible function approximation. Central to SDAMI is the Effect Footprint principle, which posits that higher-order interactions leave detectable marginal traces on constituent variables, enabling their discovery without exhaustive search. SDAMI executes this principle through a three-stage strategy: (1) screening for footprint variables, (2) disentangling main effects from interactions via group lasso, and (3) modeling components with dedicated deep subnetworks. Theoretical analysis confirms that footprints vanish only under measure-zero symmetry conditions that are rare in practice, ensuring consistent interaction recovery. Extensive simulations demonstrate that SDAMI successfully identifies pure interactions that heredity-based baselines fundamentally miss, recovering complex effect structures with near-zero false positive rates. Together, these results position SDAMI as a principled framework for interpretable high-dimensional regression.

URL PDF HTML ☆

赞 0 踩 0

2509.22459 2026-05-19 stat.ML cs.LG

Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

通用逆向蒸馏用于匹配模型与真实数据监督（无GANs）

Nikita Kornilov, David Li, Tikhon Mavrin, Aleksei Leonov, Nikita Gushchin, Evgeny Burnaev, Iaroslav Koshelev, Alexander Korotin

AI总结本文提出RealUID框架，通过无需GANs的方式将真实数据无缝融入逆向蒸馏过程，为所有匹配模型提供统一的蒸馏方法，涵盖流匹配和扩散模型，并可扩展至其变种。

详情

AI中文摘要

尽管生成质量优异，现代扩散、流及其他匹配模型在推理时速度较慢，因为它们需要许多迭代生成步骤。最近的蒸馏方法通过在预训练教师模型指导下训练高效的单步生成器来解决这个问题。然而，这些方法通常局限于特定框架，例如仅限于扩散或仅限于流模型。此外，这些方法原本是数据无关的，为了利用真实数据，需要使用额外的复杂对抗训练和额外的判别器模型。在本文中，我们提出了RealUID，一种适用于所有匹配模型的通用蒸馏框架，能够无缝地将真实数据整合到蒸馏过程中而无需GANs。我们的RealUID方法提供了一个简单的理论基础，涵盖了流匹配和扩散模型之前的蒸馏方法，并可扩展到其变种，如桥接匹配和随机插值。代码可在https://github.com/David-cripto/RealUID中找到。

英文摘要

While achieving exceptional generative quality, modern diffusion, flow, and other matching models suffer from slow inference, as they require many steps of iterative generation. Recent distillation methods address this problem by training efficient one-step generators under the guidance of a pre-trained teacher model. However, these methods are often constrained to only one specific framework, e.g., only to diffusion or only to flow models. Furthermore, these methods are originally data-free, and to benefit from the usage of real data, it is required to use an additional complex adversarial training with an extra discriminator model. In this paper, we present RealUID, a universal distillation framework for all matching models that seamlessly incorporates real data into the distillation procedure without GANs. Our RealUID approach offers a simple theoretical foundation that covers previous distillation methods for Flow Matching and Diffusion models, and can be also extended to their modifications, such as Bridge Matching and Stochastic Interpolants. The code can be found in https://github.com/David-cripto/RealUID.

URL PDF HTML ☆

赞 0 踩 0

2508.08080 2026-05-19 cs.LG cs.NE stat.AP

Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles

符号量化回归用于条件量化可解释性预测

Cas Oude Hoekstra, Floris den Hengst

AI总结本文提出了一种符号量化回归方法，用于预测条件量化并解释预测变量对结果的影响，通过在航空燃料使用案例中比较预测极值和中央结果的模型，展示了SQR在高风险应用中的有效性。

详情

Journal ref: Transactions on Machine Learning Research, May 2026, https://openreview.net/pdf?id=x9OYbyPJOG

AI中文摘要

符号回归（SR）是一种生成可解释或白盒预测模型的已知框架。尽管SR已被成功应用于创建结果平均值的可解释估计，但目前尚不清楚如何利用SR来估计目标变量分布其他点处变量之间的关系。例如，中位数或极值的估计提供了预测变量如何影响结果的更全面图景，并在高风险、安全关键应用领域是必要的。本文介绍了符号量化回归（SQR），一种利用SR预测条件量化的做法。在广泛的评估中，我们发现SQR在透明模型上表现优于，并且在不牺牲透明性的情况下与强大的黑盒基线模型表现相当。我们还展示了如何利用SQR通过比较预测极值和中央结果的模型来解释目标分布的差异。我们得出结论，SQR适用于预测条件量化并理解不同分位数下的有趣特征影响。

英文摘要

Symbolic Regression (SR) is a well-established framework for generating interpretable or white-box predictive models. Although SR has been successfully applied to create interpretable estimates of the average of the outcome, it is currently not well understood how it can be used to estimate the relationship between variables at other points in the distribution of the target variable. Such estimates of e.g. the median or an extreme value provide a fuller picture of how predictive variables affect the outcome and are necessary in high-stakes, safety-critical application domains. This study introduces Symbolic Quantile Regression (SQR), an approach to predict conditional quantiles with SR. In an extensive evaluation, we find that SQR outperforms transparent models and performs comparably to a strong black-box baseline without compromising transparency. We also show how SQR can be used to explain differences in the target distribution by comparing models that predict extreme and central outcomes in an airline fuel usage case study. We conclude that SQR is suitable for predicting conditional quantiles and understanding interesting feature influences at varying quantiles.

URL PDF HTML ☆

赞 0 踩 0

2508.03833 2026-05-19 math.ST math.PR stat.TH

Computable Bounds for Strong Approximations with Applications

可计算的强逼近界及其应用

Haoyu Ye, Morgane Austern

AI总结本文提出了一种可计算的KMT不等式，用于有界独立同分布随机变量的部分和，同时给出了在标准差未知时的经验版本，并展示了其在在线突变点检测和首次击中时间概率中的应用。

2507.20982 2026-05-19 math.PR math.ST stat.TH

Bernstein-type dimension-free concentration for self-normalised martingales

伯恩斯坦型无维集中不等式用于自归一化鞅

Arya Akhavan, Amitis Shidani, Alex Ayoub, David Janz

AI总结本文提出了一种无维的伯恩斯坦型尾界不等式，用于自归一化鞅，其中归一化使用可预测的二次变分，半径取决于观测协方差的信息增益。应用包括为具有自适应选择的希尔伯特值协变量的逻辑回归提供椭球置信序列，以及为希尔伯特臂逻辑带宽提供实例自适应的后悔界。

2507.05482 2026-05-19 cs.LG stat.ML

Stein Diffusion Guidance: Training-Free Posterior Correction for Sampling Beyond High-Density Regions

Van Khoa Nguyen, Lionel Blondé, Alexandros Kalousis

AI总结本文提出了一种基于Stein扩散引导的训练自由后验校正方法，用于在高密度区域之外进行采样。该方法结合了随机最优控制和Stein变分推断，通过引入新的理论界和运行成本函数，实现了在低密度区域的有效引导。

Comments Revised version accepted to the ICML 2026 main track; prior version accepted to two ICLR 2026 workshops: ReALM-GEN and DeLTa

详情

AI中文摘要

Training-free diffusion guidance offers a flexible framework for leveraging off-the-shelf classifiers without additional training. Yet, current approaches hinge on posterior approximations via Tweedie's formula, which often yield unreliable guidance, particularly in low-density regions. Stochastic optimal control (SOC), in contrast, enables principled posterior sampling but remains computationally prohibitive for efficient inference. In this work, we reconcile the strengths of these paradigms by introducing Stein Diffusion Guidance (SDG), a novel 免训练 framework grounded in a surrogate SOC objective. We establish a new theoretical bound on the SOC value function, revealing the necessity of correcting approximate posteriors to reflect true diffusion dynamics. Building on Stein variational inference, SDG computes the steepest descent direction that minimizes the Kullback-Leibler divergence between approximate and true posteriors. By integrating a principled Stein correction mechanism along with a novel running cost functional, SDG enables effective guidance in low-density regions. Our experiments on diverse image-guidance tasks and on challenging small-ligand sampling for protein docking suggest that SDG consistently outperforms standard 免训练 guidance methods and highlights its potential for broader posterior sampling problems beyond high-density regimes.

分布变换器：通过实时先验适应实现快速近似贝叶斯推断

George Whittle, Juliusz Ziomek, Jacob Rawling, Maike A. Osborne

AI总结本文提出分布变换器，一种能够学习任意分布到分布映射的新型架构，通过实时先验适应实现快速近似贝叶斯推断，显著降低计算时间并达到与现有方法相当或更优的对数似然性能。

Comments Spotlight acceptance at ICML 2026

详情

AI中文摘要

尽管贝叶斯推断为在不确定性下的推理提供了原理性框架，但其广泛应用受到精确后验计算不可行的限制，需要使用近似推断。然而，现有方法通常计算成本高，或在先验变化时需要昂贵的重新训练，限制了其在如实时传感器融合等连续推断问题中的实用性。为了解决这些挑战，我们引入了分布变换器——一种新型架构，能够学习任意分布到分布的映射。我们的方法可以训练为将先验映射到对应的后验，条件于某些数据集——从而执行近似贝叶斯推断。我们的新型架构将先验分布表示为（通用近似）高斯混合模型（GMM），并将其实变为后验的GMM表示。GMM的组成部分通过自注意力机制相互关注，并通过交叉注意力机制与数据点相互作用。我们证明分布变换器在保持先验变化的灵活性的同时，显著减少了计算时间——从分钟到毫秒——并在序列推断、量子系统参数推断以及具有超先验的高斯过程预测后验推断等任务中实现了与现有近似推断方法相当或更优的对数似然性能。

英文摘要

While Bayesian inference provides a principled framework for reasoning under uncertainty, its widespread adoption is limited by the intractability of exact posterior computation, necessitating the use of approximate inference. However, existing methods are often computationally expensive, or demand costly retraining when priors change, limiting their utility, particularly in sequential inference problems such as real-time sensor fusion. To address these challenges, we introduce the Distribution Transformer -- a novel architecture that can learn arbitrary distribution-to-distribution mappings. Our method can be trained to map a prior to the corresponding posterior, conditioned on some dataset -- thus performing approximate Bayesian inference. Our novel architecture represents a prior distribution as a (universally-approximating) Gaussian Mixture Model (GMM), and transforms it into a GMM representation of the posterior. The components of the GMM attend to each other via self-attention, and to the datapoints via cross-attention. We demonstrate that Distribution Transformers both maintain flexibility to vary the prior, and significantly reduces computation times-from minutes to milliseconds-while achieving log-likelihood performance on par with or superior to existing approximate inference methods across tasks such as sequential inference, quantum system parameter inference, and Gaussian Process predictive posterior inference with hyperpriors.

URL PDF HTML ☆

赞 0 踩 0

2501.14993 2026-05-19 math.OC stat.ML

Convergence Analysis of the Wasserstein Proximal Algorithm beyond Geodesic Convexity

超越测地凸性的Wasserstein近端算法收敛性分析

Shuailong Zhu, Xiaohui Chen

AI总结本文提出了一种无需假设目标函数测地凸性的简单自包含分析，证明了Wasserstein近端算法在自然的Wasserstein类欧几里得Polyak-Łojasiewicz不等式下具有无偏线性收敛性，改进了现有在强测地凸性下求解Wasserstein梯度流的近端算法收敛率，并扩展到半测地凸目标的近端算法。

详情

AI中文摘要

近端算法是一种强大的工具，用于在一般的度量空间中最小化非线性和非光滑泛函。受最近在均场 regime 下研究噪声梯度下降算法训练动态在两层神经网络中的进展启发，本文提供了一种简单且自包含的分析，用于分析一般用途的Wasserstein近端算法的收敛性，而无需假设目标泛函的测地凸性。在自然的Wasserstein类欧几里得Polyak-Łojasiewicz不等式的前提下，我们证明了近端算法具有无偏和线性收敛速率。我们的收敛速率优于现有在强测地凸性下求解Wasserstein梯度流的近端算法的收敛率。我们还扩展了我们的分析到半测地凸目标的近端算法。在我们的数值实验中，近端训练在均场神经网络上的收敛速率比噪声梯度下降算法更快。

英文摘要

The proximal algorithm is a powerful tool to minimize nonlinear and nonsmooth functionals in a general metric space. Motivated by the recent progress in studying the training dynamics of the noisy gradient descent algorithm on two-layer neural networks in the mean-field regime, we provide in this paper a simple and self-contained analysis for the convergence of the general-purpose Wasserstein proximal algorithm without assuming geodesic convexity of the objective functional. Under a natural Wasserstein analog of the Euclidean Polyak-Łojasiewicz inequality, we establish that the proximal algorithm achieves an unbiased and linear convergence rate. Our convergence rate improves upon existing rates of the proximal algorithm for solving Wasserstein gradient flows under strong geodesic convexity. We also extend our analysis to the inexact proximal algorithm for geodesically semiconvex objectives. In our numerical experiments, proximal training demonstrates a faster convergence rate than the noisy gradient descent algorithm on mean-field neural networks.

URL PDF HTML ☆

赞 0 踩 0

2410.16307 2026-05-19 q-fin.ST stat.AP stat.ME

Functional Clustering of Discount Functions for Behavioral Investor Profiling

基于折扣函数的功能聚类用于行为投资者画像

Annamaria Porreca, Viviana Ventre, Roberta Martino, Salvador Cruz Rambaud, Fabrizio Maturo

AI总结本文通过功能数据分析研究不同性格类型在时间折扣行为中的异质性，揭示投资者画像的多样性，为金融顾问制定个性化策略提供理论支持。

详情

DOI: 10.1002/asmb.70101
Journal ref: Applied Stochastic Models in Business and Industry 42(3), e70101 (2026)

AI中文摘要

经典金融模型基于投资者理性决策和利用所有可用信息的假设，但这些模型往往无法捕捉跨时期选择和不确定性决策中的异常现象，尤其是在考虑个人偏好和消费模式差异时。此类限制阻碍了传统金融理论回答关键问题：个人偏好如何影响投资决策？投资者行为的驱动力是什么？个体如何选择其投资组合？Pompian的四种行为投资者类型（BITs）模型是一个重要贡献，它将行为金融学研究与Keirsey的性格理论联系起来，强调了性格在金融决策中的作用。然而，传统参数模型难以捕捉这些不同性格如何影响跨时期决策，如个体如何评估现在与未来结果之间的权衡。为填补这一空白，本文采用功能数据分析（FDA）专门研究时间折扣行为，揭示不同性格类型在时间不确定性感知和管理中的细微模式。我们的发现表明每种性格类型内部都存在异质性，表明投资者画像比以往认为的更加多样。这种细化的分类提供了更深入的见解，揭示了性格在塑造跨时期金融决策中的作用，为金融顾问更好地制定针对个体风险偏好和决策风格的策略提供了实用意义。

英文摘要

Classical finance models are based on the premise that investors act rationally and utilize all available information when making portfolio decisions. However, these models often fail to capture the anomalies observed in intertemporal choices and decision-making under uncertainty, particularly when accounting for individual differences in preferences and consumption patterns. Such limitations hinder traditional finance theory's ability to address key questions like: How do personal preferences shape investment choices? What drives investor behaviour? And how do individuals select their portfolios? One prominent contribution is Pompian's model of four Behavioral Investor Types (BITs), which links behavioural finance studies with Keirsey's temperament theory, highlighting the role of personality in financial decision-making. Yet, traditional parametric models struggle to capture how these distinct temperaments influence intertemporal decisions, such as how individuals evaluate trade-offs between present and future outcomes. To address this gap, the present study employs Functional Data Analysis (FDA) to specifically investigate temporal discounting behaviours revealing nuanced patterns in how different temperaments perceive and manage uncertainty over time. Our findings show heterogeneity within each temperament, suggesting that investor profiles are far more diverse than previously thought. This refined classification provides deeper insights into the role of temperament in shaping intertemporal financial decisions, offering practical implications for financial advisors to better tailor strategies to individual risk preferences and decision-making styles.

URL PDF HTML ☆

赞 0 踩 0

2406.16859 2026-05-19 stat.ME

On the extensions of the Chatterjee-Spearman test

关于Chatterjee-Spearman检验的扩展

Qingyang Zhang

AI总结本文提出了一种基于秩的联合检验方法，通过结合Chatterjee和Spearman相关性，扩展了检验的适用范围，并探讨了其在多变量情况下的应用。

Comments 46 pages, 8 figures

详情

AI中文摘要

Chatterjee (2021) 引入了一种新颖的独立性检验，该检验基于秩，渐近正态且对所有替代假设一致。Chatterjee检验的一个局限性是其在检测单调关系时统计功效较低。为了解决这一局限性，在我们之前的工作（Zhang, 2024, Commun. Stat. - Theory Methods）中，我们提出了将Chatterjee和Spearman相关性结合为最大型检验，并建立了渐近联合正态性。本工作考察了联合检验的三个关键扩展。首先，受其原始非对称形式的启发，我们将Chatterjee-Spearman检验扩展为对称版本，并推导了对称统计量的渐近零分布。其次，我们研究了Chatterjee相关性与其他流行秩相关性（包括Kendall's tau和 quadrant 相关性）之间的关系。我们证明，在独立性下，Chatterjee相关性和这些秩相关性渐近联合正态且独立。模拟研究显示，Chatterjee-Kendall检验的效力优于Chatterjee-Spearman检验。最后，我们探讨了两种可能的多变量扩展。这些扩展扩展了基于秩的联合检验在更广泛场景中的适用性。

英文摘要

Chatterjee (2021) introduced a novel independence test that is rank-based, asymptotically normal and consistent against all alternatives. One limitation of Chatterjee's test is its low statistical power for detecting monotonic relationships. To address this limitation, in our previous work (Zhang, 2024, Commun. Stat. - Theory Methods), we proposed to combine Chatterjee's and Spearman's correlations into a max-type test and established the asymptotic joint normality. This work examines three key extensions of the combined test. First, motivated by its original asymmetric form, we extend the Chatterjee-Spearman test to a symmetric version, and derive the asymptotic null distribution of the symmetrized statistic. Second, we investigate the relationships between Chatterjee's correlation and other popular rank correlations, including Kendall's tau and quadrant correlation. We demonstrate that, under independence, Chatterjee's correlation and any of these rank correlations are asymptotically joint normal and independent. Simulation studies demonstrate that the Chatterjee-Kendall test has better power than the Chatterjee-Spearman test. Finally, we explore two possible extensions to the multivariate case. These extensions expand the applicability of the rank-based combined tests to a broader range of scenarios.

URL PDF HTML ☆

赞 0 踩 0

2307.08643 2026-05-19 cs.LG stat.ML

Corruptions of Supervised Learning Problems: Typology and Mitigations

监督学习问题的腐败：类型与缓解方法

Laura Iacovissi, Nan Lu, Robert C. Williamson

AI总结本文提出了一种通用的腐败理论，通过马尔可夫核分析底层概率分布的变化，统一了不同类型的腐败模型，并探讨了针对各种腐败类型的缓解方法。

Comments 73 pages. To be published in Journal of Machine Learning Research 27 (2026) 1-73

详情

AI中文摘要

腐败在数据收集中普遍存在。尽管已有大量研究，现有文献主要集中在特定设置和学习场景，缺乏对腐败建模和缓解的统一视角。本文开发了一种通用的腐败理论，涵盖监督学习问题的所有修改，包括模型类和损失的变化。通过分析底层概率分布的变化，我们的方法带来了三个新机会：首先，构建了一个新型且可证明的腐败框架，区分不同类型的腐败；其次，通过比较清洁和受污染场景下的贝叶斯风险，系统分析了腐败对学习任务的影响；第三，基于这些结果，我们研究了各种腐败类型的缓解方法。我们扩展了现有的标签腐败损失修正方法以处理依赖性腐败类型。我们的发现强调了将经典腐败修正学习框架推广到更宽松的范式以涵盖更多腐败类型的必要性。我们提供了这种范式以及属性和联合腐败情况下的损失修正公式。

英文摘要

Corruption is notoriously widespread in data collection. Despite extensive research, the existing literature predominantly focuses on specific settings and learning scenarios, lacking a unified view of corruption modelization and mitigation. In this work, we develop a general theory of corruption, which incorporates all modifications to a supervised learning problem, including changes in model class and loss. Focusing on changes to the underlying probability distributions via Markov kernels, our approach leads to three novel opportunities. First, it enables the construction of a novel, provably exhaustive corruption framework, distinguishing among different corruption types. This serves to unify existing models and establish a consistent nomenclature. Second, it facilitates a systematic analysis of corruption's consequences on learning tasks, by comparing Bayes risks in the clean and corrupted scenarios. Notably, while label corruptions affect only the loss function, attribute corruptions additionally influence the hypothesis class. Third, building upon these results, we investigate mitigations for various corruption types. We expand existing loss-correction methods for label corruption to handle dependent corruption types. Our findings highlight the necessity to generalize this classical corruption-corrected learning framework to a new paradigm with weaker requirements to encompass more corruption types. We provide such a paradigm as well as loss correction formulas in the attribute and joint corruption cases.

URL PDF HTML ☆

赞 0 踩 0

2305.18578 2026-05-19 stat.ME cs.LG stat.ML

Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

快速自适应三元分割：一种适用于隐马尔可夫模型的高效解码过程

Alexandre Mösching, Housen Li, Axel Munk

AI总结本文提出了一种快速自适应三元分割（QATS）方法，通过分治策略在序列长度上具有多项对数复杂度，在状态空间大小上具有三次复杂度，适用于大规模隐马尔可夫模型。该方法通过自适应搜索近似最大化局部似然得分，实现了比Viterbi和PMAP更快的解码速度和更高的精度。

详情

DOI: 10.1080/10618600.2025.2572328
Journal ref: Journal of Computational and Graphical Statistics, 35(2), 865-879, 2026

AI中文摘要

隐马尔可夫模型（HMMs）由一个不可观测的马尔可夫链和一个可观测的过程组成——隐藏链的噪声版本。从噪声观测中解码原始信号是几乎所有基于HMM的数据分析的主要目标。现有的解码算法，如维特比算法和点最大后验（PMAP）算法，其计算复杂度在最坏情况下是观测序列长度的线性函数，或隐藏链状态空间大小的亚二次函数。我们提出了快速自适应三元分割（QATS），一种分治策略，其计算复杂度在序列长度上为多项对数，在状态空间大小上为三次方，因此特别适用于具有相对较少状态的大规模HMM。它还提出了一种有效的数据存储方法，即特定的累积和。本质上，估计的状态序列在所有最多三个段的局部路径中最大化局部似然得分，并且是可接受的。最大化仅通过自适应搜索过程近似进行。我们的模拟展示了QATS相比维特比和PMAP的速度提升，以及精度分析。QATS的实现可在GitHub上的R包QATS中找到。

英文摘要

Hidden Markov models (HMMs) are characterized by an unobservable Markov chain and an observable process -- a noisy version of the hidden chain. Decoding the original signal from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as Viterbi and the pointwise maximum a posteriori (PMAP) algorithm have computational complexity at best linear in the length of the observed sequence, and sub-quadratic in the size of the state space of the hidden chain. We present Quick Adaptive Ternary Segmentation (QATS), a divide-and-conquer procedure with computational complexity polylogarithmic in the length of the sequence, and cubic in the size of the state space, hence particularly suited for large scale HMMs with relatively few states. It also suggests an effective way of data storage as specific cumulative sums. In essence, the estimated sequence of states sequentially maximizes local likelihood scores among all local paths with at most three segments, and is meanwhile admissible. The maximization is performed only approximately using an adaptive search procedure. Our simulations demonstrate the speedups offered by QATS in comparison to Viterbi and PMAP, along with a precision analysis. An implementation of QATS is in the R-package QATS on GitHub.

URL PDF HTML ☆

赞 0 踩 0

2211.09284 2026-05-19 eess.SP cs.NA math.NA stat.ME

Iterative execution of discrete and inverse discrete Fourier transforms with applications for signal denoising via sparsification

迭代执行离散和反向离散傅里叶变换及其在信号去噪中的应用

H. Robert Frost

AI总结本文提出了一种迭代算法家族，通过反复执行离散和反向离散傅里叶变换，利用稀疏化操作在时域和频域数据中实现信号去噪，特别是在高斯噪声中恢复周期性尖峰信号。

2605.18134 2026-05-19 stat.CO stat.ME

Optimal Sampling for Kernel Quadrature on Unbounded Domains

核 quadrature 在无界域上的最优采样

Edoardo Bandoni, Christian Robert, Julien Stoehr

AI总结本文研究了随机 quadrature 方法，旨在提高鲁棒性而非特定核的最优性。提出了一种显式且依赖于 n 的采样分布，能够在不需了解核的情况下实现最小最大误差率，扩展到无界域，提供理论保证和实用的鲁棒最优随机 quadrature 方法。

详情

Yichen Shen, Mengxin Yu

AI总结本文研究了具有群对称性的数据的无分布预测推断，旨在建立超越可交换性的近条件覆盖保证。虽然许多预测推断方法可以达到目标覆盖水平，但大多数只能提供边缘覆盖。在实践中，条件预测推断更受青睐，因为它可以量化给定观察属性的黑盒预测的不确定性，从而适应异质性。尽管许多努力旨在实现高效的条件覆盖，但现有方法通常依赖于i.i.d.或可交换假设，这在结构数据如网络、聚类和成像数据中常常被违反。最近，SymmPI引入了一种在超越可交换性的情况下进行预测推断的统一方法；然而，其保证仍然只是边缘的，并不考虑总体异质性。为了填补这一差距，我们引入了C-SymmPI框架，该框架在具有群对称性的通用数据结构下实现近条件覆盖，超越了可交换性，覆盖网络、聚类级数据及相关结构。受放松多准确性启发，我们的方法将条件覆盖重新公式化为用户指定的功能类上的误覆盖误差。我们在分布不变性和分布转移下建立了理论保证，并推导了线性和RKHS函数类的收敛速率，将最先进结果作为可交换情况的特例恢复。为了计算效率，我们开发了两种变体：一种基于投影的算法用于高维观测，另一种基于采样的算法用于大或无限群。我们在分层和网络数据上展示了有效性。实验证果表明，C-SymmPI相比现有方法提供了更具信息性和稳定性的条件覆盖，精度有所提高。

详情

AI中文摘要

我们研究了具有群对称性的数据的无分布预测推断，旨在建立超越可交换性的近条件覆盖保证。虽然许多预测推断方法可以达到目标覆盖水平，但大多数只能提供边缘覆盖。在实践中，条件预测推断更受青睐，因为它可以量化给定观察属性的黑盒预测的不确定性，从而适应异质性。尽管许多努力旨在实现高效的条件覆盖，但现有方法通常依赖于i.i.d.或可交换假设，这在结构数据如网络、聚类和成像数据中常常被违反。最近，SymmPI引入了一种在超越可交换性的情况下进行预测推断的统一方法；然而，其保证仍然只是边缘的，并不考虑总体异质性。为了填补这一差距，我们引入了C-SymmPI框架，该框架在具有群对称性的通用数据结构下实现近条件覆盖，超越了可交换性，覆盖网络、聚类级数据及相关结构。受放松多准确性启发，我们的方法将条件覆盖重新公式化为用户指定的功能类上的误覆盖误差。我们建立了在分布不变性和分布转移下的理论保证，并推导了线性和RKHS函数类的收敛速率，将最先进结果作为可交换情况的特例恢复。为了计算效率，我们开发了两种变体：一种基于投影的算法用于高维观测，另一种基于采样的算法用于大或无限群。我们在分层和网络数据上展示了有效性。实验证果表明，C-SymmPI相比现有方法提供了更具信息性和稳定性的条件覆盖，精度有所提高。

英文摘要

We study distribution-free predictive inference for data with group symmetries, aiming to establish near-conditional coverage guarantees beyond exchangeability for structured data. While many predictive inference methods achieve a target coverage level, most provide marginal coverage. In practice, conditional predictive inference is often preferred, as it quantifies uncertainty for black-box predictions given observed attributes, thereby accommodating heterogeneity. Although many efforts have pursued efficient conditional coverage, existing methods rely on the i.i.d. or exchangeable assumption, often violated in structured settings such as networks, clusters, and imaging data. Recently, SymmPI introduced a unified approach to predictive inference under group symmetries beyond exchangeability; nevertheless, its guarantees remain marginal and do not account for population heterogeneity. To bridge this gap, we introduce C-SymmPI, a framework that achieves near-conditional coverage under general data structures with group symmetries, extending beyond exchangeability to cover networks, cluster-level data, and related structures. Inspired by relaxed multi-accuracy, our approach reformulates conditional coverage as miscoverage error over a user-specified function class. We establish theoretical guarantees under distributional invariance and distribution shift, and derive convergence rates for linear and RKHS function classes, recovering state-of-the-art results in the exchangeable setting as special cases. For computational efficiency, we develop two variants: a projection-based algorithm for high-dimensional observations, and a sampling-based algorithm for large or infinite groups. We demonstrate effectiveness on hierarchical and network data. Empirical results show that C-SymmPI delivers more informative and stable conditional coverage with improved accuracy compared to existing methods.

URL PDF HTML ☆

赞 0 踩 0

2605.17920 2026-05-19 stat.ME stat.AP

Multivariate reconciliation for hierarchical time series

多变量层级时间序列的重新协调

Ana Caroline Pinheiro, Rodrigo de Souza Bulhões, Rob J. Hyndman, Paulo Canas Rodrigues

AI总结本文提出了一种多变量重新协调方法，用于确保层级时间序列的预测一致性，并考虑变量间的关系。通过数值模拟和实际数据验证，该方法在模拟数据和实际应用中均优于传统方法。

Comments 22 pages, 7 figures, 8 tables

详情

AI中文摘要

某些时间序列可以根据某些特征（如地理或其它属性）进行层次化组织，这些序列称为层级时间序列。通常，所有层级的预测都会被生成，以确保一致性，即预测应满足与观测数据相同的汇总约束。各种方法已提出，通过使用一组基础预测来保证这种一致性，这一过程称为预测重新协调。类似于单变量情况，多变量时间序列也可以进行层次化结构。然而，所有现有方法都局限于单个变量。因此，确保一致的预测需要分别重新协调每个变量。然而，这一过程不考虑多个变量之间的相关性。为了解决这一限制，本文提出了一种多变量重新协调方法，以确保一致的预测并纳入变量间的关系。所提出的方法通过数值模拟进行测试，考虑了系列层次中的不同场景和多个变量之间的差异。此外，一些基础预测模型也被评估。该方法还应用于巴西实际就业数据中的录取和解雇数据。结果表明，多变量重新协调在模拟数据和实际应用中均比其他方法更准确。

英文摘要

Some time series can be hierarchically organized into levels based on certain characteristics, such as geography or other attributes of interest. These series are referred to as hierarchical time series. Typically, forecasts are generated at all levels to ensure coherence, meaning that the forecasts should satisfy the same aggregation constraints as the observed data. Various approaches have been proposed to guarantee this coherence by using a set of base forecasts. The process through which these forecasts are adjusted to become coherent is known as forecast reconciliation. Similar to the univariate case, multivariate time series can also be structured hierarchically. However, all existing approaches are limited to a single variable. As a result, ensuring coherent forecasts requires reconciling each variable separately. However, this process does not account for correlations among multiple variables. To address this limitation, this paper proposes a multivariate reconciliation methodology that ensures coherent forecasts and incorporates relationships among variables. The proposed methodology was tested through numerical simulations, considering distinct scenarios within the series hierarchy and across multiple variables. Additionally, some base forecasting models were evaluated. The methodology was also applied to real employment data of admissions and dismissals in Brazil. The results demonstrated that multivariate reconciliation yielded more accurate outcomes than the other methods considered, both in simulated data and in practical applications.

URL PDF HTML ☆

赞 0 踩 0

2605.17910 2026-05-19 stat.ME

Double/Debiased Machine Learning for Continuous Treatment Effects in Panel Data with Endogeneity

双重/去偏机器学习用于面板数据中持续治疗效应的估计

Peikai Wu, Kuan Sun, Zhiguo Xiao

AI总结本文提出了一种双重/去偏机器学习框架，用于估计非参数面板模型中平均导数效应，扩展了工具变量方法到面板数据设置，处理连续治疗和各种内生性形式，并引入交叉拟合方案以消除时间固定效应后的独立性。通过惩罚GMM去偏项实现自动去偏机器学习。所提出的估计器在同时效应、动态效应和聚合效应上具有一致性和渐近正态性，具有有效的方差估计器。模拟显示了减少的正则化偏差和准确的置信区间。对ECLS-K数据的应用揭示了家庭社会经济地位对儿童BMI影响的丰富动态。

2605.17864 2026-05-19 stat.ME

Wavelet Based Time Series Models with Time-Varying Thresholds

基于小波的时间序列模型与时间变化阈值

Rhea Davis, N. Balakrishna

AI总结本文提出了一种具有时间变化阈值的小波时间序列模型，通过小波级数展开表示阈值，能够更好地捕捉不规则和突发变化以及阈值参数的平滑变化，比傅里叶方法更具灵活性。通过模拟实验和实际数据应用评估了该模型的性能。

2605.17850 2026-05-19 stat.ML cs.CV cs.LG cs.NA math.NA math.PR

Simple Approximation and Derivative Free Inference-Time Scaling for Diffusion Models via Sequential Monte Carlo on Path Measures

通过路径测度的序列蒙特卡洛实现扩散模型的简单近似与无导数推理时间缩放

Chenyang Wang, Weizhong Wang, Yinuo Ren, Jose Blanchet, Yiping Lu

AI总结本文提出URGE算法，一种无需梯度的推理时间缩放方法，通过路径重要性重加权提升扩散模型样本质量，同时在合成测试和扩散模型基准中表现出色，且实现简单且无梯度依赖。

Comments accepted by ICML 2026

详情

AI中文摘要

扩散生成模型越来越多地依赖于推理时间引导，通过添加漂移项或重新加权专家混合物来提高任务特定目标的样本质量。然而，大多数现有技术需要重复评估分数或梯度，引入偏差、高计算开销或两者兼有。我们引入URGE（Unbiased Resampling via Girsanov Estimation），一种无导数的推理时间缩放算法，通过Girsanov测度变换进行路径重要性重加权。与先前工作不同，URGE为每个模拟轨迹附加简单的乘法权重，并定期重新采样。无需计算基于梯度的粒子权重。我们建立了路径级和粒子级SMC之间的等价性：Girsanov路径权重允许一个向后条件期望，恢复先前的粒子级权重，保证两种方案产生相同的无偏终端分布。经验上，URGE在合成测试和扩散模型基准中优于现有推理时间引导基线，实现了更好的生成质量，同时显著更简单且完全无梯度依赖。

英文摘要

iffusion-based generative models increasingly rely on inference-time guidance, adding a drift term or reweighting mixture of experts, to improve sample quality on task-specific objectives. However, most existing techniques require repeated score or gradient evaluations, introducing bias, high computational overhead, or both. We introduce \texttt{URGE}, Unbiased Resampling via Girsanov Estimation, a derivative-free inference-time scaling algorithm that performs path-wise importance reweighting via a Girsanov change of measure. Instead of computing gradient-based particle weights in previous work, \texttt{URGE} attaches a simple multiplicative weight to each simulated trajectory and periodically resamples. No score, no Hessian, and no PDE evaluation is required. We establish an equivalence between path-wise and particle-wise SMC: the Girsanov path weight admits a backward conditional expectation that recovers the previous particle-level weights, guaranteeing that both schemes produce the same unbiased terminal law. Empirically, \texttt{URGE} outperforms existing inference-time guidance baselines on synthetic tests and diffusion-model benchmarks, achieving better generation quality, while being significantly simpler to implement and fully gradient-free.

URL PDF HTML ☆

赞 0 踩 0

2605.17845 2026-05-19 stat.AP

Quantifying Officiating Impact in the NBA: A Referee Impact Metric Analysis Using ESPN Win-Probability Data

量化NBA裁判影响：使用ESPN胜率数据的裁判影响指标分析

Nirek Duma, Leo Benaharon

AI总结本文提出了一种裁判影响指标（RIM），用于量化NBA比赛中裁判决策对比赛结果的影响，通过整合胜率变化数据，分析裁判表现，并探讨不同因素对裁判影响的异质性。

详情

AI中文摘要

在过去一个世纪中，篮球分析从简单的比分统计发展到复杂的、考虑上下文的度量方法，这些方法评估事件对比赛结果的预期影响。然而，裁判分析并未经历这一转变：现有研究和公众讨论仍然严重依赖于犯规率、犯规差异、赛后复核的晚场比赛正确性标签，或球队/球员从判罚中受益的情况。这留下了经验上的空白，因为一场比赛中低影响的犯规不应等同于在紧要关头改变胜率的判罚。为了解决这一空白，我们引入了裁判影响指标（RIM），这是一个比赛层面的统计指标，整合了与犯规事件相关的绝对胜率变化，以衡量每位裁判在每场比赛中的影响。利用ESPN比赛总结和胜率数据，我们展示了RIM在经验上与犯规数量和犯规差异不同，识别了常规赛和季后赛裁判分布，并探讨了主客场、球队侧和裁判-球队异质性。然后，我们使用线性控制作为压力测试：在考虑主客场状态、球队、对手、赛季和季后赛系列状态等因素后，哪些描述性异常值在基本上下文调整后仍然存在。结果表明，一些球队侧和裁判-球队模式在条件后仍然可见，但遗漏变量稳健性诊断表明，这些模式应被解释为观测筛选信号，而不是任何单一官员有意、违规或吹哨责任的证据。本文对文献的贡献是基础性的，我们强调该框架应使用不同的胜率模型和进一步的因果推断进行测试。

英文摘要

Over the past century, basketball analytics has moved from simple box-score rates toward complex context-aware measures that evaluate events by their expected effect on game outcomes. Officiating analysis has not made the same transition: existing work and public discussion still rely heavily on foul rates, foul differentials, reviewed late-game correctness labels, or team/player benefit from calls. This leaves an empirical gap because a low-leverage foul in a decided game should not be treated as equivalent to a whistle that materially shifts win probability in a close game. To address this gap, we introduce the Ref Impact Metric (RIM), a game-level statistic that aggregates the absolute win-probability movement attached to foul events, measuring the impact of each referee for each game. Using ESPN game-summary and win-probability data for NBA seasons 2021-2022 through 2024-2025, we show that RIM is empirically distinct from both foul volume and foul disparity, identify regular-season and postseason referee distributions, and examine home/away, team-side, and referee-team heterogeneity. We then use linear controls intentionally as stress tests: conditioning on home status, team, opponent, season, and postseason series state asks which descriptive outliers persist after basic contextual adjustment. The results show that several team-side and referee-team patterns remain visible after conditioning, but omitted-variable robustness diagnostics indicate that these patterns should be interpreted as observational screening signals rather than evidence of intent, misconduct, or whistle-level responsibility by any single official. Our contribution to the literature is foundational, and we emphasize that this framework should be tested with different win probability models and further causal inference.

URL PDF HTML ☆

赞 0 踩 0

2605.17808 2026-05-19 cs.LG stat.ML

A Unified Framework for Data-Free One-Step Sampling via Wasserstein Gradient Flows

通过Wasserstein梯度流构建数据免费一步采样的统一框架

Chenguang Wang, Tianshu Yu

AI总结本文提出了一种基于Wasserstein梯度流的数据免费一步采样的统一理论框架，展示了f-分歧度目标下诱导速度场的通用形式，并通过软欠覆盖功能理论推导了分歧度选择与质量运输几何之间的压缩-弹性恒等式，进一步扩展到Log-Variance分歧度，并通过KDE实现和归一化流路线实现了一步推断。

详情

AI中文摘要

我们开发了一种基于Wasserstein梯度流的数据免费一步采样的统一理论框架。对于广泛的标准f-分歧度目标，我们证明诱导速度场具有通用形式V(x)=w(r(x))β(x)，其中β(x)=∇log(p(x)/q(x))在不同目标中共享，而w仅由分歧度的选择决定。这种分解表明标准f-分歧度漂移共享相同的渐近目标分布p，并主要区别于如何在欠覆盖区域重新分配瞬时修复努力。为了正式化这种区别，我们推导了软欠覆盖功能的一步区域响应理论，并获得了一个将分歧度选择与质量运输进入欠覆盖区域的几何联系的压缩-弹性恒等式。我们进一步将该框架扩展到Log-Variance (LV)分歧度，分析参考分布如何改变最终的漂移结构，并提出一个实用的LV启发式替代方案用于数据免费训练。基于此理论，我们通过KDE实现该框架，并描述了互补的归一化流路线，从而在训练后实现一步推断。在多模态高斯混合基准测试中的实验结果与理论预测一致，并在这些目标上展示了有效的一步采样。

英文摘要

We develop a unified theoretical framework for data-free one-step sampling from unnormalized target distributions based on Wasserstein gradient flows. For a broad class of standard f-divergence objectives, we show that the induced velocity field admits the universal form $\mathbf{V}(x)=w(r(x))\,β(x)$, where $β(x)=\nabla \log (p(x)/q(x))$ is shared across objectives and $w$ is determined solely by the choice of divergence. This decomposition shows that standard f-divergence drifts share the same asymptotic target distribution $p$ and differ primarily in how they redistribute transient repair effort across under-covered regions. To formalize this distinction, we derive a one-step regional-response theory for a soft under-coverage functional and obtain a compression--elasticity identity that links divergence choice to the geometry of mass transport into under-covered regions. We further extend the framework beyond the f-divergence family to the Log-Variance (LV) divergence, analyze how the reference distribution alters the resulting drift structure, and motivate a practical LV-inspired surrogate for data-free training. Based on this theory, we instantiate the framework with a KDE-based implementation and describe a complementary normalizing-flow route, enabling one-step inference after training. Experiments on multimodal Gaussian-mixture benchmarks are consistent with the theoretical predictions and demonstrate effective one-step sampling on these targets.

URL PDF HTML ☆

赞 0 踩 0

2605.17778 2026-05-19 math.ST cs.LG stat.ME stat.ML stat.TH

Self-Distillation is Optimal Among Spectral Shrinkage Estimators in Spiked Covariance Models

自蒸馏在带噪协方差模型中的谱收缩估计器中是最优的

Radu Lecoiu, Debarghya Mukherjee, Pragya Sur

AI总结本文研究了自蒸馏在带噪协方差模型中的表现，证明了在谱收缩估计器中，s步自蒸馏在性能上最优，并展示了其在统计和机器学习中的优势。

Comments 103 pages, 8 figures

详情

AI中文摘要

自蒸馏已经 emerged 为提高现代机器学习系统模型性能的一种有前景的技术。我们通过引入并分析一个广泛的估计器类别，即谱收缩估计器，建立了自蒸馏在带噪协方差模型中的统计基础。我们证明了对于具有s个脊的带噪协方差矩阵，s步自蒸馏在谱收缩估计器中达到最优性能，优于统计和机器学习中已知的估计器。此外，我们还显示s步是必要的，任何(s-k)步蒸馏估计器对于1 ≤ k ≤ s都是严格次优的。对于等方差协方差的特殊子类，我们证明了最优调优的岭回归在谱收缩估计器中表现最佳。我们还研究了一种联邦方法，其中多个数据中心共享谱收缩估计器，并且一个共同的服务器试图聚合它们以实现最优性能。在这种情况下，我们发现最佳的本地规则再次采用自蒸馏的形式，尽管当数据集中在单一服务器上时，它与最优规则不同。总之，我们的结果阐明了自蒸馏如何提高预测性能，并提供了一个更广泛的统计框架，将自蒸馏与经典收缩方法联系起来。

英文摘要

Self-distillation has emerged as a promising technique for improving model performance in modern machine learning systems. We develop the statistical foundations of self-distillation in spiked covariance models, by introducing and analyzing a broad class of estimators, namely spectral shrinkage estimators. We establish that for spiked covariance matrices with $s$ spikes, $s$-step self-distillation achieves optimal performance among spectral shrinkage estimators, outperforming well-known estimators in statistics and machine learning. Moreover, we show that $s$ steps are necessary for optimality: any $(s-k)$-step distilled estimator is strictly suboptimal for $1 \leq k \leq s$. For the special subclass of isotropic covariances, we show that optimally tuned Ridge regression performs best among spectral shrinkage estimators. We also study a federated approach where multiple data centers share spectral shrinkage estimators and a common server seeks to aggregate them to achieve optimal performance. In this case, we find that the best local rule again takes the form of self-distillation, though it differs from the optimal rule when data are hosted centrally on a single server. Together, our results elucidate why self-distillation improves predictive performance and provide a broader statistical framework connecting it with classical shrinkage-based methods.

URL PDF HTML ☆

赞 0 踩 0

2605.17771 2026-05-19 stat.AP

Multi-Class Neurological Disorder Prediction with Tensor Network Feature Engineering

多类神经系统疾病预测的张量网络特征工程

Keshav Balakrishna, Aaryan Chityala, Vivan Kanna, Ishan Pathak, Harshit Ravula, Aaron Lee, Alessandro Hammond, Moemal Al-Wishah, Leo Anthony Celi

AI总结本文提出了一种结合张量分解与集成分类器的方法，用于多类神经系统疾病预测，通过张量网络表达性提升了模型鲁棒性，并在临床数据集上展示了与最新经典方法相媲美的性能。

详情

AI中文摘要

准确诊断神经系统疾病依赖于先进的成像模态，如磁共振成像（MRI），其通常利用稀疏成像技术从有限数据中重建图像，从而减少存储和采集时间。然而，管理噪声和保留关键诊断特征仍具挑战性。在本研究中，一种集成分类器被增强为PARAFAC CP张量分解，其数学灵感来自量子神经网络架构，但完全采用经典方法实现。该模型在包含55,160张图像的大型平衡临床数据集上进行了评估，涵盖8种诊断类别，采用高和低PARAFAC秩配置。通过5折分层嵌套交叉验证评估，两种配置均表现出强大的验证性能，展示了张量网络表达性的鲁棒性。此外，所提模型在最近的经典方法中表现具有竞争力，进一步凸显了受量子启发的经典框架在增强医学图像分析和支持可靠临床诊断中的潜力。未来的工作将探索先进编码方案的整合、在真实量子硬件上的部署以及使用更多样化的神经系统数据集。

英文摘要

Accurate diagnosis of neurological disorders is contingent upon advanced imaging modalities such as Magnetic Resonance Imaging (MRI), which commonly utilize sparse imaging techniques to reconstruct images from limited data, thus reducing storage and acquisition time. However, challenges remain in managing noise and preserving critical diagnostic features for effective analysis. In this study, an ensemble classifier is enriched with PARAFAC CP tensor decompositions, drawing mathematical inspiration from quantum neural network architectures but implemented entirely classically. The model was evaluated on a large, balanced clinical dataset comprising 55,160 images across 8 diagnostic categories, employing both higher and lower PARAFAC rank configurations. Evaluated through 5-fold nested stratified cross-validation, both configurations achieved strong validation performance, demonstrating robustness to tensor network expressivity. Additionally, the proposed model achieved competitive performance relative to recent classical approaches, further underscoring the potential of quantum-inspired classical frameworks to enhance medical image analysis and support reliable clinical diagnosis. Future work will explore the integration of advanced encoding schemes, deployment on real quantum hardware, and the use of more diverse neurological datasets.

URL PDF HTML ☆

赞 0 踩 0

2605.17764 2026-05-19 stat.ME

Stationary birth-death processes generating inflation-deflation distributions: Avoiding the issue of dominance

静止的生灭过程生成通胀-贬值分布：避免主导问题

Wanrudee Skulpakdee, Mongkol Hunkrajok

AI总结本文研究了通过修改生灭过程的出生和死亡率来生成通胀-贬值分布的机制，并引入了两种新的此类分布，以解决现有方法中可能出现的主导问题。

详情

AI中文摘要

两种或更多计数分布的混合已深深嵌入超额计数分析中，通常相对于生灭过程（如几何分布、泊松分布、泊松-林德利分布、负二项分布、超泊松分布和康韦-马克斯韦尔-泊松分布）的平稳（平衡）分布。然而，超额计数产生的机制——即通过修改基础分布的出生和死亡率——尚未在文献中直接被研究。所有已知的通胀混合分布实际上都是生灭过程平稳分布的参数化。因此，尽管所得到的分布具有相同的形状，但它们来自不同的机制，并在回归分析中不等价。本文聚焦于由修改的生灭过程生成的通胀-贬值平稳分布，这些过程形成指数族，并引入了两种此类分布。

英文摘要

A mixture of two or more count distributions has become deeply embedded in the analysis of excess counts, often relative to the stationary (equilibrium) distributions of birth-death processes such as the geometric, Poisson, Poisson-Lindley (PL), negative binomial (NB), hyper-Poisson (HP), and Conway-Maxwell-Poisson (CMP) distributions. However, the mechanism by which excess counts arise--namely, through modifications of the birth and death rates in the base distributions--has not yet been directly examined in the research literature. All well-known inflation mixture distributions are, in fact, parameterizations of the stationary distributions of birth-death processes. Thus, although the resulting distributions share the same shapes, they arise from distinct mechanisms and are not equivalent in regression analyses. This paper focuses on inflation-deflation stationary distributions arising from modified birth-death processes that form an exponential family and introduces two types of such distributions.

URL PDF HTML ☆

赞 0 踩 0

2605.17763 2026-05-19 stat.ME stat.ML

Comparing Two Categorical Gini Correlations with Applications to Classification Problems

比较两种分类Gini相关性及其在分类问题中的应用

Sameera Hewage, Yongli Sang

AI总结本文提出了一种用于比较分类问题中预测变量重要性的推断框架，基于Dang等人（2020）提出的分类Gini相关性（CGC），通过测试不同预测变量组之间的CGC差异来评估预测变量的重要性，并通过模拟研究和实际应用验证了该方法的有效性。

详情

AI中文摘要

本文提出了一种用于比较分类问题中预测变量重要性的推断框架。该方法基于Dang等人（2020）提出的分类Gini相关性（CGC），这是一种衡量数值预测变量与分类结果之间依赖性的指标。通过测试不同预测变量组之间的CGC差异来评估预测变量的重要性。所提出的方法可以处理任意维度和不等维度的预测变量，并允许预测变量组之间存在依赖性。在原假设和备择假设下均建立了检验统计量的渐近正态性，并证明了该检验的一致性。此外，除了推导渐近分布外，还开发了一种非参数自助法作为另一种推断方法。通过模拟研究以及乳腺癌和人类活动识别数据集的应用，展示了所提出框架的有效性。

英文摘要

This article proposes an inferential framework for comparing predictor importance in classification problems with categorical response variables. The approach is based on the categorical Gini correlation (CGC) proposed by Dang et al. (2020), a measure of dependence between numerical predictors and categorical outcomes. Predictor importance is evaluated by testing differences in CGCs across competing predictor groups. The proposed methodology accommodates predictors of arbitrary and unequal dimensions and allows for dependence between predictor groups. Asymptotic normality of the test statistic is established under both the null and alternative hypotheses, and the resulting test is shown to be consistent. In addition to deriving the asymptotic distribution, a nonparametric bootstrap procedure is developed as an alternative approach to inference. Simulation studies, along with applications to breast cancer and human activity recognition datasets, demonstrate the effectiveness of the proposed framework.

URL PDF HTML ☆

赞 0 踩 0

2605.17749 2026-05-19 cs.LG stat.ML

Testable and Actionable Calibration for Full Swap Regret

可检验且可操作的全面交换懊悔校准

Konstantina Bairaktari, Lunjia Hu, Huy L. Nguyen, Jonathan Ullman

AI总结本文提出了一种新的校准度量标准SCDL，该度量标准在不削弱任何要求的前提下，既可操作又可检验，同时具备连续性和一致性等理想特性，并通过实验验证了其在实际中的优越性能。

详情

AI中文摘要

人工智能生成的预测越来越多地影响关键任务中的决策制定，因此必须具有可信度。校准是衡量可信度的一种广泛使用的度量标准，要求预测与真实频率匹配，并可以像真实概率一样对待某一结果。然而，定义校准是微妙的，设计良好的校准误差度量标准一直是最近研究的活跃主题。第一个目标是找到可操作的校准度量标准，即能够向决策者说明当预测被视为真实概率时的效用损失，这被称为交换懊悔。第二个目标是找到可检验的校准度量标准，即校准误差可以从少量预测和结果中测量出来。尽管这些是基本要求，但目前没有现有的校准度量标准能够完全满足这两个属性，所有现有的度量标准都通过限制交换懊悔的弱化观念来放松可操作性，或通过具有次优估计误差来放松可检验性。我们介绍了一种新的校准度量标准，称为软分箱校准决策损失（SCDL），我们证明其在不削弱任何要求的前提下是完全可操作的，并且可检验性具有几乎最优的误差率。此外，SCDL还满足其他理想属性，如连续性和一致性。我们还提供了一组实验，证明了SCDL与其他度量标准的理论优势在实践中导致更好的性能。

英文摘要

AI generated predictions increasingly inform decision making in critical tasks, and therefore must be trustworthy. One widely used measure of trustworthiness is calibration, which requires that the predictions match the true frequencies and can be treated like real probabilities of a given outcome. However, defining calibration is subtle, and designing good measures of calibration error has been an active topic of recent research. The first goal is to find calibration measures that are actionable, meaning they can inform decision makers about their utility loss when predictions are treated as true probabilities, which is known as swap regret. The second goal is to find calibration measures that are testable, meaning that calibration error can be measured from a small sample of predictions and outcomes. Although these are very basic requirements, there is no existing calibration measure that fully satisfies both properties, and all existing measures relax actionability by bounding a weaker notion of swap regret, or relax testability by having suboptimal estimation error. We introduce a new calibration measure, Soft-Binned Calibration Decision Loss (SCDL), which we prove is fully actionable without weakening either requirement, and testable with nearly optimal error rate. In addition, SCDL satisfies other desired properties such as continuity and consistency. We also provide a set of experiments confirming that the theoretical advantages of SCDL compared to other measures lead to better performance in practice.

URL PDF HTML ☆

赞 0 踩 0

2605.17745 2026-05-19 stat.ML cs.LG

平稳性变换真的能提升时间序列预测吗？一种受控的实验评估

Bhanu Suraj Malla, Yuqing Hu

AI总结本文通过构造具有已知性质（趋势、季节性、异方差性及组合）的合成数据集，并在七种模型和三种预测时间跨度（共3528次实验）上应用14种变换配置，评估了平稳性变换对不同非平稳性类型和模型家族的预测准确性影响，发现只有18%的变换能提升预测，而方差稳定化方法在异方差数据上表现更佳，且差分线性趋势序列反而会降低预测精度，实验证实应基于经验性外样本评估选择变换而非理论平稳性假设。

详情

AI中文摘要

模块格安全（第三部分）：对数单位格上的结构化CVP距离

Ming-Xing Luo

AI总结本文研究了对数单位格上结构化CVP距离的性质，证明了在随机短环元到对数单位格的L²距离收敛到特定值，并展示了其在Voronoi单元内的位置，同时给出了关于L∞范数的近似因子和粗格定理，以及模块行列式理想中的三角函数定理，最终将ML-KEM的CDPR因子从指数级降低到亚多项式级。

Comments 26 pages (simplied version). Most important part in this series

2605.14692 2026-05-19 math.ST stat.ME stat.TH

Asymptotic Anytime-Valid Inference for U-statistics

关于U统计量的渐近任意时刻有效推断

Leheng Cai, Qirui Hu, Weijia Li

AI总结本文研究了在连续监控下二阶U统计量的渐近任意时刻有效置信序列，通过Hoeffding投影将非退化情况转化为时间均匀的中心极限理论，同时在退化情况下提出SAGE边界以解决二次高斯-混沌近似问题，最终实现非退化和退化情况下的最优置信区间宽度。

详情

AI中文摘要

我们研究了在连续监控下二阶U统计量的渐近任意时刻有效的置信序列。在非退化情况下，Hoeffding的投影将问题转化为对一级投影部分和的时间均匀中心极限理论，同时在较弱的矩假设下证明了标准余项可以忽略。通过leave-one-out jackknife估计器，得到一个完全数据驱动的程序，从而得到具有渐近覆盖保证的参数置信序列。在退化情况下，我们证明U统计量近似于一个中心化的二次高斯-混沌，而非简单的高斯分布，这给顺序推断带来了重大挑战。为了解决这个问题，我们新颖地开发了Spectrally Allocated Gaussian-chaos Excursion（SAGE）边界，并基于截断谱估计提供插件实现，具有一致性保证。所得到的宽度可以达到预期的时间均匀最优速率：在非退化情况下为√(log log n/n)，在退化情况下为log log n/n。讨论了几种广泛使用的U统计量，并通过数值实验进一步支持所推导的理论。

英文摘要

We study asymptotic anytime-valid confidence sequences for degree-two U-statistics under continuous monitoring. In the nondegenerate case, Hoeffding's projection reduces the problem to a time-uniform central limit theory for the partial sums of the first-order projection, while the canonical remainder is shown to be negligible under mild moment assumptions. A leave-one-out jackknife estimator then yields a fully data-driven procedure, leading to confidence sequences with asymptotic coverage guarantee for the parameter of interest. In the degenerate case, we show that the U-statistic is approximated by a centered quadratic Gaussian-chaos rather than by a simple Gaussian, which poses significant challenges for sequential inference. To address this issue, we novelly develop the Spectrally Allocated Gaussian-chaos Excursion (SAGE) boundary, and then provide plug-in implementations based on truncated spectrum estimation with consistency guarantees. The resulting widths can attain the expected time-uniform optimal rates: $\sqrt{\log\log n/n}$ in the nondegenerate regime and $\log\log n/n$ in the degenerate regime. Several widely used U-statistics are discussed within the proposed framework, and numerical experiments further support the validity of the derived theory.

URL PDF HTML ☆

赞 0 踩 0

2604.12288 2026-05-19 stat.ML cs.LG stat.ME

SMART Fine-tuning Factor Augmented Neural Lasso

Jinhang Chai, Jianqing Fan, Cheng Gao, Qishuo Yin

AI总结本文提出了一种结合预训练源模型作为增强特征的残差调优框架（SMART），用于高维非参数回归中的变量选择问题，通过引入低秩因子结构和残差调优分解，实现了协变量和后验偏移的联合处理，并推导了最小最大最优的超额风险界。

Comments Authors are listed in alphabetical order

详情

AI中文摘要

细调是一种广泛用于将预训练模型适应到新任务的策略，然而在高维非参数设置中，其方法论和理论性质在变量选择方面尚未得到发展。我们提出了一种源模型增强残差调优（SMART）框架，该框架将预训练源模型作为增强特征纳入目标学习者，并仅估计残差目标特定组件。该方法广泛适用，从参数和稀疏模型到神经网络和黑箱机器学习模型。我们专注于细调因子增强神经Lasso的发展，从而得到SMART-FAN-Lasso。这种用于高维非参数回归的迁移学习框架，同时处理协变量和后验偏移。我们使用低秩因子结构来管理高维依赖协变量，并在残差调优分解中将目标函数表示为源模型和其他目标特定变量的函数，从而降低目标任务的有效复杂性。我们推导了最小最大最优的超额风险界，刻画了在相对样本量和函数复杂性条件下，细调在统计加速方面优于单任务学习的精确条件。在广泛的不同协变量和后验偏移场景中进行的大量数值实验表明，SMART-FAN-Lasso在严重的目标样本量约束下仍能超越标准基线，实现接近 oracle 的性能，经验上验证了推导的速率。

英文摘要

Fine-tuning is a widely used strategy for adapting pre-trained models to new tasks, yet its methodology and theoretical properties in high-dimensional nonparametric settings with variable selection have not yet been developed. We propose a source-model-augmented residual tuning (SMART) framework, which incorporates the pre-trained source model as an augmented feature into the target learner and estimates only the residual target-specific component. The approach is widely applicable, from parametric and sparse models to neural networks and blackbox machine learning models. We focus on the development of fine-tuning factor-augmented neural Lasso, resulting in SMART-FAN-Lasso. This transfer-learning framework for high-dimensional nonparametric regression with variable selection simultaneously handles covariate and posterior shifts. We use a low-rank factor structure to manage high-dimensional dependent covariates and a residual tuning decomposition in which the target function is expressed as a function of source model and other target-specific variables, thereby reducing the effective complexity of the target task. We derive minimax-optimal excess risk bounds, characterizing the precise conditions, in terms of relative sample sizes and function complexities, under which fine-tuning yields statistical acceleration over single-task learning. Extensive numerical experiments across diverse covariate- and posterior-shift scenarios demonstrate that SMART-FAN-Lasso consistently outperforms standard baselines and achieves near-oracle performance even under severe target sample size constraints, empirically validating the derived rates.

URL PDF HTML ☆

赞 0 踩 0

2604.07630 2026-05-19 physics.geo-ph stat.AP

Diffusional earthquakes and their slip-distance scaling

扩散型地震及其滑动距离标度

Dye SK Sato, Keisuke Yoshida

AI总结研究通过分析扩散型地震的滑动距离标度，揭示了地震活动区域的扩散迁移特性，并建立了统一的标度关系。

Comments 33 pages, 10 figures

详情

AI中文摘要

地震的最终规模通常无法从其持续的地震辐射中预测。扩展的观测结果揭示了例外情况，如慢地震、注人诱导地震性和地震群，其中断层滑动有上限。这些异常的共同点是其活跃区域的扩散迁移。本文报告了一种统一的标度关系用于这些扩散型地震。通过跟踪日本东北地区持续的地震群，我们约束了其活跃地震区域的时间演化和累积地震矩。它们的矩-持续时间轨迹与全球地震群和诱导地震在各种尺度下的最终状态一致。当以地震矩对地震活动区域作图时，它们的轨迹坍缩到慢地震的轨迹上，这由一个扩散的恒定滑动模型统一解释。这种恒定滑动标度定义了一个独特的扩散型地震类，其中最终可用的地震能量由滑动距离预决定。

英文摘要

The final size of an earthquake typically cannot be predicted from its ongoing seismic radiation. Expanding observations reveal distinct exceptions, such as slow earthquakes, injection-induced seismicity, and earthquake swarms, in which fault slip has an upper bound. A common thread among these anomalies is the diffusive migration of their active areas. Here, we report a unified scaling relation for these diffusional earthquakes. By tracking prolonged earthquake swarms in Northeast Japan, we constrained the time evolution of their active seismicity areas and cumulative seismic moments. Their moment-duration trajectories coincide with the final states documented for global swarms and induced seismicity across various scales. When plotted as seismic moment versus seismicity area, their trajectories collapse onto those of slow earthquakes, uniformly explained by a diffusional constant-slip model. This constant-slip scaling carves out a unique class of diffusional earthquakes, where the final available seismic energy is predetermined by slip distance.

URL PDF HTML ☆

赞 0 踩 0

2604.01160 2026-05-19 stat.ME

Machine learning methods for finite population parameter estimation in survey sampling

用于调查抽样中有限总体参数估计的机器学习方法

Mehdi Dagdoug, David Haziza

AI总结本文探讨了机器学习方法在调查抽样有限总体推断中的应用，重点在于基于设计的有效性与统计推断。虽然灵活的预测工具能显著提高估计准确性，但也带来了重要挑战，主要是由于拟合预测器与样本之间的依赖性。本文聚焦于预测如何通过模型辅助估计、项目非响应插补和单位非响应调整进入调查估计的场景。对于模型辅助估计和项目非响应，展示了交叉拟合和奈曼正交估计方程如何借鉴双重/去偏机器学习的思想，使高维或非参数学习器得以应用，同时在适当条件下保持根n一致性与渐近正态性。相比之下，对于单位非响应，标准逆概率加权方法是结果无关且操作上具有吸引力的，但这一特性使得双重稳健和正交构造在官方统计中更难部署。此外，还简要讨论了小区域估计和概率/非概率数据整合的相关发展。总体而言，本文突显了机器学习的潜力及其对调查实践提出的根本推断挑战。

详情

AI中文摘要

本文是一篇教学性的综述，探讨了在调查抽样中有限总体推断中使用机器学习方法的应用，重点在于基于设计的有效性与统计推断。虽然灵活的预测工具在估计准确性上带来了显著的提升，但它们也引入了重要的挑战，主要由于拟合的预测器与样本之间的依赖性。我们关注的是预测如何通过模型辅助估计、项目非响应插补和单位非响应调整进入调查估计的场景。对于模型辅助估计和项目非响应，我们展示了交叉拟合和奈曼正交估计方程如何借鉴双重/去偏机器学习的思想，使高维或非参数学习器得以应用，同时在适当条件下保持根n一致性与渐近正态性。相比之下，对于单位非响应，标准逆概率加权方法是结果无关且操作上具有吸引力的，但这一特性使得双重稳健和正交构造在官方统计中更难部署。此外，还简要讨论了小区域估计和概率/非概率数据整合的相关发展。总体而言，本文突显了机器学习的潜力及其对调查实践提出的根本推断挑战。

英文摘要

This pedagogical review examines the use of machine learning methods in finite-population inference for survey sampling, with an emphasis on design-based validity and statistical inference. While flexible prediction tools offer substantial gains in estimation accuracy, they also introduce important challenges, primarily due to the dependence between the fitted predictors and the sample. We focus on settings in which such predictions enter survey estimation through model-assisted estimation, item nonresponse imputation, and unit nonresponse adjustment. For model-assisted estimation and item nonresponse, we show how cross-fitting and Neyman-orthogonal estimating equations can adapt ideas from double/debiased machine learning to survey data, allowing the use of high-dimensional or nonparametric learners while preserving root-n consistency and asymptotic normality under suitable conditions. In contrast, for unit nonresponse, standard inverse-probability weighting remains outcome-agnostic and operationally attractive, but this same feature makes doubly robust and orthogonal constructions harder to deploy in official statistics. We also briefly discuss related developments in small area estimation and probability/nonprobability data integration. Overall, the paper highlights both the promise of machine learning and the fundamental inferential challenges it raises for survey practice.

URL PDF HTML ☆

赞 0 踩 0

2603.11276 2026-05-19 stat.ML cs.LG

RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits

RIE-Greedy: 基于正则化的探索策略用于上下文老虎机

Tong Li, Thiago de Queiroz Casanova, Eric M. Schwartz, Victor Kostyuk, Dehan Kong, Joseph J. Williams

AI总结本文提出了一种基于正则化的探索策略（RIE-Greedy），利用模型拟合过程中的随机性作为内在探索源，理论证明其在两臂老虎机情况下等价于Thompson Sampling，并在大规模商业环境中优于epsilon-greedy等基准方法。

详情

AI中文摘要

现实中的复杂奖励模型的上下文老虎机问题通常使用迭代训练的模型（如提升树）来解决。然而，直接应用简单的有效探索策略（如Thompson Sampling或UCB）在这些黑箱估计器上很困难。现有方法依赖于复杂的假设或不可行的程序，难以在实践中验证和实现。本文探讨了一种无探索（纯贪婪）的动作选择策略，利用模型拟合过程中的随机性作为内在探索源。更具体地说，我们注意到基于交叉验证的正则化过程中的随机性可以自然地诱导出Thompson Sampling-like的探索。我们证明了这种正则化诱导的探索在两臂老虎机情况下在理论上等价于Thompson Sampling，并在大规模商业环境中相对于epsilon-greedy和其他最先进的方法在经验上实现了可靠的探索。总体而言，本文揭示了正则化估计器训练本身如何诱导有效的探索，为上下文老虎机设计提供了理论洞察和实践指导。

英文摘要

Real-world contextual bandit problems with complex reward models are often tackled with iteratively trained models, such as boosting trees. However, it is difficult to directly apply simple and effective exploration strategies--such as Thompson Sampling or UCB--on top of those black-box estimators. Existing approaches rely on sophisticated assumptions or intractable procedures that are hard to verify and implement in practice. In this work, we explore the use of an exploration-free (pure-greedy) action selection strategy, that exploits the randomness inherent in model fitting process as an intrinsic source of exploration. More specifically, we note that the stochasticity in cross-validation based regularization process can naturally induce Thompson Sampling-like exploration. We show that this regularization-induced exploration is theoretically equivalent to Thompson Sampling in the two-armed bandit case and empirically leads to reliable exploration in large-scale business environments compared to benchmark methods such as epsilon-greedy and other state-of-the-art approaches. Overall, our work reveals how regularized estimator training itself can induce effective exploration, offering both theoretical insight and practical guidance for contextual bandit design.

URL PDF HTML ☆

赞 0 踩 0

2603.09089 2026-05-19 stat.CO math.PR q-bio.NC

Sampling on Discrete Spaces with Temporal Point Processes

在离散空间中使用时间点过程进行采样

Cameron A. Stewart, Maneesh Sahani

AI总结本文提出了一种基于时间点过程的离散空间采样方法，通过构造多变量时间点过程，使其在固定长度滑动窗口内的事件计数向量收敛于目标分布，同时引入辅助随机性将采样器转化为退化出生-死亡过程，并在多个目标分布上验证了其优越性。

Comments 20 pages, 1 figure. Minor revisions to wording, notation, and formatting. No substantive changes

详情

AI中文摘要

时间点过程为从离散分布中采样提供了一个强大的框架，但在现有文献中仍未被充分利用。我们展示了如何为任何具有向下闭合支持的多变量计数分布构造一个多变量时间点过程，其在固定长度滑动窗口内的事件计数向量随着时间趋于无穷大时收敛于目标分布。该采样器被结构化为一组可能相互耦合的无限服务器队列，具有确定性服务时间，表现出一种离散形式的动量，抑制了随机游走行为。允许的进程家族既包括可逆动态也包括不可逆动态。作为应用，我们推导出一个递归的随机神经网络，其动态实现基于采样的计算，并表现出一些生物合理特征，包括相对抑制期和振荡。引入辅助随机性将采样器转化为出生-死亡过程，从而将后者确立为退化情况，具有相同的极限分布。在63个目标分布的模拟中，我们的采样器始终优于这些出生-死亡过程，并在多变量有效样本量方面频繁优于Zanella过程，进一步在归一化CPU时间下获得进一步增益。

英文摘要

Temporal point processes offer a powerful framework for sampling from discrete distributions, yet they remain underutilized in existing literature. We show how to construct, for any target multivariate count distribution with downward-closed support, a multivariate temporal point process whose event-count vector in a fixed-length sliding window converges in distribution to the target as time tends to infinity. Structured as a system of potentially coupled infinite-server queues with deterministic service times, the sampler exhibits a discrete form of momentum that suppresses random-walk behaviour. The admissible families of processes permit both reversible and non-reversible dynamics. As an application, we derive a recurrent stochastic neural network whose dynamics implement sampling-based computation and exhibit some biologically plausible features, including relative refractory periods and oscillations. The introduction of auxiliary randomness reduces the sampler to a birth-death process, establishing the latter as a degenerate case with the same limiting distribution. In simulations on 63 target distributions, our sampler always outperforms these birth-death processes and frequently outperforms Zanella processes in multivariate effective sample size, with further gains when normalized by CPU time.

URL PDF HTML ☆

赞 0 踩 0

2602.21426 2026-05-19 cs.LG stat.CO

Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators

Proximal-IMH: 用于独立Metropolis-Hastings的近端后验提议

Youguang Chen, George Biros

AI总结本文提出了一种改进的独立Metropolis-Hastings算法，通过引入辅助优化问题来消除近似后验分布中的偏差，从而在保持精确模型的同时提高稳定性和采样效率。

详情

AI中文摘要

我们考虑了在科学、工程和成像中的贝叶斯反问题中从后验分布采样的问题。我们的方法属于独立Metropolis-Hastings（IMH）采样算法家族，常用于贝叶斯推断。依赖于存在一个更便宜但可能有显著偏差的近似后验分布，我们引入了Proximal-IMH，通过辅助优化问题纠正近似后验的样本，从而在精确模型和近似参考点周围获得局部调整。对于理想化设置，我们证明了近端校正能够收紧近似和精确后验之间的匹配，从而提高接受率和混合性。该方法适用于线性和非线性输入-输出算子，并特别适用于精确后验采样成本过高的反问题。我们展示了包含多模态和数据驱动先验的数值实验，结果表明Proximal-IMH在现有IMH变体中表现更优。

英文摘要

We consider the problem of sampling from a posterior distribution arising in Bayesian inverse problems in science, engineering, and imaging. Our method belongs to the family of independence Metropolis-Hastings (IMH) sampling algorithms, which are common in Bayesian inference. Relying on the existence of an approximate posterior distribution that is cheaper to sample from but may have significant bias, we introduce Proximal-IMH, a scheme that removes this bias by correcting samples from the approximate posterior through an auxiliary optimization problem. This yields a local adjustment that trades off adherence to the exact model against stability around the approximate reference point. For idealized settings, we prove that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing. The method applies to both linear and nonlinear input-output operators and is particularly suitable for inverse problems where exact posterior sampling is too expensive. We present numerical experiments including multimodal and data-driven priors with nonlinear input-output operators. The results show that Proximal-IMH reliably outperforms existing IMH variants.

URL PDF HTML ☆

赞 0 踩 0

2602.05742 2026-05-19 stat.ML cs.LG math.ST stat.TH

Fast Rates for Nonstationary Weighted Risk Minimization

非平稳加权风险最小化中的快速收敛速率

Tobias Brock, Thomas Nagler

AI总结本文研究了非平稳条件下加权经验风险最小化方法的样本外预测误差，提出了一种将超额风险分解为学习项和分布漂移相关项的通用分解方法，并在混合条件下证明了学习误差的Oracle不等式，考虑了权重向量的有效样本量、权重和假设类的复杂性以及数据依赖性。

2602.04872 2026-05-19 stat.ML cs.AI cs.LG

Multi-layer Cross-attention is Provably Optimal for Multi-modal In-context Learning

多层交叉注意力是多模态上下文学习中可证明最优的

Nicholas Barnfield, Subhabrata Sen, Pragya Sur

AI总结本文研究了多模态上下文学习中多层交叉注意力机制的理论最优性，证明了在多模态数据下，交叉注意力机制在梯度流优化下可达到贝叶斯最优，同时指出单层线性自注意力无法在任务分布下统一恢复贝叶斯最优预测。

详情

AI中文摘要

近期进展迅速推动了我们对现代基于注意力的神经网络中上下文学习机制的理解。然而，现有结果仅专注于单模态数据；相比之下，多模态数据的上下文学习的理论基础仍不清晰。我们引入了一个数学上可处理的框架来研究多模态学习，并探讨了在何种情况下Transformer-like架构可以在上下文中恢复贝叶斯最优性能。为了建模多模态问题，我们假设观测数据来自一个潜在因子模型。我们的第一个结果是对表达性的否定：我们证明单层线性自注意力无法在任务分布下统一恢复贝叶斯最优预测。为了解决这一限制，我们引入了一种新的线性化交叉注意力机制，并在交叉注意力层和上下文长度都较大的情况下进行研究。我们证明，当使用梯度流优化时，这种交叉注意力机制可证明是贝叶斯最优的。我们的结果强调了深度对上下文学习的好处，并确立了交叉注意力在多模态分布中的可证明效用。

英文摘要

Recent progress has rapidly advanced our understanding of the mechanisms underlying in-context learning in modern attention-based neural networks. However, existing results focus exclusively on unimodal data; in contrast, the theoretical underpinnings of in-context learning for multi-modal data remain poorly understood. We introduce a mathematically tractable framework for studying multi-modal learning and explore when transformer-like architectures can recover Bayes-optimal performance in-context. To model multi-modal problems, we assume the observed data arises from a latent factor model. Our first result comprises a negative take on expressibility: we prove that single-layer, linear self-attention fails to recover the Bayes-optimal predictor uniformly over the task distribution. To address this limitation, we introduce a novel, linearized cross-attention mechanism, which we study in the regime where both the number of cross-attention layers and the context length are large. We show that this cross-attention mechanism is provably Bayes optimal when optimized using gradient flow. Our results underscore the benefits of depth for in-context learning and establish the provable utility of cross-attention for multi-modal distributions.

URL PDF HTML ☆

赞 0 踩 0

2602.02830 2026-05-19 cs.LG stat.ME

SC3D: Dynamic and Differentiable Causal Discovery for Temporal and Instantaneous Graphs

SC3D：动态和可微的因果发现用于时序和瞬时图

Sourajit Das, Dibyajyoti Chakraborty, Romit Maulik

AI总结本文提出SC3D，一种动态和可微的因果发现方法，用于处理时序和瞬时图，通过两阶段可微框架联合学习滞后特定的邻接矩阵和瞬时有向无环图，提升了因果结构的稳定性和准确性。

Comments 12 pages

详情

AI中文摘要

从多变量时间序列中发现因果结构是一个关键问题，因为相互作用跨越多个滞后并可能涉及瞬时依赖。此外，动态图的搜索空间本质上是组合性的。在本研究中，我们提出稳定因果动态可微发现（SC3D），一种两阶段可微框架，联合学习滞后特定的邻接矩阵以及如果存在的话瞬时有向无环图（DAG）。在第一阶段，SC3D通过节点级预测进行边预选以获得滞后和瞬时边的掩码，而第二阶段通过优化具有稀疏性的似然函数并强制瞬时块的无环性来细化这些掩码。在合成SVAR系统、非线性和混沌基准、非平稳动态和现实世界数据集上的数值结果表明，SC3D在稳定性和准确性方面优于现有基线，能够更准确地恢复滞后和瞬时因果结构。

英文摘要

Discovering causal structures from multivariate time series is a key problem because interactions span across multiple lags and possibly involve instantaneous dependencies. Additionally, the search space of the dynamic graphs is combinatorial in nature. In this study, we propose Stable Causal Dynamic Differentiable Discovery (SC3D), a two-stage differentiable framework that jointly learns lag-specific adjacency matrices and, if present, an instantaneous directed acyclic graph (DAG). In Stage 1, SC3D performs edge preselection through node-wise prediction to obtain masks for lagged and instantaneous edges, whereas Stage 2 refines these masks by optimizing a likelihood with sparsity along with enforcing acyclicity on the instantaneous block. Numerical results across synthetic SVAR systems, nonlinear and chaotic benchmarks, nonstationary dynamics and real-world datasets demonstrate that SC3D achieves improved stability and more accurate recovery of both lagged and instantaneous causal structures compared to existing baselines.

URL PDF HTML ☆

赞 0 踩 0

2602.01733 2026-05-19 stat.ML cs.LG

ST-BCP: Tightening Coverage Bound for Backward Conformal Prediction via Non-Conformity Score Transformation

ST-BCP：通过非一致性分数转换紧缩后向符合预测的覆盖界

Junxian Liu, Hao Zeng, Hongxin Wei

AI总结本文提出ST-BCP方法，通过引入数据依赖的非一致性分数转换来缩小后向符合预测中的覆盖界差距，实验表明该方法有效减少了覆盖差距。

2601.20888 2026-05-19 stat.ML cs.LG math.ST stat.CO stat.TH

Latent-IMH: Efficient Bayesian Inference for Inverse Problems with Approximate Operators

Latent-IMH: 高效的贝叶斯推断用于具有近似算子的反问题

Youguang Chen, George Biros

AI总结本文研究了在贝叶斯线性反问题中如何高效地从后验分布采样，其中参数到观测算子A计算成本高。通过将A分解为可构造低成本近似算子A~的方式，提出了一种基于Metropolis-Hastings独立采样器的Latent-IMH方法，通过近似算子生成中间潜在变量并利用精确算子进行优化，从而将计算成本转移到离线阶段，理论分析表明其在KL散度和混合时间上表现优异，实验显示其在计算效率上优于NUTS等现有方法。

详情

AI中文摘要

多类预测中的诚实校准误差

Yuxuan Lu, Yifan Wu, Jason Hartline, Lunjia Hu

AI总结本文研究了多类预测中诚实校准误差的实用作用，提出了完美诚实校准误差以处理标签分布的多维线性属性，并分析了这些诚实误差在决策理论上的影响，从而解释并缓解了分箱校准误差的排名鲁棒性问题。

详情

AI中文摘要

校准预测之所以有用，是因为其数值可以被解释为概率。校准误差因此被广泛用于评估、比较和调整概率预测器。最近，Haghtalab等人（2024）引入了一个额外的要求：诚实性。如果预测器通过报告真实的条件标签分布来最小化其预期测量误差，则校准度量是诚实的。许多标准的经验校准误差是非诚实的：预测器可能通过扭曲其概率而不是报告真实值来显得更校准。我们研究了诚实性在多类预测中校准测量的实用作用。首先，我们引入了完美诚实校准误差以处理标签分布的多维线性属性，推广了Hartline等人（2025）中二元预测的诚实校准误差。此框架包括完整的多类校准和类内校准。我们还确定了置信度校准的诚实修正。其次，我们分析了这些诚实误差的决策理论影响。对于校准预测器，诚实校准误差保持了Blackwell主导性：更信息丰富的校准预测器不会产生更大的预期误差。第三，我们表明这种决策理论解释解释并缓解了已观察到的分箱校准误差的排名鲁棒性问题。经验上，非诚实的置信度校准误差在分箱数量变化时可能逆转模型排名，而我们的诚实误差在不同分箱选择下提供更稳定的排名。

通过平滑改进随机森林

Ziyi Liu, Phuc Luong, Mario Boley, Daniel F. Schmidt

AI总结本文提出一种基于核的平滑机制，通过引入局部正则性来增强随机森林的预测性能，同时保留其自适应分区能力，特别是在数据稀缺情况下提升了预测效果。

Comments v2: Accepted manuscript. 30 pages (18 main + 12 appendix), 6 figures

详情

AI中文摘要

随机森林回归是一种强大的非参数方法，通过数据驱动的分区适应局部数据特征，在各种应用领域中表现出色。然而，随机森林预测的分段常数性质意味着每个分区都是独立预测的，忽略了潜在的函数平滑性。特别是在小数据情况下，输入空间内缺乏信息共享可能导致性能不佳。在本文中，我们提出了一种基于核的平滑机制，通过引入局部正则性来增强随机森林，同时保留其自适应分区能力。我们的方法将核平滑应用于随机森林的分段常数输出，有效地结合了基于树的方法的适应性和核方法的平滑性假设。我们证明这种平滑过程可以被解释为在重新采样训练输入的情况下捕捉树切分点的变异性/不确定性。实验证实，所提出的平滑随机森林模型在各种测试案例中一致提高了预测性能，特别是在数据稀缺的情况下。代码、数据集和实验结果可在 https://github.com/Neal-Liu-Ziyi/SmoothedRandomForest.git 公开获取。

使用随机模拟器进行非确定性极限状态的可靠性分析

Anderson V. Pires, Maliki Moustapha, Stefano Marelli, Bruno Sudret

AI总结本文提出了一种基于随机模拟器的可靠性分析方法，通过使用合适的替代模型降低计算成本，验证了通用lambda模型和随机多项式展开在分析风力涡轮机可靠性时的有效性。

详情

DOI: 10.1016/j.strusafe.2025.102621
Journal ref: Structural Safety, 117,102621,pp. 1-14, 2025

AI中文摘要

可靠性分析是不确定性量化的一个子领域，用于评估系统在各种不确定性下的预期性能概率。传统上，这种分析依赖于确定性模型，其中实验是可重复的，即给定输入集下产生一致的输出。然而，现实系统往往表现出随机行为，导致不可重复的结果。这些所谓的随机模拟器每次运行模型时都会产生不同的输出，即使输入固定。本文正式引入了对随机模型的可靠性分析，并通过使用合适的替代模型来解决这一问题，以降低通常较高的计算成本。具体而言，我们专注于最近引入的广义lambda模型和随机多项式展开。这些模拟器旨在学习模拟器响应的内在随机性，并在比传统蒙特卡洛模拟低得多的成本下实现高效的不确定性量化。我们通过三个案例研究验证了我们的方法。首先，使用具有闭式解的分析函数，我们证明模拟器收敛到正确解。其次，我们使用简支梁的玩具示例展示了替代模型的成果。最后，我们将模拟器应用于一个现实的风力涡轮机案例研究，其中只有模拟结果的数据集可用。

英文摘要

Reliability analysis is a sub-field of uncertainty quantification that assesses the probability of a system performing as intended under various uncertainties. Traditionally, this analysis relies on deterministic models, where experiments are repeatable, i.e., they produce consistent outputs for a given set of inputs. However, real-world systems often exhibit stochastic behavior, leading to non-repeatable outcomes. These so-called stochastic simulators produce different outputs each time the model is run, even with fixed inputs. This paper formally introduces reliability analysis for stochastic models and addresses it by using suitable surrogate models to lower its typically high computational cost. Specifically, we focus on the recently introduced generalized lambda models and stochastic polynomial chaos expansions. These emulators are designed to learn the inherent randomness of the simulator's response and enable efficient uncertainty quantification at a much lower cost than traditional Monte Carlo simulation. We validate our methodology through three case studies. First, using an analytical function with a closed-form solution, we demonstrate that the emulators converge to the correct solution. Second, we present results obtained from the surrogates using a toy example of a simply supported beam. Finally, we apply the emulators to perform reliability analysis on a realistic wind turbine case study, where only a dataset of simulation results is available.

URL PDF HTML ☆

赞 0 踩 0

2411.03936 2026-05-19 cs.LG stat.ML

GUIDE-VAE: Advancing Data Generation with User Information and Pattern Dictionaries

GUIDE-VAE：利用用户信息和模式词典推进数据生成

Kutay Bölat, Simon Tindemans

AI总结本文提出GUIDE-VAE，一种基于用户嵌入和模式词典的生成模型，通过整合用户信息和复杂特征依赖性，提升多用户数据集下的生成性能和样本真实性。

详情

AI中文摘要

多用户数据集的生成建模在科学和工程中变得突出。生成特定用户的样本需要利用用户信息，而传统生成模型，包括变分自编码器（VAEs），通常忽略这一点。本文介绍了GUIDE-VAE，一种新的条件生成模型，利用用户嵌入生成用户引导的数据。通过利用用户之间的共享模式，GUIDE-VAE在多用户设置中提升了性能，即使在数据不平衡显著的情况下。除了整合用户信息外，GUIDE-VAE还采用基于模式词典的协方差组成（PDCC）来提高生成样本的真实性和捕捉复杂特征依赖性。虽然用户嵌入推动了性能提升，但PDCC解决了VAEs中常见的噪声和过平滑问题。所提出的GUIDE-VAE在具有显著用户数据不平衡的多用户智能电表数据集上进行了评估。定量结果表明，GUIDE-VAE在合成数据生成和缺失记录填补任务中表现良好，而定性评估表明其生成的数据更加合理且噪声更少。这些结果确立了GUIDE-VAE作为多用户数据集可控、真实数据生成的有前景工具，具有跨领域应用的潜力。

英文摘要

Generative modelling of multi-user datasets has become prominent in science and engineering. Generating a data point for a given user requires employing user information, and conventional generative models, including variational autoencoders (VAEs), often ignore this. This paper introduces GUIDE-VAE, a novel conditional generative model that leverages user embeddings to generate user-guided data. By leveraging shared patterns across users, GUIDE-VAE improves performance in multi-user settings, even under significant data imbalance. In addition to integrating user information, GUIDE-VAE incorporates a pattern dictionary-based covariance composition (PDCC) to improve the realism of generated samples by capturing complex feature dependencies. While user embeddings drive performance gains, PDCC addresses common issues such as noise and over-smoothing typically seen in VAEs. The proposed GUIDE-VAE was evaluated on a multi-user smart meter dataset characterised by substantial data imbalance across users. Quantitative results show that GUIDE-VAE performs effectively on both synthetic data generation and missing-record imputation tasks, while qualitative evaluations indicate that it produces more plausible and less noisy data. These results establish GUIDE-VAE as a promising tool for controlled, realistic data generation in multi-user datasets, with potential applications across domains that require user-informed modelling.

URL PDF HTML ☆

赞 0 踩 0

2410.01223 2026-05-19 stat.CO cs.LG

Statistical Taylor Expansion: A New and Path-Independent Method for Uncertainty Analysis

统计泰勒展开：一种新的、路径无关的不确定性分析方法

Chengpu Wang

AI总结本文提出了一种新的路径无关的不确定性分析方法，通过将精确输入变量替换为具有已知分布和样本数的随机变量，计算每个结果的均值、偏差和可靠因子，从而实现对输入不确定性的传播追踪，使最终结果成为路径无关的，与传统数学方法不同。

Comments 47 pages, 40 figures

详情

AI中文摘要

作为一种严谨的统计方法，统计泰勒展开扩展了传统泰勒展开，通过将精确输入变量替换为具有已知分布和样本数的随机变量来计算每个结果的均值、偏差和可靠因子。它通过中间步骤追踪输入不确定性的传播，使最终的解析结果成为路径无关的。因此，它与传统数学方法根本不同，后者为每项计算优化计算路径。统计泰勒展开可能为解析表达式的数值计算提供标准化方法。本研究还介绍了称为方差算术的统计泰勒展开的实现，并在广泛的数学应用中展示了相应测试结果。此外，本研究还得出一个重要结论，即库函数中的数值误差可能显著影响结果。理想情况下，每个库函数的值都应通过不确定性偏差来完成。此外，统计泰勒展开与量子物理之间的可能联系也进行了讨论。

英文摘要

As a rigorous statistical approach, statistical Taylor expansion extends the conventional Taylor expansion by replacing precise input variables with random variables of known distributions and sample counts to compute the mean, the deviation, and the reliable factor of each result. It tracks the propagation of the input uncertainties through intermediate steps, so that the final analytic result becomes path independent. Therefore, it differs fundamentally from common approaches in applied mathematics that optimize computational path for each calculation. Statistical Taylor expansion may standardize numerical computations for analytic expressions. This study also introduces the implementation of statistical Taylor expansion termed variance arithmetic and presents corresponding test results across a wide range of mathematical applications. Another important conclusion of this study is that numerical errors in library functions can significantly affect results. It is desirable that each value from library functions be accomplished by an uncertainty deviation. The possible link between statistical Taylor expansion and quantum physics is discussed as well.

URL PDF HTML ☆

赞 0 踩 0

2406.19152 2026-05-19 stat.ME stat.AP

泊松对数正态模型的复合似然推断

Julien Stoehr, Stephane S. Robin

AI总结本文提出了一种结合EM框架与复合似然和重要性采样估计的新型推断方法，用于泊松对数正态模型参数估计，解决了高维积分瓶颈问题，实现了计算可行性，同时保持了最大似然估计的渐近性质。

详情

DOI: 10.1007/s11222-025-10797-2

AI中文摘要

泊松对数正态模型是一种隐变量模型，为多变量计数数据的分析提供了通用框架。推断其参数是一项艰巨的任务，因为给定观测值的隐变量条件分布不可 tractable。对于此模型，变分方法是黄金标准的解决方案，因为它们被证明在计算上是高效的，但缺乏对估计量的理论保证。基于采样的解决方案则恰恰相反。我们首先定义了一个蒙特卡洛EM算法，可以实现最大似然估计器，但仅在低维隐变量空间中计算高效。然后我们提出了一种新的推断程序，将EM框架与复合似然和重要性采样估计相结合。该算法保持了最大似然估计器的有利渐近性质，同时绕过了高维积分瓶颈，从而在中等大小的数据集上保持计算可行性。这种方法使基于实际的参数估计、置信区间和假设检验成为可能。对贝伦特海斯鱼数据集的应用展示了该算法识别显著环境效应和残余种间相关性的能力。

英文摘要

The Poisson log-normal model is a latent variable model that provides a generic framework for the analysis of multivariate count data. Inferring its parameters can be a daunting task since the conditional distribution of the latent variables given the observed ones is intractable. For this model, variational approaches are the golden standard solution as they prove to be computationally efficient but lack theoretical guarantees on the estimates. Sampling-based solutions are quite the opposite. We first define a Monte Carlo EM algorithm that can achieve maximum likelihood estimators, but that is computationally efficient only for low-dimensional latent spaces. We then propose a novel inference procedure combining the EM framework with composite likelihood and importance sampling estimates. The algorithm preserves the desirable asymptotic properties of maximum likelihood estimators while circumventing the high-dimensional integration bottleneck, thus maintaining computational feasibility for moderately large datasets. This approach enables grounded parameter estimation, confidence intervals, and hypothesis testing. Application to the Barents Sea fish dataset demonstrates the algorithm capacity to identify significant environmental effects and residual interspecies correlations.

URL PDF HTML ☆

赞 0 踩 0

2201.04982 2026-05-19 stat.OT

An empirical exploration of the diversified R ecosystem

对多样化R生态系统的实证探索

Tian-Yuan Huang, Zhilan Lou

AI总结本文通过分析CRAN元数据和文献引用数据，探讨了R生态系统的发展动力及其跨学科影响，揭示了R社区中开发者协作模式及潜在影响。

1810.06433 2026-05-19 stat.CO stat.ME

Calibration procedures for approximate Bayesian credible sets

近似贝叶斯可信集的校准程序

Jeong Eun Lee, Geoff K. Nicholls, Robin J. Ryder

AI总结本文提出并应用了两种校准程序，用于检查使用蒙特卡洛方法估计的可信区间覆盖性。研究核心是通过半参数逻辑回归和重要性采样来估计后验覆盖，以评估近似可信集的性能。

Comments 28 pages, 6 Figures, 1 Table, 4 Algorithm boxes. Revision improves clarity of presentation and adds relevant citations

详情

DOI: 10.1214/19-BA1175
Journal ref: Bayesian Analysis 14(4): 1245-1269 (2019)

AI中文摘要

我们开发并应用了两种校准程序，用于检查使用蒙特卡洛方法估计的可信区间覆盖性。用户拥有理想的先验和似然，但生成了一个近似后验的可信集，该后验不与理想似然和先验的乘积成比例。我们估计由近似可信集实现的后验覆盖，即如果数据是用户理想观测模型在参数条件下的实现，且参数是从用户理想先验中抽取的，那么未知的“真实”参数的覆盖性。在一种方法中，我们通过半参数逻辑回归对二元覆盖结果进行回归，以估计数据点的后验覆盖，该回归基于模拟数据的汇总统计量。在另一种方法中，我们使用重要性采样从近似后验中进行重要性采样，并将模拟数据窗口化以接近观测数据。我们通过四个例子展示了我们的方法。

英文摘要

We develop and apply two calibration procedures for checking the coverage of approximate Bayesian credible sets including intervals estimated using Monte Carlo methods. The user has an ideal prior and likelihood, but generates a credible set for an approximate posterior which is not proportional to the product of ideal likelihood and prior. We estimate the realised posterior coverage achieved by the approximate credible set. This is the coverage of the unknown ``true'' parameter if the data are a realisation of the user's ideal observation model conditioned on the parameter, and the parameter is a draw from the user's ideal prior. In one approach we estimate the posterior coverage at the data by making a semi-parametric logistic regression of binary coverage outcomes on simulated data against summary statistics evaluated on simulated data. In another we use Importance Sampling from the approximate posterior, windowing simulated data to fall close to the observed data. We illustrate our methods on four examples.

URL PDF HTML ☆

赞 0 踩 0

2605.17269 2026-05-19 cs.LG stat.ML

Calibeating for general proper losses: A Bregman divergence approach

基于Bregman散度的方法：一般恰当损失的校准

Maximilian Fichtl, Cristóbal Guzmán, Nishant A. Mehta

AI总结本文提出了一种基于懊悔最小化的通用校准框架，考虑了包括α-Tsallis损失（α∈[1,2]）和Lipschitz损失在内的广泛恰当损失家族，同时展示了新的关于Be The Regularized Leader的懊悔等式。

Comments 31 pages

详情

AI中文摘要

本文介绍了一种基于懊悔最小化的通用校准框架。与Foster和Hart的开创性校准工作相比，后者专门处理Brier分数（平方损失）和log损失，我们考虑了一类包含α-Tsallis损失（α∈[1,2]）和Lipschitz损失的广泛恰当损失家族。我们的结果对于Tsallis损失也适用于未缩放的Tsallis损失，该损失恢复log损失。我们的分析围绕恰当损失的Bregman散度观点展开。技术上，我们考虑的Tsallis损失家族的结果是U-calibration结果，同时在所有损失家族中获得对数懊悔，同时与先前结果相比具有更弱的维度依赖性。潜在的独立兴趣点是，我们还展示了新的关于Be The Regularized Leader的懊悔等式。该懊悔等式适用于一般恰当损失，并且本身基于两个与广义方差的在线更新公式相关的结果，后者是基于Bregman散度的方差泛化。

英文摘要

This work introduces a general framework for calibeating based on regret minimization. As compared to Foster and Hart's seminal calibeating work which had specialized treatments of Brier score (squared loss) and log loss, we consider a large family of proper losses that includes $α$-Tsallis losses (for $α\in [1, 2]$) and Lipschitz losses. Our results for Tsallis losses also hold for an unscaled version of Tsallis loss that recovers log loss. Our analysis is oriented around the Bregman divergence view of a proper loss. Technically, our results for the family of Tsallis losses that we consider are U-calibration results, simultaneously obtaining logarithmic regret for all losses in this family while having a weaker dependence on the dimension compared to previous results. Of potential independent interest, we also show a new regret equality for the regret of Be The Regularized Leader. This regret equality holds for general proper losses and itself is based on two results related to online updating formulas for the generalized variance, the latter being a previously introduced generalization of variance based on Bregman divergences.

URL PDF HTML ☆

赞 0 踩 0

2605.17240 2026-05-19 stat.ME

The FORSS Framework for Sample Size and Power Calculations With Win Statistics for Hierarchical Endpoints

具有Win统计的分层终点的样本量和功效计算的FORSS框架

Baoshan Zhang, Huiman X. Barnhart, Yuan Wu, Roland A. Matsouaka

AI总结本文提出了一种基于公式的方法，即FORSS框架，用于处理具有分层终点的临床试验中的样本量和功效计算，通过灵活的联合工作分布和熟悉度量来指定边际治疗效应，从而克服了现有方法的局限性。

详情

AI中文摘要

Win统计已成为具有分层终点（HEs）作为主要终点的临床试验的主要分析方法。然而，现有试验设计中的样本量和功效计算方法仍面临几个限制和挑战：基于模拟的方法计算成本高，而现有的基于公式的办法通常依赖于简化假设，如HEs之间的独立性，或需要指定总体Win统计和平局概率，这些在实践中难以事先获得。为了解决这些挑战，我们提出了FORSS框架，一种基于公式的超样本方法，允许研究者使用熟悉的度量（如危险比、均值差异和风险差异）指定边际治疗效应，同时结合灵活的联合工作分布用于HEs。而不是在每个候选样本量上反复模拟完整试验，FORSS使用超样本来估计分析公式所需的人口层面插值量，用于功率和样本量计算。通过广泛的模拟研究评估了所提FORSS的性能。结果表明，基于公式的FORSS在广泛的情景中紧密匹配经验功率，同时保持I类错误率接近名义的5%水平。基于HEART-FID试验的示例进一步表明，当规划具有HEs的试验时，终点依赖性规范可能对预期功率和所需样本量产生实质性影响。

英文摘要

Win statistics have gained increasing popularity as primary analysis methods for clinical trials with hierarchical endpoints (HEs) as primary endpoints. However, existing sample size and power calculation approaches in trial design still face several limitations and challenges: simulation-based approaches are computationally intensive, while existing formula-based methods often rely on simplifying assumptions such as independence among HEs, or require specification of overall win statistics and tie probability that are difficult to elicit a priori in practice. To address these challenges, we propose the FORSS framework, a FORmula-based Super-Sample approach that allows investigators to specify marginal treatment effects using familiar metrics (e.g., hazard ratios, mean differences, and risk differences) together with a flexible joint working distribution for the HEs. Rather than repeatedly simulating full trials at each candidate sample size, FORSS uses super-samples to estimate the population-level plug-in quantities required by analytical formulas for both power and sample size calculation. We evaluated the performance of the proposed FORSS through extensive simulation studies. The results show that the formula-based FORSS closely matches empirical power across a wide range of scenarios while maintaining Type~I error rates near the nominal 5\% level. An illustration based on the HEART-FID trial further shows that endpoint-dependence specifications can materially affect projected power and required sample size when planning trials with HEs.

URL PDF HTML ☆

赞 0 踩 0

2605.17238 2026-05-19 cs.LG stat.ML

可微优化层用于深度学习中的保证公平性

David Troxell, Noah Roemer, Guido Montúfar

AI总结本文提出了一种称为'公平性层'的可微优化层，该层可确保在神经网络中集成时满足所选的输出平等性概念，并介绍了一个在线对偶推理算法，为流式预测提供可证明的公平性保证，即使使用任意小的批量大小。

Comments To be published in International Conference on Machine Learning (ICML), 2026

2605.17107 2026-05-19 stat.ML cs.LG math.OC math.PR

Diffusion-Based Stochastic Operator Networks for Uncertainty Quantification in Stochastic Partial Differential Equations

基于扩散的随机算子网络用于随机偏微分方程中的不确定性量化

Phuoc-Toan Huynh, Richard Archibald, Feng Bao

AI总结本文提出了一种新的框架，用于随机偏微分方程（SPDEs）解算子的不确定性量化。尽管SPDEs在建模具有不确定性的复杂物理系统中起着核心作用，但其实际应用通常需要指定模型不确定性的幅度和结构，而这些通常是未知且难以从噪声测量中推断出来的。为此，本文开发了一种随机算子学习框架，直接从噪声数据中学习，并输出均值解场和不确定性量化。所提出的方法，即随机算子网络（SON），通过结合深度算子网络（DeepONet）的结构与随机神经网络（SNNs）来建模随机性并实现概率预测。训练过程通过最小化一种哈密顿型损失并使用随机最大原理优化所得目标进行。在多个不确定性源下的基准SPDEs上的数值实验展示了所提出方法在捕捉解结构和量化预测不确定性方面的准确性和鲁棒性。

详情

AI中文摘要

我们介绍了一种新颖的框架，用于随机偏微分方程（SPDEs）解算子的不确定性量化。尽管SPDEs在建模具有不确定性的复杂物理系统中起着核心作用，但其实际应用通常需要指定模型不确定性的幅度和结构，而这些通常是未知且难以从噪声测量中推断出来的。为此，我们开发了一种随机算子学习框架，直接从噪声数据中学习，并输出均值解场和不确定性量化。所提出的方法，即随机算子网络（SON），是通过将深度算子网络（DeepONet）的结构与随机神经网络（SNNs）相结合来建模随机性并实现概率预测。训练过程是通过最小化一种哈密顿型损失并使用随机最大原理优化所得目标进行。在多个不确定性源下的基准SPDEs上的数值实验展示了所提出方法在捕捉解结构和量化预测不确定性方面的准确性和鲁棒性。

英文摘要

We introduce a novel framework for uncertainty quantification of solution operators associated with stochastic partial differential equations (SPDEs). Although SPDEs play a central role in modeling complex physical systems under uncertainty, their practical use typically requires specifying the magnitude and structure of model uncertainties that are often unknown and difficult to infer from noisy measurements. To address this challenge, we develop a stochastic operator-learning framework that learns directly from noisy data and outputs both a mean solution field and a quantification of uncertainty. The proposed method, namely the Stochastic Operator Network (SON), is constructed by combining the structure of the Deep Operator Network (DeepONet) with Stochastic Neural Networks (SNNs) to model stochasticity and enable probabilistic prediction. The training procedure is carried out by minimizing a Hamiltonian-type loss and optimizing the resulting objective using the Stochastic Maximum Principle. Numerical experiments on benchmark SPDEs under multiple uncertainty sources demonstrate the accuracy and robustness of the proposed method in capturing solution structure and quantifying predictive uncertainty.

URL PDF HTML ☆

赞 0 踩 0

2605.17086 2026-05-19 econ.GN cs.AI cs.CY q-fin.EC stat.AP

Global Automation Atlas

全球自动化图谱

Prashant Garg, Tommaso Crosta, Jasmin Baier

AI总结本文提出了一种基于任务和国家特定的方法，用于全球范围内分类自动化暴露，以区分劳动力替代和增强自动化，相关技术渠道以及人工智能的物质作用。研究涵盖了124个国家，生成了覆盖全球99%人口和GDP的233万个任务-国家标签。

Comments 65 pages, 6 figures. Data and code: https://automationatlas.org/

详情

AI中文摘要

自动化对工作劳动力内容的影响在不同背景下有所不同。然而，大多数现有的暴露测量方法对任务或职业分配固定分数，限制了国家之间的自动化暴露比较。我们开发了一种基于任务和国家特定的方法，用于在全球范围内分类自动化暴露，以区分劳动力替代和增强自动化，相关技术渠道以及人工智能的物质作用。我们的测量覆盖124个国家，生成了覆盖全球99%人口和GDP的233万个任务-国家标签。我们提出了五个描述性结果。首先，暴露程度高度不均，从南苏丹3.3%的任务到中国61.6%的任务，收入越高暴露程度越强，尽管收入组内仍有显著差异。其次，不同国家暴露的任务偏向于替代而非增强，但低收入国家更倾向于替代，而中等收入国家则更异质。第三，低收入国家中，技术先进的自动化形式占暴露任务的一半以上，而高收入国家则约为四分之一；而其他更复杂的渠道通常随收入水平上升。第四，人工智能在简单自动化渠道中较少，但在低收入地区更倾向于劳动力替代边缘，而在高收入地区则更倾向于增强劳动力。第五，我们发现女性似乎比男性更倾向于受到劳动力替代自动化的影响。我们的方法为比较不同发展阶段的自动化暴露提供了基础，将其与跨国数据联系起来，允许我们将暴露水平、劳动力边缘、技术渠道和人工智能参与视为独立维度。

英文摘要

Automation affects the labour content of work differently across different contexts. Yet, most existing exposure measures assign fixed scores to tasks or occupations, limiting comparisons of automation exposure across countries. We develop a task-based and country-specific approach to classify automation exposure across the world to disentangle labor-substituting from labor-augmenting automation, the relevant technology channel, and the material role of AI. Our measure spans 124 countries, generating an atlas of 2.33 million task-country labels for economies covering 99% of world population and GDP. We present five descriptive results. First, exposure is highly uneven, ranging from 3.3% of tasks in South Sudan to 61.6% in China, and rises strongly with income, although substantial variation remains within income groups. Second, across countries, exposed tasks are skewed towards substitution rather than augmentation, but low-income countries are disproportionately exposed to substitution, whereas middle-income countries are more heterogeneous. Third, less technologically advanced forms of automation account for more than half of exposed tasks in low-income countries but about one quarter in high-income countries; while other more complex channels generally rise with income levels. Fourth, AI tends to be less prevalent in simpler channels of automation, but also more prevalent in labour-substituting margins in lower income settings and to augment labour in higher income settings. Fifth, we find that females seem to be disproportionately more exposed to labour-substituting automation than males. Our methodology provides a basis for comparing automation exposure across development stages, linking it with cross-country data and allowing us to treat exposure levels, labour margins, technological channels and AI involvement as separate dimensions.

URL PDF HTML ☆

赞 0 踩 0

2605.17050 2026-05-19 stat.ME

Single World Intervention Graphs as Distributions: A Framework for Causal Identification

单世界干预图作为分布：因果识别的一个框架

Christian Bartels

AI总结本文提出将单世界干预图视为分布的框架，用于因果识别，通过系统推导干预定义的估计量的识别表达式，扩展了现有文献中的后门推导方法，并提出了适用于复杂场景的前门推导方法。

2605.16970 2026-05-19 math.ST stat.TH

Quantifying Dependence Between Random Vectors: A New Index with Applications

对随机向量之间依赖性的量化：一个新的指数及其应用

Chuancun yin

AI总结本文提出一个新的指数来量化随机向量之间依赖的程度，该指数在[0,1]区间内取值，当且仅当随机向量子独立时取零值。与单纯的不相关性不同，子独立性表示一种更强的依赖形式，但仍然严格弱于完全独立性。该指数通过特征函数构造，并在矩的术语中具有简化表示。我们建立了其理论性质，并推导了相应的经验测度的计算效率公式。此外，我们研究了估计量的渐近行为，并通过在机器学习、精算科学和再生成理论中的应用展示了其实际用途。

Comments 31pages

2605.16919 2026-05-19 stat.ML cs.LG

CAST: Causal Anchored Simplex Transport for Distribution-Valued Time Series

CAST：基于简单集的因果传输用于分布值时间序列

Jiecheng Lu, Jieqi Di, Runhua Wu, Yuwei Zhou

AI总结该研究提出CAST方法，通过因果锚定简单集传输来处理分布值时间序列的因果预测，解决了分布传输中的结构性失效问题，并在多个基准测试中表现出色。

详情

AI中文摘要

许多面向决策的随机系统是通过聚合分布而非标量轨迹观测的：队列占用、移动份额、公共卫生混合、发电源份额、生态组成和空气质量严重程度剖面都生活在概率简单集上并随时间演变。我们研究这些分布值时间序列的因果（在线）预测，并认为过渡算子本身应围绕简单集进行结构化。我们引入CAST（因果锚定简单集传输），一种 successor-local 操作符，它（i）从因果上下文中检索经验后继，（ii）通过持久锚稳定它们，（iii）在有序支持上应用有界的局部随机传输；每一步都通过构造保持简单集。我们识别出一种结构性失效模式，即潜在的转换核别名，其中相似的观测分布在不同的上下文制度下演变不同，且证明任何仅依赖于别名总结的预测者都会遭受不可约的加权Jensen-Shannon超额风险下界，而CAST假设类包含制度-aware的贝叶斯后继；对于有序支持，当传输后继位于无传输锚壳体外时，额外存在Pinsker分离。在覆盖生态、能源、饮食、死亡率、就业、空气质量、恶劣天气、移动和G/G/1，G_t/G/1队列占用的11个公共和模拟基准上，CAST在一步KL（1.27）和自回归滚动JSD（1.91）上获得最佳平均排名，战胜了广泛的统计、组成、递归、卷积和Transformer基线集，并在所有11个部分中取得前两名的离线KL。组件消融和受控合成别名实验验证了理论。

英文摘要

Many decision-facing stochastic systems are observed through aggregate distributions rather than scalar trajectories: queue occupancies, mobility shares, public-health mixtures, generation-source shares, ecological compositions, and air-quality severity profiles all live on the probability simplex and evolve over time. We study causal (online) forecasting for these distribution-valued time series and argue that the transition operator itself should be structured around the simplex. We introduce CAST (Causal Anchored Simplex Transport), a successor-local operator that (i) retrieves empirical successors from causal context, (ii) stabilizes them with a persistence anchor, and (iii) applies a bounded local stochastic transport on ordered supports; every stage preserves the simplex by construction. We identify a structural failure mode, latent transition-kernel aliasing, where similar observed distributions evolve differently under different contextual regimes, and prove that any forecaster depending only on an aliased summary incurs an irreducible weighted Jensen-Shannon excess-risk lower bound, while the CAST hypothesis class contains the regime-aware Bayes successor; for ordered supports an additional Pinsker separation holds whenever the transported successor lies outside the no-transport anchor hull. On eleven public and simulated benchmarks spanning ecology, energy, diet, mortality, employment, air quality, severe weather, mobility, and G/G/1, G_t/G/1 queue occupancy, CAST attains the best average rank on both one-step KL (1.27) and autoregressive rollout JSD (1.91), winning 8/11 sections on each metric against a broad statistical, compositional, recurrent, convolutional, and Transformer baseline set, and top-2 on all 11 sections for offline KL. Component ablations and a controlled synthetic aliasing experiment corroborate the theory.

URL PDF HTML ☆

赞 0 踩 0

2605.16913 2026-05-19 stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG math.PR

A Fourier perspective on the learning dynamics of neural networks: from sample complexities to mechanistic insights

从样本复杂性到机理洞察的神经网络学习动态的傅里叶视角

Fabiola Ricci, Claudia Merger, Sebastian Goldt

AI总结本文从傅里叶视角研究神经网络学习动态，揭示了自然图像的近似平移不变性和功率谱特性，展示了简单神经网络在图像分类任务中先依赖幅度信息再利用相位信息的学习过程，并证明了在高维输入下仅基于相位信息的分类任务的难度，以及功率谱如何加速相位信息学习。

详情

Journal ref: ICML 2026

AI中文摘要

在上下文流映射中传播混沌

Shi Chen, Zhengjiang Lin, Kaizhao Liu, Philippe Rigollet

AI总结本文提出了一种定量统计理论，用于在大上下文范围内研究transformers，通过采用上下文流映射（CFMs）的抽象：在一组注意力块中，动态系统在上下文度量的存在下演进一个区分的token。在此框架下，有限上下文模型近似于理想化的无限上下文系统，其中上下文度量被其底层总体取代，因此上下文长度n成为统计资源。利用动态的麦肯-瓦尔科夫结构和经典的传播混沌经典机器，我们建立了前向边界，控制有限上下文和无限上下文CFMs在深度上的偏差，并建立了后向边界，控制对应的训练轨迹在在线梯度下降迭代中的偏差。这两个边界实现了通用CFMs的最优Wasserstein速率n^{-1/d}和参数速率n^{-1/2}，对于包含transformers的受限CFM类。分析基于新的欧拉共轭公式和由此产生的前向-共轭系统的稳定性估计，这两者可能具有独立兴趣。

Comments 31 pages, 1 figure

详情

AI中文摘要

我们通过采用上下文流映射（CFMs）的抽象来开发一种定量统计理论，用于在大上下文范围内研究transformers：动态系统在一组注意力块中，通过上下文度量的存在演进一个区分的token。在此框架下，有限上下文模型近似于理想化的无限上下文系统，其中上下文度量被其底层总体取代，因此上下文长度n成为统计资源。利用动态的麦肯-瓦尔科夫结构和经典的传播混沌经典机器，我们建立了前向边界，控制有限上下文和无限上下文CFMs在深度上的偏差，并建立了后向边界，控制对应的训练轨迹在在线梯度下降迭代中的偏差。这两个边界实现了通用CFMs的最优Wasserstein速率n^{-1/d}和参数速率n^{-1/2}，对于包含transformers的受限CFM类。分析基于新的欧拉共轭公式和由此产生的前向-共轭系统的稳定性估计，这两者可能具有独立兴趣。

英文摘要

We develop a quantitative statistical theory of transformers in the large-context regime by adopting the abstraction of contextual flow maps (CFMs): dynamical systems that evolve a distinguished token in the presence of a contextual measure across a stack of attention blocks. Within this framework, the finite-context model approximates an idealized infinite-context system in which the contextual measure is replaced by its underlying population, so that the context length $n$ becomes a statistical resource. Exploiting the McKean--Vlasov structure of the dynamics and the classical machinery of propagation of chaos, we establish a forward bound controlling the deviation between the finite- and infinite-context CFMs uniformly along depth, and a backward bound controlling the deviation between the corresponding training trajectories uniformly across iterations of online gradient descent. Both bounds achieve the optimal Wasserstein rate $n^{-1/d}$ for general CFMs and parametric rate $n^{-1/2}$ for a restricted class of CFMs that includes transformers as a special case. The analysis rests on a new Eulerian adjoint formulation of the loss gradient and stability estimates for the resulting forward--adjoint system, both of which may be of independent interest.

URL PDF HTML ☆

赞 0 踩 0

2605.16742 2026-05-19 cs.CV stat.ME

Diffeomorphic Cortical Alignment via Direct Warping of Streamline Endpoints

通过直接变形纤维束端点实现的皮层对齐

Yang Xiang, Martin Cole, Zhengwu Zhang

AI总结本文提出了一种基于连接性的皮层对齐方法，通过直接操作白质纤维束端点来对齐皮层表面，以提高纤维束层面的对应性，并在主要纤维束上实现更高的连接性重叠系数和更强的鲁棒性。

详情

AI中文摘要

分段线性单调回归

Timo Kuosmanen, Juan F. Monge, José L. Ruiz, Xun Zhou

AI总结本文提出分段线性平滑框架，解决传统单调回归无法提供边际属性的问题，通过双层优化方法在凸非凸情况下提升估计精度。

详情

AI中文摘要

单调回归提供了一种灵活的无调参方法来估计单调函数，但估计的回归函数本质上是阶梯函数。本文针对此类估计器的关键局限性：无法提供有意义的边际属性，如影子价格或弹性。我们提出了一种新的分段线性平滑框架，即使在非凸情况下也能恢复有意义的边际估计。基于确定性前沿分析中最初开发的条件凸性概念，我们将平滑过程建模为一个双层优化问题，以拟合连续、单调、分段线性的函数到初始单调回归预测。蒙特卡洛模拟显示，所提出的方法在单变量和多变量数据的凸和非凸情况下显著提高了估计精度。我们将其应用于分析芬兰市政的集聚经济效应，展示了其实际价值。

英文摘要

Isotonic regression provides a flexible, tuning-free approach to estimating monotonic functions without imposing global curvature constraints, yet the estimated regression function is inherently a step function. This paper addresses a key limitation of such estimators: their inability to provide meaningful marginal properties, such as shadow prices or elasticities. We propose a novel piece-wise linear smoothing framework that recovers meaningful marginal estimates even in non-convex settings. Building on the concept of conditional convexity originally developed in deterministic frontier analysis, we formulate the smoothing process as a bilevel optimization problem that fits a continuous, monotonic, piece-wise linear function to the initial isotonic regression predictions. Monte Carlo simulations demonstrate that the proposed approach can significantly improve estimation accuracy in both convex and non-convex settings for univariate and multivariate data. We apply this approach to analyze agglomeration economies in Finnish municipalities, illustrating its practical value.

URL PDF HTML ☆

赞 0 踩 0

2605.12547 2026-05-19 econ.EM cs.LG q-fin.ST stat.AP

The Payment Heterogeneity Index: An Integrated Unsupervised Framework for High-Volume Procurement Oversight and Decision Support

支付异质性指数：一种用于高 volume 采购监督和决策支持的集成无监督框架

Kyriakos Christodoulides

AI总结本文提出支付异质性指数（PHI），通过整合高斯混合模型参数和非参数统计，用于高 volume 采购监督和决策支持，揭示支付结构和潜在模式。

Comments Request category change from econ.EM -> stat.ML. Paper is methodological, introducing a new unsupervised ML/stat framework (SHI/PHI index) for distributional structure. Methodology is general; procurement is the application. stat.ML is more appropriate primary; econ.EM as cross-list

详情

AI中文摘要

公共采购易受错误、欺诈和腐败影响，特别是在高交易量超出监督能力时。尽管研究常关注招标阶段异常，但中标后付款监控仍被忽视。由于标记数据稀缺且如本福特定律等方法假设限制多，需要可解释的无监督框架用于高 volume 采购监督和决策支持。本文引入结构异质性指数（SHI），一种一维样本复合统计量，及其支付特定实例支付异质性指数（PHI），用于表征支付结构和潜在模式。它整合高斯混合模型（GMM）参数和非参数统计，整合四个可解释组件：模态、不对称性、尾部行为和结构分散性。独特的是，尾部行为组件捕捉分布厚重和极值集中，而结构分散性结合了潜在支付模式的变异性、普遍性和分离度。应用于英国市政采购数据，PHI识别出一个财务显著的供应商群体（0.6%的供应商；10.1%的高 volume 供应商）具有结构不同的支付模式。统计检验进一步支持这些差异，针对性的人工验证确认了优先案例的合理性。比较分析显示PHI揭示了被变异系数（ρ=0.310）掩盖的模式分离。PHI提供了一个透明、可分解且计算轻量的框架用于采购完整性监督和目标审计优先级。

英文摘要

Public procurement is vulnerable to error, fraud, and corruption, particularly as high transaction volumes overwhelm oversight. While research often focuses on tender-stage anomalies, post-award payment monitoring remains underexplored. Since labelled datasets are rare and methods like Benford's Law face restrictive assumptions, there is a need for interpretable, unsupervised frameworks for high-volume procurement oversight and decision support. This paper introduces the Structural Heterogeneity Index (SHI), a composite statistic for one-dimensional samples, and its payment-specific instantiation, the Payment Heterogeneity Index (PHI), characterising payment structure and latent regimes. It incorporates Gaussian Mixture Model (GMM) parameters alongside non-parametric statistics, integrating four interpretable components: modality, asymmetry, tail behaviour, and structural dispersion. Uniquely, the tail-behaviour component captures both distributional heaviness and extreme-value concentration, while structural-dispersion combines the variability, prevalence, and separation of latent payment regimes. Applied to UK municipal procurement data, PHI identifies a financially significant cohort (0.6\% of suppliers; 10.1\% of high-volume vendors) with structurally distinct payment patterns. Statistical testing further supports these differences, and targeted human verification confirms the plausibility of prioritised cases. Comparative analysis shows PHI reveals regime separation obscured by the Coefficient of Variation ($ρ= 0.310$). PHI provides a transparent, decomposable, and computationally lightweight framework for procurement integrity oversight and targeted audit prioritisation.

URL PDF HTML ☆

赞 0 踩 0

2605.10088 2026-05-19 stat.ME

阈值破裂点

Tianjun Ke, Marco Avella Medina

AI总结本文提出了一种新的有限样本鲁棒性方法，定义了阈值破裂点和有限样本m-敏感性，扩展了Zhang(1996)的决策破裂点，展示了这些概念与假设检验的有限样本对应关系。

详情

AI中文摘要

我们介绍了一种新的有限样本鲁棒性方法，以避免传统破裂分析的悲观性。我们定义了阈值破裂点，即引起预定偏差所需的最小污染分数，以及有限样本m-敏感性，即在m个观测值被污染后估计器可能产生的最坏偏差。我们推导了这些度量标准用于常用M-估计量、其标准误差及相关检验统计量。这使我们能够扩展Zhang(1996)的决策破裂点，以获得假设检验的一般破裂特征，并展示这些概念如何对应于He、Simpson和Portnoy(1990)的幂和水平破裂函数的有限样本对应物。我们补充了阈值破裂和m-敏感性的推断框架，该框架提供了一致性和渐近正态性结果，以及用于不确定性量化的有效乘数自助法。我们通过各种数值示例和一个用于血压数据集的两样本检验问题展示了我们方法的实际效用。

英文摘要

We introduce a novel approach to finite sample robustness that avoids the pessimism of traditional breakdown analyses. We define the threshold breakdown point, the smallest contamination fraction needed to induce a prescribed deviation, and the finite sample m-sensitivity, the worst-case deviation that an estimator can incur after m observations are contaminated. We derive these measures for commonly used M-estimators, their standard errors and related test statistics. This allows us to extend the decision breakdown point of Zhang (1996) to obtain general breakdown characterizations for hypothesis testing, and show how these notions correspond to finite sample counterparts of the power and level breakdown functions of He, Simpson and Portnoy (1990). We complement our work with an inferential framework for the threshold breakdown and m-sensitivity that yields consistency and asymptotic normality results, as well as a valid multiplier bootstrap for uncertainty quantification. We illustrate the practical utility of our methods in various numerical examples and an application to a two sample testing problem for a blood pressure dataset.

URL PDF HTML ☆

赞 0 踩 0

2605.00155 2026-05-19 cs.LG cs.CL math.OC stat.ML

Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback

Wasserstein分布鲁棒遗憾优化用于人类反馈的强化学习

Yikai Wang, Shang Liu, Jose Blanchet

AI总结本文提出Wasserstein分布鲁棒遗憾优化（DRRO）用于强化学习从人类反馈，通过简单分配模型研究提示问题，展示在ℓ1-地面成本Wasserstein模糊集下，内最坏遗憾有精确解，最优策略具有水填充结构，从而实现高效政策梯度算法。

详情

AI中文摘要

强化学习从人类反馈（RLHF）已成为对齐大语言模型的核心后训练步骤，但RLHF中使用的奖励信号仅是真实人类效用的学得代理。从运筹学角度看，这形成了一个目标不准确的决策问题：策略是针对估计奖励优化，而部署性能由未观察的目标决定。由此产生的差距导致奖励过度优化，即Goodharting现象，即代理奖励在真正质量下降后仍继续改善。现有缓解方法通过不确定性惩罚、悲观奖励或保守约束，但这些方法计算上负担重且过于悲观。我们提出Wasserstein分布鲁棒遗憾优化（DRRO）用于RLHF。不同于标准DRO悲观最坏价值，DRRO悲观最坏遗憾相对于相同合理奖励扰动下的最佳策略。我们通过简单分配模型研究提示问题，展示在ℓ1-地面成本Wasserstein模糊集下，内最坏遗憾有精确解，最优策略具有水填充结构。这些结果导致具有简单采样奖金解释和仅小幅改动GRPO式RLHF训练的实用策略梯度算法。该框架还理论上澄清了为什么DRRO比DRO更不悲观，且实验显示DRRO比现有基线更有效缓解过度优化，而标准DRO系统性过悲观。

英文摘要

Reinforcement learning from human feedback (RLHF) has become a core post-training step for aligning large language models, yet the reward signal used in RLHF is only a learned proxy for true human utility. From an operations research perspective, this creates a decision problem under objective misspecification: the policy is optimized against an estimated reward, while deployment performance is determined by an unobserved objective. The resulting gap leads to reward over-optimization, or Goodharting, where proxy reward continues to improve even after true quality deteriorates. Existing mitigations address this problem through uncertainty penalties, pessimistic rewards, or conservative constraints, but they can be computationally burdensome and overly pessimistic. We propose Wasserstein distributionally robust regret optimization (DRRO) for RLHF. Instead of pessimizing worst-case value as in standard DRO, DRRO pessimizes worst-case regret relative to the best policy under the same plausible reward perturbation. We study the promptwise problem through a simplex allocation model and show that, under an $\ell_1$-ground-cost Wasserstein ambiguity set, the inner worst-case regret admits an exact solution and the optimal policy has a water-filling structure. These results lead to a practical policy-gradient algorithm with a simple sampled-bonus interpretation and only minor changes to GRPO-style RLHF training. The framework also clarifies theoretically why DRRO is less pessimistic than DRO, and our experiments show that DRRO mitigates over-optimization more effectively than existing baselines while standard DRO is systematically over-pessimistic.

URL PDF HTML ☆

赞 0 踩 0

2604.20031 2026-05-19 math.OC cs.LG stat.ML

Decision-Focused Federated Learning Under Heterogeneous Objectives and Constraints

在异质目标和约束下聚焦决策的联邦学习

Konstantinos Ziliaskopoulos, Alexander Vinel

AI总结本文研究了在异质目标和约束下聚焦决策的联邦学习，通过SPO+替代损失推导出异质性界限，展示了在强凸可行集下联邦学习的鲁棒性，并通过实验验证了其有效性。

详情

AI中文摘要

我们考虑了决策聚焦联邦学习（DFFL），这是一种预测后再优化的设置，在其中多个客户端协同训练预测模型以解决下游的线性优化问题，而无需交换原始数据。除了标准联邦学习中典型的数据异质性外，客户端还可能有不同的目标函数和可行区域。基于SPO+替代损失，我们推导出异质性界限，将目标偏移（通过成本向量距离测量）与可行集偏移（通过支撑函数和形状距离术语测量）分开。我们证明，对于一般的紧致可行集，小的目标扰动仍可引起非消失的决策聚焦损失差异，而强凸可行区域会产生更尖锐的基于稳定性界限。然后，我们将这些点状界限提升到局部与联邦的超额风险比较，显示当统计优势超过客户端特定的异质性惩罚时，联邦学习是有益的。在多面体和强凸问题上的计算实验证实，在强凸可行区域下联邦学习的鲁棒性显著增强。最后，我们评估了一个简单的基于验证的插值方法，用于本地和联邦DFFL模型之间。该插值方法缓解了理论权衡，减少了合成实验和PJM电力定价案例研究中的累积遗憾和最坏客户端损害。

英文摘要

We consider Decision-Focused Federated Learning (DFFL), a predict-then-optimize setting in which multiple clients collaboratively train predictive models for downstream linear optimization problems without exchanging raw data. Besides the data heterogeneity typical of standard federated learning, clients may also have different objective functions and feasible regions. Building on the SPO+ surrogate loss, we derive heterogeneity bounds that separate objective shift, measured through cost-vector distances, from feasible-set shift, measured through support-function and shape-distance terms. We show that, for general compact feasible sets, small objective perturbations can still induce nonvanishing decision-focused loss discrepancies, while strongly convex feasible regions yield sharper stability-based bounds. We then lift these pointwise bounds to a local-versus-federated excess-risk comparison, showing that federation is beneficial when the statistical advantage of pooling exceeds a client-specific heterogeneity penalty. Computational experiments on polyhedral and strongly convex problems confirm that federation is substantially more robust under strongly convex feasible regions. Finally, we evaluate a simple validation-based interpolation between local and federated DFFL models. This interpolation mitigates the theoretical tradeoff and reduces aggregate regret and worst-client harm in both synthetic experiments and a PJM energy-pricing case study.

URL PDF HTML ☆

赞 0 踩 0

2604.13276 2026-05-19 stat.ME math.ST stat.TH

Addressing Confounding by Indication Through (Un)Measured Centre Characteristics in Learn-As-you-GO(LAGO) Trials

通过测量和未测量的中心特征处理指示偏倚的LAGO试验

Minh Thu Bui, Christopher T. Longenecker, Ante Bing, Donna Spiegelman, Allison R. Webel, Hayden B. Bosworth, Judith J. Lok

AI总结本文提出通过引入固定中心效应来控制指示偏倚，统一了连续和二元结果类型的LAGO理论，并提供了统计检验和优化方法。

详情

AI中文摘要

Learn-As-you-Go (LAGO) 设计是一种自适应临床试验设计，允许在不同阶段修改多组件干预包。在 LAGO 试验中，中心特征可能作为混杂因素，预测干预包和结果。本文通过引入固定中心效应来控制通过测量和未测量的中心特征引起的指示偏倚。通过包含固定中心效应来条件化中心特征，确保渐近结果成立而无需显式表征未测量的混杂因素。我们的方法即使在中心数量较少时也适用。LAGO 理论已建立在广义线性模型和二元结果的逻辑回归模型下，统一了不同结果类型的理论。推导了点估计和区间估计，并建立了一致性和渐近正态性。提供了总体干预效应的有效假设检验，并通过约束优化获得了最小化成本且满足目标结果均值的最优干预包。

英文摘要

The Learn-As-you-Go (LAGO) design is an adaptive clinical trial design that allows modifications to multicomponent intervention packages across stages. Centers participate in more than one stage, as is common in large-scale implementation trials. In LAGO trials, center characteristics may act as confounders, predicting both the intervention package and the outcomes. We extend the LAGO theory by introducing fixed center effects to control for confounding by indication through measured and unmeasured center characteristics. Conditioning on center characteristics by including fixed center effects ensures asymptotic results hold without requiring explicit characterization of unmeasured confounders. Our methods apply even with small numbers of centers. LAGO theory is established for continuous outcomes following a generalized linear model and binary outcomes following a logistic regression model, unifying theory across outcome types. Point- and interval estimators are derived, and consistency and asymptotic normality are established. Valid hypothesis tests for the overall intervention effect are provided, and the optimal intervention package minimizing cost subject to a target outcome mean is obtained via constrained optimization.

URL PDF HTML ☆

赞 0 踩 0

2603.25860 2026-05-19 stat.ML cs.LG

On the Expressive Power of Contextual Relations in Transformers

Transformer中上下文关系的表达能力

Demián Fraiman

AI总结本文提出一种测度理论框架，将上下文关系建模为概率对象，揭示了softmax注意力与熵正则化最优传输的联系，并证明Transformer能近似任意上下文关系规则。

详情

AI中文摘要

Transformer架构在建模上下文关系方面取得了显著的实证成功，但对其表达能力的理解仍不清晰。本文引入一种测度理论框架，将上下文关系建模为概率对象，无论是条件分布还是联合分布（耦合）。这一视角揭示了标准softmax注意力与熵正则化最优传输之间的自然联系，为注意力提供了一种统一的视图，即作为底层亲和函数的归一化。在此框架内，我们利用标准softmax注意力和交替Sinkhorn归一化建立了上下文系统的通用近似定理。这些结果表明，Transformer架构能够近似任意上下文关系规则，且归一化的选择决定了这些关系的表示方式。此外，它们还提供了Transformers在建模上下文关系上有效的原因的原理性解释。

英文摘要

Transformer architectures have achieved remarkable empirical success in modeling contextual relations, yet a clear understanding of their expressive power is still lacking. In this work, we introduce a measure-theoretic framework in which contextual relations are modeled as probabilistic objects, either as conditional distributions or as joint distributions (couplings). This perspective reveals a natural connection between standard softmax attention and entropy-regularized optimal transport, providing a unified view of attention as a normalization of an underlying affinity function. Within this framework, we establish a universal approximation theorem for contextual systems using standard Softmax Attention and alternately Sinkhorn normalization. These results show that Transformer architectures can approximate arbitrary contextual relations rules, and that the choice of normalization determines how these relations are represented. Moreover, they provide a principled explanation for why Transformers are effective at modeling contextual relations.

URL PDF HTML ☆

赞 0 踩 0

2603.20904 2026-05-19 stat.ME math-ph math.DS math.MP nlin.CD physics.data-an stat.ML

Weak-Form Recovery of Stochastic Generators and Dynamical Invariants

弱形式恢复随机生成器与动力学不变量

Eshwar R A, Gajanan V. Honnavar

AI总结本文通过弱投影方法从稀疏回归中联合识别随机过程的漂移和扩散项，从而显式生成符可进行谱分析，并在基准系统中验证了其准确性。

Comments 21 pages, 5 figures

详情

AI中文摘要

谱间隙、克拉默斯逃逸率和位置依赖的弛豫时间尺度是随机流的无穷小生成器$\Lop$中编码的动力学不变量。我们展示弱投影生成 governing Itô SDE 到时间测试函数会引入阶 $O(T\,\dt^{3/2})$ 的内生偏差，该偏差随观测窗口增大而增长，无法通过额外数据消除。相反，将投影到空间高斯核则可精确去除偏差：$\mathcal{F}_{t_n}$-可测性和塔性质保证了每一步的无偏回归行。由此框架从单个稀疏回归中联合识别漂移$b(x)$和扩散$a(x)$，产生一个显式的符号生成器，可进行谱分析。在三个基准系统中的验证显示系数误差低于5%，静止密度总变差距离低于0.01，自相关函数忠实再现真实弛豫时间尺度。

英文摘要

Spectral gaps, Kramers escape rates, and position-dependent relaxation timescales are dynamical invariants encoded in the infinitesimal generator $\Lop$ of a stochastic flow. We show that weak projection of the governing Itô SDE onto temporal test functions produces an endogeneity bias of order $O(T\,\dt^{3/2})$ that grows with the observation window and cannot be eliminated by additional data. Projecting instead onto spatial Gaussian kernels removes the bias exactly: $\mathcal{F}_{t_n}$-measurability and the tower property guarantee unbiased regression rows at every step. The resulting framework jointly identifies the drift $b(x)$ and diffusion $a(x)$ from a single sparse regression, producing an explicit symbolic enerator amenable to spectral analysis. Validation on three benchmark systems yields coefficient errors below 5%, stationary-density total-variation distances below 0.01, and autocorrelation functions that faithfully reproduce true relaxation timescales.

URL PDF HTML ☆

赞 0 踩 0

2603.14942 2026-05-19 eess.SY cs.SY stat.ME

A System-Theoretic Approach to Hawkes Process Identification with Guaranteed Positivity and Stability

基于系统理论的Hawkes过程识别方法：保证正定与稳定性

Xinhui Rong, Girish N. Nair

AI总结本文提出基于系统理论的Hawkes过程识别方法，利用正交拉格朗日基保证正定性和稳定性，通过半定规划高效求解参数约束问题。

Comments 6 pages, 2 figures

2603.01388 2026-05-19 cs.LG stat.ML

Invariant-Stratified Propagation for Expressive Graph Neural Networks

不变量分层传播用于表达性图神经网络

Asela Hevapathige, Ahad N. Zehmakan, Asiri Wijesinghe, Saman Halgamuge

AI总结本文提出不变量分层传播框架，通过改进的WL变体和高效神经网络实现，提升图神经网络的表达能力，解决结构异质性捕捉问题。

详情

Journal ref: Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

AI中文摘要

图神经网络（GNNs）在表达性和捕捉结构异质性方面存在根本限制。标准消息传递架构受限于1维Weisfeiler-Leman（1-WL）测试，无法区分超过度序列的图，并且从邻居均匀聚合信息，无法捕捉节点在更高阶模式中的不同结构性位置。尽管存在实现更高表达性的方法，但它们带来了不可接受的计算成本，并缺乏统一的框架来灵活编码多样的结构属性。为了解决这些限制，我们引入不变量分层传播（ISP），该框架包括一种新的WL变体（ISP-WL）及其高效的神经网络实现（ISPGNN）。ISP根据图不变量分层节点，处理它们在层次结构中揭示的结构差异，这些差异对1-WL不可见。通过层次结构异质性编码，ISP量化节点在更高阶模式中的结构性位置差异，区分参与者占据不同角色的相互作用与参与者参与均匀的相互作用。我们提供了正式的理论分析，证明了超越1-WL的增强表达性，收敛保证以及固有的抗过平滑性。在图分类、节点分类和影响估计的广泛实验中，ISP在标准架构和最先进的表达性基线中表现出一致的改进。

英文摘要

Graph Neural Networks (GNNs) face fundamental limitations in expressivity and capturing structural heterogeneity. Standard message-passing architectures are constrained by the 1-dimensional Weisfeiler-Leman (1-WL) test, unable to distinguish graphs beyond degree sequences, and aggregate information uniformly from neighbors, failing to capture how nodes occupy different structural positions within higher-order patterns. While methods exist to achieve higher expressivity, they incur prohibitive computational costs and lack unified frameworks for flexibly encoding diverse structural properties. To address these limitations, we introduce Invariant-Stratified Propagation (ISP), a framework comprising both a novel WL variant (ISP-WL) and its efficient neural network implementation (ISPGNN). ISP stratifies nodes according to graph invariants, processing them in hierarchical strata that reveal structural distinctions invisible to 1-WL. Through hierarchical structural heterogeneity encoding, ISP quantifies differences in nodes' structural positions within higher-order patterns, distinguishing interactions where participants occupy different roles from those with uniform participation. We provide formal theoretical analysis establishing enhanced expressivity beyond 1-WL, convergence guarantees, and inherent resistance to oversmoothing. Extensive experiments across graph classification, node classification, and influence estimation demonstrate consistent improvements over standard architectures and state-of-the-art expressive baselines.

URL PDF HTML ☆

赞 0 踩 0

2602.04353 2026-05-19 stat.OT

Anyone for chess? Analysing chess ratings above high thresholds

有人下棋吗？分析高于高阈值的国际象棋评级

Nils Lid Hjort

AI总结本文分析了国际象棋评级中高于高阈值的玩家分布，提出新的模型来解释顶级玩家的差异。

Comments 9 pages, 7 figures

2601.21170 2026-05-19 cs.LG stat.ML

The Powers of Precision: Structure-Informed Detection in Complex Systems -- From Customer Churn to Seizure Onset

精度的威力：复杂系统中的结构引导检测——从客户流失到癫痫发作 onset

Augusto Santos, Teresa Santos, Catarina Rodrigues, José M. F. Moura

AI总结本文提出一种基于结构信息的机器学习方法，用于复杂系统中关键事件的早期检测，通过学习最优特征表示和分类模块，实现对隐藏因果结构的识别与利用，展示了在癫痫发作检测和客户流失预测中的有效性。

详情

AI中文摘要

涌现现象——癫痫发作 onset、突发客户流失或流行病爆发——往往源于复杂系统中隐藏的因果相互作用。我们提出了一种机器学习方法，用于其早期检测，解决了核心挑战：在数据生成过程未知且部分观测的情况下，揭示并利用系统潜在的因果结构。该方法从一个参数家族的估计器中学习最优特征表示——经验协方差或精度矩阵的幂——提供了一种原则性方法来捕捉驱动关键事件出现的底层结构。随后的监督学习模块对学习到的表示进行分类。我们证明了该家族的结构一致性，并在癫痫发作检测和客户流失预测中展示了方法的实证有效性，取得了竞争性的结果。除了预测之外，我们还发现最优协方差幂显示出良好的可识别性，同时捕捉到结构特征，从而在预测性能与可解释的统计结构之间取得平衡。

英文摘要

Emergent phenomena -- onset of epileptic seizures, sudden customer churn, or pandemic outbreaks -- often arise from hidden causal interactions in complex systems. We propose a machine learning method for their early detection that addresses a core challenge: unveiling and harnessing a system's latent causal structure despite the data-generating process being unknown and partially observed. The method learns an optimal feature representation from a one-parameter family of estimators -- powers of the empirical covariance or precision matrix -- offering a principled way to tune in to the underlying structure driving the emergence of critical events. A supervised learning module then classifies the learned representation. We prove structural consistency of the family and demonstrate the empirical soundness of our approach on seizure detection and churn prediction, attaining competitive results in both. Beyond prediction, and toward explainability, we ascertain that the optimal covariance power exhibits evidence of good identifiability while capturing structural signatures, thus reconciling predictive performance with interpretable statistical structure.

URL PDF HTML ☆

赞 0 踩 0

2601.06009 2026-05-19 stat.ML cs.LG eess.SP math.PR stat.AP

Detecting Stochasticity in Discrete Signals via Nonparametric Excursion Theorem

通过非参数逃逸定理检测离散信号中的随机性

Sunia Tanweer, Firas A. Khasawneh

AI总结本文提出一种基于连续半鞅逃逸和穿越定理的非参数方法，通过比较实测逃逸次数与理论期望比值，区分扩散过程与确定性信号，不依赖参数模型。

详情

DOI: 10.1063/5.0324348

AI中文摘要

我们开发了一个实用框架，仅使用单个离散时间序列区分扩散随机过程与确定性信号。该方法基于连续半鞅的经典逃逸和穿越定理，将逃逸次数$N_\varepsilon$与过程的二次变分$[X]_T$相关联。该标度定律适用于所有具有有限二次变分的连续半鞅，包括具有非线性或状态依赖波动率的一般伊藤扩散过程，但对确定性系统失效，从而提供了一种理论认证的方法来区分这些动态，而非基于主观熵或复发的最新方法。我们构建了一个稳健的数据驱动扩散测试，该方法将实测逃逸次数与理论期望进行比较。所得比值$K(\varepsilon)=N_{\varepsilon}^{\mathrm{emp}}/N_{\varepsilon}^{\mathrm{theory}}$通过log-log斜率偏差总结，测量$\varepsilon^{-2}$定律，从而分类为扩散样或非扩散样。我们在经典随机系统、某些周期性和混沌映射及加性白噪声系统，以及随机杜芬系统上展示了该方法。该方法是非参数、无模型的，仅依赖于连续半鞅的小尺度结构。

英文摘要

We develop a practical framework for distinguishing diffusive stochastic processes from deterministic signals using only a single discrete time series. Our approach is based on classical excursion and crossing theorems for continuous semimartingales, which correlates number $N_\varepsilon$ of excursions of magnitude at least $\varepsilon$ with the quadratic variation $[X]_T$ of the process. The scaling law holds universally for all continuous semimartingales with finite quadratic variation, including general Ito diffusions with nonlinear or state-dependent volatility, but fails sharply for deterministic systems -- thereby providing a theoretically-certfied method of distinguishing between these dynamics, as opposed to the subjective entropy or recurrence based state of the art methods. We construct a robust data-driven diffusion test. The method compares the empirical excursion counts against the theoretical expectation. The resulting ratio $K(\varepsilon)=N_{\varepsilon}^{\mathrm{emp}}/N_{\varepsilon}^{\mathrm{theory}}$ is then summarized by a log-log slope deviation measuring the $\varepsilon^{-2}$ law that provides a classification into diffusion-like or not. We demonstrate the method on canonical stochastic systems, some periodic and chaotic maps and systems with additive white noise, as well as the stochastic Duffing system. The approach is nonparametric, model-free, and relies only on the universal small-scale structure of continuous semimartingales.

URL PDF HTML ☆

赞 0 踩 0

2512.23978 2026-05-19 cs.LG math.OC stat.ML

Assured autonomy: How operations research powers and orchestrates generative AI systems

保障自主性：如何用运筹学赋能和协调生成式AI系统

Tinglong Dai, David Simchi-Levi, Michelle Xiao Wu, Yao Xie

AI总结本文探讨生成式AI在向自主决策系统转变过程中，如何通过运筹学方法提升系统的可行性、鲁棒性和风险控制能力。

Comments Authors are listed alphabetically; Production and Operations Management (POM), 2026

详情

AI中文摘要

生成式人工智能（GenAI）正从对话助手转向代理系统——能够在操作流程中感知、决策和行动的自主决策系统。这种转变带来了自主性悖论：随着GenAI系统获得更大的操作自主权，它们应通过设计体现更正式的结构、更明确的约束和更强的风险控制。我们论证，除非生成模型与提供可验证可行性、对抗鲁棒性和高后果场景下的压力测试机制相结合，否则随机生成模型在操作领域可能脆弱。为此，我们开发了一个以运筹学（OR）为基础的保障自主性框架，基于两种互补方法。首先，基于流的生成模型将生成过程框架为确定性传输，由常微分方程描述，从而实现可审计性、约束感知生成以及与最优传输、鲁棒优化和顺序决策控制的联系。其次，通过对抗鲁棒性视角制定操作安全性：决策规则在不确定性或模糊集内评估最坏扰动，使未建模风险成为设计的一部分。该框架阐明了增加自主性如何使OR的角色从求解器转变为护栏到系统架构师，负责控制逻辑、激励协议、监控制度和安全边界。这些元素定义了在安全关键、可靠性敏感的操作领域中保障自主性的研究议程。

英文摘要

Generative artificial intelligence (GenAI) is shifting from conversational assistants toward agentic systems -- autonomous decision-making systems that sense, decide, and act within operational workflows. This shift creates an autonomy paradox: as GenAI systems are granted greater operational autonomy, they should, by design, embody more formal structure, more explicit constraints, and stronger tail-risk discipline. We argue that stochastic generative models can be fragile in operational domains unless paired with mechanisms that provide verifiable feasibility, robustness to distribution shift, and stress testing under high-consequence scenarios. To address this challenge, we develop a conceptual framework for assured autonomy grounded in operations research (OR), built on two complementary approaches. First, flow-based generative models frame generation as deterministic transport characterized by an ordinary differential equation, enabling auditability, constraint-aware generation, and connections to optimal transport, robust optimization, and sequential decision control. Second, operational safety is formulated through an adversarial robustness lens: decision rules are evaluated against worst-case perturbations within uncertainty or ambiguity sets, making unmodeled risks part of the design. This framework clarifies how increasing autonomy shifts OR's role from solver to guardrail to system architect, with responsibility for control logic, incentive protocols, monitoring regimes, and safety boundaries. These elements define a research agenda for assured autonomy in safety-critical, reliability-sensitive operational domains.

URL PDF HTML ☆

赞 0 踩 0

2512.22473 2026-05-19 stat.ML cs.AI cs.LG

Gradient Dynamics of Attention: How Cross-Entropy Sculpts Bayesian Manifolds

注意力的梯度动力学：交叉熵如何塑造贝叶斯流形

Naman Agarwal, Siddhartha R. Dalal, Vishal Misra

AI总结研究通过分析交叉熵训练如何重塑Transformer注意力分数和值向量，揭示了注意力评分的优势路由定律和值的职责加权更新，展示了梯度动力学如何塑造贝叶斯流形以支持概率推理。

Comments v2: Add dual-entropy connection - advantage signal drives \r{ho} down; fix duplicate bibliography entries (synced from Paper I)

详情

AI中文摘要

Transformer在精心构建的『贝叶斯风洞』和大规模语言模型中表现出精确的概率推理能力，但梯度学习如何创建所需的内部几何仍不清楚。本文提供了一种完整的首次级分析，揭示了交叉熵训练如何重塑Transformer注意力头中的注意力评分和值向量。核心结果是注意力评分的『优势路由定律』，以及值的『职责加权更新』。这些方程诱导出正反馈循环，使路由和内容共同专业化：查询更强烈地路由到误差信号高于平均的值，而这些值被拉向使用它们的查询。本文展示了这种耦合专业化行为类似于两时间尺度EM过程：注意力权重实现E步（软责任），而值实现M步（责任加权原型更新），查询和键调整假设框架。通过受控模拟，包括一个粘性马尔可夫链任务，比较了闭合形式EM式更新与标准SGD，证明了相同的梯度动力学在最小化交叉熵的同时，塑造了本文配套工作所识别的低维流形，这些流形实现了贝叶斯推理。这给出了一个统一的画面：优化（梯度流）导致几何（贝叶斯流形），后者又支持功能（上下文概率推理）。

英文摘要

Transformers empirically perform precise probabilistic reasoning in carefully constructed ``Bayesian wind tunnels'' and in large-scale language models, yet the mechanisms by which gradient-based learning creates the required internal geometry remain opaque. We provide a complete first-order analysis of how cross-entropy training reshapes attention scores and value vectors in a transformer attention head. Our core result is an \emph{advantage-based routing law} for attention scores, \[ \frac{\partial L}{\partial s_{ij}} = α_{ij}\bigl(b_{ij}-\mathbb{E}_{α_i}[b]\bigr), \qquad b_{ij} := u_i^\top v_j, \] coupled with a \emph{responsibility-weighted update} for values, \[ Δv_j = -η\sum_i α_{ij} u_i, \] where $u_i$ is the upstream gradient at position $i$ and $α_{ij}$ are attention weights. These equations induce a positive feedback loop in which routing and content specialize together: queries route more strongly to values that are above-average for their error signal, and those values are pulled toward the queries that use them. We show that this coupled specialization behaves like a two-timescale EM procedure: attention weights implement an E-step (soft responsibilities), while values implement an M-step (responsibility-weighted prototype updates), with queries and keys adjusting the hypothesis frame. Through controlled simulations, including a sticky Markov-chain task where we compare a closed-form EM-style update to standard SGD, we demonstrate that the same gradient dynamics that minimize cross-entropy also sculpt the low-dimensional manifolds identified in our companion work as implementing Bayesian inference. This yields a unified picture in which optimization (gradient flow) gives rise to geometry (Bayesian manifolds), which in turn supports function (in-context probabilistic reasoning).

URL PDF HTML ☆

赞 0 踩 0

2512.22471 2026-05-19 cs.LG cs.AI stat.ML

The Bayesian Geometry of Transformer Attention

Transformer 注意力的贝叶斯几何

Naman Agarwal, Siddhartha R. Dalal, Vishal Misra

AI总结本文通过构建贝叶斯风道，验证了Transformer在上下文中的贝叶斯推理能力，发现其通过几何机制实现后验更新与路由，揭示了注意力机制的必要性及扁平架构的不足。

Comments v2: Add dual-entropy measurement framework (H_I, H_P, \r{ho} = H_P/H_I); incorporate Overleaf revisions; fix duplicate bibliography entries (akyurek mashup; openai title; legacy aliases removed)

详情

AI中文摘要

图神经网络在住宅选址选择中的应用：与经典logit模型的联系

Zhanhong Cheng, Lingqian Hu, Yuheng Bu, Yuqi Zhou, Shenhao Wang

AI总结本文提出基于图神经网络的住宅选址选择模型，通过捕捉空间替代关系，优于传统模型，展现深度学习与离散选择模型结合的潜力。

详情

DOI: 10.1016/j.trb.2026.103464

AI中文摘要

研究人员已采用深度学习进行经典离散选择分析，因其能捕捉复杂特征关系并提高预测性能。然而，现有深度学习方法无法显式捕捉选择替代品之间的关系，这在经典离散选择模型中一直是重点。为解决这一差距，本文引入图神经网络（GNN）作为新框架分析住宅选址选择。GNN-DCMs提供了一种结构化方法，使神经网络能捕捉空间替代品间的依赖关系，同时保持与经典随机效用理论的明确联系。理论上，证明GNN-DCMs包含嵌套logit（NL）模型和空间相关logit（SCL）模型作为特定情况，通过替代品效用间的消息传递获得新的算法解释。实证上，GNN-DCMs在预测芝加哥77个社区区的住宅选址选择中优于基准MNL、SCL和前馈神经网络。在模型解释方面，GNN-DCMs能捕捉个体异质性和空间感知的替代模式。总体而言，这些结果突显了GNN-DCMs作为统一且表达性强的框架，可整合离散选择建模和深度学习，在复杂空间选择情境中的潜力。

英文摘要

Researchers have adopted deep learning for classical discrete choice analysis as it can capture complex feature relationships and achieve higher predictive performance. However, the existing deep learning approaches cannot explicitly capture the relationship among choice alternatives, which has been a long-lasting focus in classical discrete choice models. To address the gap, this paper introduces Graph Neural Network (GNN) as a novel framework to analyze residential location choice. The GNN-based discrete choice models (GNN-DCMs) offer a structured approach for neural networks to capture dependence among spatial alternatives, while maintaining clear connections to classical random utility theory. Theoretically, we demonstrate that the GNN-DCMs incorporate the nested logit (NL) model and the spatially correlated logit (SCL) model as two specific cases, yielding novel algorithmic interpretation through message passing among alternatives' utilities. Empirically, the GNN-DCMs outperform benchmark MNL, SCL, and feedforward neural networks in predicting residential location choices among Chicago's 77 community areas. Regarding model interpretation, the GNN-DCMs can capture individual heterogeneity and exhibit spatially-aware substitution patterns. Overall, these results highlight the potential of GNN-DCMs as a unified and expressive framework for synergizing discrete choice modeling and deep learning in the complex spatial choice contexts.

URL PDF HTML ☆

赞 0 踩 0

2507.09148 2026-05-19 stat.ML cs.LG math.OC

A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

基于基本SDP松弛的稀疏PCA随机算法

Alberto Del Pia, Dekun Zhou

AI总结本文提出基于基本SDP松弛的稀疏PCA随机近似算法，通过构造确定性和随机性解并输出最优解，实现高概率下的稀疏性常数近似比，并在特定条件下保证近似比受对数约束。

Comments 29 pages, 2 figures

详情

AI中文摘要

稀疏主成分分析（SPCA）是一种用于降维的基本技术，属于NP难问题。本文介绍了一种基于基本SDP松弛的随机近似算法，该算法通过构造确定性稀疏解和多个随机解，并输出最优解。该算法在足够多次调用时，近似比最多为稀疏常数。在技术假设下，平均近似比受O(log d)约束，其中d为特征数。我们证明若SDP解低秩或具有指数衰减特征值，则该技术假设成立。我们还展示了两类实例满足该假设，并在协方差模型中证明确定性解可达到近优近似比。通过在真实数据集上的数值测试验证了算法的有效性。

英文摘要

Sparse Principal Component Analysis (SPCA) is a fundamental technique for dimensionality reduction, and is NP-hard. In this paper, we introduce a randomized approximation algorithm for SPCA, which is based on the basic SDP relaxation. Our algorithm takes an (approximate) SDP solution, constructs one deterministic sparse solution and several randomized solutions, and outputs the best among them. Our algorithm has an approximation ratio of at most the sparsity constant with high probability, if called enough times. Under a technical assumption, which is consistently satisfied in our numerical tests, the average approximation ratio is also bounded by $\mathcal{O}(\log{d})$, where $d$ is the number of features. We show that this technical assumption is satisfied if the SDP solution is low-rank, or has exponentially decaying eigenvalues. We then present two classes of instances for which this technical assumption holds. We also demonstrate that in a covariance model, which generalizes the spiked Wishart model, the deterministic solution in our algorithm achieves a near-optimal approximation ratio. We demonstrate the efficacy of our algorithm through numerical tests on real-world datasets.

URL PDF HTML ☆

赞 0 踩 0

2506.11229 2026-05-19 stat.ME physics.ed-ph

Advancing clustering methods in physics education research: A case for mixture models

推动物理教育研究中的聚类方法：混合模型的案例

Minghui Wang, Meagan Sundstrom, Karen Nylund-Gibson, Marsha Ing

AI总结本文探讨了混合模型在物理教育研究中的应用，对比了k-modes聚类与潜在类别分析的理论差异，并通过平行分析展示其在解决相同研究问题时的异同。

详情

DOI: 10.1103/1fn4-nqvj

AI中文摘要

聚类方法常用于物理教育研究（PER）中，以识别具有相似响应模式或特征的个体子群体。k-means（或k-modes，用于分类数据）是PER中最常用的聚类方法之一。然而，该算法并非基于模型：它依赖于算法划分，并将个体分配到子群体中具有确定隶属关系。研究人员还必须进行事后分析，以将子群体隶属关系与其他变量相关联。混合模型提供了一种基于模型的替代方法，能够考虑分类误差，并允许研究人员将子群体隶属关系直接整合到更广泛的潜在变量框架中。本文概述了k-modes聚类与潜在类别分析（一种用于分类数据的混合模型类型）之间的理论相似性和差异。我们还使用每种方法进行平行分析，以解决相同的研究问题，以展示这些相似性和差异。我们为有兴趣使用混合模型的研究人员提供了数据和R代码，以复制本文中所展示的示例。

英文摘要

Clustering methods are often used in physics education research (PER) to identify subgroups of individuals within a population who share similar response patterns or characteristics. K-means (or k-modes, for categorical data) is one of the most commonly used clustering methods in PER. This algorithm, however, is not model-based: it relies on algorithmic partitioning and assigns individuals to subgroups with definite membership. Researchers must also conduct post-hoc analyses to relate subgroup membership to other variables. Mixture models offer a model-based alternative that accounts for classification errors and allows researchers to directly integrate subgroup membership into a broader latent variable framework. In this paper, we outline the theoretical similarities and differences between k-modes clustering and latent class analysis (one type of mixture model for categorical data). We also present parallel analyses using each method to address the same research questions in order to demonstrate these similarities and differences. We provide the data and R code to replicate the worked example presented in the paper for researchers interested in using mixture models.

URL PDF HTML ☆

赞 0 踩 0

2506.10959 2026-05-19 cs.LG cs.AI math.ST stat.TH

Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods

在结构流形上理解上下文学习：连接注意力机制与核方法

Zhaiming Shen, Alexander Hsu, Rongjie Lai, Wenjing Liao

AI总结本文研究了在结构几何数据上上下文学习的理论，通过将注意力机制与核方法联系，揭示了transformers在流形上进行核预测的机制，并推导了泛化误差界。

详情

AI中文摘要

尽管上下文学习（ICL）在自然语言和视觉领域取得了显著成功，但其在结构几何数据中的理论理解仍不明确。本文首次对ICL在流形上回归Hölder函数的理论进行了研究。我们建立了注意力机制与经典核方法之间的新联系，证明transformers通过与提示的交互在新查询上进行基于核的预测。这一联系通过数值实验得到验证，显示学习的查询-提示分数与高斯核高度相关。基于此见解，我们推导了泛化误差界，以提示长度和训练任务数量为变量。当观察到足够多的训练任务时，transformers在流形上实现Hölder函数的最小最大回归率，该速率与提示长度呈指数关系，指数取决于流形的内在维度，而非外蕴空间维度。我们的结果还描述了泛化误差随训练任务数量的变化，揭示了transformers作为上下文核算法学习器的复杂性。我们的发现为理解几何在ICL中的作用提供了基础见解，并为研究非线性模型的ICL提供了新工具。

英文摘要

While in-context learning (ICL) has achieved remarkable success in natural language and vision domains, its theoretical understanding-particularly in the context of structured geometric data-remains unexplored. This paper initiates a theoretical study of ICL for regression of Hölder functions on manifolds. We establish a novel connection between the attention mechanism and classical kernel methods, demonstrating that transformers effectively perform kernel-based prediction at a new query through its interaction with the prompt. This connection is validated by numerical experiments, revealing that the learned query-prompt scores for Hölder functions are highly correlated with the Gaussian kernel. Building on this insight, we derive generalization error bounds in terms of the prompt length and the number of training tasks. When a sufficient number of training tasks are observed, transformers give rise to the minimax regression rate of Hölder functions on manifolds, which scales exponentially with respect to the prompt length with the exponent depending on the intrinsic dimension of the manifold, rather than the ambient space dimension. Our result also characterizes how the generalization error scales with the number of training tasks, shedding light on the complexity of transformers as in-context kernel algorithm learners. Our findings provide foundational insights into the role of geometry in ICL and novels tools to study ICL of nonlinear models.

URL PDF HTML ☆

赞 0 踩 0

2505.12181 2026-05-19 stat.ME

Reliable fairness auditing with semi-supervised inference

基于半监督推断的可靠公平性审计

Jianhui Gao, Jessica Gronsbell

AI总结本文提出Infairness框架，利用半监督推断在有限标注数据下实现公平性审计，通过回归与非线性基函数填补缺失结果，提升估计鲁棒性和效率，实验证明其在医疗数据中显著降低方差。

详情

AI中文摘要

机器学习模型常表现出加剧生物医学应用不平等的偏见。公平性审计是评估模型在子群体表现的关键步骤，但通常依赖大量标注数据，成本高且耗时。本文引入Infairness框架，结合小规模标注数据与大规模未标注数据，通过回归与精心选择的非线性基函数填补缺失结果，实现广泛公平性标准的审计。理论和实证分析显示，所提估计器对ML或填补模型的规格具有鲁棒性，并且在仅使用标注数据的监督估计基础上显著更高效。在两个真实世界公平性审计中，Infairness将方差降低约50%，证明其在有限标注数据下的可靠性。

英文摘要

Machine learning (ML) models often exhibit bias that can exacerbate inequities in biomedical applications. Fairness auditing, the process of evaluating a model's performance across subpopulations, is critical for identifying and mitigating these biases. However, audits typically rely on large volumes of labeled data, which are costly and labor-intensive to obtain. To address this challenge, we introduce $\textit{Infairness}$, a unified framework for auditing a wide range of fairness criteria using semi-supervised inference. Our approach combines a small labeled dataset with a large unlabeled dataset by imputing missing outcomes via regression with carefully selected nonlinear basis functions. Through extensive theoretical and empirical analyses, we show that our proposed estimator is (i) robust to specification of the ML or imputation model and (ii) substantially more efficient than supervised estimation based solely on the labeled data. In two real-world fairness audits using electronic health record and medical imaging data, Infairness reduces variance by approximately 50% compared to supervised estimation, underscoring its value for reliable fairness auditing with limited labeled data.

URL PDF HTML ☆

赞 0 踩 0

2505.03205 2026-05-19 cs.LG cs.NA math.NA math.ST stat.TH

Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights

用于噪声和任务级流形学习的Transformer：近似和泛化见解

Zhaiming Shen, Alex Havrilla, Rongjie Lai, Alexander Cloninger, Wenjing Liao

AI总结本文研究了Transformer在噪声和任务级流形上的学习性能，证明了其在低维结构中泛化能力与任务级流形的内在维度密切相关。

详情

AI中文摘要

Transformers作为大语言和视频生成模型的基础架构，如GPT、BERT、SORA及其后续模型。实证研究表明，现实数据和学习任务具有低维结构，伴有噪声或测量误差。Transformer的性能依赖于数据/任务的内在维度，但理论理解仍待探索。本文通过分析回归任务中接近流形的噪声输入数据，建立了Transformer的理论基础。具体而言，输入数据位于流形的管状邻域中，而真实函数依赖于噪声数据在该流形上的投影，称为任务级流形。我们证明了近似和泛化误差，其关键依赖于任务级流形的内在维度。结果表明，即使输入数据受高维噪声扰动，Transformer仍能利用低复杂度结构进行学习。我们的新证明技术通过Transformer构建基本算术运算的表示，可能具有独立兴趣。

英文摘要

Transformers serve as the foundational architecture for large language and video generation models, such as GPT, BERT, SORA and their successors. Empirical studies have demonstrated that real-world data and learning tasks exhibit low-dimensional structures, along with some noise or measurement error. The performance of transformers tends to depend on the intrinsic dimension of the data/tasks, though theoretical understandings remain largely unexplored for transformers. This work establishes a theoretical foundation by analyzing the performance of transformers for regression tasks involving noisy input data near a manifold. Specifically, the input data are in a tubular neighborhood of a manifold, while the ground truth function depends on the projection of the noisy data onto this manifold, referred to as the task-level manifold. We prove approximation and generalization errors which crucially depend on the intrinsic dimension of the task-level manifold. Our results demonstrate that transformers can leverage low-complexity structures in learning task even when the input data are perturbed by high-dimensional noise. Our novel proof technique constructs representations of basic arithmetic operations by transformers, which may hold independent interest.

URL PDF HTML ☆

赞 0 踩 0

2501.09015 2026-05-19 stat.ME

Family-wise Error Rate Control with E-values

基于e值的家族错误率控制

Will Hartog, Lihua Lei

AI总结本文提出基于e值的闭合检验框架，用于控制家族错误率，改进了传统方法在静态和动态设置中的性能，并开发了高效的算法。

Comments 32 pages, 12 figures, 4 algorithms

详情

AI中文摘要

闭合原理是多重检验问题中实现强家族错误率（FWER）控制的标准工具。我们开发了一种基于e值的闭合检验框架，继承了e值的优良性质，这些性质常见于顺序假设检验或不规则参数模型的通用推断中。我们证明了基于e值的闭合检验在静态设置中强控制后验FWER，并在顺序设置中具有更强的任何时间有效和始终有效FWER控制性质。此外，我们扩展了著名的图形方法用于FWER控制（Bretz等，2009），使用e值的加权平均作为局部检验，这是一种比使用逆e值作为p值的加权Bonferroni局部检验更强大的方法。一般来说，闭合检验的计算成本可能呈指数级增长于假设的数量。尽管p值基于的图形方法的计算快捷方式不适用，我们开发了使用动态规划的高效多项式时间算法用于e值基于的图形方法，适用于任何有向无环图，并为e-Holm程序（之前由Vovk和Wang研究）和e-Fallback程序开发了定制算法。

英文摘要

The closure principle is a standard tool for achieving strong family-wise error rate (FWER) control in multiple testing problems. We develop an e-value-based closed testing framework that inherits nice properties of e-values, which are common in settings of sequential hypothesis testing or universal inference for irregular parametric models. We prove that e-value-based closed testing strongly controls the post-hoc FWER in the static setting, and has stronger anytime-valid and always-valid FWER-controlling properties in the sequential setting. Furthermore, we extend the celebrated graphical approach for FWER control (Bretz et al. 2009), using the weighted average of e-values for the local test, a strictly more powerful approach than weighted Bonferroni local tests with inverse e-values as p-values. In general, the computational cost for closed testing can be exponential in the number of hypotheses. Although the computational shortcuts for the p-value-based graphical approach are not applicable, we develop an efficient polynomial-time algorithm using dynamic programming for e-value-based graphical approaches with any directed acyclic graph, and tailored algorithms for the e-Holm procedure previously studied by Vovk and Wang (2021) and the e-Fallback procedure.

URL PDF HTML ☆

赞 0 踩 0

2501.02475 2026-05-19 stat.CO stat.ME

Tactics for Improving Least Squares Estimation

提升最小二乘估计的策略

Qiang Heng, Hua Zhou, Kenneth Lange

AI总结本文探讨了高维最小二乘回归中加速计算的策略，包括MM原理、Moreau包络 smoothing 和约束估计的proximal距离原理，通过迭代加权最小二乘等方法提高计算效率。

详情

AI中文摘要

本文讨论了在高维最小二乘回归中加速计算的策略。这些策略包括：(a) 主要化-最小化 (MM) 原理，(b) 通过Moreau包络进行平滑，以及(c) 用于约束估计的proximal距离原理。在迭代加权最小二乘中，MM原理可以创建一个替代函数，通过交换案例权重来调整响应值。将其减少到普通最小二乘允许在迭代中重用Gram矩阵及其Cholesky分解。此策略适用于L2E回归和广义线性模型的估计。对于如分位数回归等问题，非光滑目标函数项可以被其Moreau包络近似替代，并通过球面二次函数进行主要化。最后，具有距离到集合惩罚的惩罚回归也受益于这种视角。我们的数值实验验证了去权重和Moreau包络近似的速度和实用性。Julia软件实现这些实验可在我们的网页上获得。

英文摘要

This paper deals with tactics for fast computation in least squares regression in high dimensions. These tactics include: (a) the majorization-minimization (MM) principle, (b) smoothing by Moreau envelopes, and (c) the proximal distance principle for constrained estimation. In iteratively reweighted least squares, the MM principle can create a surrogate function that trades case weights for adjusted responses. Reduction to ordinary least squares then permits the reuse of the Gram matrix and its Cholesky decomposition across iterations. This tactic is pertinent to estimation in L2E regression and generalized linear models. For problems such as quantile regression, non-smooth terms of an objective function can be replaced by their Moreau envelope approximations and majorized by spherical quadratics. Finally, penalized regression with distance-to-set penalties also benefits from this perspective. Our numerical experiments validate the speed and utility of deweighting and Moreau envelope approximations. Julia software implementing these experiments is available on our web page.

URL PDF HTML ☆

赞 0 踩 0

2412.19983 2026-05-19 q-fin.RM stat.AP

A Dynamic Spillover Effect Investigation on Cryptocurrency Market Before and After Pandemic

新冠疫情前后加密货币市场动态溢出效应研究

Wenjie Lan

AI总结本文基于非对称断点方法区分加密货币市场中的风险共振与风险分散关系，分析极端事件下加密货币的风险传播机制，并探讨疫情前后加密货币风险关联的动态演变。

Comments This paper has been withdrawn because the current version contains errors in the framing and results that may mislead readers. The authors are preparing a corrected manuscript

详情

AI中文摘要

本文基于新开发的非对称断点方法，区分了加密货币市场中的风险共振与风险分散关系，并分析了极端事件下加密货币之间的风险传播机制。此外，通过节点关联和网络结构的视角，本文探讨了疫情前后加密货币风险关联的动态演变关系。同时，通过疫情指标深入分析了加密货币风险运动的驱动机制。研究发现，在新冠爆发的影响下，加密货币之间的风险传播效应变得更加显著。同时，确诊病例的增加加剧了加密货币之间的风险溢出效应，而原油市场与加密货币市场之间的风险共振效应放大了疫情对加密货币的影响。然而，其他金融市场相对独立于加密货币市场。本文从公共卫生危机的角度提出应对加密货币风险传播的策略，为完善加密货币监管机制提供了有益的参考依据。

英文摘要

This paper distinguishes between risk resonance and risk diversification relationships in the cryptocurrency market based on the newly developed asymmetric breakpoint approach, and analyzes the risk propagation mechanism among cryptocurrencies under extreme events. In addition, through the lens of node association and network structure, this paper explores the dynamic evolutionary relationship of cryptocurrency risk association before and after the epidemic. In addition, the driving mechanism of the cryptocurrency risk movement is analyzed in a depth with the epidemic indicators. The findings show that the effect of propagation of risk among cryptocurrencies becomes more significant under the influence of the new crown outbreak. At the same time, the increase in the number of confirmed cases exacerbated the risk spillover effect among cryptocurrencies, while the risk resonance effect that exists between the crude oil market and the cryptocurrency market amplified the extent of the outbreak's impact on cryptocurrencies. However, other financial markets are relatively independent of the cryptocurrency market. This study proposes a strategy to deal with the spread of cryptocurrency risks from the perspective of a public health crisis, providing a useful reference basis for improving the regulatory mechanism of cryptocurrencies.

URL PDF HTML ☆

赞 0 踩 0

2411.18234 2026-05-19 cs.LG cs.AI cs.PF stat.CO

Time-Efficient Hybrid Hyperparameter Tuning Approach for Cardiovascular Disease Classification

用于心血管疾病分类的高效混合超参数调优方法

Abhay Kumar Pathak, Mrityunjay Chaubey, Manjari Gupta

AI总结本文提出一种结合随机搜索和网格搜索的混合超参数调优方法，提升心血管疾病分类模型的准确性和效率，实验表明该方法在性能和计算时间上均优于传统方法。

详情

AI中文摘要

心血管疾病（CVDs）是任何严重的心脏疾病，需要准确诊断以防止致命后果。超参数调优在优化机器学习模型性能中起关键作用，通过选择最合适的参数配置来提高准确性、泛化性和可靠性。网格搜索系统地评估预定义的超参数组合，而随机搜索则从搜索空间中随机采样配置，实现更广泛的探索并减少计算成本。因此，在开发分类模型时，高效调优策略至关重要，因为时间和预测能力同样关键。本文提出了一种新的超参数调优方法，用于调优用于CVD分类的机器学习模型。所提出的随机网格搜索结合了随机搜索探索全局空间的能力和网格搜索在最有前途区域的集中和彻底搜索。这种混合方法在探索和利用之间找到最佳平衡，产生了一个稳健且高效的时间机器学习模型。在最先进的模型上的实验结果表明，随机网格搜索比传统超参数调优方法表现更好。除了观察到的模型性能提升外，大多数模型的训练所需计算时间也显著减少。所提研究的结果强调了所提出随机网格搜索方法在训练时间和计算效率上的减少。所提出的技术在医疗保健领域的机器学习应用中具有重大潜力，能够提供及时且准确的CVDs诊断。

英文摘要

Cardiovascular diseases (CVDs) are any serious illness of the heart, which require accurate diagnosis to prevent fatal consequences. Hyperparameter tuning plays a critical role in optimizing machine learning model performance by selecting the most suitable parameter configurations for improved accuracy, generalization, and reliability. Grid search systematically evaluates predefined hyperparameter combinations, whereas random search samples configurations randomly from the search space enabling broader exploration with reduced computational cost. Therefore, an efficient tuning strategy is essential when developing classification models where time plays an crucial role along with the predictive capability. In this work, we propose a new hyperparameter tuning approach to tune the hyperparameters of ML models for CVD classification. The proposed random grid search combines the power of random search to explore the global space with the focused and exhaustive search of grid search in the most promising areas. This hybrid approach finds an optimal balance between exploration and exploitation and yields a robust and time-efficient ML model for classification seetings. Experimental results on state of the art models demonstrated that randomised grid search performed better than traditional hyperparameter tuning methods. In addition to the observed improvement in model performance, the computational time required for training models was substantially reduced across most of the models. Presented results of the proposed study emphasizes the reduction in training time and computational efficiency of the proposed Randomized-Grid Search method. The proposed technique has significant potential to advance ML application in healthcare providing timely and accurate CVDs diagnosis.

URL PDF HTML ☆

赞 0 踩 0

2410.20319 2026-05-19 stat.ME

High-dimensional partial linear model with trend filtering

高维部分线性模型与趋势过滤

Sang Kyu Lee, Erikka Loftfield, Hyokyoung G. Hong, Haolei Weng

AI总结本文提出高维部分线性回归模型，结合线性模型的可解释性和非参数方法的适应性，利用趋势过滤处理局部平滑变化，实现最小最大最优率，用于复杂生物数据集中的生物标志物识别。

Comments 52 pages, 8 figures

详情

DOI: 10.1214/26-EJS2522
Journal ref: Lee, S. K., Loftfield, E., Hong, H. G., and Weng, H. (2026) High-dimensional partial linear model with trend filtering, Electronic Journal of Statistics, 20(1), 1800-1850

AI中文摘要

理解饮食、代谢变化与健康结果之间的联系是营养科学和更广泛生物研究的关键焦点。分析如超加工食品（UPF）摄入量与代谢物之间的关系，可为饮食相关疾病和公共健康应用提供潜在生物标志物的洞察。然而，这些分析因高维数据结构和协变量与健康结果之间复杂的、往往非线性关联而具有挑战性。传统线性模型和常规非参数方法往往缺乏灵活性，无法准确捕捉生物数据中的复杂性。为此，我们提出一个高维部分线性回归模型，能够捕捉线性和非线性效应，结合线性模型的可解释性和非参数方法的适应性。我们的模型利用趋势过滤有效处理局部平滑变化，并实现最小最大最优率，使其适用于复杂生物数据集。我们将其应用于互动饮食和活动跟踪在AARP（IDATA）研究的数据中，展示了其在识别与UPF摄入量相关的生物标志物方面的实用性，并展示了其在饮食、代谢和健康相关研究中的潜在应用价值。

英文摘要

Understanding the links between diet, metabolic changes, and health outcomes is a key focus in nutritional science and broader biological research. Analyzing relationships, such as those between ultra-processed food (UPF) intake and metabolites, offers insights into potential biomarkers for diet-related diseases and public health applications. However, these analyses are challenging due to high-dimensional data structures and complex, often nonlinear associations between covariates and health outcomes. Traditional linear models and conventional nonparametric methods often lack the flexibility to accurately capture such complexities in biological data. To address these challenges, we propose a high-dimensional partial linear regression model that captures both linear and nonlinear effects, combining the interpretability of linear models with the adaptability of nonparametric approaches. Our model leverages trend filtering to handle local smoothness variations effectively and achieves minimax optimal rates, making it suitable for complex biological datasets. We apply this model to data from the Interactive Diet and Activity Tracking in AARP (IDATA) Study, demonstrating its utility in identifying biomarkers associated with UPF intake and illustrating its potential for broader applications in dietary, metabolic, and health-related research.

URL PDF HTML ☆

赞 0 踩 0

2410.07191 2026-05-19 cs.RO cs.LG stat.ME

Curb Your Attention: Causal Attention Gating for Robust Trajectory Prediction in Autonomous Driving

抑制注意力：因果注意力门控用于自动驾驶中的鲁棒轨迹预测

Ehsan Ahmadi, Ray Mercurius, Soheil Alizadeh, Kasra Rezaee, Amir Rasouli

AI总结本文提出CRiTIC模型，通过因果发现网络识别agent间因果关系，并引入因果注意力门控机制提升轨迹预测的鲁棒性和泛化能力，实验表明模型在对抗非因果扰动时鲁棒性提升54%。

Comments Accepted ICRA 2025

详情

DOI: 10.1109/ICRA55743.2025.11128367

AI中文摘要

自动驾驶中的轨迹预测模型易受非因果代理的扰动影响，此类扰动可能导致其他代理轨迹预测错误，进而影响自动驾驶决策的安全性和效率。本文提出CRiTIC模型，利用因果发现网络识别过去时间窗口内代理间的因果关系，并引入因果注意力门控机制，以选择性过滤Transformer架构中的信息。在两个自动驾驶基准数据集上进行了大量实验，评估了模型在对抗非因果扰动和泛化能力方面的鲁棒性。实验结果表明，预测鲁棒性可提升54%而对预测准确性影响不大。此外，本文展示了所提模型在跨域性能上的优越泛化能力，达到29%的改进。进一步细节请参见项目页面：https://ehsan-ami.github.io/critic。

英文摘要

Trajectory prediction models in autonomous driving are vulnerable to perturbations from non-causal agents whose actions should not affect the ego-agent's behavior. Such perturbations can lead to incorrect predictions of other agents' trajectories, potentially compromising the safety and efficiency of the ego-vehicle's decision-making process. Motivated by this challenge, we propose $\textit{Causal tRajecTory predICtion}$ $\textbf{(CRiTIC)}$, a novel model that utilizes a $\textit{Causal Discovery Network}$ to identify inter-agent causal relations over a window of past time steps. To incorporate discovered causal relationships, we propose a novel $\textit{Causal Attention Gating}$ mechanism to selectively filter information in the proposed Transformer-based architecture. We conduct extensive experiments on two autonomous driving benchmark datasets to evaluate the robustness of our model against non-causal perturbations and its generalization capacity. Our results indicate that the robustness of predictions can be improved by up to $\textbf{54%}$ without a significant detriment to prediction accuracy. Lastly, we demonstrate the superior domain generalizability of the proposed model, which achieves up to $\textbf{29%}$ improvement in cross-domain performance. These results underscore the potential of our model to enhance both robustness and generalization capacity for trajectory prediction in diverse autonomous driving domains. Further details can be found on our project page: https://ehsan-ami.github.io/critic.

URL PDF HTML ☆

赞 0 踩 0

2409.07014 2026-05-19 stat.ML cs.DB cs.LG

A Practical Theory of Generalization in Selectivity Learning

选择性学习中泛化理论的实用性研究

Peizhi Wu, Haoshu Xu, Ryan Marcus, Zachary G. Ives

AI总结本文从理论与实践角度探讨选择性学习的泛化能力，提出基于有符号测度的可学习预测方法，并改进OOF泛化性能。

Comments 15 pages. Technical Report (Extended Version)

详情

DOI: 10.14778/3725688.3725708

AI中文摘要

查询驱动的机器学习模型已作为一种有前途的查询选择性估计技术出现。然而，从理论角度看，这些技术的有效性仍知之甚少，因为实际解决方案与基于Probably Approximately Correct (PAC) 学习框架的最先进理论之间存在显著差距。本文旨在弥合理论与实践之间的差距。首先，我们证明由符号测度诱导的选择性预测器是可学习的，这放松了PAC理论对概率测度的依赖。更重要的是，在此基础上，我们建立了在温和假设下，此类选择性预测器在分布外（OOD）泛化误差界上的有利表现。这些理论进步为我们提供了对查询驱动选择性学习的分布内和分布外泛化能力的更好理解，并促进了两种改进分布外泛化的通用策略的设计。我们实证验证了我们的技术在预测准确性和查询延迟性能方面显著帮助查询驱动选择性模型泛化到分布外查询，同时保持其优越的分布内泛化性能。

英文摘要

Query-driven machine learning models have emerged as a promising estimation technique for query selectivities. Yet, surprisingly little is known about the efficacy of these techniques from a theoretical perspective, as there exist substantial gaps between practical solutions and state-of-the-art (SOTA) theory based on the Probably Approximately Correct (PAC) learning framework. In this paper, we aim to bridge the gaps between theory and practice. First, we demonstrate that selectivity predictors induced by signed measures are learnable, which relaxes the reliance on probability measures in SOTA theory. More importantly, beyond the PAC learning framework (which only allows us to characterize how the model behaves when both training and test workloads are drawn from the same distribution), we establish, under mild assumptions, that selectivity predictors from this class exhibit favorable out-of-distribution (OOD) generalization error bounds. These theoretical advances provide us with a better understanding of both the in-distribution and OOD generalization capabilities of query-driven selectivity learning, and facilitate the design of two general strategies to improve OOD generalization for existing query-driven selectivity models. We empirically verify that our techniques help query-driven selectivity models generalize significantly better to OOD queries both in terms of prediction accuracy and query latency performance, while maintaining their superior in-distribution generalization performance.

URL PDF HTML ☆

赞 0 踩 0

2407.07316 2026-05-19 cs.GT math.OC stat.AP

Fast Revenue Maximization

快速收益最大化

Achraf Bahamou, Omar Besbes, Omar Mouchtaki

AI总结本文研究了基于数据的定价问题，通过有限历史价格数据确定单个物品的价格，量化信息价值并指导高效定价实验。核心方法是将无限维问题转化为一维优化问题，提供有保证的定价策略，并展示在动态定价中如何减少实验次数。

详情

AI中文摘要

本文研究了基于数据的定价问题，通过有限历史价格数据确定单个物品的价格，量化信息价值并指导高效定价实验。核心方法是将无限维问题转化为一维优化问题，提供有保证的定价策略，并展示在动态定价中如何减少实验次数。

英文摘要

Problem definition: We study a data-driven pricing problem in which a seller sets a price for a single item based on demand observed at a limited number of historical prices. Our goal is to quantify the value of such information and to guide efficient price experimentation under practical constraints. Methodology/results: Our main methodological contribution is an exact reduction that characterizes the maximin revenue ratio, defined as the worst-case revenue achievable using only past data relative to the optimal revenue under full information. This reduction transforms an infinite-dimensional problem into a tractable one-dimensional optimization problem, allowing us to compute near-optimal pricing policies with explicit guarantees and to precisely quantify the value of historical data. Managerial implications: Motivated by practical constraints that limit price changes, we first evaluate the value of local information and show that the sign of the revenue gradient at a single price can provide significant guidance. We then use our framework to design efficient price experiments: we develop a method to select the next price to test so as to maximize future robust performance, and show how to substantially reduce the number of experiments needed to achieve target revenue guarantees in dynamic pricing. Finally, we show that our approach remains effective with noisy demand data, achieving near-optimal performance with as few as 25 to 100 samples per price.

URL PDF HTML ☆

赞 0 踩 0

2405.14657 2026-05-19 cs.LG stat.ML

Anchor-Based Heteroscedastic Noise for Preferential Bayesian Optimization

基于锚点的异方差噪声用于偏好贝叶斯优化

Marshal Arijona Sinaga, Julien Martinelli, Samuel Kaski

AI总结本文提出一种异方差噪声模型用于偏好贝叶斯优化，通过用户提供的可靠示例（锚点）和核密度估计生成用户不确定性图，并推导出风险规避的获取函数，提升风险调整性能。

Comments Camera-ready version (ProbML 2026)

详情

AI中文摘要

偏好贝叶斯优化（PBO）通过成对比较学习潜在效用，但现有方法假设比较噪声同方差，这在人机交互场景中不足，因为用户可能对某些设计可靠而对其他设计犹豫。本文提出PBO的异方差噪声模型：在优化前，用户提供少量可靠示例（锚点），核密度估计（KDE）将这些锚点转化为输入依赖的用户不确定性图。该图被整合到偏好高斯过程（GP）代理中，并推导出风险规避的获取函数，平衡效用和比较的便利性。进一步证明，风险调整的流行预期效用（EUBO）变体在一步贝叶斯最优性保证上至多加一个常数，且在理想化的独立同分布锚点模型下，KDE估计器具有标准一致性和集中率。在合成问题和人类偏好数据集上的实验显示，改进了风险调整性能，并澄清了锚点放置对方法的影响。

英文摘要

Preferential Bayesian optimization (PBO) learns latent utilities from pairwise comparisons, but most existing methods assume homoscedastic comparison noise. This is inadequate in human-in-the-loop settings, where a user may compare some designs reliably and others only hesitantly. We propose a heteroscedastic noise model for PBO: before optimization, the user provides a small set of reliable examples, called anchors, and a kernel density estimator (KDE) turns these anchors into an input-dependent map of user uncertainty. We incorporate this map into preferential GP surrogates and derive risk-averse acquisition functions that trade off utility and ease of comparison. We further show that a risk-adjusted variant of the popular expected utility of the best option (EUBO) preserves the one-step Bayes-optimality guarantee up to an additive constant, and that under an idealized i.i.d. anchor model the KDE estimator enjoys standard consistency and concentration rates. Experiments on synthetic problems and human-preference datasets show improved risk-adjusted performance and clarify how anchor placement affects the method.

URL PDF HTML ☆

赞 0 踩 0

2403.11782 2026-05-19 cs.LG stat.ML

A tutorial on learning from preferences and choices with Gaussian Processes

基于高斯过程的学习偏好与选择教程

Alessio Benavoli, Dario Azzimonti

AI总结本文介绍了利用高斯过程进行偏好学习的框架，结合经济学和决策理论原理，提出新颖的模型以填补现有文献的空白。

2310.07983 2026-05-19 cs.LG math.OC stat.ML

Achieving Linear Speedup with ProxSkip in Distributed Stochastic Optimization

通过ProxSkip在分布式随机优化中实现线性加速

Luyao Guo, Sulaiman A. Alghunaim, Kun Yuan, Laurent Condat, Jinde Cao

AI总结本文研究了ProxSkip在非凸设置下的收敛性，证明其在节点数量上实现线性加速，并展示了局部更新对通信效率的提升作用。

详情

AI中文摘要

ProxSkip算法在分布式优化中因其减少通信的效果而受到越来越多的关注。然而，现有分析仅限于强凸设置，无法实现节点数量的线性加速。本文重新审视去中心化ProxSkip，回答了其在非凸设置下的行为及线性加速的可实现性问题。我们为随机非凸、凸和强凸问题提供了统一的收敛分析，揭示了梯度噪声、局部更新、网络连通性和数据异质性如何共同决定收敛行为。到目前为止，这是首次证明去中心化ProxSkip在随机梯度下实现节点数量线性加速的分析。此外，我们的结果表明，局部更新可以有效减少通信频率并提高通信效率。

英文摘要

The ProxSkip algorithm for distributed optimization is gaining increasing attention due to its effectiveness in reducing communication. However, existing analyses of ProxSkip are limited to the strongly convex setting and fail to achieve linear speedup with respect to the number of nodes. Key questions regarding its behavior in the non-convex setting and the achievability of linear speedup remain open. In this paper, we revisit decentralized ProxSkip and answer these questions affirmatively. We provide a unified convergence analysis for stochastic non-convex, convex, and strongly convex problems, revealing how gradient noise, local updates, network connectivity, and data heterogeneity jointly determine the convergence behavior. To the best of our knowledge, this is the first analysis showing that decentralized ProxSkip achieves linear speedup in the number of nodes under stochastic gradients. Moreover, our results demonstrate that local updates can effectively reduce communication frequency and improve communication efficiency.

URL PDF HTML ☆

赞 0 踩 0

2308.05534 2026-05-19 stat.ME

Collective Outlier Detection and Enumeration with Conformalized Closed Testing

集体异常检测与枚举的符合化封闭检验

Chiara G. Magnani, Matteo Sesia, Aldo Solari

AI总结本文提出一种分布无关方法，用于集体异常检测与枚举，结合符合推断和多重检验等思想，通过自动选择分类器和检验程序，有效检测稀疏、弱或隐蔽的异常信号。

2605.16622 2026-05-19 cs.LG math.OC stat.ML

Does Weight Decay Enhance Training Stability?

权重衰减是否增强训练稳定性？

Marius Saether, Amir Kolic, Tomaso Poggio, Pierfrancesco Beneventano

AI总结本文研究权重衰减对训练动态稳定性的影响机制，发现其通过参数空间动态和损失尖锐度的变化影响训练稳定性，并揭示了架构依赖的相变现象。

Comments 24 pages, 16 figures

2605.16593 2026-05-19 stat.AP econ.EM stat.ML

Policy Learning with Observational Data: The Case of Hepatitis C Treatment for HIV/HCV Co-Infected Patients

基于观测数据的政策学习：HIV/HCV共感染患者抗病毒治疗的案例

Raphaël Langevin

AI总结本文提出在弱假设下通过观测数据推导多行动政策规则的方法，应用于HIV/HCV共感染患者抗病毒治疗，发现部分患者无需治疗即可自愈，优化治疗分配可降低成本并提升健康效益。

Comments 74 pages, 10 figures

详情

AI中文摘要

决策者常需在有限选项中选择单一行动，如医生选治疗方案。本文展示如何在弱假设下从观测数据中推导多行动政策规则。通过加权K均值算法估计条件平均处理效应（CATEs），假设每个同质子群内的结果模型正确指定。通过标准决策树实施可行政策规则，允许完美或 imperfect 的治疗依从性。方法应用于HIV/HCV共感染患者抗病毒治疗，该领域缺乏统一指南。结果发现约80%的患者无需治疗即可自愈，重新分配治疗可降低总成本360万至490万加元，同时提升整体健康效益。这些发现表明，所提出方法可生成数据驱动的优化治疗指南。

英文摘要

Decision-makers frequently must choose a single action from a finite set of alternatives -- for example, physicians selecting a treatment, investors choosing a portfolio risk level, or judges determining sentences. To improve outcomes, policymakers often issue policy rules or guidelines to inform such choices. In this paper, I show how to generally derive policy rules from observational data in a multi-action framework under relatively weak assumptions about the underlying structure of the heterogeneous sampled population. Conditional average treatment effects (CATEs) are consistently estimated via a weighted K-means algorithm, assuming the outcome model is correctly specified within each homogeneous subgroup. Feasible policy rules are then implemented via a standard decision tree, allowing for both perfect and imperfect adherence to treatment. The methodology is applied to treatment options for Hepatitis C (HCV) among patients co-infected with human immunodeficiency virus (HIV), a setting in which no uniform guideline exists for modern pharmaceutical therapies. The results identify a subgroup of patients with approximately an 80% probability of spontaneous HCV clearance without treatment. Estimation results also show that reallocating treatments among treated individuals could have reduced total treatment costs by CAN$3.6-4.9 million while still increasing aggregate health benefits relative to the status quo. These findings demonstrate that the proposed approach can generate improved, data-driven treatment guidelines for the management of HIV/HCV co-infected patients.

URL PDF HTML ☆

赞 0 踩 0

2605.16571 2026-05-19 stat.ML cs.AI cs.LG

Isotonic Survival Regression: Calibrated Survival Distributions from Deep Cox Models

非递减生存回归：从深度Cox模型中校准生存分布

Anchit Jain, Kevin Zhang, Stephen Bates

AI总结本文提出一种非递减回归方法，用于校准深度Cox模型的生存概率，通过理论保证和实验验证提升模型实用性。

2605.16570 2026-05-19 stat.CO stat.ML

A Cubing Strategy for Identifying Stable Hyperparameter Regions for Uncertainty Quantification in Spatial Deep Learning

一种用于空间深度学习中不确定性量化稳定超参数区域识别的立方策略

Isaac Amouzou, Ben Seiyon Lee

AI总结本文提出一种基于立方体的诊断框架，通过递归划分超参数空间，识别MC dropout产生良好校准预测区间稳定区域，提升空间深度学习模型的不确定性量化能力。

详情

AI中文摘要

空间参考数据集在许多领域中变得越来越普遍，主要得益于数据收集方法的进步，如卫星遥感。在许多应用中，未观测位置的预测伴随着可靠的不确定性估计。尽管深度学习方法为空间预测提供了可扩展且准确的模型，但在空间深度学习中仍缺乏明确的共识来解决不确定性量化问题。蒙特卡洛（MC）丢弃已成为不确定性量化的流行方法，但现有实现通常专注于调整丢弃率，而固定其他关键超参数，如权重衰减和预测标准差乘数，通常通过随意或手动调整。我们提出了一种基于立方体的诊断框架，通过递归划分超参数空间，以识别MC丢弃产生良好校准预测区间的稳定区域。该方法通过评分规则相对统计基线模型评估超参数区域，该基线模型作为校准锚点。通过涵盖多个空间依赖性制度的模拟研究以及一个大规模的遥感地表温度数据集，我们证明了我们的方法在预测区间上与基线模型相比具有竞争力或更优的表现。我们的方法为从业者提供了一种系统化的方法，将不确定性量化纳入空间深度学习模型中。

英文摘要

Spatially referenced datasets have become increasingly prevalent across many fields, largely driven by advances in data collection methods such as satellite remote sensing. In many applications, predictions at unobserved locations are accompanied by reliable uncertainty estimates. While deep learning methods provide both scalable and accurate models for spatial predictions, there remains no clear consensus for addressing uncertainty quantification in spatial deep learning. Monte Carlo (MC) dropout has become a popular approach for uncertainty quantification, yet existing implementations typically focus on tuning the dropout rate while fixing other influential hyperparameters, such as weight decay and the predictive standard deviation multiplier, often through ad-hoc or manual tuning. We propose a cubing-based diagnostic framework that recursively partitions the hyperparameter space to identify stable regions where MC dropout yields well-calibrated predictive intervals. The approach evaluates hyperparameter regions using scoring rules relative to a statistical baseline model, which serves as a calibration anchor. Through a simulation study spanning multiple spatial dependence regimes as well as a large remotely-sensed land surface temperature dataset, we demonstrate that our approach produces competitive or superior predictive intervals compared to the baseline model. Our methodology provides practitioners with a systematic procedure for incorporating uncertainty quantification into spatial deep learning models.

URL PDF HTML ☆

赞 0 踩 0

2605.16486 2026-05-19 stat.ML astro-ph.IM cs.LG

StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

StAD：基于Stein算子的 amortized 散度用于具有扩散和流的快速似然

Gurjeet Jagwani, Stephen Thorp, Sinan Deger, Hiranya Peiris

AI总结本文提出StAD方法，利用Langevin-Stein算子预测和学习PF-ODE的散度，无需计算雅可比矩阵，提升了似然预测的效率和稳定性。

Comments 24 pages, 10 figures

详情

AI中文摘要

扩散和流基模型广泛用于生成建模和密度估计。它们允许确定性概率流常微分方程（PF-ODE），类似于连续归一化流（CNFs），描述了概率质量的传输。从这些模型中获得似然对于许多工作流程至关重要，尤其是贝叶斯分析，这需要求解雅可比矩阵的迹来计算学习PF-ODE的发散性，这要么是$\mathcal{O}(D^2)$精确计算，要么是$\mathcal{O}(D)$的噪声估计。我们引入StAD，一种新的蒸馏方法，利用兰格vin-斯坦算子预测和学习PF-ODE的发散性，而无需计算雅可比矩阵。我们证明我们的方法在CIFAR-10、ImageNet和其他密度估计任务上与Hutchinson和Hutch++竞争，一致提高了似然预测的方差和速度，优于Hutchinson。我们还证明我们的方法可以推广到各种生成模型，且在某些正则性条件下，这些学习的向量场可以满足斯坦类。

英文摘要

Diffusion and flow-based models are ubiquitously used for generative modelling and density estimation. They admit a deterministic probability flow ordinary differential equation (PF-ODE), analogous to continuous normalizing flows (CNFs), which describes the transport of the probability mass. Obtaining the likelihood from these models is of interest to many workflows, especially Bayesian analysis, and requires solving the trace of the Jacobian to compute the divergence of the learned PF-ODE, which is either $\mathcal{O}(D^2)$ to compute exactly or $\mathcal{O}(D)$ with a noisy estimate. We introduce StAD, a new distillation method to predict and learn the divergence of the PF-ODE using the Langevin-Stein operator without ever computing the Jacobian. We show that our method is competitive with the Hutchinson and Hutch++ on CIFAR-10, ImageNet and other density estimation tasks, consistently improving the variance and speed of the likelihood predictions compared to the Hutchinson. We additionally show our method will generalize to a varied class of generative models, and show that under some regularity conditions these learned vector fields can be made to satisfy the Stein class.

URL PDF HTML ☆

赞 0 踩 0

2605.16473 2026-05-19 stat.ML cs.LG cs.NA math.NA math.PR

Dimension-Uniform Discretization Analysis of Preconditioned Annealed Langevin Dynamics for Multimodal Gaussian Mixtures

预处理退火 Langevin 动力学在多模高斯混合中的维度均匀离散化分析

Lorenzo Baldassari, Josselin Garnier, Knut Solna, Maarten V. de Hoop

AI总结本文研究了预处理退火 Langevin 动力学在高斯混合中的稳定性问题，通过 Euler-Maruyama 离散化和指数积分方案，证明了在满足特定谱条件时，KL 散度具有维度均匀的上界。

详情

AI中文摘要

在高维和无穷维设置中，获得稳定的扩散基采样器具有挑战性，因为高频率坐标上的误差累积会使动力学在有限维近似细化时变得不稳定。离散化是此类误差的典型来源，而使用合适的谱衰减预处理是控制其累积的一种方法。本文研究了预处理退火 Langevin 动力学（ALD）应用于高斯混合时的问题。我们首先证明 Euler-Maruyama（EM）离散化通过将退火分数的刚性线性部分用前向 Euler 步处理，施加了将预处理器与退火协方差尺度耦合的稳定性约束。结合确保退火动力学维度均匀控制的条件，该约束迫使初始平滑分布在不同维度上保持均匀接近目标。然后我们考虑了对退火分数的刚性线性部分进行精确积分的指数积分方案。在满足耦合平滑协方差、组件协方差谱和预处理器的显式谱可求和条件时，我们证明了该方案的 KL 散度具有维度均匀的上界。此上界可通过允许足够时间进行退火并相应细化时间网格来使其任意小。重要的是，这些条件允许 KL 散度在不同维度上发散的区域，表明 EM 限制是方案依赖的，而非 ALD 的固有属性。

英文摘要

Obtaining stable diffusion-based samplers in high- and infinite-dimensional settings is challenging because errors can accumulate across high-frequency coordinates and make the dynamics unstable under refinement of the finite-dimensional approximation of the underlying function-space problem. Discretization is a typical source of such errors, and preconditioning with a suitable spectral decay is one way to control their accumulation. In this paper, we study this problem for preconditioned annealed Langevin dynamics (ALD) applied to Gaussian mixtures. We first show that Euler-Maruyama (EM) discretization, by treating the stiff linear part of the annealed score with a forward Euler step, imposes a stability constraint coupling the preconditioner with the annealed covariance scale. Together with the conditions ensuring dimension-uniform control of the annealed dynamics, this constraint forces the initial smoothed law to remain uniformly close to the target across dimensions. We then consider an exponential-integrator scheme that integrates the stiff linear part of the annealed score exactly. Under explicit spectral summability conditions coupling the smoothing covariance, the component covariance spectra, and the preconditioner, we prove a dimension-uniform Kullback-Leibler (KL) bound for this scheme. This bound can be made arbitrarily small, uniformly in dimension, by allowing enough time for annealing and then refining the time mesh accordingly. Importantly, these conditions allow regimes in which the KL divergence between the target and the initial smoothed law diverges with dimension, showing that the restrictions imposed by EM are scheme-dependent rather than intrinsic to ALD.

URL PDF HTML ☆

赞 0 踩 0

2605.16390 2026-05-19 cs.CV cs.LG stat.ML

Inducing Spatial Locality in Vision Transformers through the Training Protocol

通过训练协议在视觉变换器中诱导空间局部性

Eduardo Santiago Toledo, Asael Fabian Martínez

AI总结研究通过对比不同训练协议，发现CutMix能提升视觉变换器早期层的注意力局部性，降低MAD值，表明CutMix促进局部注意力的产生。

详情

AI中文摘要

我们研究了是否可以通过训练协议在从头训练的视觉变换器（ViT）的早期层中诱导空间局部性，而无需大规模预训练。在CIFAR-10、CIFAR-100和Tiny-ImageNet上，我们比较了基线协议与现代协议（AutoAugment/ColorJitter、CutMix和Label Smoothing），通过均值注意力距离（MAD）和归一化熵来表征每个注意力头。在所有三个数据集中，现代协议在早期层产生更局部和更集中的注意力；在CIFAR-100上，最小MAD从0.316（基线）降至0.008（现代）。为了确定这种效果的来源，我们在CIFAR-100上进行了消融研究，分别添加或移除每个组件。结果表明CutMix是实验中的决定性组件：所有包含CutMix的条件均显示MAD为0.024，而所有不包含CutMix的条件仍保持在MAD 0.210。AutoAugment和Label Smoothing对局部性无独立影响。总体而言，这些发现表明，由CutMix诱导的从部分图像区域进行分类的压力，可以促进视觉变换器中局部注意力的出现。

英文摘要

We investigate whether the training protocol can induce spatial locality in the early layers of a Vision Transformer (ViT) trained from scratch, without large-scale pretraining. Keeping the architecture and optimization procedure fixed, we compare a Baseline protocol with a Modern protocol (AutoAugment/ColorJitter, CutMix, and Label Smoothing) on CIFAR-10, CIFAR-100, and Tiny-ImageNet, characterizing each attention head via Mean Attention Distance (MAD) and normalized entropy. Across all three datasets, the Modern protocol produces more local and more concentrated attention in early layers; on CIFAR-100, the minimum MAD drops from 0.316 (Baseline) to 0.008 (Modern). To identify the source of this effect, we conduct an ablation study on CIFAR-100 by adding or removing each component individually. The results identify CutMix as the determining component within our experiments: all conditions with CutMix exhibit MAD 0.024, while all conditions without CutMix remain at MAD 0.210. AutoAugment and Label Smoothing show no independent effect on locality. Taken together, these findings suggest that the pressure to classify from partial image regions, induced by CutMix, can promote the emergence of local attention in Vision Transformers.

URL PDF HTML ☆

赞 0 踩 0

2605.16383 2026-05-19 cs.CV cs.AI stat.ML

A neurosymbolic Approach with Epistemic Deep Learning for Hierarchical Image Classification

一种结合知识符号学习与认知深度学习的分层图像分类方法

Ezel Kilicdere, Shireen Kudukkil Manchingal, Fabio Cuzzolin

AI总结本文提出一种统一的神经符号和认知建模框架，通过融合Swin Transformer、焦点集推理和可微模糊逻辑，提升分层图像分类的准确性和逻辑一致性。

Comments 36 pages

详情

AI中文摘要

深度神经网络在图像分类任务中实现高精度，但往往产生过于自信的预测，无法表达认知不确定性，并违反数据中存在的逻辑或结构约束。这些局限性在分层分类中尤为明显，因为细粒度和粗粒度的预测必须保持一致。本文首次提出一种统一的神经符号和认知建模框架，通过融合Swin Transformer、焦点集推理和可微模糊逻辑，将标签视为孤立类别，而是在学习的嵌入空间中诱导数据驱动的焦点集，帮助捕捉多个可能细粒度类别的认知不确定性。这些焦点集构成了一个基于信念理论的层，利用模糊隶属函数和t-范数合取来鼓励细粒度和粗粒度预测之间的一致性。可学习的损失进一步平衡校准、质量正则化和逻辑一致性，使模型能够自适应地权衡符号结构与数据驱动的证据。在分层图像分类实验中，本文框架在与Transformer基线相当的准确性的同时，提供更校准和可解释的预测，减少过度自信并强制在分层输出中保持高逻辑一致性。实验结果表明，结合焦点集推理与模糊逻辑为深度学习模型提供了实际步骤，使其既准确又具有认知意识。

英文摘要

Deep neural networks achieve high accuracy on image classification tasks. Yet, they often produce overconfident predictions as which fail to express epistemic uncertainty, and frequently violate logical or structural constraints present in the data. These limitations are particularly pronounced in hierarchical classification, where predictions across fine and coarse levels must remain coherent. We propose, for the first time, a unified neurosymbolic and epistemic modelling framework that augments Swin Transformers with focal set reasoning and differentiable fuzzy logic. Rather than treating labels as isolated categories, our method induces data-driven focal sets within the learnt embedding space, which helps capture epistemic uncertainty over multiple plausible fine-grained classes. These focal sets form the basis of a belief-theoretic layer that uses fuzzy membership functions and t-norm conjunctions to encourage consistency between fine- and coarse-grained predictions. A learnable loss further balances calibration, mass regularisation, and logical consistency, allowing the model to adaptively trade off symbolic structure with data-driven evidence. In experiments on hierarchical image classification, our framework maintains accuracy on par with transformer baselines while providing more calibrated and interpretable predictions, reducing overconfidence and enforcing high logical consistency across hierarchical outputs. Our experimental results show that combining focal set reasoning with fuzzy logic provides a practical step toward deep learning models that are both accurate and epistemically aware.

URL PDF HTML ☆

赞 0 踩 0

2605.16361 2026-05-19 cs.LG cs.AI stat.ML

TailedTS: Benchmark Dataset for Heavy-Tailed Time Series Prediction and Periodicity Quantification

TailedTS：用于重尾时间序列预测和周期性量化的大规模基准数据集

Xinyu Chen, HanQin Cai, Lijun Ding, Jinhua Zhao

AI总结 TailedTS数据集用于测试在重尾、零膨胀和非高斯条件下时间序列预测模型的鲁棒性，通过稀疏自回归框架揭示高频页面的周期性较弱，同时提供非高斯损失函数的标准化预测基准。

详情

AI中文摘要

我们介绍了TailedTS，一个基于2024年维基百科每小时页面浏览观测数据的大规模基准数据集，专门用于测试时间序列预测模型在重尾、零膨胀和非高斯条件下的性能。该数据集包含约2469亿个数据点，覆盖约300万个唯一维基百科页面，存储在高效的Apache Parquet格式中。维基百科流量遵循幂律分布，其中约5%的页面贡献了70%的总浏览量，为模型在极端波动下的鲁棒性提供了一个自然且严谨的测试环境。TailedTS支持多个研究任务：首先，我们引入了一个基于稀疏自回归的周期性量化框架，揭示高频页面的周期性结构显著弱于低频页面，这对大型数字平台的服务器分配和流量预测有直接意义。其次，我们提供了在一系列非高斯损失函数下的标准化预测基准，包括ℓ1范数、Huber、分位数和ℓp范数损失，表明基于高斯的估计器在高流量页面类别中性能显著下降，而鲁棒替代方案在所有流量规模上均提供一致的提升。TailedTS可在https://doi.org/10.5281/zenodo.17070469公开获取。

英文摘要

We present TailedTS, a large-scale benchmark dataset derived from Wikipedia hourly page view observations throughout 2024, specifically designed to test time series forecasting models under heavy-tailed, zero-inflated, and non-Gaussian conditions. The dataset comprises approximately 24.69 billion data points spanning roughly 3 million unique Wikipedia pages per month, stored in high-efficiency Apache Parquet format. Wikipedia traffic follows a pronounced power-law distribution where roughly 5% of pages account for over 70% of total page views, creating a natural and rigorous testbed for model robustness against extreme volatility that are absent from or underrepresented in existing benchmarks such as M4, M5, and UCI electricity datasets. TailedTS enables several research tasks. First, we introduce a periodicity quantification framework based on sparse autoregression with sparsity and non-negativity constraints, revealing that frequently-viewed pages exhibit significantly weaker periodic structure than their less-viewed counterparts, showing direct implications for server allocation and traffic forecasting on large digital platforms. Second, we provide standardized prediction benchmarks evaluated under a suite of non-Gaussian loss functions, including $\ell_1$-norm, Huber, quantile, and $\ell_p$-norm losses, demonstrating that standard Gaussian-based estimators degrade substantially on high-volume page categories, while robust alternatives provide consistent gains across all traffic scales. TailedTS is publicly available at https://doi.org/10.5281/zenodo.17070469.

URL PDF HTML ☆

赞 0 踩 0

2605.16354 2026-05-19 cs.LG cs.AI cs.CL cs.HC stat.ML

Augmenting Human Evaluation with LLM Judges: How Many Human Reviews Do You Need?

通过LLM裁判增强人类评估：你真的需要多少人类评审？

Jane Paik Kim

AI总结本文提出通过LLM作为辅助裁判来增强人类评估，通过两阶段抽样设计确定人类和LLM评审样本量，以实现目标统计功效。

Comments 10 pages, 5 figures

详情

AI中文摘要

大型语言模型（LLMs）越来越多地被用作AI系统的自动评估者，包括在高风险应用中。在这一角色中，LLMs用于生成关于模型输出质量、适当性甚至安全性的判断。这种做法受到实际限制的驱动。专家人类评分成本高且难以扩展，而LLM评分可以快速低成本地生成。然而，当前部署LLM评估者的方法是随意的，通常仅限于报告人类和LLM裁判之间的一致性度量作为替代人类评分的正当性，且缺乏正式的研究设计基础。本文（1）将LLM裁判的角色从替代性转为辅助性，并（2）将LLM作为裁判范式制定为通过两阶段抽样设计增强人类评估的一种方法，其中在第一阶段对所有观察进行LLM评估，在第二阶段对子样本进行部分人类评分。我们提出使用来自缺失数据文献的双重鲁棒估计器，利用预测模型的鲁棒性属性，因为缺失性模型是设计已知的。使用该估计器的渐近方差，我们提出如何确定人类和LLM评分的样本量以达到目标统计功效。我们还展示通过分配更多人类评分给LLM评分预测性不高的评估类型，可以高效地设计研究。据我们所知，关于在验证基准时应保留多少人类监督的指导非常有限。

英文摘要

Large language models (LLMs) are increasingly used as automated evaluators of AI systems, including in high-stakes applications. In this role, LLMs are used to generate judgments about the quality, appropriateness, or even safety of model outputs. This approach is motivated by practical constraints. Expert human ratings are costly and difficult to scale, whereas LLM ratings can be produced quickly at low cost. However, current approaches to deploying LLM evaluators are ad hoc, typically limited to reporting agreement metrics between human and LLM judges as a justification for substitution of human ratings, and lack a formal basis for study design. This paper (1) shifts the role of the LLM judge from substitutive to auxiliary, and (2) formulates the LLM-as-a-judge paradigm as one of augmenting human evaluation through a two-stage sampling design, where LLM evaluations are measured for all observations at the first stage and human ratings are partially observed for a subsample at the second stage. We propose to use a doubly robust estimator from the missing data literature, which takes advantage of the robustness property against the prediction model, since the missingness model is known by design. Using the asymptotic variance of this estimator, we propose how sample sizes of human and LLM ratings can be determined to achieve a targeted level of power. We also show that a study can be efficiently designed by allocating more human ratings for types of evaluations where the predictability of LLM ratings is not high. To the best of our knowledge, there is very little guidance on how much human oversight should be retained when validating benchmarks.

URL PDF HTML ☆

赞 0 踩 0

2605.16335 2026-05-19 stat.ME math.ST stat.TH

Tests for constancy of model parameters Over time

对时间变化模型参数一致性的检验

Nils Lid Hjort, Alex J. Koning

AI总结本文研究了模型参数随时间变化的检验问题，提出了一种监控过程并构建了适合检验的统计量，讨论了如何确定变化的位置和类型，适用于各类参数模型。

Comments 23 pages, 3 figures. This is a Statistical Research Report, Department of Mathematics, University of Oslo, from 2001, containing some more material than for the published version, in Journal of Nonparametric Statistics, 2002, vol. 14, pages 113-132. NLH honours Alex Koning (1959-2022) by making these Hjort-Koning methods more visible, via arXiv and other channels

详情

Journal ref: Journal of Nonparametric Statistics, 2002, vol. 14, pages 113-132

AI中文摘要

假定一系列数据点服从某种参数形式的分布，但其中一些底层参数可能随时间变化。本文在该框架下探讨了各种自然问题。我们构建了监控过程，在无变化假设下，这些过程收敛于独立布朗桥。利用这些过程构建了适合检验的统计量。研究了加权版本，并推导了最优权重函数以获得最大局部效力。还讨论了如何利用结果确定变化的位置和类型，当初步筛查测试表明存在变化时。我们的统一大样本方法具有广泛适用性，适用于所有常规参数模型，包括回归、马尔可夫链和时间序列情况。

英文摘要

Suppose that a sequence of data points follows a distribution of a certain parametric form, but that one or more of the underlying parameters may change over time. This paper addresses various natural questions in such a framework. We construct canonical monitoring processes which under the hypothesis of no change converge in distribution to independent Brownian bridges, and use these to construct natural goodness-of-fit statistics. Weighted versions of these are also studied, and optimal weight functions are derived to give maximum local power against alternatives of interest. We also discuss how our results can be used to pinpoint where and what type of changes have occurred, in the event that initial screening tests indicate that such exist. Our unified large-sample methodology is quite general and applies to all regular parametric models, including regression, Markov chains, and time series situations.

URL PDF HTML ☆

赞 0 踩 0

2605.16332 2026-05-19 stat.AP

审计审计者：基于社区的 moderation 是否正确？

Yeganeh Alimohammadi, Karissa Huang, Christian Borgs, Jennifer Chayes

AI总结本文研究了基于社区的 moderation 系统在 X 平台 Community Notes 中的审计机制，发现少数贡献者在争议话题中趋于多数意见，并提出一种基于贡献者稳定性权重的两阶段算法以提升预测性能。

详情

AI中文摘要

在线社交平台越来越多地依赖众包系统来大规模标记误导性内容，但这些系统必须聚合用户的评估并决定信任哪些用户的评估。为解决后者，许多平台通过奖励与最终聚合结果一致来审计用户，我们称之为基于共识的审计。我们分析了这种设计在 X 的 Community Notes 中的后果，该平台在 2022 年 9 月采用了将用户参与资格与最终平台结果一致性的审计机制。我们发现证据表明存在策略性 conformity：少数贡献者的评估倾向多数意见，且在争议话题中其参与比例下降，其中独立信号最为重要。我们通过一个行为模型正式化了这一机制，其中贡献者在私人信念与预期分歧惩罚之间权衡。受这些发现启发，我们提出了一种两阶段审计和聚合算法，该算法根据贡献者过去残差的稳定性而非多数同意来加权贡献者。该方法首先考虑内容和贡献者之间的差异，然后衡量每个贡献者评估相对于潜在因子模型的可预测性。那些评估始终具有信息性的贡献者在聚合中获得更大的影响力，即使他们与主流共识相左。在 Community Notes 数据中，这种方法提高了离样预测性能，同时避免了对分歧的惩罚。

英文摘要

Online social platforms increasingly rely on crowd-sourced systems to label misleading content at scale, but these systems must both aggregate users' evaluations and decide whose evaluations to trust. To address the latter, many platforms audit users by rewarding agreement with the final aggregate outcome, a design we term consensus-based auditing. We analyze the consequences of this design in X's Community Notes, which in September 2022 adopted consensus-based auditing that ties users' eligibility for participation to agreement with the eventual platform outcome. We find evidence of strategic conformity: minority contributors' evaluations drift toward the majority and their participation share falls on controversial topics, where independent signals matter most. We formalize this mechanism in a behavioral model in which contributors trade off private beliefs against anticipated penalties for disagreement. Motivated by these findings, we propose a two-stage auditing and aggregation algorithm that weights contributors by the stability of their past residuals rather than by agreement with the majority. The method first accounts for differences across content and contributors, and then measures how predictable each contributor's evaluations are relative to the latent-factor model. Contributors whose evaluations are consistently informative receive greater influence in aggregation, even when they disagree with the prevailing consensus. In the Community Notes data, this approach improves out-of-sample predictive performance while avoiding penalization of disagreement.

URL PDF HTML ☆

赞 0 踩 0

2512.19929 2026-05-19 math.ST stat.TH

Deconvolution in unlinked linear models

未链接线性模型中的反卷积

Fadoua Balabdaoui, Antonio Di Noia, Cécile Durot

AI总结本文研究了在未链接线性回归框架下非参数反卷积问题，提出了一种在Wasserstein距离下达到参数收敛速率的非参数估计器，且噪声平滑度不影响收敛速度。

详情

AI中文摘要

未链接回归，即协变量和响应变量被分别观测且无已知对应关系，近年来受到越来越多关注。反卷积，另一方面，在非参数统计中是一个基本且具有挑战性的问题，旨在根据受某些加性噪声污染的观测值来估计潜在随机变量Z的分布。该任务的复杂性受到噪声分布平滑度的严重影响，通常导致缓慢的估计速率。在本文中，我们将最近的未链接线性回归问题与经典反卷积框架相结合。具体而言，我们研究在Z是可观测多维协变量的线性函数的假设下的非参数反卷积。这种结构约束允许我们引入一种非参数估计Z分布的估计器，该估计器在Wasserstein距离阶1中达到参数收敛速率，其中噪声的平滑度不影响收敛速率。此外，我们引入了Z的无条件密度和给定观测响应的条件密度的非参数估计器。这使我们能够研究估计潜在线性预测值的问题，其与观测响应的联系不可及。通过若干模拟，我们展示了我们反卷积估计器的快速收敛速率以及所提条件估计器在不同模拟场景中的性能。

英文摘要

Unlinked regression, in which covariates and responses are observed separately without known correspondence, has recently gained increasing attention. Deconvolution, on the other hand, is a fundamental and challenging problem in nonparametric statistics with the aim of estimating the distribution of a latent random variable $Z$ based on observations contaminated by some additive noise. The complexity of this task is heavily influenced by the smoothness of the noise distribution and often leads to slow estimation rates. In this paper, we combine the recent unlinked linear regression problem with the classical deconvolution framework. Specifically, we study nonparametric deconvolution under the assumption that $Z$ is a linear function of an observable multidimensional covariate. This structural constraint allows us to introduce a nonparametric estimator of the distribution of $Z$ which achieves the parametric rate of convergence in the Wasserstein distance of order 1, where the smoothness of the noise does not affect the rate. Furthermore, we introduce nonparametric estimators for the unconditional density of $Z$ and the conditional density of $Z$ given an observed response. This allows us to study the problem of estimating the value of the latent linear predictor, whose link to the observed response is not accessible. Through several simulations, we illustrate the fast convergence rate of our deconvolution estimator and the performance of the proposed conditional estimators of the latent predictor in different simulation scenarios.

URL PDF HTML ☆

赞 0 踩 0

2512.12572 2026-05-19 cs.LG stat.ML

On the Accuracy of Newton Step and Influence Function Data Attributions

关于牛顿步和影响函数数据归因的准确性

Ittai Rubinstein, Samuel B. Hopkins

AI总结本文研究了牛顿步和影响函数数据归因方法的准确性，推导出误差缩放规律，揭示了NS方法在特定条件下更准确的原因。

详情

AI中文摘要

数据归因旨在通过估计移除某些训练点时预测的变化来解释模型预测，广泛应用于可解释性、信用分配、遗忘和隐私等领域。即使在逻辑回归这种相对简单的案例中，现有对影响函数（IF）和单步牛顿步（NS）等主流数据归因方法的数学分析仍存在两个关键局限：首先，它们依赖于全局强凸性假设，这在实践中往往不成立；其次，所得的界限在参数数量（d）和移除样本数量（k）方面表现极差。因此，这些分析不够精确，无法回答诸如“每种方法的渐进行为误差如何”或“给定数据集哪种方法更准确”等基本问题。本文引入了针对凸学习问题的NS和IF数据归因方法的新分析。据我们所知，这是首个不假设全局强凸性且解释了[KATL19]和[RH25a]观察到NS数据归因常比IF更准确的分析。我们证明，对于足够良好的逻辑回归，我们的界限在多项对数因子范围内渐近紧致，从而得到平均样本移除情况下的误差缩放定律。[公式]

英文摘要

Data attribution aims to explain model predictions by estimating how they would change if certain training points were removed, and is used in a wide range of applications, from interpretability and credit assignment to unlearning and privacy. Even in the relatively simple case of logistic regressions, existing mathematical analyses of leading data attribution methods such as Influence Functions (IF) and single Newton Step (NS) remain limited in two key ways. First, they rely on global strong convexity assumptions which are often not satisfied in practice. Second, the resulting bounds scale very poorly with the number of parameters ($d$) and the number of samples removed ($k$). As a result, these analyses are not tight enough to answer fundamental questions such as "what is the asymptotic scaling of the errors of each method?" or "which of these methods is more accurate for a given dataset?" In this paper, we introduce a new analysis of the NS and IF data attribution methods for convex learning problems. To the best of our knowledge, this is the first analysis of these questions that does not assume global strong convexity and also the first explanation of [KATL19] and [RH25a]'s observation that NS data attribution is often more accurate than IF. We prove that for sufficiently well-behaved logistic regressions, our bounds are asymptotically tight up to poly-logarithmic factors, yielding scaling laws for the errors in the average-case sample removals. \[ \mathbb{E}_{T \subseteq [n],\, |T| = k} \bigl[ \|\hatθ_T - \hatθ_T^{\mathrm{NS}}\|_2 \bigr] = \widetildeΘ\!\left(\frac{k d}{n^2}\right), \qquad \mathbb{E}_{T \subseteq [n],\, |T| = k} \bigl[ \|\hatθ_T^{\mathrm{NS}} - \hatθ_T^{\mathrm{IF}}\|_2 \bigr] = \widetildeΘ\!\left( \frac{(k + d)\sqrt{k d}}{n^2} \right). \]

URL PDF HTML ☆

赞 0 踩 0

2512.06238 2026-05-19 cs.IT math.IT math.ST stat.TH

Non-Asymptotic Error Bounds for Causally Conditioned Directed Information Rates of Gaussian Sequences

关于高斯序列因果条件定向信息率的非渐近误差界

Yuping Zheng, Andrew Lamperski

AI总结本文研究了高斯序列的因果条件定向信息率，提出基于最优预测的显式公式，并给出误差界为O(N^{-1/2}log(N))的估计器。

Comments 9 pages, 1 figure; accepted by IFAC World Congress 2026

2510.21523 2026-05-19 cs.LG stat.ML

Interpretable epistemic uncertainty decomposition in sequential generative models via polynomial chaos surrogates

通过多项式混沌代理实现序列生成模型中可解释的epistemic不确定性分解

Ramón Nartallo-Kaluarachchi, Shashanka Ubaru, Małgorzata J Zimoń, Dongsung Huh, Robert Manson-Sawko, Lior Horesh, Yoshua Bengio

AI总结本文提出通过多项式混沌展开分析序列生成模型中epistemic不确定性的来源，揭示奖励组件对生成决策的影响，优于深度集合、贝叶斯神经网络等方法，且在多个真实任务中展现高效性和鲁棒性。

Comments 37 pages, 15 figures

详情

AI中文摘要

条件于不确定奖励的序列生成模型在AI驱动的科学发现中至关重要，但其继承的epistemic不确定性仍无法量化。我们通过拟合多项式混沌展开（PCE）到小规模训练模型集合，将不确定性传播通过生成流网络（GFlowNets）。PCE系数产生分析Sobol敏感性指数，提供首次可解释的分解，揭示哪些奖励组件驱动哪些生成决策，这一能力无法由深度集合、贝叶斯神经网络或蒙特卡洛dropout提供。理论上建立了收敛保证，并在Lean 4证明助手中正式验证了四分之五。在三个真实任务中，该框架揭示了无法被集合单独发现的可操作结构。在Doyle-Dreher Buchwald-Hartwig催化剂选择任务中，催化剂选择稳健（D_catalyst≈71），而添加剂选择脆弱（D_additive≈179，2.5倍更高）。在基于片段的分子设计中，连接位置是最敏感的（D_linker≈28），而装饰位置是最稳健的（D≈14-18），逆转了传统支架稳健/装饰脆弱的假设。在Sachs蛋白质信号网络中，MAPK级联边和PKA/PKC枢纽边分离到不同的敏感性区域，为扰动实验提供靶向地图。95%置信度下的校准覆盖率达到0.97-1.00，且代理在毫秒内评估10,000个策略样本，比穷举重新训练快10^3-10^4倍。

英文摘要

Sequential generative models conditioned on uncertain rewards are central to AI-driven scientific discovery, yet the epistemic uncertainty they inherit from imperfect reward estimates remains unquantified. We propagate this uncertainty through generative flow networks (GFlowNets) by fitting polynomial chaos expansions (PCEs) to small ensembles of trained models. The PCE coefficients yield analytical Sobol sensitivity indices, providing the first interpretable decomposition of which reward components drive which generative decisions, a capability unavailable from deep ensembles, Bayesian neural networks, or Monte Carlo dropout. Convergence guarantees are established theoretically and four of five are formally verified in the Lean 4 proof assistant. Across three real-world tasks the framework reveals actionable structure invisible to ensembles alone. On the Doyle-Dreher Buchwald-Hartwig dataset catalyst selection is robust ($D_{\mathrm{catalyst}}\approx 71$) while additive selection is fragile ($D_{\mathrm{additive}}\approx 179$, $2.5\times$ higher). In fragment-based molecular design the linker position is the most sensitive ($D_{\mathrm{linker}}\approx 28$) while decoration positions are the most robust ($D\approx 14$-$18$), reversing the conventional scaffold-robust / decoration-fragile assumption. On the Sachs protein signalling network, MAPK-cascade edges and PKA/PKC hub edges separate into distinct sensitivity regimes, providing a targeted map for perturbation experiments. Calibration coverage at the 95% level reaches 0.97-1.00 across the dominant steps, and the surrogate evaluates 10{,}000 policy samples in milliseconds - $10^{3}$-$10^{4}\times$ faster than exhaustive retraining.

URL PDF HTML ☆

赞 0 踩 0

2504.15879 2026-05-19 stat.ME

Multivariate Poisson intensity estimation via low-rank tensor decomposition

通过低秩张量分解估计多变量泊松强度函数

Haotian Xu, Carlos Misael Madrid Padilla, Oscar Hernan Madrid Padilla, Daren Wang

AI总结本文提出基于矩阵和张量的方法估计非均匀点过程的多变量强度函数，通过函数空间中的无限维矩阵或张量实现最优偏差方差权衡，提高估计精度并降低计算成本。

详情

AI中文摘要

在本文中，我们提出新的基于矩阵和张量的方法，用于估计非均匀点过程的多变量强度函数。通过将多变量强度函数视为函数空间中的无限维矩阵或张量，我们的算法实现了最优的偏差-方差权衡，产生率最优的估计误差，其模型复杂度由矩阵或张量的秩决定。它们显著提高了估计精度，同时降低了计算成本。为了说明所提框架的适应性，我们证明了许多基本的多变量函数类，包括加法和均场模型，都允许有限秩张量表示。我们应用我们的方法到一个四维的美国地质调查局地震数据集，包含纬度、经度、深度和震级等特征。我们的张量估计器恢复了局部地震活动模式（加利福尼亚、俄克拉荷马、太平洋西北、美国中北部），而核基线方法则过度平滑了这些模式。

英文摘要

In this work, we propose new matrix- and tensor-based methodologies for estimating multivariate intensity functions of inhomogeneous point processes. By viewing multivariate intensity functions as infinite-dimensional matrices or tensors within function spaces, our algorithms attain the optimal bias-variance trade-off, yielding rate-optimal estimation error, with model complexity governed by matrix or tensor ranks. They substantially improve estimation accuracy, while simultaneously reducing computational cost. To illustrate the adaptivity of the proposed framework, we show that many fundamental classes of multivariate functions, including additive and mean-field models, admit finite-rank tensor representations. We apply our method to a four-dimensional U.S. Geological Survey earthquake dataset, comprising features such as latitude, longitude, depth, and magnitude. Our tensor estimator recovers localized seismicity patterns (California, Oklahoma, Pacific Northwest, north-central U.S.), whereas the kernel baseline oversmooths them.

URL PDF HTML ☆

赞 0 踩 0

2502.17007 2026-05-19 cs.LG cs.AI stat.ML

Uncertainty Quantification as a Principled Foundation for Explainable Artificial Intelligence: A Case Study of Counterfactual Explanations

不确定性量化作为可解释人工智能的原理性基础：反事实解释的案例研究

Kacper Sokol, Santo M. A. R. Thies, Eyke Hüllermeier

AI总结本文通过反事实可解释性中的不确定性量化，展示其作为统一框架的潜力，提出两种解释器变体，并证明其在性能上优于现有方法。

2411.18510 2026-05-19 stat.ME

A subgroup-aware scoring approach to the study of effect modification in observational studies

一种考虑子组的评分方法用于观察性研究中的效应修饰研究

Yijun Fan, Dylan S. Small

AI总结本文提出一种新的组M统计量方法，通过在每个子组中评分匹配对来解决子组联合分布中因异常值导致的效应修饰混淆问题，通过广泛实验验证其优越性，并应用于西非疟疾预防治疗的效果研究。

详情

AI中文摘要

效应修饰指的是治疗效应的大小随观察到的协变量变化。一般来说，较大的治疗效应伴有更稳定的误差项，因此可以通过使用这些经历较大治疗效应的子组来得出研究对未测量偏差更不敏感的结论。Lee等人（2018）提出了利用子组联合分布测试统计量的submax方法，如果存在效应修饰则能得出更稳固的结论。然而，一种submax方法版本使用M统计量作为测试统计量，并在R包submax中实现（Rosenbaum, 2017）。M统计量的缩放因子是通过跨子组的所有观测数据计算的。我们证明这种合并可能将效应修饰与异常值混淆。我们提出了一种新的组M统计量，通过在每个子组中评分匹配对来解决这一问题。我们通过广泛设置检验我们的新评分策略，以展示其优越性。所提出的方法应用于西非疟疾预防治疗效果的观察性研究。

英文摘要

Effect modification means the size of a treatment effect varies with an observed covariate. Generally speaking, a larger treatment effect with more stable error terms is less sensitive to bias. Thus, we might be able to conclude that a study is less sensitive to unmeasured bias by using these subgroups experiencing larger treatment effects. Lee et al. (2018) proposed the submax method that leverages the joint distribution of test statistics from subgroups to draw a firmer conclusion if effect modification occurs. However, one version of the submax method uses M-statistics as the test statistics and is implemented in the R package submax (Rosenbaum, 2017). The scaling factor in the M-statistics is computed using all observations combined across subgroups. We show that this combining can confuse effect modification with outliers. We propose a novel group M-statistic that scores the matched pairs in each subgroup to tackle the issue. We examine our novel scoring strategy in extensive settings to show the superior performance. The proposed method is applied to an observational study of the effect of a malaria prevention treatment in West Africa.

URL PDF HTML ☆

赞 0 踩 0

2010.15538 2026-05-19 stat.ML cs.LG

Matérn Gaussian Processes on Graphs

图上的Matérn高斯过程

Viacheslav Borovitskiy, Iskander Azangulov, Alexander Terenin, Peter Mostowsky, Marc Peter Deisenroth, Nicolas Durrande

AI总结本文研究了图上Matérn高斯过程，利用其随机偏微分方程特性，继承了欧几里得和黎曼流形高斯过程的特性，提供标准训练方法，使其适用于小批量和非共轭场景。

详情

Journal ref: Artificial Intelligence and Statistics, 2021

AI中文摘要

高斯过程是一种用于学习未知函数的灵活框架，允许利用对函数性质的先验信息。尽管许多不同的高斯过程模型在欧几里得输入空间中 readily available，但对于输入空间为无向图的高斯过程，选择则更加有限。在本文中，我们利用Matérn高斯过程的随机偏微分方程特性——在欧几里得设置中广泛使用的模型类——来研究其在无向图上的类比。我们证明，所得到的高斯过程继承了其欧几里得和黎曼流形类比的各种吸引特性，并提供了允许使用标准方法（如诱导点）进行训练的技术。这使得图Matérn高斯过程能够应用于小批量和非共轭设置，从而使其更易于从业者使用，并更容易在更大的学习框架中部署。

英文摘要

Gaussian processes are a versatile framework for learning unknown functions in a manner that permits one to utilize prior information about their properties. Although many different Gaussian process models are readily available when the input space is Euclidean, the choice is much more limited for Gaussian processes whose input space is an undirected graph. In this work, we leverage the stochastic partial differential equation characterization of Matérn Gaussian processes - a widely-used model class in the Euclidean setting - to study their analog for undirected graphs. We show that the resulting Gaussian processes inherit various attractive properties of their Euclidean and Riemannian analogs and provide techniques that allow them to be trained using standard methods, such as inducing points. This enables graph Matérn Gaussian processes to be employed in mini-batch and non-conjugate settings, thereby making them more accessible to practitioners and easier to deploy within larger learning frameworks.

URL PDF HTML ☆

赞 0 踩 0

1908.05387 2026-05-19 cs.LG stat.ML

HONEM: Learning Embedding for Higher Order Networks

HONEM：用于高阶网络的嵌入学习

Mandana Saebi, Giovanni Luca Ciampaglia, Lance M Kaplan, Nitesh V Chawla

AI总结本文提出HONEM方法，针对高阶网络结构，有效捕捉非马尔可夫高阶依赖，提升节点分类、网络重建、链接预测和可视化性能。

详情

DOI: 10.1089/big.2019.0169
Journal ref: Big Data 8, no. 4 (2020): 255-269

AI中文摘要

图网络上的表示学习为手动特征工程往往繁琐的过程提供了一个强大的替代方案，因此近年来取得了显著的成功。然而，现有的所有表示学习方法都是基于一阶网络（FON），即只捕捉节点之间成对相互作用的网络。因此，这些方法可能无法纳入非马尔可夫高阶依赖性。因此，生成的嵌入可能无法准确表示网络中的底层现象，导致在不同的归纳或传递学习任务中表现不佳。为了解决这一挑战，本文提出了HONEM，一种能够捕捉网络中非马尔可夫高阶依赖性的高阶网络嵌入方法。HONEM专门针对高阶网络结构（HON）设计，并在包含非马尔可夫高阶依赖性的网络中，在节点分类、网络重建、链接预测和可视化任务中优于其他最先进的方法。

英文摘要

Representation learning on networks offers a powerful alternative to the oft painstaking process of manual feature engineering, and as a result, has enjoyed considerable success in recent years. However, all the existing representation learning methods are based on the first-order network (FON), that is, the network that only captures the pairwise interactions between the nodes. As a result, these methods may fail to incorporate non-Markovian higher-order dependencies in the network. Thus, the embeddings that are generated may not accurately represent of the underlying phenomena in a network, resulting in inferior performance in different inductive or transductive learning tasks. To address this challenge, this paper presents HONEM, a higher-order network embedding method that captures the non-Markovian higher-order dependencies in a network. HONEM is specifically designed for the higher-order network structure (HON) and outperforms other state-of-the-art methods in node classification, network re-construction, link prediction, and visualization for networks that contain non-Markovian higher-order dependencies.

URL PDF HTML ☆

赞 0 踩 0