arXivDaily arXiv每日学术速递 周一至周五更新
重置

1. 统计理论与方法 10 篇

2606.13593 2026-06-12 stat.ME 新提交

Smoothed Rank-Based Regression Estimation Using Wilcoxon Score Functions

基于Wilcoxon得分函数的平滑秩回归估计

Feridun Tasdan

AI总结 提出用平滑秩代替整数秩的Wilcoxon秩回归估计,通过核分布函数近似指示函数,在保持稳健性的同时提高重尾误差下的效率并处理结数据,推导了Wald检验并证明渐近正态性。

Comments 17 pages

详情
AI中文摘要

本文提出了一种改进的基于秩的回归估计量,通过用从平滑经验累积分布函数导出的平滑秩替换Wilcoxon秩得分回归过程中的普通整数秩。平滑秩通过连续、非递减的核分布函数H计算,该函数为标准秩回归中使用的经典指示函数提供了可微近似。将这些平滑秩代入Wilcoxon得分函数,得到简单和多元线性回归模型中斜率参数的新估计量。我们证明,所提出的估计量继承了经典秩回归的稳健性,同时在重尾误差分布下提高了效率,并更好地处理了结观测值。推导了回归系数的Wald型假设检验,并建立了其渐近正态性。蒙特卡洛模拟研究将新估计量与普通最小二乘估计量、经典Wilcoxon秩回归估计量以及Theil和Sen估计量在几种误差分布(包括正态、拉普拉斯、柯西和污染正态)下进行了比较。所提出的估计量在所有考虑的场景中均匀地达到或超过经典秩回归的相对效率,在存在异常值和重尾误差时尤其显著。

英文摘要

This article proposes an improved rank based regression estimator obtained by replacing the ordinary integer ranks in the Wilcoxon rank-score regression procedure with smoothed ranks derived from a smoothed empirical cumulative distribution function. The smoothed ranks are computed via a continuous, nondecreasing kernel distribution function H that provides a differentiable approximation to the classical indicator function used in standard rank regression. Substituting these smoothed ranks into the Wilcoxon score function yields a new estimator for the slope parameter(s) of the simple and multiple linear regression model. We show that the proposed estimator inherits the robustness properties of classical rank regression while providing improved efficiency under heavy tailed error distributions and better handling of tied observations. A Wald type hypothesis test for the regression coefficients is derived and its asymptotic normality is established. A Monte Carlo simulation study compares new estimator with the ordinary least-squares (OLS) estimator, the classical Wilcoxon rank regression estimator, and the Theil and Sen estimator under several error distributions including the normal, Laplace, Cauchy, and contaminated normal. The proposed estimator achieves relative efficiencies at or above those of classical rank regression uniformly across all scenarios considered, with notable gains in the presence of outliers and heavy-tailed errors.

2606.13433 2026-06-12 stat.ME 新提交

Smoothed-KL Reweighting: A Principled Account and Matching Rule for SNR-Based Diffusion Training

平滑KL重加权:基于信噪比的扩散训练的原则性解释与匹配规则

Lei Li

AI总结 提出平滑KL重加权方法,从扩散散度推导出闭式权重,建立与Min-SNR家族的匹配规则,在CIFAR-10和CelebA-64上验证,最终FID相当但迭代效率因数据集而异。

详情
AI中文摘要

我们对Crowson等人(2024)的Soft-Min-SNR权重进行了原则性推导。Zhang等人(2018)的扩散散度在计算KL散度之前,先对两个比较分布进行高斯核卷积;将其应用于每个时间步的逐样本局部匹配高斯代理,得到闭式权重w(t,lambda) = sigma^2 / (sigma^2 + lambda)。由此产生三个结果。第一,对于方差保持调度,w(t,lambda)等于Soft-Min-SNR的常数倍,其中gamma' = (1+lambda)/lambda,从而推导出一个经过验证的启发式方法,而非引入新权重。第二,在gamma约等于1/lambda的主导阶下,相同权重匹配Min-SNR-gamma,从而在软硬重加权家族之间建立交叉路径。第三,局部几何分析在高SNR时间步将SGD难度代理按w^3缩放。与Kingma & Gao(2023)的目标级解释(将单调对数SNR加权统一为噪声增强数据的ELBO)互补,我们的方法平滑了两个比较分布,而不仅仅是数据侧。实验上,匹配规则在CIFAR-10(线性和余弦)和CelebA-64(余弦)上成立,并在跨数据集截面上得到轨迹级确认:在seed-42 CelebA-64轨迹的七个中间检查点上,|我们的方法 - Min-SNR|的平均FID为0.45,大约是任一重加权器与DDPM之间差距的3倍。局部几何预测部分得到证实:在CIFAR-10的线性调度上,我们的方法在训练中期FID阈值处比DDPM收敛早约21%,此时高SNR阻尼空间最大,但这种迭代效率优势并未转移到余弦或CelebA-64上,这三种方法在这些数据集上达到相似的最终FID。总体而言:最终FID相当,但迭代效率因数据集而异,并且在Min-SNR家族中具有原则性的匹配规则。

英文摘要

We give a principled derivation of the Soft-Min-SNR weight of Crowson et al. (2024). The spread divergence of Zhang et al. (2018) convolves both compared distributions with a Gaussian kernel before taking the Kullback-Leibler (KL) divergence; applied to the per-sample local matched-Gaussian surrogate at each timestep, it yields the closed-form weight w(t,lambda) = sigma^2 / (sigma^2 + lambda). Three consequences follow. First, for variance-preserving schedules, w(t,lambda) equals a constant multiple of Soft-Min-SNR with gamma' = (1+lambda)/lambda, deriving a validated heuristic rather than introducing a new weight. Second, the same weight matches Min-SNR-gamma at leading order under gamma approximately 1/lambda, giving a cross-walk between the soft and hard reweighting families. Third, a local-geometry analysis scales an SGD-difficulty proxy by w^3 at high-SNR timesteps. Complementary to the objective-level account of Kingma & Gao (2023), who unified monotonic-in-log-SNR weightings as ELBOs of noise-augmented data, ours smooths both compared distributions rather than only the data side. Empirically, the matching rule holds on CIFAR-10 (linear and cosine) and CelebA-64 (cosine), with trajectory-wide confirmation on the cross-dataset cut: |Ours - Min-SNR| averages 0.45 FID across seven intermediate checkpoints on the seed-42 CelebA-64 trajectory, roughly 3x tighter than either reweighter's gap to DDPM. The local-geometry prediction is partially borne out: Ours converges about 21% earlier than DDPM at mid-training FID thresholds on CIFAR-10's linear schedule, where high-SNR damping headroom is largest, but this iteration-efficiency advantage does not transfer to cosine or CelebA-64, where all three methods reach similar final FIDs. Overall: final-FID parity with dataset-dependent iteration efficiency, plus a principled matching rule across the Min-SNR family.

2606.13242 2026-06-12 stat.ME stat.CO 新提交

Least Absolute Deviations Estimation for Sinusoidal Models

正弦模型的最小绝对偏差估计

Zehaan Naik, Debasis Kundu

AI总结 提出基于最小绝对偏差的正弦回归模型鲁棒参数估计方法,采用坐标下降算法(加权中位数更新振幅、周期图网格搜索优化频率),证明估计量的强一致性和渐近正态性,在合成数据和真实时间序列中展示对非高斯噪声的鲁棒性。

Comments 34 pages, 5 figures

详情
AI中文摘要

我们研究在最小绝对偏差(LAD)框架下正弦回归模型中的鲁棒参数估计。虽然经典方法主要依赖于最小二乘公式,但已知它们对重尾噪声和异常值敏感。我们将估计问题表述为直接最小化LAD目标,并提出一种简单、模块化的坐标下降算法,该算法利用目标的部分凸性:振幅参数通过加权中位数计算更新,从而比传统的单纯形优化方法带来实质性的计算改进,而频率参数则通过基于周期图的网格搜索和局部细化进行估计。我们在温和的正则条件下建立了所提估计量的强一致性和渐近正态性。实验上,我们在合成数据集和真实世界时间序列(包括莫纳罗亚大气CO2数据、航空旅客数据和英国驾驶员死亡数据)上展示了该方法的有效性,其中对非高斯噪声的鲁棒性至关重要。所提出的方法为正弦信号估计提供了一种简单、可解释且鲁棒的替代最小二乘方法的方案。

英文摘要

We study robust parameter estimation in sinusoidal regression models within a least absolute deviations (LAD) framework. While classical approaches rely predominantly on least-squares formulations, they are known to be sensitive to heavy-tailed noise and outliers. We formulate the estimation problem as direct minimization of the LAD objective and propose a simple, modular coordinate descent algorithm that exploits the partial convexity of the objective: amplitude parameters are updated via weighted median computations, leading to substantial computational improvements over traditional simplex-based optimization methods, while frequency parameters are estimated via a periodogram-inspired grid search with local refinement. We establish strong consistency and asymptotic normality of the proposed estimator under mild regularity conditions. Empirically, we demonstrate the method's effectiveness on both synthetic datasets and real-world time series, including the Mauna Loa atmospheric CO2 data, air passenger data, and UK drivers' deaths data, where robustness to non-Gaussian noise is essential. The proposed approach provides a simple, interpretable, and robust alternative to least-squares-based methods for sinusoidal signal estimation.

2606.12884 2026-06-12 stat.ME eess.SP 新提交

Volterra--Wiener--Kunchenko Orthogonalization: From Wiener--Hermite to Distribution-Matched Volterra Bases

Volterra--Wiener--Kunchenko正交化:从Wiener--Hermite到分布匹配的Volterra基

Serhii Zabolotnii

AI总结 针对非高斯输入下Volterra辨识的病态问题,通过定向Gram-Schmidt正交化构造分布匹配的VWK基,并证明方差匹配高斯基下的自归一化对角估计器风险受偏度系数控制,实验表明VWK基条件数优于幂基。

Comments 20 pages, 1 figure; companion reproducibility archive with code, frozen results, and Lean 4 files

详情
AI中文摘要

有限记忆Volterra辨识的单项式参数化在非高斯输入下是病态的,而Wiener--Hermite展开仅对高斯白噪声输入消除病态。我们通过在$L^2(P)$中对单项式进行定向Gram--Schmidt正交化,构造了分布匹配的Volterra--Wiener--Kunchenko (VWK)基,并将其作为任意多项式混沌坐标系,用于从数据中进行有限记忆Volterra辨识,遵循Xiu和Karniadakis (2002)的广义多项式混沌以及Oladyshkin和Nowak (2012)的数据驱动任意多项式混沌。该基本身是经典的;贡献在于Volterra估计的解读。首先,一个二阶误指定惩罚定理表明,在方差匹配高斯基中,自归一化对角估计器的超额$L^2(P)$风险由偏度系数$\delta=\mu_3/\sigma^2$控制,对于对称输入恰好消失。其次,条件实验将总体匹配Gram是单位矩阵这一构造性事实与有限样本设计Gram区分开来:在$n=2000$时,中心指数经验VWK Gram的条件数远优于幂Gram,尽管它随阶数增加而退化。第三,一个机器检查的Lean 4证明建立了任意$N$的二项式$(N,p)$ Krawtchouk行。固定跨度上的全最小二乘是基不变的,因此VWK稳定了对角互相关和正则化坐标拟合,而非声称通用预测优越性。该分析基于矩、有限记忆,并限制为乘积输入分布。

英文摘要

The monomial parameterization of finite-memory Volterra identification is ill-conditioned under non-Gaussian input, and the Wiener--Hermite expansion removes this ill-conditioning only for Gaussian white-noise input. We construct the distribution-matched Volterra--Wiener--Kunchenko (VWK) basis by oriented Gram--Schmidt orthogonalization of monomials in $L^2(P)$ and use it as an arbitrary-polynomial-chaos coordinate system for finite-memory Volterra identification from data, following the generalized polynomial chaos of Xiu and Karniadakis (2002) and the data-driven arbitrary polynomial chaos of Oladyshkin and Nowak (2012). The basis itself is classical; the contribution is the Volterra-estimation reading. First, an order-2 misspecification-penalty theorem shows that a self-normalized diagonal estimator in the variance-matched Gaussian basis incurs an excess $L^2(P)$ risk governed by the skew coefficient $δ=μ_3/σ^2$, vanishing exactly for symmetric inputs. Second, conditioning experiments separate the constructional fact that the population matched Gram is the identity from the finite-sample design Gram: at $n=2000$, the centered-exponential empirical VWK Gram remains far better conditioned than the power Gram, although it degrades with degree. Third, a machine-checked Lean 4 proof establishes the Binomial$(N,p)$ Krawtchouk row for arbitrary $N$. Full least squares over a fixed span is basis-invariant, so VWK stabilizes diagonal cross-correlation and regularized coordinate fits rather than claiming universal prediction superiority. The analysis is moment-based, finite-memory, and restricted to product input laws.

2606.13240 2026-06-12 cs.LG cs.AI cs.CV stat.ME stat.ML 新提交

Towards More General Control of Diffusion Models Using Jeffrey Guidance

使用 Jeffrey 引导实现扩散模型的更通用控制

Raphaël Razafindralambo, Rémy Sun, Frédéric Precioso, Jes Frellsen, Pierre-Alexandre Mattei

发表机构 * Inria, CNRS, I3S, Maasai Université Côte d’Azur(法国国家信息与自动化研究所、法国国家科学研究中心、信息与系统科学实验室、马赛·蔚蓝海岸大学) Technical University of Denmark(丹麦技术大学) Inria, CNRS, LJAD, Maasai Université Côte d’Azur(法国国家信息与自动化研究所、法国国家科学研究中心、雅克-路易·利翁实验室、马赛·蔚蓝海岸大学)

AI总结 提出 Jeffrey 引导框架,通过 Jeffrey 条件规则更新边缘分布,扩展扩散模型控制到标准引导无法表达的应用,在 CIFAR-10 和 FFHQ 上显著降低 FID,并在 CelebA-HQ 上实现公平性控制。

详情
AI中文摘要

扩散模型的一个关键优势在于其灵活性,因为其输出可以在采样时通过引导进行控制。然而,除了条件采样等简单情况外,目标分布通常隐含地定义,仅通过采样规则或启发式能量函数给出。为了解决这个问题,我们提出了 Jeffrey 引导,这是一个原则性框架,将扩散模型控制扩展到标准引导无法表达的应用。它利用 Jeffrey 条件规则将边际分布更新到指定的目标,保持条件结构并最小化对联合分布的扰动。我们首先通过针对指定的嵌入分布来演示 Jeffrey 引导。以 Inception 嵌入为目标,这导致在 CIFAR-10 和 FFHQ 上 FID 显著降低。我们进一步将 Jeffrey 引导应用于 CelebA-HQ 上的公平性,更新无条件扩散模型以强制属性之间的独立性。

英文摘要

A key strength of diffusion models lies in their flexibility, since their outputs can be controlled at sampling time through guidance. However, beyond simple cases such as conditional sampling, the target distribution is often left implicit, defined only through a sampling rule or a heuristic energy function. To address this, we propose Jeffrey guidance, a principled framework that extends diffusion-model control to applications beyond what standard guidance can express. It leverages Jeffrey's rule of conditioning to update marginal distributions towards a prescribed target, preserving the conditional structure and minimally perturbing the joint distribution. We first demonstrate Jeffrey guidance by targeting a prescribed embedding distribution. With Inception embeddings as the target, this leads to substantial reductions in FID on both CIFAR-10 and FFHQ. We further apply Jeffrey guidance to fairness on CelebA-HQ, updating an unconditional diffusion model to enforce independence between attributes.

2606.13554 2026-06-12 math.ST stat.ME stat.TH 新提交

Asymptotic regimes for maximum likelihood estimation in the Ewens--Pitman model: When the strength parameter matters

Ewens-Pitman模型中最大似然估计的渐近区域:当强度参数重要时

Filippo Ascolani, Mario Beraha, Stefano Favaro

AI总结 研究Ewens-Pitman模型中折扣和强度参数最大似然估计的大样本渐近行为,发现四个不同区域,其中θ可能起关键作用,并通过缩放模型克服无限可交换性限制。

详情
AI中文摘要

我们研究了随机划分的Ewens-Pitman模型中折扣和强度参数$(\alpha,\theta)$的最大似然估计的大样本渐近行为,在数据生成机制的温和假设下。我们表明,根据频率谱的极限行为,会出现四个不同的区域。特别是,与先前的工作相反,我们发现$\theta$在渐近上可能起关键作用。我们进一步表明,现有文献隐含地只关注其中两个区域,并将这种限制与无限可交换性施加的约束联系起来。在后者下,确实,不同块的数量和频率谱必然通过刚性的结构关系联系在一起。我们证明,通过我们所谓的缩放Ewens-Pitman模型可以克服这种缺乏灵活性的问题,在该模型中,$\theta$允许随样本大小$n$增长。最后,我们提供了来自真实世界数据的经验证据,表明需要这样的扩展来捕获超出经典Ewens-Pitman框架的频率谱。

英文摘要

We study the large sample asymptotic behaviour of the Maximum Likelihood Estimator of the discount and strength parameters $(α,θ)$ in the Ewens--Pitman model for random partitions, under mild assumptions on the data-generating mechanism. We show that four distinct regimes arise, depending on the limiting behaviour of the frequency spectrum. In particular, in contrast with previous work, we find that $θ$ may play a crucial role asymptotically. We further show that the existing literature implicitly focuses on only two of these regimes, and we relate this restriction to the constraints imposed by infinite exchangeability. Under the latter, indeed, the number of distinct blocks and the frequency spectrum are necessarily tied by a rigid structural relation. We prove that this lack of flexibility can be overcome through what we call the scaled Ewens--Pitman model, in which $θ$ is allowed to grow with the sample size $n$. Finally, we provide empirical evidence from real-world data showing that such extensions are needed to capture frequency spectra that fall outside the classical Ewens--Pitman framework.

2606.12720 2026-06-12 math.PR math.ST stat.ML stat.TH 新提交

On McDiarmid's Inequality under Dependence via Approximate Tensorization of Entropy

关于依赖下通过熵的近似张量化得到的McDiarmid不等式

Valentin Roth

AI总结 本文通过熵的近似张量化(ATE)推导依赖数据的McDiarmid不等式,应用于非各向同性高斯向量、强对数凹和对数光滑测度,并解决符号函数集中问题、依赖下Erdős-Rényi图及Dvoretzky-Kiefer-Wolfowitz型不等式,改进收敛速率至$1/\sqrt{n}$。

Comments 27 pages

详情
AI中文摘要

我们认为McDiarmid不等式的依赖版本是数理统计、学习理论和理论计算机科学中有用但未被充分利用的工具。为说明这一点,我们首先强调熵的近似张量化(ATE)通过熵方法蕴含McDiarmid不等式。其次,我们通过ATE推导非各向同性高斯随机向量$X \sim \mathcal N(\mu, \Sigma)$的McDiarmid不等式,其常数阶为$\Sigma$的条件数。我们通过随机局部化的简单应用独立获得该ATE,并讨论Ascolani等人(2026)针对Gibbs采样器提出的更一般的ATE如何将McDiarmid型集中性推广到强对数凹和对数光滑概率测度。然后,我们将所得集中不等式应用于解决Simone Bombari提出的关于$\operatorname{sign}(X)$集中性的问题,研究依赖下的Erdős-Rényi图,并证明对于满足ATE和连续边际CDF的联合测度观测值的Dvoretzky-Kiefer-Wolfowitz型不等式。对于强对数凹和对数光滑测度类,该结果改进了Bobkov和Götze(2010)针对非独立同分布观测值的先验Dvoretzky-Kiefer-Wolfowitz型不等式,在弱依赖下建立了预期的$1/\sqrt{n}$收敛速率,而非$n^{-1/3}$。

英文摘要

We argue that dependent versions of McDiarmid's inequality are a useful but underutilized tool in mathematical statistics, learning theory and theoretical computer science. To make this point, we first highlight that approximate tensorization of entropy (ATE) implies McDiarmid's via the Entropy Method. Second, we derive McDiarmid's inequality for non-isotropic Gaussian random vectors $X \sim \mathcal N(μ, Σ)$ through ATE with a constant of the order of the condition number of $Σ$. We both independently obtain this ATE through a simple application of stochastic localization and also discuss how a more general ATE for the Gibbs sampler due to Ascolani et al., 2026 generalizes McDiarmid's-like concentration to strongly log-concave and log-smooth probability measures. We then apply the resulting concentration inequalities to resolve a question on the concentration of $\operatorname{sign}(X)$ posed by Simone Bombari, investigate Erdős-Rényi graphs under dependence and prove a Dvoretzky-Kiefer-Wolfowitz-type inequality for observations from a joint measure fulfilling ATE and continuous marginal CDFs. For the class of strongly log-concave and log-smooth measures, this result improves upon a prior Dvoretzky-Kiefer-Wolfowitz-type inequality for non-i.i.d. observations due to Bobkov and Götze, 2010, by establishing the expected $1/\sqrt{n}$-rate of convergence under weak dependence instead of $n^{-1/3}$.

2503.02178 2026-06-12 stat.ML cs.LG 版本更新

Central Limit Theorems for Stochastic Gradient Descent Quantile Estimators

随机梯度下降分位数估计量的中心极限定理

Ziyang Wei, Jiaqi Li, Likai Chen, Wei Biao Wu

发表机构 * Department of Statistics, University of Chicago(芝加哥大学统计系) Department of Statistics and Data Science, Washington University in St. Louis(圣路易斯华盛顿大学统计与数据科学系)

AI总结 本文针对常学习率SGD分位数估计,利用马尔可夫链理论证明其平稳分布随学习率趋于零时收敛到高斯分布,首次给出CLT型理论保证,并提出置信区间递归算法。

详情
AI中文摘要

本文发展了通过恒定学习率的随机梯度下降(SGD)进行分位数估计的渐近理论。分位数损失函数既不光滑也不强凸。超越传统视角和技术,我们将分位数SGD迭代视为一个不可约、周期且正常返的马尔可夫链,该链循环收敛到其唯一的平稳分布,无论初始值如何任意固定。为了推导平稳分布的精确形式,我们通过利用平稳方程分析其特征函数的结构。我们还推导了其矩生成函数(MGF)和尾部概率的紧界。综合上述方法,我们证明了当学习率$\eta\rightarrow0$时,中心化和标准化的平稳分布收敛到高斯分布。这一发现为恒定学习率的分位数SGD估计量提供了首个中心极限定理(CLT)类型的理论保证。我们进一步提出了一种递归算法来构建具有统计保证的估计量的置信区间。数值研究展示了在线估计器和推断过程的有效有限样本性能。本研究所发展的理论工具对于研究一般形式化为马尔可夫链的SGD算法具有独立意义,特别是在非强凸和非光滑设置中。

英文摘要

This paper develops asymptotic theory for quantile estimation via stochastic gradient descent (SGD) with a constant learning rate. The quantile loss function is neither smooth nor strongly convex. Beyond conventional perspectives and techniques, we view quantile SGD iteration as an irreducible, periodic, and positive recurrent Markov chain, which cyclically converges to its unique stationary distribution regardless of the arbitrarily fixed initialization. To derive the exact form of the stationary distribution, we analyze the structure of its characteristic function by exploiting the stationary equation. We also derive tight bounds for its moment generating function (MGF) and tail probabilities. Synthesizing the aforementioned approaches, we prove that the centered and standardized stationary distribution converges to a Gaussian distribution as the learning rate $η\rightarrow0$. This finding provides the first central limit theorem (CLT)-type theoretical guarantees for the quantile SGD estimator with constant learning rates. We further propose a recursive algorithm to construct confidence intervals of the estimators with statistical guarantees. Numerical studies demonstrate the effective finite-sample performance of the online estimator and inference procedure. The theoretical tools developed in this study are of independent interest for investigating general SGD algorithms formulated as Markov chains, particularly in non-strongly convex and non-smooth settings.

2209.13686 2026-06-12 stat.ME 版本更新

False Discovery Rate Adjustments for Average Significance Level Controlling Tests

平均显著性水平控制检验的错误发现率调整

Timothy B. Armstrong

AI总结 研究在平均显著性水平控制下,Benjamini-Hochberg过程仍能渐近控制FDR,并证明某些依赖调整在有限样本中有效,为高维非参数设置提供FDR控制方法。

详情
AI中文摘要

多重检验调整,例如控制错误发现率(FDR)的Benjamini & Hochberg(1995)逐步上升程序,通常应用于在经典意义上控制显著性水平的检验族:对于每个单独检验,错误拒绝的概率不超过名义水平。在本文中,我们考虑仅满足较弱显著性水平控制概念的检验,其中错误拒绝的概率只需在假设上平均控制。我们发现,Benjamini & Hochberg(1995)逐步上升程序在具有许多弱相关p值和拒绝数量增加的渐近情况下仍然控制FDR,并且对相关p值的某些调整(例如Benjamini & Yekutieli(2001)程序)在有限样本中继续产生FDR控制。我们的结果为在非参数和高维设置中采用FDR控制程序打开了大门,其中弱化推断概念可能允许提高功效。

英文摘要

Multiple testing adjustments, such as the Benjamini & Hochberg (1995) step-up procedure for controlling the false discovery rate (FDR), are typically applied to families of tests that control significance level in the classical sense: for each individual test, the probability of false rejection is no greater than the nominal level. In this paper, we consider tests that satisfy only a weaker notion of significance level control, in which the probability of false rejection need only be controlled on average over the hypotheses. We find that the Benjamini & Hochberg (1995) step-up procedure still controls FDR in the asymptotic regime with many weakly dependent p-values and an increasing number of rejections, and that certain adjustments for dependent p-values such as the Benjamini & Yekutieli (2001) procedure continue to yield FDR control in finite samples. Our results open the door to FDR controlling procedures in nonparametric and high dimensional settings where weakening the notion of inference may allow for power improvements.

2504.16279 2026-06-12 math.ST cs.IT math.IT stat.AP stat.TH 版本更新

Sharp Detection Threshold for Correlation among Multiple Unlabeled Gaussian Networks

多个未标记高斯网络之间相关性的尖锐检测阈值

Taha Ameen, Bruce Hajek

AI总结 研究m≥2个带高斯边权的完全加权图在未知顶点重标号后是否互相关的假设检验问题,确定了固定m下的信息论检测阈值,并证明无检测-恢复间隙。

详情
AI中文摘要

本文研究假设检验问题,判断$m \geq 2$个具有高斯边权的完全加权图在顶点未知重标号后是否互相关。在零模型下,所有边权独立服从标准高斯分布;而在植入模型下,图共享一个潜在顶点对齐,每对对应边权具有相关性$\rho$。对于固定$m$,我们确定了检测的尖锐信息论阈值。在阈值之上,广义似然比检验实现强检测;而在阈值之下,即使弱检测也不可能。该结果将Wu、Xu和Yu的双图检测阈值推广到任意固定数量的图,展示了一种侧信息机制,其中仅两个图不足但多个图可实现检测,并且与Vassaux和Massoulié的恢复阈值一起表明,该高斯多图模型不存在检测-恢复间隙。

英文摘要

This paper studies the hypothesis testing problem of deciding whether $m \geq 2$ complete weighted graphs with Gaussian edge weights are mutually correlated after unknown relabelings of their vertices. Under the null model all edge weights are independent standard Gaussians, whereas under the planted model the graphs share a latent vertex alignment and each pair of corresponding edge weights has correlation $ρ$. For fixed $m$, we identify the sharp information-theoretic threshold for detection. Above the threshold, a generalized likelihood-ratio test achieves strong detection, whereas even weak detection is impossible below the threshold. The result extends the two-graph detection threshold of Wu, Xu, and Yu to any fixed number of graphs, exhibits a side-information regime in which two graphs alone are insufficient but multiple graphs enable detection, and, together with the recovery threshold of Vassaux and Massoulié, shows that this Gaussian multi-graph model has no detection--recovery gap.

2. 贝叶斯统计与概率建模 6 篇

2606.12701 2026-06-12 stat.ME 新提交

Bayesian machine learning approach for recurrent events studies using Soft Bayesian Additive Regression Trees (SBART)

基于贝叶斯机器学习方法的复发事件研究:软贝叶斯加性回归树(SBART)

MengXing Chen, Debajyoti Sinha, Antonio Linero

AI总结 提出软贝叶斯加性回归树(SBART)非参数方法,结合软决策树与贝叶斯集成学习,用于复发事件建模,通过两层数据增强实现高效计算,在模拟和实际数据中优于现有方法。

详情
AI中文摘要

复发事件数据在生物医学研究中经常出现,其中个体可能经历同一类型事件的多次复发,例如反复住院。本文介绍了一种在贝叶斯集成学习框架下用于复发事件的非参数方法,称为软贝叶斯加性回归树(SBART),该方法结合多个软决策树以实现高预测精度和复发事件潜在强度的平滑估计。所提出的模型将非齐次泊松过程的条件强度函数表示为时间常数基线、个体特定脆弱随机效应以及捕获潜在非线性协变量效应和协变量与时间之间未知交互作用的非参数分量的乘积。采用两层数据增强方案,以在我们的计算算法中有效整合SBART组件。模拟研究表明,即使我们的建模假设不成立,我们的方法(简称RecSBART)在估计累积强度方面也优于现有方法。通过对结直肠癌患者反复住院研究的贝叶斯分析,我们进一步证明了RecSBART方法在复发事件研究中揭示和解释协变量之间潜在复杂关系的能力。

英文摘要

Recurrent event data frequently arise in biomedical studies, where individuals may experience multiple recurrences of the same type of events, such as recurrent hospitalizations. This article introduces a nonparametric method for recurrent events under a Bayesian ensemble learning framework, called Soft Bayesian Additive Regression Trees (SBART), which combines multiple soft decision trees to achieve high predictive accuracy and a smooth estimator of the underlying intensity of the recurrent events. The proposed model represents the conditional intensity function of the non-homogeneous Poisson process as the product of a time-constant baseline, a subject-specific frailty random effect, and a nonparametric component capturing potentially nonlinear covariate effects and unknown interactions among covariates and time. A two-layer data augmentation scheme is employed to efficiently incorporate the SBART component within our computational algorithm. Simulation studies demonstrate that our method, called RecSBART in short, achieves superior accuracy in estimating cumulative intensity compared to existing approaches, even when our modeling assumptions are not true. With the Bayesian analysis of a study of recurrent hospitalizations of colorectal cancer patients, we further demonstrate our RecSBART method's ability to reveal and interpret the underlying complex relationships among covariates in a recurrent events study.

2605.00432 2026-06-12 cs.LG stat.ML 版本更新

Optimal Spatio-Temporal Decoupling for Bayesian Conformal Prediction

贝叶斯共形预测的最优时空解耦

Yu-Hsueh Fang, Chia-Yen Lee

AI总结 提出状态自适应贝叶斯共形预测(SA-BCP),通过门控凸组合平衡长期时间惯性与局部空间证据,实现分布漂移下的快速适应与稳定覆盖,并给出MSE最优阈值闭式解及在线选择过程的遗憾界。

详情
AI中文摘要

在线共形预测必须在快速适应分布漂移与稳定覆盖之间取得平衡:基于反馈的方法反应迅速但变得不稳定,而强折扣贝叶斯方法滞后并在紧密覆盖下膨胀区间。我们引入了\textbf{状态自适应贝叶斯共形预测(SA-BCP)},它将预测分位数形成为长期时间惯性与来自核密度估计的局部空间证据的门控凸组合,由单个可解释的证据阈值$K$控制。我们建立了三个结果:(i) 所得区间的渐近边际有效性;(ii) MSE最优阈值的闭式表达式$K^*_{\mathrm{MSE}}=\alpha(1-\alpha)/M^{\mathcal{T}}$,权衡了覆盖指标(伯努利)方差与时间结构偏差$M^{\mathcal{T}}$;(iii) 在线选择$K$的滚动起点过程——在平稳性下一致,对最佳固定$K$具有$O(\sqrt{T\log N})$遗憾,对于分段变体,在有界漂移下具有次线性动态遗憾界。在四个金融波动率和天气数据集、三个目标覆盖水平以及八个基线(包括最强的最近条件分位数方法SPCI和KOWCPI)上,SA-BCP在大多数设置中达到或超过名义覆盖,同时产生显著更窄的区间——在最紧密覆盖下,Winkler得分比折扣贝叶斯CP低约$3\times$——覆盖匹配审计确认这些效率提升并非欠覆盖的假象。我们披露了一个主要限制:一个专门针对波动率的共形GARCH竞争对手在其主波动率基序列上仍然更高效,尽管它不能跨领域迁移。

英文摘要

Online conformal prediction must balance fast adaptation to distribution shift against stable coverage: feedback-driven methods react quickly but become volatile, while strongly discounted Bayesian methods lag and inflate intervals at tight coverage. We introduce \textbf{State-Adaptive Bayesian Conformal Prediction (SA-BCP)}, which forms the predictive quantile as a gated convex combination of long-term temporal inertia and local spatial evidence from a kernel density estimate, controlled by a single interpretable evidence threshold $K$. We establish three results: (i) asymptotic marginal validity of the resulting intervals; (ii) a closed-form expression for the MSE-optimal threshold, $K^*_{\mathrm{MSE}}=α(1-α)/M^{\mathcal{T}}$, trading the coverage-indicator (Bernoulli) variance against the temporal structural bias $M^{\mathcal{T}}$; and (iii) a rolling-origin procedure for selecting $K$ online -- consistent under stationarity, with $O(\sqrt{T\log N})$ regret against the best fixed $K$ and, for a segmented variant, a sublinear dynamic-regret bound under bounded drift. Across four financial-volatility and weather datasets, three target coverage levels, and eight baselines (including the strongest recent conditional-quantile methods, SPCI and KOWCPI), SA-BCP attains at-or-above-nominal coverage in most settings while producing substantially sharper intervals -- up to roughly $3\times$ lower Winkler score than discounted Bayesian CP at the tightest coverage -- and a coverage-matched audit confirms these efficiency gains are not an artifact of under-coverage. We disclose one principal limitation: a volatility-specialized conformal-GARCH competitor remains more efficient on its home volatility-base series, though it does not transfer across domains.

2602.08913 2026-06-12 cs.LG stat.ML 版本更新

GEMSS: A Variational Bayesian Method for Discovering Multiple Sparse Solutions in Classification and Regression Problems

GEMSS: 一种用于在分类和回归问题中发现多个稀疏解的变分贝叶斯方法

Kateřina Henclová, Václav Šmídl

发表机构 * Faculty of Electrical Engineering, Czech Technical University(捷克技术大学电子工程系)

AI总结 提出GEMSS算法,利用结构化spike-and-slab先验、高斯混合近似后验和Jaccard惩罚,通过变分推断同时发现多个多样化的稀疏特征组合,在128个实验和3个真实数据集上优于对比方法。

详情
AI中文摘要

高维、欠定且高度相关的系统在数据科学实践中很常见,尤其是在分析物理测量时。在这种情况下,特征选择面临根本性挑战,因为多个不同的稀疏子集可能同样好地解释响应。识别这些子集不仅对预测建模至关重要,而且对生成关于潜在机制的领域特定见解也至关重要。然而,传统方法通常只隔离单个解,掩盖了全部合理的解释。本文介绍了GEMSS(高斯集成多稀疏解),一种变分算法,旨在同时发现多个多样化的稀疏特征组合。该方法采用结构化spike-and-slab先验实现稀疏性,使用高斯混合近似难以处理的多模态后验,并引入基于Jaccard的惩罚进一步控制解的多样性。通过随机梯度下降优化单个目标函数。该方法通过一个新的基准测试框架在128个综合实验上进行测试,该框架旨在生成具有相同预测属性的多个稀疏解的人工问题。这使我们能够测量真实特征的检索,而不仅仅是评估预测性能——这些特征更符合我们的实际需求。比较分析表明,GEMSS始终优于通过ALFESE框架适配的五种著名特征选择方法。最后,我们通过来自代谢组学和物理化学的3个具有挑战性的真实世界数据集展示了其实用性:GEMSS成功分离出多个不同但质量高的解。GEMSS作为PyPI包'gemss'提供。相应的存储库此http URL包含完整的代码库和免费的无代码应用程序GEMSS Explorer。

英文摘要

High-dimensional, underdetermined and highly correlated systems are common in data science practice, especially when analyzing physical measurements. In such settings, feature selection poses a fundamental challenge because multiple distinct sparse subsets may explain the response equally well. Their identification is crucial not only for predictive modeling but also for generating domain-specific insights into the underlying mechanisms. Yet, conventional methods typically isolate a single solution, obscuring the full spectrum of plausible explanations. This work introduces GEMSS (Gaussian Ensemble for Multiple Sparse Solutions), a variational algorithm designed to simultaneously discover multiple, diverse sparse feature combinations. The method employs a structured spike-and-slab prior for sparsity, a mixture of Gaussians to approximate the intractable multimodal posterior, and a Jaccard-based penalty to further control solution diversity. A single objective function is optimized via stochastic gradient descent. The method is tested on 128 comprehensive experiments by a novel benchmarking framework designed to generate artificial problems with multiple sparse solutions of equal predictive properties. This allows us to measure the retrieval of ground truth features rather than only evaluating predictive performance -- characteristics more fitting to our practical needs. A comparative analysis shows that GEMSS consistently outperforms five prominent feature selection methods adapted through the ALFESE framework. Finally, we demonstrate practical usability through 3 challenging real-world datasets from metabolomics and physical chemistry: GEMSS successfully isolates multiple distinct yet quality solutions. GEMSS is available as a PyPI package 'gemss'. The corresponding repository github.com/kat-er-ina/gemss/ includes the full codebase and a free, no-code application GEMSS Explorer.

2512.25056 2026-06-12 stat.ME 版本更新

Sequential Bayesian parameter-state estimation in dynamical systems with noisy and incomplete observations via a variational framework

基于变分框架的含噪声不完全观测动态系统序贯贝叶斯参数-状态估计

Liliang Wang, Alex Gorodetsky

AI总结 提出一种在线变分推断框架,通过分解联合后验为参数边缘分布和条件状态分布,实现动态系统参数与状态的联合估计,并给出误差上界理论保证,数值实验验证了其在混沌和高维系统中的鲁棒性与可扩展性。

Comments 31 pages, 8 figures

详情
AI中文摘要

在许多应用中,对动态模型的未知参数和状态进行在线联合估计并量化不确定性至关重要。例如,数字孪生动态更新其对模型参数和状态的知识以支持预测和决策。可靠性和计算速度对数字孪生至关重要。在线参数-状态估计确保了计算效率,而不确定性量化对于做出可靠的预测和决策至关重要。在参数-状态估计中,以数据为条件的状态和模型参数的联合分布(称为联合后验)提供了准确的不确定性量化。由于联合后验通常难以计算,本文提出一个在线变分推断框架,在每个时间步计算其近似。该近似被分解为模型参数的边缘分布和以参数为条件的状态分布。这种分解通过两阶段过程实现递归更新:首先,通过变分推断近似参数后验;其次,基于近似参数后验使用高斯滤波计算以参数为条件的状态分布。算法设计由一个定理支持,该定理建立了联合后验近似误差的上界。数值实验表明,所提出的方法(i)准确推断动态和观测模型的未观测状态和未知参数;(ii)在混沌Lorenz'96系统中,在噪声、部分观测和模型偏差下保持鲁棒性;(iii)有效扩展到由对流-扩散方程空间离散化产生的高维状态空间系统,在此设置下优于联合集成卡尔曼滤波器。

英文摘要

Online joint estimation of a dynamical model's unknown parameters and states with uncertainty quantification is crucial in many applications. For example, digital twins dynamically update their knowledge of model parameters and states to support prediction and decision-making. Reliability and computational speed are vital for DTs. Online parameter-state estimation ensures computational efficiency, while uncertainty quantification is essential for making reliable predictions and decisions. In parameter-state estimation, the joint distribution of the state and model parameters conditioned on the data, termed the joint posterior, provides accurate uncertainty quantification. Because the joint posterior is generally intractable to compute, this paper presents an online variational inference framework to compute its approximation at each time step. The approximation is factorized into a marginal distribution over the model parameters and a state distribution conditioned on the parameters. This factorization enables recursive updates through a two-stage procedure: first, the parameter posterior is approximated via variational inference; second, the state distribution conditioned on the parameters is computed using Gaussian filtering based on the approximate parameter posterior. The algorithmic design is supported by a theorem establishing upper bounds on the joint posterior approximation error. Numerical experiments demonstrate that the proposed method (i) accurately infers both unobserved states and unknown parameters of dynamical and observation models; (ii) remains robust under noisy, partial observations and model discrepancies in a chaotic Lorenz'96 system; and (iii) scales effectively to a high-dimensional state-space system arising from the spatial discretization of a convection-diffusion equation. outperforming the joint ensemble Kalman filter in this setting.

2408.17346 2026-06-12 stat.ME stat.CO 版本更新

On Nonparanormal Likelihoods

关于非参数正态似然

Torsten Hothorn

AI总结 提出非参数正态模型的一步估计框架,通过四种新似然函数解决参数联合估计问题,并展示其在变换判别分析中的应用优势。

详情
AI中文摘要

非参数正态模型通过潜在高斯(即参数)copula 描述多元响应的联合分布,同时允许灵活的非参数边际。这些分布的某些方面(例如条件独立性)是参数化的。其他特征(如边际分布)可以是非参数或半参数化的。当多元正态性可疑但可解释性至关重要时,此类模型具有吸引力。大多数估计过程执行两步:首先估计非参数部分。然后处理 copula 参数,将边际估计视为已知。这对于某些应用是足够的。对于其他应用,例如当半参数边际包含感兴趣的参数或标准误差很重要时,所有参数的联合估计可能更有利。我们提出了非参数正态模型的合适参数化,可能包括半参数效应,并定义了四种新颖的非参数正态对数似然函数。通常,相应的单步优化问题被证明是非凸的。然而,在某些情况下,会出现双凸问题。讨论了几种凸近似。从底层计算角度来看,核心贡献是通过 Genz 过程计算的多元正态对数概率的得分函数。为了展示理论和计算框架的通用性,我们提出了一系列用于变换判别分析的非参数正态模型,其中一些生物标志物受到检测限问题的影响。在模拟研究中,针对半参数有效多分格相关分析(存在理论基准),展示了全最大似然估计相比两步方法可能带来的经验增益。

英文摘要

Nonparanormal models describe the joint distribution of multivariate responses via latent Gaussian, and thus parametric, copulae while allowing flexible nonparametric marginals. Some aspects of such distributions, for example conditional independence, are formulated parametrically. Other features, such as marginal distributions, can be formulated non- or semiparametrically. Such models are attractive when multivariate normality is questionable but interpretability paramount. Most estimation procedures perform two steps, first estimating the nonparametric part. The copula parameters come second, treating the marginal estimates as known. This is sufficient for some applications. For other applications, e.g. when a semiparametric margin features parameters of interest or when standard errors are important, a simultaneous estimation of all parameters might be more advantageous. We present suitable parameterisations of nonparanormal models, possibly including semiparametric effects, and define four novel nonparanormal log-likelihood functions. In general, the corresponding one-step optimisation problems are shown to be non-convex. In some cases, however, biconvex problems emerge. Several convex approximations are discussed. From a low-level computational point of view, the core contribution is the score function for multivariate normal log-probabilities computed via Genz procedure. As a demonstration for the versatility of the theoretical and computational framework, we present a series of nonparanormal models for transformation discriminant analysis when some biomarkers are subject to limit-of-detection problems. Possible empirical gains of full maximum likelihood estimation compared to two-step approaches are illustrated in a simulation study targeting semiparametric efficient polychoric correlation analysis where a theoretical benchmark is available.

2411.07651 2026-06-12 stat.ME stat.ML 版本更新

Quasi-Bayes empirical Bayes: a sequential approach to the Poisson compound decision problem

拟贝叶斯经验贝叶斯:泊松复合决策问题的序贯方法

Stefano Favaro, Sandra Fortini

AI总结 针对流式数据中的泊松复合决策问题,提出基于牛顿算法的拟贝叶斯序贯估计,具有常数计算成本,并证明了其一致性和渐近最优性。

Comments 49 pages

详情
AI中文摘要

泊松复合决策问题是统计学中一个长期存在的问题,经验贝叶斯方法常用于在静态或批量设置中估计泊松均值。我们在流式或在线框架中考虑该问题。基于牛顿算法的拟贝叶斯方法,我们开发了一个易于评估、计算高效且随着数据积累具有常数每观测成本的序贯估计。我们为所提出的估计建立了频率学派保证,包括一致性和渐近最优性,其中最优性理解为渐近消失的超额贝叶斯风险或遗憾。通过模拟研究和与基准程序的比较评估了实证性能。

英文摘要

The Poisson compound decision problem is a long-standing problem is statistics, for which empirical Bayes methods are commonly used to estimate Poisson means in static or batch settings. We consider this problem in a streaming, or online, framework. Building on a quasi-Bayesian approach based on Newton's algorithm, we develop a sequential estimate that is easy to evaluate, computationally efficient, and has constant per-observation cost as the data accrue. We establish frequentist guarantees for the proposed estimate, including consistency and asymptotic optimality, with optimality understood as vanishing excess Bayes risk, or regret. Empirical performance is assessed through simulation studies and comparisons with benchmark procedures.

3. 因果推断与实验设计 10 篇

2606.13531 2026-06-12 stat.ME 新提交

When Representative Samples Produce Worse Outcomes: Scale-up Decisions and Testing in Small-Budget RCTs

当代表性样本产生更差结果:小预算随机对照试验中的规模决策与测试

Hannah Li, Hongseok Namkoong, Isaac Scheinfeld

AI总结 本文研究小预算随机对照试验中,基于统计显著性检验决定是否扩大干预时,试点样本组成如何影响预期结果,发现小预算下最优设计是仅从单一同质子群体抽样。

详情
AI中文摘要

小型随机对照试验通常用于在开展更大规模后续研究之前筛选干预措施。这是实验的关键阶段,因为错过有效干预或扩大有害干预可能代价高昂。为减少这些错误,一个常见建议是招募对目标人群具有代表性的样本,但在资源有限的试点中这往往具有挑战性。我们挑战了代表性样本总是更优的观点,证明当统计显著性检验决定干预措施是否获得进一步研究时,最大化下游预期结果改善的试点试验组成关键取决于其预算规模。在大预算极限下,最优试点设计收敛于对目标人群具有代表性的样本。然而,在小预算区间,试点设计者通过仅从单一同质子群体抽样来最大化预期影响,子群体的选择取决于抽样成本以及设计者对异质性处理效应的先验信念。我们对小预算结果的证明更普遍地适用于当随机对照试验和显著性检验用于决定是否获得任何非自适应下游收益的情况,这一结果可能适用于其他实验预算受限的场景。

英文摘要

Small randomized controlled trials are often used to screen interventions before running larger follow-up studies. This is a critical phase of experimentation, as missing effective interventions or scaling up harmful ones can be very costly. A common proposal to mitigate these errors is to recruit samples that are representative of the target population, but this is often challenging in resource-constrained pilots. We challenge the narrative that representative samples are always superior by showing that when statistical significance testing determines whether interventions receive further study, the pilot trial composition that maximizes the downstream expected improvement in outcomes depends critically on its budget size. In the large-budget limit, the optimal pilot design converges to a sample that is representative of the target population. However, in the small-budget regime, the pilot designer maximizes expected impact by sampling only from a single homogeneous sub-population, chosen in a manner that depends on sampling costs and the designer's prior beliefs about heterogeneous treatment effects. Our proof of the small-budget result applies more generally when an RCT and significance test are used to decide whether to receive any non-adaptive downstream payoff, a result that may be applicable to other settings with constrained experimentation budgets.

2606.13305 2026-06-12 stat.ME stat.AP stat.CO 新提交

Semiparametric Bayesian inference for causal mediation in cluster randomized trials

整群随机试验中因果中介的半参数贝叶斯推断

Woojung Bae, Michael Daniels, Joseph Hogan, Rajesh Vedanthan, Stavroula Chrysanthopoulou

AI总结 针对整群随机试验中群组数量少、中介变量在群组层面测量时的因果中介分析难题,提出一种结合参数贝叶斯模型和相似性加权贝叶斯自助法的稳健推断框架,准确估计自然直接和间接效应。

详情
AI中文摘要

整群随机试验(CRTs)常用于评估干预措施,但在此类设置中进行因果中介分析仍然具有挑战性,特别是当中介变量在群组层面测量且群组数量较少时。标准推断方法通常依赖于渐近假设,这些假设在有限样本设置中失效,导致方差估计有偏和置信区间无效。在本文中,我们为CRT中的因果中介分析提出一个稳健的推断框架。我们利用结果和中介的参数贝叶斯模型以确保计算效率和可解释性。关键的是,为了量化不确定性,我们指定了一种新颖的相似性加权贝叶斯自助法(SWBB),其中包含群组之间的“距离”度量;这避免了对限制性参数假设的需求,并允许模型从“更近”的群组中借用更多信息。通过将观测数据模型与因果假设相结合,我们的方法即使在群组有限的情况下也能准确估计自然直接和间接效应。模拟研究表明,我们的方法在各种场景下实现了名义覆盖概率。我们通过评估肯尼亚一项CRT中的中介作用来展示我们方法的实际效用。

英文摘要

Cluster randomized trials (CRTs) are frequently used to evaluate interventions, yet conducting causal mediation analysis in these settings remains challenging, particularly when the mediator is measured at the cluster level and the number of clusters is small. Standard inference methods often rely on asymptotic assumptions that fail in finite-sample settings, leading to biased variance estimation and invalid confidence intervals. In this paper, we propose a robust inference framework for causal mediation analysis in CRTs. We utilize parametric Bayesian models for the outcome and mediator to ensure computational efficiency and interpretability. Crucially, to quantify uncertainty, we specify a novel similarity-weighted Bayesian bootstrap (SWBB) with a `distance' metric between clusters; this avoids the need for restrictive parametric assumptions and allows the model to borrow more information from `closer' clusters. By combining observed data models with causal assumptions, our approach accurately estimates natural direct and indirect effects even with limited clusters. Simulation studies demonstrate that our method achieves nominal coverage probability across diverse scenarios. We illustrate the practical utility of our approach by assessing mediation in a CRT in Kenya.

2606.13281 2026-06-12 stat.ME 新提交

Causal invariance in graphical models with latent variables

含潜变量图模型中的因果不变性

Marco Borriero, Monia Lupparelli, Giovanni M. Marchetti, Veronica Vinciotti

AI总结 本文研究含潜变量时因果不变性原理的适用条件,刻画了观测变量诱导图的结构,并给出了多变量高斯目标下检验不变性的充要条件。

详情
AI中文摘要

因果发现旨在从观测或干预数据中识别变量间的因果关系,通常用有向无环图(DAG)表示。因果不变性原理通过利用因果效应在不同实验设置下的稳定性,能够识别目标变量的因果父节点。然而,当某些父节点未被观测到时,观测变量上的诱导图可能不再是DAG,且可能不唯一,这使因果推断复杂化。针对潜父节点的相关配置,我们刻画了诱导图,并形式化了因果不变性得以保持以识别观测父节点的条件。对于多变量高斯目标,正式建立了检验此类不变性的必要和充分条件。

英文摘要

Causal discovery aims to identify causal relationships among variables from observational or interventional data, typically represented by a directed acyclic graph (DAG). The causal invariance principle enables the identification of the causal parents of target variables by exploiting the stability of causal effects across different experimental settings. When some parents are unobserved, however, the induced graph over the observed variables may no longer be a DAG, and it may not be unique, complicating causal inference. For relevant configurations of latent parents, we characterize the induced graph and formalize the conditions under which causal invariance is preserved for the identification of the observed parents. Necessary and sufficient conditions for testing such invariance are formally established for a multivariate Gaussian target.

2606.12680 2026-06-12 cs.LG stat.ML 新提交

How Useful is Causal Invariance for Domain Adaptation in Finite-Sample Settings?

因果不变性在有限样本设置中对领域适应有多大用处?

Julia Kostin, Kasra Jalaldoust, Elias Bareinboim, Samory Kpotufe, Fanny Yang

发表机构 * Department of Computer Science, ETH Zurich(苏黎世联邦理工学院计算机科学系) Causal Artificial Intelligence Lab, Columbia University(哥伦比亚大学因果人工智能实验室) Department of Statistics, Columbia University(哥伦比亚大学统计系)

AI总结 研究线性回归中因果不变性如何提升监督领域适应,通过候选预测器的目标风险边界和有限样本估计误差推导匹配上下界,证明当边界足够大时自适应聚合可避免负迁移。

详情
AI中文摘要

机器学习模型在部署到与训练源分布不同的目标分布时,性能往往会下降。最近基于因果的领域泛化工作表明,领域间的共享因果结构可以诱导不变预测器,例如在结构化领域偏移下具有稳定风险的某些特征子集上的模型。然而,这种总体水平的因果不变性在有限样本设置中能带来多大收益仍未充分探索。特别是,在实践中我们通常只能获得少量带标签的目标样本,这种设置称为监督领域适应(sDA)。本文探讨何时(完全或部分)因果知识能够可证明地改进监督领域适应。作为第一步,我们研究线性回归,其中完全或部分因果知识指定了一组不变或可能不变的特征子集,每个子集产生一个源训练候选预测器。我们推导了匹配的上界和下界,表明有限样本收益由候选预测器之间的目标风险边界以及有限源估计误差共同决定。当这些边界相对于$n_Q$足够大时,自适应聚合过程可以匹配最佳候选预测器,同时避免相对于仅使用目标样本学习的负迁移。另一方面,当边界过小时,没有算法能够可靠地利用候选集合获得更快的有限样本速率。我们进一步将这些边界与线性SCM中的结构偏移幅度联系起来,并在真实世界的因果基准上验证了理论。

英文摘要

Machine learning models often degrade when they are deployed on a target distribution that differs from the source distributions they were trained on. Recent work in causality-based domain generalization has shown how shared causal structure between domains can induce invariant predictors, e.g., models on a subset of features which have stable risk across structured domain shifts. However, the extent to which such population-level causal invariances can lead to gains in finite-sample settings remains underexplored. In particular, in practice we often have access to a few labeled target samples, a setting called supervised domain adaptation (sDA). In this paper, we explore when (full or partial) causal knowledge can provably improve supervised domain adaptation. As a first step, we study linear regression, where full or partial causal knowledge specifies a collection of invariant or possibly invariant feature subsets, each yielding a source-trained candidate predictor. We derive matching upper and lower bounds showing that finite-sample gains are governed by the target-risk margins separating the candidates, together with the finite-source estimation error. When these margins are sufficiently large relative to $n_Q$, an adaptive aggregation procedure can match the best candidate predictor while avoiding negative transfer relative to target-only learning. On the other hand, when the margins are too small, no algorithm can reliably exploit the candidate collection to obtain faster finite-sample rates. We further connect these margins to structural shift magnitude in linear SCMs and validate the theory on real-world causal benchmarks.

2606.12892 2026-06-12 stat.ML cs.LG econ.EM math.ST stat.ME stat.TH 新提交

Prediction-Powered Causal Inference by Automatic Debiased Machine Learning and Semi-Supervised Riesz Regression

预测驱动的因果推断:自动去偏机器学习与半监督Riesz回归

Masahiro Kato

发表机构 * University of Tokyo(东京大学)

AI总结 研究半监督设置下因果参数的半参数有效估计,通过结合去偏机器学习和半监督Riesz回归,提出DML-PPCI和TMLE-PPCI方法,实现比仅用标注数据更小的渐近方差。

详情
AI中文摘要

本研究探讨了在半监督设置下因果和结构参数的半参数有效估计。在我们的设置中,除了由结果和回归变量组成的标注观测数据外,还有未标记的辅助回归变量可用。我们的目标是构建因果和结构参数的估计量,其渐近方差小于仅使用标注数据构建的估计量。我们将此框架称为预测驱动的因果推断(PPCI)。我们首先推导了有效影响函数和效率界,这表明使用辅助回归变量可以获得比仅从标注观测数据可达到的效率界更小的渐近方差。然后,通过将有效影响函数与去偏机器学习(DML)框架相结合,我们提出了称为DML-PPCI的方法。如果我们构建一个估计方程估计量,我们称之为EE-DML-PPCI;如果我们构建一个目标学习估计量,我们称之为TMLE-DML-PPCI。两种估计量的渐近方差都与我们推导的效率界相匹配。在构建估计量时,有效影响函数的估计起着重要作用。在我们的研究中,有效影响函数也是一个Neyman正交分数,它依赖于Riesz表示子和回归函数。对于Riesz表示子估计,我们开发了具有收敛速度保证的半监督广义Riesz回归。

英文摘要

This study investigates semiparametric efficient estimation of causal and structural parameters in a semi-supervised setting. In our setting, unlabeled auxiliary regressors are available in addition to labeled observations consisting of outcomes and regressors. Our goal is to construct estimators of causal and structural parameters whose asymptotic variances are smaller than those of estimators constructed using only labeled data. We refer to this framework as prediction-powered causal inference (PPCI). We first derive the efficient influence function and the efficiency bound, which imply that the use of auxiliary regressors can attain a smaller asymptotic variance than the efficiency bound attainable from labeled observations alone. Then, by combining the efficient influence function with the debiased machine learning (DML) framework, we propose methods that we call DML-PPCI. If we construct an estimating-equation estimator, we refer to the method as EE-DML-PPCI; if we construct a targeted-learning estimator, we refer to the method as TMLE-DML-PPCI. The asymptotic variances of both estimators match our derived efficiency bound. In the construction of the estimators, estimation of the efficient influence function plays an important role. In our study, the efficient influence function is also a Neyman orthogonal score, which depends on the Riesz representer and the regression function. For Riesz representer estimation, we develop semi-supervised generalized Riesz regression with convergence rate guarantees.

2606.04009 2026-06-12 stat.ML cs.AI cs.LG 版本更新

Counterfactual Explanations for Deep Two-Sample Testing

深度双样本检验的反事实解释

Wei-Cheng Lai, Marco Simnacher, Christoph Lippert

发表机构 * Hasso-Plattner-Institute, University of Potsdam(波茨坦大学洪堡-劳恩堡研究所) Hasso Plattner Institute for Digital Health at Mount Sinai Icahn School of Medicine at Mount Sinai(辛辛那提医学院洪堡数字健康研究所)

AI总结 针对深度双样本检验,提出基于扩散自编码器和MMD优化的反事实解释框架,生成样本级编辑以揭示驱动假设拒绝的特征。

Comments 17 pages

详情
AI中文摘要

双样本检验是检测科学领域中分布差异的基本工具,但经典检验(包括基于核的检验)在高维结构化数据(如图像)上可能效果不佳。最近的深度双样本检验通过学习信息表示提高了这些场景下的灵敏度,但它们对哪些数据特征驱动拒绝原假设 $H_0$ 提供的洞察有限。为解决此问题,我们提出了一种用于深度双样本检验的反事实解释框架,该框架生成样本级编辑,将观测值从源组移向目标组,同时明确减少检验所测量的差异。我们的方法将扩散自编码器与预训练的深度双样本检验模型相结合,并在检验模型的表示空间中优化最大均值差异(MMD)目标,以生成合理的反事实。我们通过检验统计量和由此产生的双样本p值的变化来量化分布级效应。我们在合成2D形状数据集和两个MRI队列上评估了该方法。在这两种设置下,反事实变换相对于原始样本持续增加p值,表明编辑后的源集在检验下在统计上更接近目标分布。我们使用LPIPS测量最小性,以确保反事实保持接近原始样本。由此产生的编辑提供了与检测到的组差异相关的特征的可解释证据。在MRI上,局部变化与队列之间已知的解剖学差异一致。

英文摘要

Two-sample testing is a fundamental tool for detecting distributional differences across scientific domains, but classical tests (including kernel-based tests) can be ineffective on high-dimensional structured data such as images. Recent deep two-sample tests improve sensitivity in these settings by learning informative representations, yet they provide limited insight into which data features drive rejection of the null hypothesis $H_0$. To address this issue, we propose a counterfactual explanation framework for deep two-sample testing that generates sample-level edits moving observations from a source group toward a target group while explicitly reducing the discrepancy measured by the test. Our method combines a diffusion autoencoder with a pretrained deep two-sample test model and optimizes a maximum mean discrepancy (MMD) objective in the test model's representation space to produce plausible counterfactuals. We quantify distribution-level effects through changes in the test statistic and the resulting two-sample p-values. We evaluate the method on synthetic 2D shape datasets and two MRI cohorts. Across both settings, the counterfactual transformations consistently increase p-values relative to the original samples, indicating that the edited source set becomes statistically closer to the target distribution under the test. We measure minimality using LPIPS to ensure the counterfactuals remain close to the original samples. The resulting edits provide interpretable evidence of the features associated with the detected group differences. On MRI, the localized changes are consistent with known anatomical differences between cohorts.

2605.18724 2026-06-12 stat.ME 版本更新

Sensitivity analysis for causal mediation: bridge score, sharp sensitivity bounds, and calibration

因果中介的敏感性分析:桥分数、精确敏感性界限和校准

Yuki Ohnishi, Fan Li

AI总结 本文提出桥分数作为中介阶段的平衡分数,并通过两个可解释的潜在混淆参数推导出精确的点wise界限,同时介绍了两种校准方法以实现敏感性分析。

Comments 33 pages

详情
AI中文摘要

因果中介分析将总处理效应分解为通过假设中介变量起作用的部分和残余直接部分。自然直接和间接效应的识别通常依赖于顺序可忽略性的中介阶段,这无法通过经验验证,需要明确的敏感性分析。我们引入了桥分数,这是一种由两个处理特定的中介密度在共同中介值处形成的低维向量,并展示了它是顺序可忽略性中介阶段的平衡分数。在桥分数条件下,我们推导出一个精确的点wise envelope,以解释两个可解释的潜在混淆参数来表达未识别的中介-结果混淆函数。为了使该界限适用于敏感性分析,我们进一步引入了两种校准方法。第一种是针对观测协变量的基准校准,包括一种基于排名的版本,其对基准的单调重新表达具有不变性;第二种是基于残余结果变异的残差预算校准。最后,我们展示如何通过标量函数减少和贝叶斯g-计算算法将点wise界限用于推断,将所有不确定性源传播到中介效应估计的后验抽样中。

英文摘要

Causal mediation analysis decomposes the total treatment effect into a portion operating through a hypothesized mediator and a residual direct portion. Identification of natural direct and indirect effects typically rests on the mediator stage of sequential ignorability, which cannot be empirically verified and requires explicit sensitivity analysis. We formulate the \emph{bridge score}, a mediator-stage balancing score, as a low-dimensional vector formed from the two treatment-specific mediator densities at a common mediator value, and show that it balances baseline covariates for the mediator stage relevant to natural effect identification. Conditional on the bridge score, we derive a sharp pointwise variance envelope on the unidentified mediator-outcome confounding function in terms of latent outcome relevance and residual selection. To make the bound operational for sensitivity analysis, we further introduce a residual budget calibration approach based on local residual outcome variation and record a complementary range bound for support-based restrictions. Finally, we show how the pointwise bound can be operationalized for inference through a scalar functional reduction and a Bayesian g-computation algorithm that combines observed-data posterior uncertainty with user-specified sensitivity uncertainty, rather than treating the unidentified sensitivity corrections as learned from the likelihood.

2604.23534 2026-06-12 stat.ME stat.AP 版本更新

Multivariate incremental effects for continuous treatments: Studying the health effects of environmental mixtures

连续型处理变量的多元增量效应:研究环境混合物的健康影响

Zhuochao Huang, Kejin Dong, Tuo Lin, Joseph Antonelli

AI总结 针对连续型多元暴露(如空气污染混合物)违背正性假设的问题,提出基于指数倾斜的因果推断框架,定义公平比较不同干预方向的因果估计量,并开发高效一步估计、黎曼BFGS算法等理论方法,应用于全国环境健康数据以优化PM2.5化学混合物干预策略。

详情
AI中文摘要

评估多元连续暴露(如空气污染混合物)的因果健康效应是一项关键的公共卫生挑战。主要障碍是正性假设经常被违反,这使得标准确定性干预的效果无法识别或严重依赖于不可靠的模型外推。在本文中,我们开发了一个新的因果推断框架来应对这一挑战。我们将指数倾斜扩展到多元暴露,并解决了如何公平比较不同干预方向的关键问题。这建立了一个系统框架,用于定义和评估各种政策相关的因果估计量,使研究人员能够解决不同的科学问题。我们开发了许多方法论进展,包括高效的一步估计策略、用于求解约束流形优化问题的黎曼BFGS算法、因果估计量的半参数效率界、估计量的极小极大速率,并建立了渐近正态性。我们通过将框架应用于全国环境健康数据集来展示其实用性,以确定减少与PM$_{2.5}$化学混合物相关的不良健康结果的最优策略。

英文摘要

Evaluating the causal health effects of multivariate, continuous exposures, such as air pollution mixtures, is a critical public health challenge. A primary obstacle is the frequent violation of the positivity assumption, which renders the effects of standard deterministic interventions unidentified or heavily reliant on unreliable model extrapolation. In this paper, we develop a novel causal inference framework to address this challenge. We extend exponential tilting to multivariate exposures and address the critical question of how to compare different intervention directions fairly. This establishes a systematic framework for defining and evaluating various policy-relevant causal estimands, allowing researchers to address diverse scientific questions. We develop numerous methodological advancements, including efficient one-step estimation strategies, a Riemannian BFGS algorithm to solve a constrained manifold optimization problem, semiparametric efficiency bounds for causal estimands, minimax rates for estimators, and establishing asymptotic normality. We demonstrate our framework's utility by applying it to a nationwide environmental health dataset to identify the optimal strategy for reducing adverse health outcomes associated with a PM$_{2.5}$ chemical mixture.

2410.00903 2026-06-12 stat.AP cs.CL cs.LG 版本更新

Causal Inference with Generative Artificial Intelligence: Application to Texts as Treatments

基于生成式人工智能的因果推断:以文本作为处理变量

Kosuke Imai, Kentaro Nakamura

发表机构 * Harvard University(哈佛大学) John F. Kennedy School of Government(约翰·F·肯尼迪政府学院)

AI总结 提出利用生成式AI(如大语言模型)生成处理变量并利用其内部表示进行因果效应估计,避免从数据中学习因果表示,提高估计准确性和效率。

详情
AI中文摘要

在本文中,我们展示了如何利用生成式人工智能(GenAI)的力量,增强以文本等高维非结构化数据作为处理变量时的因果推断有效性。具体而言,我们提出使用深度生成模型(如大语言模型,LLMs)高效地生成处理变量,并利用其内部表示进行后续的因果效应估计。我们表明,了解这种真实内部表示有助于将感兴趣的处理特征(如特定情感和某些主题)与其他可能未知的混淆特征分离开来。与现有方法不同,所提出的GenAI驱动推断(GPI)方法无需从数据中学习因果表示,因此能产生更准确和高效的估计。我们正式建立了非参数识别平均处理效应所需的条件,提出了一种避免重叠假设违反的估计策略,并通过应用双重机器学习推导了所提出估计量的渐近性质。最后,利用工具变量方法,我们将所提出的GPI方法扩展到处理特征基于人类感知的场景。GPI也适用于文本复用,即使用LLM重新生成现有文本。我们进行了模拟和实证研究,使用开源LLM Llama 3生成的文本数据,展示了我们的估计器相对于最先进的因果表示学习算法的优势。

英文摘要

In this paper, we demonstrate how to enhance the validity of causal inference with unstructured high-dimensional treatments like texts, by leveraging the power of generative Artificial Intelligence (GenAI). Specifically, we propose to use a deep generative model such as large language models (LLMs) to efficiently generate treatments and use their internal representation for subsequent causal effect estimation. We show that the knowledge of this true internal representation helps disentangle the treatment features of interest, such as specific sentiments and certain topics, from other possibly unknown confounding features. Unlike existing methods, the proposed GenAI-Powered Inference (GPI) methodology eliminates the need to learn causal representation from the data, and hence produces more accurate and efficient estimates. We formally establish the conditions required for the nonparametric identification of the average treatment effect, propose an estimation strategy that avoids the violation of the overlap assumption, and derive the asymptotic properties of the proposed estimator through the application of double machine learning. Finally, using an instrumental variables approach, we extend the proposed GPI methodology to the settings in which the treatment feature is based on human perception. The GPI is also applicable to text reuse where an LLM is used to regenerate existing texts. We conduct simulation and empirical studies, using the generated text data from an open-source LLM, Llama 3, to illustrate the advantages of our estimator over state-of-the-art causal representation learning algorithms.

2111.08157 2026-06-12 econ.EM math.ST stat.ME stat.TH 版本更新

Fine Stratification of Survey Experiments

调查实验的精细分层

Max Cytrynbaum

AI总结 本文提出两阶段实验模型,通过匹配k元组随机化实现精细分层,开发快速匹配算法,证明可减少处理效应估计方差,并提供充分利用设计效率的推断方法。

详情
AI中文摘要

本文研究了一个两阶段实验模型,其中研究者首先从符合条件的池中抽样具有代表性的实验参与者,然后使用匹配的$k$元组随机化将每个抽样单元分配到处理组或对照组。为了实现这种设计,我们开发了一种快速的新算法,用于将单元匹配成$k$元组,适用于任意$k \ge 2$和任意维度的协变量。通过调查200篇近期实验工作论文,我们估计该算法新近实现了多变量精细分层,并为经济学中约44%的实验提供了可证明的匹配质量保证。我们表明,精细分层抽样和分配都非参数地降低了处理效应估计的方差,其中分层抽样的收益随着合格池的大小以及协变量预测处理效应异质性的程度而增加。我们开发了新的推断方法,充分利用两个设计阶段的效率提升,允许研究者报告更小的标准误,如果他们设计了代表性实验。对九个已发表实验的应用量化了效率提升。

英文摘要

This paper studies a two-stage model of experimentation, where the researcher first samples representative experimental participants from an eligible pool, then assigns each sampled unit to treatment or control, using matched $k$-tuples randomization at both stages. To implement such designs, we develop a fast new algorithm for matching units into $k$-tuples for any $k \ge 2$ and any dimension of covariates. By surveying 200 recent experimental working papers, we estimate that our algorithm newly enables multivariate fine stratification with provable match quality guarantees for about 44\% of experiments in economics. We show that finely stratified sampling and assignment both nonparametrically reduce the variance of treatment effect estimation, with the gains from stratified sampling increasing in the size of the eligible pool and how well covariates predict treatment effect heterogeneity. We develop new inference methods that fully exploit the efficiency gains from both design stages, allowing researchers to report smaller standard errors if they designed a representative experiment. An application to nine published experiments quantifies the efficiency gains.

4. 时间序列与空间统计 2 篇

2606.13615 2026-06-12 math.PR stat.ME 新提交

Data-driven subsampling rates for diffusion parameter estimation of SDEs

数据驱动的扩散参数估计子采样率选择

Felix Lindner, Andre Schmeißer, Felipe Trolldenier, Raimund Wegener

AI总结 提出基于单调游程统计的自动子采样率选择方法,确保子采样数据与SDE模型在无穷小尺度上一致,无需多尺度扩散渐近框架。

Comments 30 pages, 11 figures

详情
AI中文摘要

我们研究随机微分方程(SDE)模型中扩散参数估计的问题,其中数据和模型仅在尚未确定的特定尺度上兼容。我们引入一种简单有效的方法,用于选择合适的速率对给定的时间序列数据进行子采样,以确保子采样数据的统计结构与SDE模型在无穷小尺度上的行为一致。我们的方法基于分析子采样数据序列中单调递增或递减段(称为单调游程)的长度统计。作为分析基础,我们证明对于一大类具有加性噪声的SDE,在无穷小尺度上单调游程的长度近似服从成功概率为$1/2$的几何分布。利用这一通用特征,我们推导出一种自动化方法,用于为给定的时间序列数据选择合适的子采样率,该方法可直接应用于实际场景,且不依赖于多尺度扩散的渐近框架。通过一个工业数学应用——非织造纺织品生产过程中纤维铺放曲线的替代模型——展示了该方法。

英文摘要

We study the problem of diffusion parameter estimation for stochastic differential equation (SDE) models in scenarios where data and model are compatible only on specific scales that have yet to be determined. We introduce a simple and efficient method for selecting suitable rates at which given time series data should be subsampled in order to ensure that the statistical structure of the subsampled data is consistent with the behavior of the SDE model on an infinitesimal scale. Our approach is based on analyzing the statistics of the lengths of monotonically increasing or decreasing segments in the subsampled data sequence, which we refer to as monotone runs. As an analytical foundation, we prove for a large class of SDEs with additive noise that the lengths of monotone runs at an infinitesimal scale are approximately geometrically distributed with success probability $1/2$. This universal characterization is employed to derive an automated method for selecting appropriate subsampling rates for given time series data that is directly applicable in real-world scenarios and does not rely on an asymptotic framework of multiscale diffusions. The approach is demonstrated using an application from industrial mathematics concerning surrogate models for fiber lay-down curves in production processes of nonwoven textiles.

2606.12836 2026-06-12 physics.data-an q-bio.QM stat.ME 新提交

Interpretable model-free inference of parametric variation across time-series data through large-scale feature extraction

通过大规模特征提取进行时间序列数据参数变化的可解释无模型推断

Ben D. Fulcher, Carl H. Lubba, Giorgio F. Gilestro, Simon R. Schultz, Nick S. Jones

AI总结 提出一种无监督数据驱动方法,利用超过7000个时间序列特征库,从时间序列数据中推断未知生成过程的参数变化维度和性质,无需指定或拟合模型。

详情
AI中文摘要

这里我们解决了直接从时间序列数据中估计未知生成过程中参数变化的维度和性质的问题,无需指定或拟合模型。特别地,我们假设时间序列集合中的实例间变化是由生成模型中的参数变化引起的。我们假设,给定一个足够大的时间序列特征库,低维参数变化将表现为特征空间中的低维结构,从而可以构建潜在自由度的可解释估计量。我们使用一个包含超过7000种多样且可解释的时间序列统计量的特征库,以及13个具有已知参数变化的模拟系统(涵盖线性随机过程、非线性振荡器和混沌动力学)来测试我们的假设。我们的无监督数据驱动方法通常能在这广泛的模拟动力系统范围内重建潜在的参数变化,同时为每个潜在维度生成可解释的估计量。应用于1143只果蝇的运动动力学,我们使用该方法提取了对应于性别和昼夜节律的生物意义成分。我们的结果为急需的数据驱动方法铺平了道路,以弥合动力学的可解释理论理解与表征现代科学问题的大规模复杂数据集之间的差距。

英文摘要

Here we address the problem of estimating the dimensionality and nature of parametric variation in an unknown generative process directly from time-series data, without specifying or fitting a model. In particular we suppose that inter-instance variation in collections of time series is caused by parametric variation in the generating model. We hypothesize that, given a sufficiently large library of time-series features, low-dimensional parametric variation will manifest as low-dimensional structure in feature space, enabling interpretable estimators of the underlying degrees of freedom to be constructed. We test our hypothesis using a library of over 7000 diverse and interpretable time-series statistics and thirteen simulated systems with known parametric variation, spanning linear stochastic processes, nonlinear oscillators, and chaotic dynamics. Our unsupervised, data-driven approach often reconstructs the underlying parametric variation across this extensive range of simulated dynamical systems while also yielding interpretable estimators for each underlying dimension. Applied to the movement dynamics of 1143 fruit flies, we use this method to extract biologically meaningful components corresponding to sex and circadian rhythmicity. Our results pave the way for much-needed data-driven methods to bridge the gap between interpretable theoretical understanding of dynamics and the large and complex datasets that characterize modern scientific problems.

5. 计算统计与MCMC 13 篇

2606.13213 2026-06-12 stat.ME stat.ML 新提交

Calibrating simplified vine copulas with a noise contrastive estimation approach

使用噪声对比估计方法校准简化藤蔓连接函数

Michael Denis Kraus, David Huk, Claudia Czado

AI总结 针对简化藤蔓连接函数在条件依赖变化显著时可能误设的问题,提出基于观测特定校正因子的校准策略,利用噪声对比估计(NCE)进行局部调整,提高模型准确性。

Comments Preprint

详情
AI中文摘要

藤蔓连接函数提供了一个灵活的框架,仅使用二元构建块对复杂的多元依赖结构进行建模。它们的实际成功在很大程度上依赖于简化假设,该假设限制条件对连接函数独立于特定的条件值。虽然这一假设极大地促进了估计,但在条件依赖变化显著的应用中可能导致模型误设。我们提出了一种基于观测特定校正因子的简化藤蔓连接函数模型的新校准策略。这些因子使用噪声对比估计(NCE)推导,这是一种用于密度估计的监督学习技术,将问题重新定义为二元分类任务,并具有易于采样的噪声分布。将拟合的简化藤蔓连接函数视为噪声模型,NCE方法为单个观测提供校正的对数似然估计,从而局部地将简化藤蔓向底层数据生成依赖结构调整。模拟研究表明,所提出的校准提供了合理有效的调整,在简化假设被违反时提高了模型准确性,而在简化模型充分时保持中性。两个实际数据应用进一步说明了该方法的实际益处。结果凸显了基于NCE的校准作为增强简化藤蔓连接函数模型而不放弃其计算可处理性的有前途工具。

英文摘要

Vine copulas provide a flexible framework for modeling complex multivariate dependence structures using only bivariate building blocks. Their practical success relies heavily on the simplifying assumption, which restricts conditional pair copulas to be independent of the specific conditioning values. While this assumption greatly facilitates estimation, it may lead to model misspecification in applications with pronounced varying conditional dependence. We propose a novel calibration strategy for simplified vine copula models based on observation-specific correction factors. These factors are derived using noise contrastive estimation (NCE), a supervised learning technique for density estimation that reframes the problem as a binary classification task with an easily sampled noise distribution. Treating the fitted simplified vine copula as the noise model, the NCE approach yields corrected log-likelihood estimates for individual observations, thereby locally adjusting the simplified vine toward the underlying data-generating dependence structure. Simulation studies demonstrate that the proposed calibration provides sensible and effective adjustments, improving model accuracy when the simplifying assumption is violated while remaining neutral when the simplified model is adequate. Two real-data applications further illustrate the practical benefits of the method. The results highlight NCE-based calibration as a promising tool to enhance simplified vine copula models without abandoning their computational tractability.

2606.12857 2026-06-12 stat.ME stat.CO 新提交

Discrepancy Modeling with Intermediate Variables: A New Framework for Robust Gaussian Process Calibration

带中间变量的差异建模:鲁棒高斯过程校准的新框架

Henry Shaowu Yuchi, Michael Grosskopf, Aman Sharma, Nicolas Schunck, Jared O'Neal, Matt Menickelly, Stefan M. Wild

AI总结 提出利用中间变量进行差异建模的鲁棒高斯过程校准框架,通过结构化变量选择、离散化缩放高斯过程约束和空间填充设计,联合建模仿真器与差异,提升预测性能并缓解可辨识性问题。

详情
AI中文摘要

高斯过程广泛用于计算机实验中的代理建模,这些实验通常产生大量中间变量,但在标准校准框架中未明确使用。如果不利用这些变量,校准不完美模型可能具有挑战性,而分别拟合仿真器和差异模型也会带来可辨识性问题。在这项工作中,我们提出了一种鲁棒的高斯过程校准框架,利用中间变量进行差异建模。该框架集成了结构化的中间变量选择过程、离散化缩放高斯随机过程(S-GaSP)来约束差异项,以及用于选择约束点的空间填充设计策略。这使得仿真器和差异的联合建模成为可能,提高了预测性能,提供了原则性的不确定性量化,并减轻了可辨识性风险。我们在涉及结合能的核物理应用中证明了其有效性,其性能优于基线方法。

英文摘要

Gaussian processes are widely used for surrogate modeling in computer experiments, which often produce numerous intermediate variables that are not explicitly used in standard calibration frameworks. Calibration of imperfect models can be challenging without leveraging these variables, while fitting the emulator and the discrepancy models separately also poses identifiability issues. In this work, we propose a robust Gaussian process calibration framework that leverages intermediate variables for discrepancy modeling. The framework integrates a structured intermediate variable selection process, a discretized scaled Gaussian stochastic process (S-GaSP) to constrain the discrepancy term, and a space-filling design strategy for selecting constraint points. This enables joint modeling of the emulator and discrepancy, improving predictive performance, providing principled uncertainty quantification, and alleviating identifiability risks. We demonstrate its efficacy on a nuclear physics application involving binding energies, where it outperforms baseline approaches.

2606.12596 2026-06-12 stat.ME 新提交

Extending Prais-Winsten Regression to Panel Data with Higher-Order Autoregressive Errors: A Simulation Study

将Prais-Winsten回归扩展到具有高阶自回归误差的面板数据:一项模拟研究

Ariel Linden

AI总结 将Prais-Winsten AR(k) GLS变换扩展到面板数据,在Stata包xtpraisk中实现,并通过蒙特卡洛模拟验证其统计性质,发现xtpraisk在保持名义第一类错误率的同时比xtscc具有更高功效,且对自回归阶数误设稳健。

详情
AI中文摘要

我们将Prais-Winsten AR(k)广义最小二乘(GLS)变换扩展到Beck-Katz面板校正标准误(PCSE)框架内的面板数据,并在社区贡献的Stata包xtpraisk中实现了该方法。作为Prais-Winsten的面板扩展,xtpraisk是xtscc(Newey-West的面板扩展和Driscoll-Kraay估计量的实现)的自然比较对象。我们进行了蒙特卡洛模拟以验证xtpraisk的统计性质,并将其有限样本性能与xtscc进行比较。模拟涵盖了自回归阶数1-3、三种自相关情景、三种面板规模、六种序列长度和五种效应大小,每种条件进行2000次重复。在所有条件下,xtpraisk在保持接近名义第一类错误率、置信区间覆盖率和标准误校准的同时,实现了比xtscc更高的功效。相比之下,xtscc在短序列长度下表现出系统性的标准误低估和第一类错误膨胀,且这两种缺陷随着自回归阶数的增加而恶化。两种估计量基本上无偏。自回归阶数的误设不会降低xtpraisk的推断性能,而面板间相关性和面板规模对任一估计量的相对性能影响可忽略。结果表明,当统计效率和有效推断均为优先考虑时,尤其是在持久的高阶自相关和短到中等序列长度下,xtpraisk更优。

英文摘要

We extend the Prais-Winsten AR(k) generalized least squares (GLS) transformation to panel data within the Beck-Katz panel-corrected standard error (PCSE) framework and implement the method in the community-contributed Stata package xtpraisk. As the panel extension of Prais-Winsten, xtpraisk is the natural comparator to xtscc, the panel extension of Newey-West and implementation of the Driscoll-Kraay estimator. We conduct a Monte Carlo simulation to validate the statistical properties of xtpraisk and compare its finite-sample performance with xtscc. The simulation spans autoregressive orders 1-3, three autocorrelation scenarios, three panel sizes, six series lengths, and five effect sizes, with 2,000 replications per condition. Across all conditions, xtpraisk achieved higher power than xtscc while maintaining near-nominal Type I error rates, confidence interval coverage, and standard error calibration. In contrast, xtscc exhibited systematic standard error underestimation and inflated Type I error at short series lengths, with both deficiencies worsening as autoregressive order increased. Both estimators were essentially unbiased. Misspecification of the autoregressive order did not degrade xtpraisk's inferential performance, and cross-panel correlation and panel size had negligible effects on the relative performance of either estimator. The results indicate that xtpraisk is preferable when both statistical efficiency and valid inference are priorities, particularly under persistent higher-order autocorrelation and short to moderate series lengths.

2606.13234 2026-06-12 stat.CO cs.NA math.NA math.ST stat.TH 新提交

Switching Hamiltonian Monte Carlo for sampling from mixture distributions

切换哈密顿蒙特卡洛方法用于混合分布采样

A. Sharma

AI总结 提出切换哈密顿蒙特卡洛方法,结合对称数值积分器和泊松跳跃,实现有限混合玻尔兹曼-吉布斯分布的采样,并证明几何遍历性和二阶偏差。

详情
AI中文摘要

我们提出了一种切换哈密顿蒙特卡洛方法,用于从有限混合玻尔兹曼-吉布斯分布中采样。我们提出了对称数值积分器来近似与泊松跳跃交织的切换哈密顿动力学,其中状态切换链使用均匀化技术或随机模拟算法进行模拟。我们证明了所得马尔可夫链的几何遍历性。我们开发了一种基于与数值方案相关的离散泊松方程的方法,用于估计计算遍历平均值的误差。使用这种方法,我们证明了所提出的数值积分器具有二阶偏差。该方法简单且可推广到其他设置,例如动力学朗之万方程。最后,我们通过数值实验验证了收敛结果。

英文摘要

We introduce a switching Hamiltonian Monte Carlo method for sampling from finite mixture Boltzmann-Gibbs distributions. We propose symmetric numerical integrators to approximate switching Hamiltonian dynamics interlaced with Poisson jumps, where the regime-switching chain is simulated using the uniformization technique or the stochastic simulation algorithm. We prove geometric ergodicity of the resulting Markov chain. We develop an approach based on the discrete Poisson equation associated with numerical schemes to estimate the error in computing ergodic averages. Using this approach we prove that the proposed numerical integrators have second-order bias. This approach is simple and can be generalized to other settings, for example, kinetic Langevin equations. Finally, we verify the convergence result via numerical experiment.

2606.12694 2026-06-12 cs.DS cs.LG math.PR stat.ML 新提交

A unified complexity bound for logconcave sampling

对数凹采样的统一复杂度界

Yunbum Kook, Santosh S. Vempala

发表机构 * University of Texas at Austin(得克萨斯大学奥斯汀分校)

AI总结 本文通过In-and-Out算法与指数提升,给出了从热启动采样任意对数凹分布的简单、统一且近乎紧的界,主要创新是提升了提升分布的Poincaré常数界。

Comments 5 pages

详情
AI中文摘要

我们给出了一个简单、统一且近乎紧的界,用于从热启动使用In-and-Out算法结合指数提升采样任意对数凹分布。分析中的主要新成分是提升了提升分布的Poincaré常数界。因此,得到的收敛率对于约束设置(例如,限制在凸体上的高斯分布)和良条件设置(例如,强对数凹且光滑的密度)都是近乎紧的。

英文摘要

We give a simple, unified, and nearly tight bound for sampling arbitrary logconcave distributions from a warm start using the In-and-Out algorithm along with exponential lifting. The main new ingredient in the analysis is an improved bound on the Poincaré constant of a lifted distribution. As a consequence, the resulting convergence rate is nearly tight for both constrained settings (e.g., Gaussian restricted to a convex body) and well-conditioned settings (e.g., strongly logconcave and smooth densities).

2606.13063 2026-06-12 math.NA cs.NA stat.ML 新提交

A Quadratic Order Reduction -- Gaussian Process Ordinary Differential Equation framework for the inference of Large Continuous Dynamical Systems

二次降阶——高斯过程常微分方程框架用于大规模连续动力系统的推断

Guglielmo Padula, Michele Girfoglio, Gianluigi Rozza

AI总结 提出结合高斯过程与二次降阶的框架,实现复杂动力系统的高精度、稳定预测与不确定性量化。

Comments 49 pages, 11 figures

详情
AI中文摘要

预测复杂动力系统的演化仍然是一项根本性的挑战任务,主要由于显著的非线性相互作用、高维状态空间以及对严格可靠的不确定性量化的同时需求。当代降阶建模(ROM)框架通常在预测精度、数值稳定性和可解释性之间表现出固有的权衡,因此往往无法在这些相互竞争的目标之间达到最优平衡。为了解决这些限制,我们提出了一种基于高斯过程和二次模型降阶的核自洽常微分方程方法,用于预测复杂动力系统。我们的基础方法,高斯过程常微分方程模型,允许带有不确定性量化的精确短期预测,并且在光滑情况下可证明收敛到真实的自洽方程。我们将其与二次降阶建模和球面投影相结合,以高效学习潜在动力学并保持稳定性。数值实验表明,我们的完整模型在精度或计算成本方面优于扩展动态模式分解、Bagging优化动态模式分解以及线性和非线性去混叠优化等ROM预测方法。这些结果证明了该框架作为具有严格不确定性量化的复杂动力系统预测的稳健且稳定工具的潜力。

英文摘要

Forecasting the evolution of complex dynamical systems remains a fundamentally challenging task, primarily due to pronounced nonlinear interactions, high-dimensional state spaces, and the concomitant requirement for rigorous and reliable uncertainty quantification. Contemporary reduced-order modelling (ROM) frameworks frequently exhibit inherent trade-offs among predictive accuracy, numerical stability, and interpretability, and thus often fail to achieve an optimal balance among these competing objectives. To address these limitations, we propose a framework for forecasting complex dynamical systems via a kernel autonomous ordinary differential equation approach based on Gaussian Processes and Quadratic Order Model Reduction. Our base method, the Gaussian Process Ordinary Differential Equations model, allows accurate short-term forecasting with uncertainty quantification, and it provably converges to the real autonomous equation in the smooth case. We integrate it with quadratic order reduced-order modelling and sphere projection for learning the latent dynamics efficiently while preserving stability. Numerical experiments demonstrate that our full model outperforms ROM forecasting methods such as Extended Dynamic Mode Decomposition, Bagging Optimised Dynamic Mode Decomposition and Linear and Nonlinear Disambiguation Optimisation in terms of accuracy or computational costs. These results demonstrate the potential of the framework as a robust and stable tool for forecasting complex dynamical systems with rigorous uncertainty quantification.

2606.13245 2026-06-12 physics.comp-ph stat.ML 新提交

REMAL: Residual Equilibrium Manifold Active Learning for Surrogate-Based Multidisciplinary Design Analysis

REMAL: 基于残差平衡流形主动学习的替代模型多学科设计分析

Kail Yuan, Ashwin Renganathan

AI总结 提出REMAL框架,通过多任务高斯过程学习联合残差流形替代模型,结合熵主动学习在零等高线附近采样,求解非线性最小二乘恢复平衡状态,显著降低耦合系统多设计点分析成本。

Comments 30 pages, 16 figures

详情
AI中文摘要

耦合工程系统的多学科设计分析需要计算平衡状态,其中所有学科耦合变量相互一致。传统的固定点迭代在每个设计点单独解决此一致性问题,当学科评估成本高昂且在外环任务(如多学科设计优化、不确定性量化或数字孪生更新)中需要大量分析时,这可能会变得昂贵。本文介绍了REMAL,一种用于耦合系统的残差流形替代建模框架。该方法不是独立近似每个学科或直接学习收敛的耦合变量,而是通过多任务高斯过程模型学习联合残差流形的替代模型。基于熵的主动学习策略在不确定的零等高线区域附近选择额外的残差评估,并通过仅使用训练好的替代模型求解非线性最小二乘优化问题来恢复新设计输入的平衡状态。该方法在四个工程耦合系统基准上进行了评估:卫星模型、气动结构模型、有限元燃气轮机传热与经济模型以及带有反馈耦合的改进涡轮模型。在这些案例中,当需要在设计空间内重复评估固定点时,REMAL始终表现出成本效益。理论上,我们证明在温和假设下,REMAL的预测固定点误差是有界的。

英文摘要

Multidisciplinary design analysis of coupled engineering systems requires the computation of equilibrium states in which all disciplinary coupling variables are mutually consistent. Conventional fixed-point iteration resolves this consistency problem separately at each design point, which can become expensive when disciplinary evaluations are costly and many analyses are required in outer-loop tasks such as multidisciplinary design optimization, uncertainty quantification, or digital twin updating. This paper introduces REMAL, a residual manifold surrogate modeling framework for coupled systems. Instead of approximating each discipline independently or directly learning converged coupling variables, the proposed method learns a surrogate model of the joint residual manifold via multitask Gaussian process models. An entropy-based active learning strategy selects additional residual evaluations near uncertain zero-contour regions, and equilibrium states for new design inputs are recovered by solving a nonlinear least squares optimization problem using only the trained surrogate. The method is evaluated on four engineering coupled system benchmarks: a satellite model, an aerostructural model, a finite-element gas-turbine heat-transfer and economics model, and a modified turbine model with added feedback coupling. Across these cases, REMAL consistently demonstrates the cost effectiveness when repeated evaluations of the fixed point across the design space are necessary. Theoretically, we show that, under mild assumptions, REMAL's predictive fixed point error is bounded.

2606.13453 2026-06-12 math-ph math.MP stat.ML 新提交

Rapid mixing for Gibbs measures in Riemannian manifolds

黎曼流形上吉布斯测度的快速混合

Ángela Capel, Marco Castrillón-López, Sofyan Iblisdir, Angelo Lucia, Pablo Páez-Velasco, David Pérez-García

AI总结 分析黎曼流形上的Langevin动力学,识别确保对数Sobolev不等式(快速混合到吉布斯测度)的条件,涉及曲率、逆温度、鞍点逃逸方向,并排除贫瘠高原和虚假局部极小值,实现维度多项式混合时间。

Comments 88 + 80 pages, 1 figure

详情
AI中文摘要

分析了黎曼流形上的Langevin动力学。确定了确保存在合适的对数Sobolev不等式(快速混合到吉布斯测度)的条件。这些条件涉及流形的曲率、逆温度、从鞍点的逃逸方向,并排除了贫瘠高原和虚假局部极小值。我们表明,当这些条件满足时,可以实现流形维度多项式的混合时间。这一结果是通过定义域和黎曼淹没像中的Langevin过程之间的关系获得的。这种关系可能具有独立的意义。

英文摘要

Langevin dynamics on Riemannian manifolds is analyzed. Conditions ensuring the existence of a suitable logarithmic Sobolev inequality (rapid mixing to the Gibbs measure) are identified. These conditions involve the curvature of the manifold, the inverse temperature, escaping directions from saddle points, and exclude barren plateaus and spurious local minima. We show that when these conditions are met, mixing times polynomial in the dimension of the manifold are achievable. This result is obtained through a relation between Langevin processes in the domain and in the image of a Riemannian submersion. Such a relation can be of independent interest.

2602.03165 2026-06-12 stat.ME stat.ML 版本更新

Entropic Mirror Monte Carlo

熵镜像蒙特卡洛

Anas Cherradi, Yazid Janati, Alain Durmus, Sylvain Le Corff, Yohan Petetin, Julien Stoehr

AI总结 提出一种自适应重要性采样方法,通过结合全局采样与延迟加权机制构建高效提议分布,实现多模态高维目标分布的有效探索,并证明算法在温和假设下几何收敛。

详情
AI中文摘要

重要性采样是一种蒙特卡洛方法,它利用来自提议分布的加权样本设计目标分布下期望的估计量。当目标分布复杂时,例如高维空间中的多模态分布,重要性采样的效率关键取决于提议分布的选择。在本文中,我们提出了一种新的自适应方案来构建高效的提议分布。我们的算法通过将全局采样机制与延迟加权过程相结合,促进了对目标分布的有效探索。所提出的加权机制通过在提议分布与目标适应不良的区域实现快速重采样,发挥了关键作用。我们的采样算法在温和假设下被证明是几何收敛的,并通过各种数值实验进行了说明。

英文摘要

Importance sampling is a Monte Carlo method which designs estimators of expectations under a target distribution using weighted samples from a proposal distribution. When the target distribution is complex, such as multimodal distributions in highdimensional spaces, the efficiency of importance sampling critically depends on the choice of the proposal distribution. In this paper, we propose a novel adaptive scheme for the construction of efficient proposal distributions. Our algorithm promotes efficient exploration of the target distribution by combining global sampling mechanisms with a delayed weighting procedure. The proposed weighting mechanism plays a key role by enabling rapid resampling in regions where the proposal distribution is poorly adapted to the target. Our sampling algorithm is shown to be geometrically convergent under mild assumptions and is illustrated through various numerical experiments.

2601.22003 2026-06-12 stat.ML cs.LG stat.CO 版本更新

Efficient Stochastic Optimisation via Sequential Monte Carlo

通过序贯蒙特卡洛实现高效随机优化

James Cuin, Davide Carbone, Yanbo Tang, O. Deniz Akyildiz

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 针对梯度难以计算的优化问题,提出用序贯蒙特卡洛(SMC)采样器替代昂贵的内采样循环,实现高效随机优化,并在能量模型奖励调优中验证有效性。

Comments Accepted to ICML 2026

详情
AI中文摘要

在机器学习和统计学中,从最大边际似然估计过程到生成模型的微调,经常出现优化具有难处理梯度函数的问题。针对这类问题的随机近似方法通常需要内部采样循环来获得(有偏的)随机梯度估计,这很快会变得计算昂贵。在这项工作中,我们开发了用于优化具有难处理梯度函数的序贯蒙特卡洛(SMC)采样器。我们的方法用高效的SMC近似替代昂贵的内部采样方法,这可以带来显著的计算收益。我们为我们的方法所定义的基本递归建立了收敛结果,这些递归由SMC采样器近似。我们在各种设置下对能量模型的奖励调优展示了我们方法的有效性。

英文摘要

The problem of optimising functions with intractable gradients frequently arises in machine learning and statistics, ranging from maximum marginal likelihood estimation procedures to fine-tuning of generative models. Stochastic approximation methods for this class of problems typically require inner sampling loops to obtain (biased) stochastic gradient estimates, which rapidly becomes computationally expensive. In this work, we develop sequential Monte Carlo (SMC) samplers for optimisation of functions with intractable gradients. Our approach replaces expensive inner sampling methods with efficient SMC approximations, which can result in significant computational gains. We establish convergence results for the basic recursions defined by our methodology which SMC samplers approximate. We demonstrate the effectiveness of our approach on the reward-tuning of energy-based models within various settings.

2512.23566 2026-06-12 math.DS cond-mat.stat-mech cs.LG math.OC stat.ML 版本更新

From geometry to dynamics: Learning overdamped Langevin dynamics from sparse observations with geometric constraints

从几何到动力学:基于几何约束从稀疏观测学习过阻尼朗之万动力学

Dimitra Maoutsa

发表机构 * Dimitra Maoutsa(迪米特拉·马乌茨)

AI总结 提出一种随机控制框架,利用系统不变密度的几何结构进行路径增强,从稀疏时间采样数据中恢复过阻尼朗之万动力学,无需参数模型假设。

Comments 10+54 pages, 14 figures; accepted at ICML 2026 An earlier account of this work has previously appeared in arXiv:2301.08102 and arXiv:2304.00423 ; main methodology remains the same, this version includes additional numerical experiments and theory

详情
AI中文摘要

当随机系统的轨迹在时间上稀疏采样时,我们如何学习其动力学背后的规律?现有方法要么需要时间分辨的高频观测,要么依赖于仅适用于保守系统的几何论证,限制了它们能恢复的动力学范围。在这里,我们提出一个新的框架,通过将推断重新表述为随机控制问题来调和这两种观点。我们的方法使用几何驱动的路径增强,以系统不变密度的几何结构为指导,重构可能的轨迹并推断底层动力学,而不假设特定的参数模型。应用于过阻尼朗之万系统,我们的方法即使在极度欠采样数据下也能准确恢复随机动力学,在合成基准测试中优于现有方法。这项工作证明了将几何归纳偏差纳入随机系统识别方法的有效性。

英文摘要

How can we learn the laws underlying the dynamics of stochastic systems when their trajectories are sampled sparsely in time? Existing methods either require temporally resolved high-frequency observations, or rely on geometric arguments that apply only to conservative systems, limiting the range of dynamics they can recover. Here, we present a new framework that reconciles these two perspectives by reformulating inference as a stochastic control problem. Our method uses geometry-driven path augmentation, guided by the geometry in the system's invariant density to reconstruct likely trajectories and infer the underlying dynamics without assuming specific parametric models. Applied to overdamped Langevin systems, our approach accurately recovers stochastic dynamics even from extremely undersampled data, outperforming existing methods in synthetic benchmarks. This work demonstrates the effectiveness of incorporating geometric inductive biases into stochastic system identification methods.

2402.01779 2026-06-12 eess.IV cs.CV cs.LG stat.ML 版本更新

Plug-and-Play image restoration with Stochastic deNOising REgularization

即插即用图像恢复:随机去噪正则化

Marien Renaud, Jean Prost, Arthur Leclaire, Nicolas Papadakis

发表机构 * GitHub

AI总结 提出SNORE框架,仅在适当噪声水平图像上应用去噪器,结合随机正则化与梯度下降求解逆问题,在去模糊和修复任务上达到SOTA。

详情
AI中文摘要

即插即用(PnP)算法是一类迭代算法,通过结合物理模型和深度神经网络进行正则化来解决图像逆问题。尽管它们能产生令人印象深刻的图像恢复结果,但这些算法依赖于在迭代过程中噪声逐渐减小的图像上非标准地使用去噪器,这与最近基于扩散模型(DM)的算法形成对比,后者仅在重新加噪的图像上应用去噪器。我们提出了一种新的PnP框架,称为随机去噪正则化(SNORE),该框架仅在具有适当噪声水平的图像上应用去噪器。它基于显式的随机正则化,从而产生一种随机梯度下降算法来解决不适定逆问题。提供了该算法及其退火扩展的收敛性分析。实验上,我们证明SNORE在去模糊和修复任务上与最先进方法相比具有竞争力,无论是在定量还是定性方面。

英文摘要

Plug-and-Play (PnP) algorithms are a class of iterative algorithms that address image inverse problems by combining a physical model and a deep neural network for regularization. Even if they produce impressive image restoration results, these algorithms rely on a non-standard use of a denoiser on images that are less and less noisy along the iterations, which contrasts with recent algorithms based on Diffusion Models (DM), where the denoiser is applied only on re-noised images. We propose a new PnP framework, called Stochastic deNOising REgularization (SNORE), which applies the denoiser only on images with noise of the adequate level. It is based on an explicit stochastic regularization, which leads to a stochastic gradient descent algorithm to solve ill-posed inverse problems. A convergence analysis of this algorithm and its annealing extension is provided. Experimentally, we prove that SNORE is competitive with respect to state-of-the-art methods on deblurring and inpainting tasks, both quantitatively and qualitatively.

2505.14343 2026-06-12 stat.CO stat.ME stat.ML 版本更新

Mixing times of data-augmentation Gibbs samplers for high-dimensional probit regression

高维probit回归的数据增强Gibbs采样器的混合时间

Filippo Ascolani, Giacomo Zanella

AI总结 针对贝叶斯probit回归的数据增强Gibbs采样器,基于对数凹目标分布的Gibbs采样器最新结果,给出了混合时间的显式非渐近界,并分析了不同统计场景下的行为。

详情
AI中文摘要

我们研究了贝叶斯probit回归中流行的数据增强采样器的收敛性质。利用最近关于对数凹目标分布的Gibbs采样器的结果,我们提供了相关混合时间(在Kullback-Leibler散度下)的简单且显式的非渐近界。这些界明确依赖于设计矩阵和先验精度,并且对响应向量一致成立。我们将结果专门化到不同的统计感兴趣区域,当数据点数$n$和参数$p$都很大时:特别地,我们识别了混合时间在$n,p\to\infty$时保持有界的情况,以及混合时间发散的情况。结果表明(在响应最坏情况下)是紧的,并为选择能导致快速混合的先验分布提供了指导。基于耦合技术的实证分析表明,这些界能有效预测实际观察到的行为。

英文摘要

We investigate the convergence properties of popular data-augmentation samplers for Baye\-sian probit regression. Leveraging recent results on Gibbs samplers for log-concave targets, we provide simple and explicit non-asymptotic bounds on the associated mixing times (in Kullback-Leibler divergence). The bounds depend explicitly on the design matrix and the prior precision, while they hold uniformly over the vector of responses. We specialize the results for different regimes of statistical interest, when both the number of data points $n$ and parameters $p$ are large: in particular we identify scenarios where the mixing times remain bounded as $n,p\to\infty$, and ones where they do not. The results are shown to be tight (in the worst case with respect to the responses) and provide guidance on choices of prior distributions that provably lead to fast mixing. An empirical analysis based on coupling techniques suggests that the bounds are effective in predicting practically observed behaviours.

6. 机器学习统计基础 22 篇

2606.13277 2026-06-12 stat.ML cs.LG 新提交

ProtoX-AD: Self-Explainable Time Series Anomaly Detection and Characterization

ProtoX-AD:自解释的时间序列异常检测与特征描述

Aitor Sánchez-Ferrera, Elisabeth Wetzer, Kristoffer Wickstrøm, Michael Kampffmeyer, Robert Jenssen

AI总结 提出ProtoX-AD框架,通过原型学习实现自监督时间序列异常检测的可解释性,在保持检测性能的同时提供语义一致的异常特征解释。

Comments 26 pages, 8 figures

详情
AI中文摘要

时间序列异常检测(TSAD)的最新进展突显了自监督分类方法的有效性。这些方法对正常训练样本应用变换,训练分类器识别变换特定模式,从而通过增加分类误差来帮助识别异常。尽管性能强大,但一个重大挑战是缺乏可解释性,因为它们对标记异常的特征提供的洞察有限。为了解决这一局限,我们提出了ProtoX-AD,一种基于原型的自解释框架,用于自监督TSAD。ProtoX-AD学习变换感知的潜在表示以及可解释的原型,从而实现准确的异常检测和通过基于原型的解释识别不同的异常轮廓。此外,它允许系统分析变换设计如何影响检测性能和可解释性。在合成和真实世界数据集上的实验结果表明,ProtoX-AD实现了与其黑盒对应物相当的检测性能,同时比现有的可解释基线提供更一致和语义上有意义的解释。我们的代码在此 https URL 公开。

英文摘要

Recent advances in time series anomaly detection (TSAD) have highlighted the effectiveness of self-supervised classification-based approaches. These methods apply transformations to normal training samples, training a classifier to recognize transformation-specific patterns that help identify anomalies through increased classification errors. Despite their strong performance, a significant challenge is their lack of explainability, as they provide limited insight into the characteristics of flagged anomalies. To address this limitation, we propose ProtoX-AD, a prototype-based self-explainable framework for self-supervised TSAD. ProtoX-AD learns transformation-aware latent representations alongside interpretable prototypes, enabling both accurate anomaly detection and the identification of distinct anomalous profiles through prototype-based explanations. Additionally, it allows for systematic analysis of how transformation design impacts detection performance and explainability. Experimental results on synthetic and real-world datasets demonstrate that ProtoX-AD achieves detection performance comparable to its black-box counterparts while offering more consistent and semantically meaningful explanations than existing explainable baselines. Our code is publicly available at https://github.com/Aitorzan3/ProtoX-AD.

2606.13146 2026-06-12 stat.ML cs.LG stat.ME 新提交

Robust State-Conditional Feature-Weighted Jump Models for Temporal Clustering

鲁棒的状态条件特征加权跳跃模型用于时间聚类

Federico P. Cortese, Alessio Farcomeni

AI总结 提出一种鲁棒的特征加权跳跃模型,通过Tukey双权损失函数实现鲁棒性,并引入状态特定特征权重,在模拟和实证中优于竞争方法。

详情
AI中文摘要

我们提出了一种用于时间依赖聚类的鲁棒特征加权跳跃模型。使用惩罚项来鼓励随时间平滑过渡,同时通过Tukey双权损失函数实现鲁棒性。一个额外的参数控制特征权重在不同状态间的变异性,允许模型为每个特征分配状态特定的相关性。我们在模拟中展示了该方法如何准确恢复真实聚类序列并可靠识别相关特征,特别是在存在异常值的情况下优于竞争方法。最后,我们进行了两个实证应用,一个涉及1998-2000年科索沃冲突相关杀人事件的数量,另一个涉及1949-2024年十二个欧洲国家的宏观经济表现。

英文摘要

We propose a robust feature-weighted jump model for time-dependent clustering. A penalty is used to encourage smoothness of transitions over time, while robustness is achieved through the use of a Tukey's biweight loss function. An additional parameter controls the variability of feature weights across states, allowing the model to assign state-specific relevance to each feature. We illustrate in simulation how the method accurately recovers the true cluster sequence and reliably identifies relevant features, outperforming competing approaches, particularly in the presence of outliers. We conclude with two empirical applications, one on the number of conflict-related homicides in Kosovo in the period 1998-2000, and another on macroeconomic performance of twelve European countries in the period 1949-2024.

2606.12471 2026-06-12 stat.ML cs.CL cs.ET cs.LG 新提交

Identifiability Without Gaussianity: Symbolic World Models and Near-Infinite Temporal Consistency

无高斯假设的可识别性:符号世界模型与近无限时间一致性

Seth Dobrin, Łukasz Chmiel

AI总结 本文提出物理基础符号架构(PGSA),证明其在非高斯动态系统中实现精确线性可识别性和近无限时间一致性,克服了统计世界模型的高斯边界限制。

Comments Pre-print

详情
AI中文摘要

Klindt、LeCun 和 Balestriero (arXiv:2605.26379) 证明了联合嵌入预测架构(JEPA)实现线性可识别性(即线性恢复世界的真实潜在变量)当且仅当世界的潜在动态遵循高斯平稳过程。这一高斯边界意味着时间一致性的基本限制:对于任何非高斯物理系统,统计世界模型的表示误差随时间单调增长。我们证明这一限制是统计对齐机制的产物,而非世界模型的一般性质。我们引入物理基础符号架构(PGSA),并证明三个结果:(1) PGSA 对所有物理机制实现精确线性可识别性,无论潜在分布如何;(2) PGSA 的每步误差仅受数值精度限制;(3) 直接推论是,PGSA 在无界数量的转换中保持时间一致性,我们称之为近无限时间一致性。我们进一步证明,对于任何非高斯系统,统计世界模型无法实现这一性质,无论模型容量或训练数据量如何。其中四个定理的代数核心已在 Lean 4 中使用 Mathlib4 v4.31.0 形式化(零个 sorry 占位符);Klindt 等人的逆命题作为外部前提。对比表明,在世界动态的因果生成器中进行符号基础化是充分条件,并且在非高斯体制下,是实现近无限时间一致性的唯一条件。

英文摘要

Klindt, LeCun, and Balestriero (arXiv:2605.26379) proved that Joint-Embedding Predictive Architectures (JEPAs) achieve linear identifiability, the linear recovery of the world's true latent variables, if and only if the world's latent dynamics follow a Gaussian, stationary process. This Gaussian boundary implies a fundamental limit on temporal consistency: for any non-Gaussian physical system, the representation error of a statistical World Model grows monotonically with time. We prove that this limit is an artifact of the statistical alignment mechanism, not a property of World Models in general. We introduce the Physics-Grounded Symbolic Architecture (PGSA) and prove three results: (1) a PGSA achieves exact linear identifiability for all physical regimes, regardless of the latent distribution; (2) the per-step error of a PGSA is bounded by numerical precision alone; and (3) as a direct consequence, a PGSA maintains temporal consistency for an unbounded number of transitions, a property we term near-infinite temporal consistency. We further prove that statistical World Models cannot achieve this property for any non-Gaussian system, regardless of model capacity or the volume of training data. The algebraic cores of four of the theorems are formalized in Lean 4 with Mathlib4 v4.31.0 (zero sorry placeholders); the Klindt et al. converse is taken as an external premise. The contrast establishes that symbolic grounding in the causal generator of the world's dynamics is the sufficient condition and, in non-Gaussian regimes, the only condition for near-infinite temporal consistency.

2606.13576 2026-06-12 cs.LG cs.CC cs.DS stat.ML 新提交

Learning with Simulators: No Regret in a Computationally Bounded World

与模拟器学习:计算受限世界中的无悔学习

Sasha Voitovych, Abhishek Shetty, Noah Golowich, Alexander Rakhlin

发表机构 * MIT(麻省理工学院) Microsoft Research(微软研究院)

AI总结 提出可模拟过程框架,利用模拟器近似任意复杂依赖的数据分布,恢复VC维误差界,并展示条件采样的统计与计算优势。

Comments To appear at COLT 2026

详情
AI中文摘要

理解泛化所需的最小假设是学习理论的基本问题。不幸的是,大多数结果严重依赖于数据生成过程的独立性(或其某种代理),而强依赖数据的结果则非常有限。为填补这一空白,我们引入了可模拟过程的框架,其中学习器可以访问一个近似数据生成分布(可能是任意复杂且依赖的过程)的模拟器。令人惊讶的是,我们表明,在访问这样的模拟器的情况下,我们可以恢复与经典独立数据设置相同的学习保证,即依赖于VC维的误差界。此外,我们利用这一框架研究条件采样的能力,并展示了在这种设置下严格的统计和计算优势。作为我们框架的一个亮点,我们展示了一个单一算法,该算法同时学习所有在有限多项式时间内可采样的过程下的任意给定VC类,其遗憾由过程的时间有界Kolmogorov复杂度控制。这为经典PAC模型提供了重要的概念扩展。

英文摘要

Understanding the minimal assumptions necessary for generalization is the fundamental question in learning theory. Unfortunately, most results rely heavily on independence (or some proxy thereof) of the data-generating process, while results for strongly dependent data are far more limited. Towards addressing this gap, we introduce the framework of simulatable processes, where the learner has access to a simulator that approximates the distribution generating the data (which may be an arbitrarily complex and dependent process). Surprisingly, given access to such a simulator, we show that we can recover the same learning guarantees as in the classical setting with independent data, namely, error bounds that depend on the VC dimension. Further, we use this framework to study the power of conditional sampling and show strict statistical and computational advantages in this setting. As a highlight of our framework, we exhibit a single algorithm that simultaneously learns any given VC class under all processes samplable in bounded polynomial time, with regret controlled by the time-bounded Kolmogorov complexity of the process. This provides a significant conceptual broadening of the classical PAC model.

2606.13426 2026-06-12 cs.LG stat.ML 新提交

Accelerating Speculative Diffusions via Block Verification

通过块验证加速推测性扩散

Alexander Soen, Hisham Husain, Valentin De Bortoli, Arnaud Doucet

发表机构 * KTH(皇家理工学院) Google Research(谷歌研究) Google DeepMind(谷歌深Mind)

AI总结 提出一种针对扩散模型的推测性采样方案,通过块验证提高草稿接受率,无需训练的Free Drafter实现高达6.3%的加速。

详情
AI中文摘要

推测性解码通过使用草稿模型生成令牌,并采用接受-拒绝方案确保输出与目标分布匹配,从而加速LLM推理。将其适应于连续扩散是困难的,因为推测性采样需要从残差分布中采样。虽然在离散空间中直接,但在连续空间中高效采样残差并非易事。因此,现有的扩散适应要么使用计算效率低下的采样技术,要么依赖替代方案。在这项工作中,我们引入了一种新颖的方案,高效地实现了扩散模型的原始推测性采样机制。我们的方法相比现有方法具有关键优势:它使我们能够将LLM的块验证适应到扩散——这被证明可以提高草稿的接受率。此外,我们形式化并分析了Free Drafter,一种无需训练的扩散启发式自推测草稿生成器。通过启用块验证,我们的Free Drafter在无需额外训练且开销可忽略的情况下,相比现有推测性方法实现了高达6.3%的加速。

英文摘要

Speculative decoding speeds up LLM inference by using a draft model to generate tokens, with an acceptance-rejection scheme that ensures that the output matches the target distribution. Adapting this to continuous diffusions is difficult because speculative sampling requires drawing from a residual distribution. While straightforward in discrete spaces, efficiently sampling this residual in continuous space is non-trivial. Consequently, existing diffusion adaptations either use computationally inefficient sampling techniques or rely on an alternative scheme. In this work, we introduce a novel scheme that efficiently implements the original speculative sampling mechanism for diffusion models. Our approach offers a critical advantage over current methods: it enables us to adapt block verification from LLMs to diffusions -- which provably improves the acceptance rate of drafts. Furthermore, we formalize and analyze the Free Drafter, a heuristic self-speculative drafter for diffusions that requires no training. By enabling block verification, our Free Drafter yields up to a 6.3% speedup over existing speculative methods with no additional training and negligible overhead beyond the existing parallel verification pass.

2606.12997 2026-06-12 cs.LG stat.ML 新提交

Reliability of Probabilistic Emulation of Physical Systems

物理系统概率仿真的可靠性

Sam F. Greenbury, Radka Jersakova, Paolo Conti, Marjan Famili, Christopher Iliffe Sprague, Edwin Brown, Jason D. McEwen

发表机构 * The Alan Turing Institute(艾伦·图灵研究所) Autodesk Research(欧特克研究院) PhysicsX Orbital University of Sheffield(谢菲尔德大学) University College London(伦敦大学学院)

AI总结 比较生成模型与CRPS训练集成在物理系统概率仿真中的可靠性,发现CRPS集成在覆盖率和推理速度上更优。

详情
AI中文摘要

目前,生成物理系统概率预测的两种主要方法已经出现:生成模型(如扩散或流匹配)以及注入随机性的确定性模型集成(使用连续排序概率评分(CRPS)损失训练)。虽然这两种方法都表现出强大的预测准确性,但其不确定性的可靠性尚未得到系统评估。我们通过开发一个框架来填补这一空白,该框架在匹配模型大小和计算预算的情况下,评估这两种方法在多种二维时空物理系统中的表现。我们通过检查预测区间的经验覆盖率来评估概率仿真的可靠性,同时考虑准确性和计算效率指标。CRPS训练的集成在单步预测和自回归展开中通常能实现更可靠的不确定性,显示出比在潜在空间中训练生成模型的标准替代方案更好的覆盖率。此外,CRPS方法提供了显著更快的推理速度。当生成模型在环境空间而非压缩潜在空间中训练时(这在高维问题中通常不可行),它们表现出与CRPS训练集成相当的覆盖率,但推理延迟显著更大。相比之下,当CRPS训练的集成在潜在空间中训练时,其覆盖率相对于环境空间没有明显下降。生成模型和CRPS训练的集成都表现出良好的预测准确性。为促进未来的研究和应用,我们发布了AutoCast,一个实现生成模型和CRPS训练集成的模块化框架,以及AutoSim,一个用于快速原型的灵活数据集生成包。

英文摘要

Two dominant approaches have emerged for generating probabilistic forecasts of physical systems: generative models, such as diffusion or flow matching; and ensembles of deterministic models with stochasticity injected, trained using the continuous ranked probability score (CRPS) loss. While both approaches have demonstrated strong predictive accuracy, the reliability of their uncertainties has not been systematically assessed. We address this gap by developing a framework to evaluate both approaches across diverse 2D spatiotemporal physical systems, under matched model size and computational budget. We assess the reliability of probabilistic emulation by inspecting the empirical coverage of predictive intervals, while also considering accuracy and computational efficiency metrics. CRPS-trained ensembles typically achieve more reliable uncertainties on both single-step prediction and autoregressive rollouts, demonstrating better coverage than the standard alternative of training generative models in a latent space. Moreover, the CRPS approach offers significantly faster inference. When generative models are trained in ambient rather than a compressed latent space, which is often infeasible for high-dimensional problems, they exhibit comparable coverage to CRPS-trained ensembles, though with substantially larger inference latency. In contrast, when CRPS-trained ensembles are trained in latent space they do not show a marked degradation in coverage with respect to ambient space. Both generative models and CRPS-trained ensembles demonstrate good predictive accuracy. To facilitate future research and application, we release AutoCast, a modular framework implementing both generative models and CRPS-trained ensembles, alongside AutoSim, a flexible dataset generation package for rapid prototyping.

2606.12658 2026-06-12 cs.LG q-bio.QM stat.ML 新提交

Physics-Informed Neural Networks for Chemotherapy Pharmacokinetics: Benchmarking the Clinical Estimator and Exposing Parameter Identifiability

基于物理信息的神经网络用于化疗药代动力学:基准测试临床估计器并揭示参数可辨识性

Riya Bisht, Dhruv Agarwal

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本研究将物理信息神经网络(PINN)应用于化疗药代动力学,在双室线性模型上匹配临床标准方法,在Michaelis-Menten扩展模型中揭示参数不可辨识性,并通过稀疏组织观测部分恢复可辨识性。

详情
AI中文摘要

物理信息神经网络(PINN)是生物学中部分观测问题的一个有吸引力的工具,其中控制动力学已知但某些隔室无法测量。化疗药代动力学(PK)是一个清晰的实例:血浆中的药物浓度常规测量,但组织中的浓度——决定肿瘤杀伤和脱靶毒性——无法测量。我们在两个PK问题上将PINN与标准临床基线(非线性最小二乘解析双指数血浆解,以下简称NLS)和物理无关的神经基线(仅数据的MLP)进行基准测试。在线性双室问题上,NLS接近最优;PINN在匹配其性能(小常数因子内)的同时,在单次训练过程中产生组织曲线,而仅数据的MLP在组织上失败约10倍。在Michaelis-Menten扩展(可饱和消除)上,双指数闭式不再存在,因此NLS被错误指定并静默返回无意义的速率常数。PINN反而揭示了一个更深层的事实:Michaelis-Menten双室模型仅从血浆数据不可辨识,PINN通过收敛到k12 -> 0的盆地诚实地报告这一点。添加两个稀疏组织观测在很大程度上解决了可辨识性:在五个随机种子上,PINN恢复k21在真实值的1%以内,Vmax和Km在一个标准差范围内,而k12向正确方向移动(0.02 -> 0.82)但仍低于真实值约2个标准差——这是闭式NLS估计器根本无法尝试的恢复,因为其双指数假设仅描述血浆。我们的主张不是PINN击败NLS。而是PINN提供了一种统一的方案,该方案在教科书问题上与教科书估计器匹配,揭示了教科书估计器隐藏的结构可辨识性,并在单一损失中吸收异构测量。

英文摘要

Physics-Informed Neural Networks (PINNs) are an attractive tool for partial-observation problems in biology, where the governing dynamics are known but some compartments cannot be measured. Chemotherapy pharmacokinetics (PK) is a clean instance: drug concentration in plasma is routinely measured, but concentration in tissue -- which determines tumour kill and off-target toxicity -- is not. We benchmark a PINN against the standard clinical baseline (nonlinear least-squares on the analytical biexponential plasma solution, hereafter NLS) and a physics-agnostic neural baseline (a data-only MLP) on two PK problems. On the linear two-compartment problem, NLS is near-optimal; the PINN matches it to within a small constant factor while also producing the tissue curve in a single training pass, whereas the data-only MLP fails on tissue by roughly 10x. On a Michaelis-Menten extension (saturable elimination), the biexponential closed form no longer exists, so NLS is mis-specified and silently returns meaningless rate constants. The PINN instead exposes a deeper fact: the Michaelis-Menten two-compartment model is non-identifiable from plasma alone, and the PINN reports this honestly by converging to a basin with k12 -> 0. Adding two sparse tissue observations largely resolves identifiability: across five seeds the PINN recovers k21 to within 1% of truth and Vmax, Km to within one standard-deviation bar, while k12 moves in the correct direction (0.02 -> 0.82) but remains ~2 sigma below truth -- a recovery the closed-form NLS estimator cannot attempt at all, because its biexponential ansatz describes only plasma. Our claim is not that PINNs beat NLS. It is that PINNs offer a uniform recipe that ties the textbook estimator on the textbook problem, exposes structural identifiability that the textbook estimator hides, and absorbs heterogeneous measurements within a single loss.

2606.13614 2026-06-12 stat.ML cs.LG math.ST stat.TH 新提交

Majority-of-Three is Optimal

三中多数是最优的

Divit Rawal, Nikita Zhivotovskiy

发表机构 * Department of Statistics, University of California, Berkeley(加州大学伯克利分校统计学系)

AI总结 本文通过简短证明,在可实现PAC学习框架下,三个独立一致分类器的多数投票是最优学习器,简化了投票学习器的算法结构和概率分析。

Comments 9 pages

详情
AI中文摘要

我们给出一个简短证明,表明在可实现PAC学习框架下,三个独立一致分类器的多数投票是最优学习器。这证明了最简单投票方案的最优性,同时简化了先前投票学习器的算法结构和概率分析,包括S. Hanneke的算法和K. Green Larsen对装袋的分析。

英文摘要

We give a short proof that the majority vote of three independent consistent classifiers is an optimal learner in the realizable PAC setting. This proves optimality for the simplest voting scheme, while simplifying both the algorithmic structure and the probabilistic analysis of previous voting learners, including the algorithm of S. Hanneke and the analysis of bagging by K. Green Larsen.

2606.12879 2026-06-12 cs.DS math.ST stat.ML stat.TH 新提交

Diffusion-Network Alignment: An Efficient Algorithm and Explicit Probability Bounds

扩散-网络对齐:一种高效算法与显式概率界

Ziao Wang, Lei Ying

AI总结 提出扩散-网络对齐问题,基于树相关性测试设计高效算法,在稀疏图下证明高概率正确匹配,并给出顶点正确匹配的显式下界。

详情
AI中文摘要

本文研究经典网络对齐问题的一个变体,称为扩散-网络对齐。目标是将有根扩散树的顶点与网络的顶点对齐,其中扩散树可能来自通信追踪或接触追踪,而网络可能是在线或离线社交网络。与两个网络都被完全观测的经典网络对齐不同,该模型捕捉了两个网络的信息不对称性。为了解决这个问题,本文提出了一种基于树相关性测试的高效算法,从局部邻域中提取对齐信息。我们分析了该算法在稀疏图情况下的性能,并表明以高概率,所有匹配对都是正确的。此外,对于扩散树上的每个顶点,本文建立了该顶点被正确匹配的概率的显式下界。这些下界是深度依赖的,并且随着顶点接近根而增加。

英文摘要

This paper studies a variation of the classic network alignment problem, named diffusion-network alignment. The goal is to align the vertices of a rooted diffusion tree to the vertices of a network, where the diffusion tree could be from a communication trace or contact tracing, and the network could be an online or offline social network. Different from the classic network alignment where both networks are fully observed, this model captures the information asymmetry of two networks. To solve this problem, this paper presents an efficient algorithm based on tree correlation tests to extract alignment information from local neighborhoods. We analyze the performance of the algorithm in the sparse graph regime and show that with high probability, all matched pairs are correct. Furthermore, for each vertex on the diffusion tree, this paper establishes an explicit lower bound on the probability that the vertex is correctly matched. These lower bounds are depth-dependent and increase as vertices get closer to the root.

2606.12691 2026-06-12 cs.LG cs.AI cs.SY eess.SY math.OC stat.ML 新提交

Two-Layer Linear Auto-Regressive Models Estimate Latent States

两层线性自回归模型估计潜在状态

Yahya Sattar, Sunmook Choi, Leo Maynard-Zhang, Yassir Jedra, Maryam Fazel, Sarah Dean

AI总结 本文证明两层线性自回归模型通过经验风险最小化训练时,能近似卡尔曼滤波,恢复潜在状态估计,并提供有限样本保证。

Comments ICML 2026

详情
AI中文摘要

自回归模型已成为处理序列数据(从语言到视频)的强大工具。理解这些模型如何以及为何学习潜在表示仍然是一个开放的理论问题。在这项工作中,我们证明,当在部分观测的线性动力系统的数据上通过经验风险最小化训练时,两层线性自回归模型自然学会近似卡尔曼滤波。特别地,我们表明,学习到的隐藏表示与最优(卡尔曼)滤波器产生的状态估计一致,仅相差一个相似变换,尽管模型没有关于底层动力学或状态的显式知识。该结果基于三个主要见解。首先,我们建立卡尔曼滤波器可以被具有有界截断误差的自回归模型很好地近似。其次,我们表明,尽管非凸性,两层优化景观是良性的,即所有驻点要么是严格鞍点,要么是全局最小值。最后,作为我们的主要贡献,我们提供了关于预测误差、参数估计误差和潜在状态恢复的有限样本保证。数值模拟支持理论结果,并表明自回归模型的潜在表示恢复了状态估计。

英文摘要

Auto-regressive models have emerged as powerful tools for sequential data, from language to video. Understanding how and why these models learn latent representations remains an open theoretical question. In this work, we demonstrate that when trained by empirical risk minimization on data from partially observed linear dynamical systems, two-layer linear auto-regressive models naturally learn to approximate Kalman filtering. In particular, we show that the learned hidden representation coincides, up to a similarity transformation, with the state estimates produced by the optimal (Kalman) filter, even though the model has no explicit knowledge of the underlying dynamics or state. The result follows from three main insights. First, we establish that the Kalman filter is well approximated by an auto-regressive model with bounded truncation error. Second, we show that despite non-convexity, the two-layer optimization landscape is benign, i.e., all stationary points are either strict saddles or global minima. Finally, as our main contributions, we provide finite-sample guarantees on prediction error, parameter estimation error, and latent state recovery. Numerical simulations support the theoretical results and demonstrate that the latent representations of auto-regressive models recover state estimates.

2606.12646 2026-06-12 stat.ML cs.IT cs.LG math.IT 新提交

Epistemic Uncertainty Is Not the Reducible Kind

认知不确定性并非可约简的那种

Robin Young

发表机构 * University of Cambridge(剑桥大学)

AI总结 证明标准定义中认知不确定性为可被更多数据移除的部分,与互信息度量在扩展上不一致,并提出三部分分解:偶然、样本可约简认知和机制可约简认知不确定性。

详情
AI中文摘要

预测不确定性的标准分类将认知不确定性定义为可通过收集更多数据移除的部分,而标准度量将其与互信息项等同。我们证明该定义与度量在扩展上不一致。在一个显式构造中,度量将所有不确定性归为认知类,但任何数量的训练数据都无法减少它。可约简性反而是(不确定性,获取类)这一对的性质,二分法分解为三部分:偶然不确定性、样本可约简认知不确定性和机制可约简认知不确定性。一个观测值的精确恒等式表明,分布内数据永远不会减少机制不可约简的不确定性,并且通常会增加它。集成分歧,即部署的认知估计,追踪的是训练过程而非认知项。在一致训练下,它降至正真值以下的零,并在插值下等于超参数缩放的初始化噪声。有限样本的证伪测试和种子扫描实验证实了该理论。

英文摘要

The standard taxonomy of predictive uncertainty defines epistemic uncertainty as the part removable by collecting more data, while the standard measure identifies it with a mutual-information term. We prove the definition and the measure are extensionally inconsistent. On an explicit construction, the measure assigns all uncertainty to the epistemic class, yet no quantity of training data reduces it. Reducibility is instead a property of the pair (uncertainty, acquisition class), and the dichotomy resolves into three parts: aleatoric, sample-reducible epistemic, and mechanism-reducible epistemic uncertainty. An exact identity for the value of an observation shows that in-distribution data never reduces mechanism-irreducible uncertainty and generically increases it. Ensemble disagreement, the deployed epistemic estimate, tracks the training procedure rather than the epistemic term. It collapses to zero beneath a positive truth under consistent training, and equals hyperparameter-scaled initialization noise under interpolation. A finite-sample falsification test and seed-swept experiments confirm the theory.

2606.13548 2026-06-12 cond-mat.mtrl-sci physics.data-an stat.ML 新提交

Symmetry-electronic fingerprints reveal competing magnetic phases in two-dimensional materials

对称-电子指纹揭示二维材料中的竞争磁性相

Addis Fuhr, Zachary R. Fox, David Parker, Ayana Ghosh

AI总结 提出对称-电子指纹(SEF)表示,结合晶体对称性与电子结构,通过随机森林集成学习准确分类磁有序、回归磁矩和各向异性,并识别Stoner铁磁性与局域超交换的竞争区域,模型不确定性可诊断近简并铁磁/反铁磁相。

详情
AI中文摘要

二维磁体为自旋电子学和量子技术提供了引人注目的平台,但预测其磁基态、磁矩和各向异性仍然具有挑战性。这一限制主要源于现有的机器学习表示编码了化学环境,但没有捕捉控制磁性的对称性或交换物理。在这项工作中,我们引入了对称-电子指纹(SEF),这是一种物理可解释的表示,编码了晶体对称操作、Wyckoff位点几何以及位点分辨的电子结构。结合随机森林的集成学习,SEF在回归磁矩和各向异性能量的同时准确分类磁有序,同时分辨巡游Stoner铁磁性与局域超交换的不同区域。SEF训练模型的不同之处在于,模型不确定性较高的区域不是失败,而是一种诊断,识别出这些机制竞争的材料。对Co基和Ni基卤化物和氧化物的第一性原理计算证实,这些区域对应于具有磁受挫、抑制各向异性和涌现非共线有序的真正近简并FM和AFM相。通过将对称性和交换物理直接编码到表示中(不同于传统描述符),SEF将模型不确定性转化为指向二维材料的指南针,在这些材料中,小扰动驱动共线、受挫或非共线磁相之间的转变。

英文摘要

Two-dimensional magnets offer compelling platforms for spintronics and quantum technologies, yet predicting their magnetic ground states, moments, and anisotropy remains challenging. This limitation primarily arises because existing machine-learning representations encode chemical environments without capturing the symmetry or exchange physics that govern magnetism. In this work, we introduce the symmetry-electronic fingerprint (SEF), a physically interpretable representation that encodes crystallographic symmetry operations, Wyckoff-site geometry, together with site-resolved electronic structure. Combined with ensemble learning with random forests, the SEF accurately classifies magnetic ordering while regressing moments alongside anisotropy energies while simultaneously resolving the distinct regimes of itinerant Stoner ferromagnetism from localized superexchange. What sets the SEF-trained models apart is that regions of elevated model uncertainty are not a failure but a diagnostic, identifying materials where these mechanisms compete. First-principles calculations on Co- and Ni-based halides and oxides confirm that these regions correspond to genuine near-degenerate FM and AFM phases with magnetic frustration, suppressed anisotropy, and emergent non-collinear ordering. By encoding symmetry together with exchange physics directly into the representation unlike conventional descriptors, the SEF transforms model uncertainty into a compass pointing toward two-dimensional materials where small perturbations drive transitions between collinear, frustrated, or non-collinear magnetic phases.

2606.11104 2026-06-12 cs.LG math.CA stat.ML 新提交

Limitations of Learning Tanh Neural Networks with Finite Precision

有限精度下学习Tanh神经网络的局限性

Philipp Grohs, Matěj Trödler

AI总结 基于有限精度计算和L^p精度保证,通过构造尖锐局部化bump函数,证明自适应随机算法在L^p范数下收敛速度不超过蒙特卡洛率O(m^{-1/p}),除非采样预算随网络参数和架构指数增长。

详情
AI中文摘要

我们研究了在有限精度计算和$L^p$精度保证下,从点评估中学习$\ anh$神经网络的局限性,建立在Berner、Grohs和Voigtländer(2023)的工作基础上。我们的方法基于通过迭代$\ anh$激活函数新颖构造的尖锐局部化bump函数。利用这一机制,我们证明,在有限精度设置下,基于$m$个样本的自适应随机算法在$L^p$范数下无法达到比蒙特卡洛率$O(m^{-1/p})$更高的收敛速度,除非采样预算随网络参数和架构的大小指数增长。结果揭示了有限精度对包含局部化bump函数的类别可学习性施加的基本限制,将先前针对ReLU网络的结果推广到了$\ anh$设置。

英文摘要

We investigate limitations of learning $\tanh$ neural networks from point evaluations under finite-precision computations and $L^p$ accuracy guarantees, building on Berner, Grohs, and Voigtländer (2023). Our approach is based on a novel construction of sharply localized bump functions via iterated $\tanh$ activations. Using this mechanism, we show that, in a finite-precision setting, no adaptive randomized algorithm based on $m$ samples can achieve a convergence rate higher than the Monte Carlo rate $O(m^{-1/p})$ in the $L^p$ norm, unless the sampling budget grows exponentially with the size of the network parameters and architecture. The results reveal fundamental limitations imposed by finite precision on the learnability of classes containing localized bump functions, extending previous results for ReLU networks to the $\tanh$ setting.

2606.07247 2026-06-12 cond-mat.dis-nn cond-mat.stat-mech stat.ML 新提交

Theory of learning of high-dimensional controlled non-linear dynamical systems (I): models and methods

高维受控非线性动力系统学习理论 (I): 模型与方法

Pierfrancesco Urbani

AI总结 本文提出一类理论模型,通过动态平均场理论求解神经ODE在在线随机梯度下降下的训练动力学,并推导高维极限下的学习曲线。

Comments 28 pages, 2 figures

详情
AI中文摘要

神经常微分方程(neural ODEs)迅速成为概念化人工神经网络的一个强大且统一的框架,优雅地将动力系统的连续时间建模与现代深度学习的离散数据驱动范式联系起来。除了实际优势外,它们还为神经网络的训练和泛化性质提供了新的理论见解。该框架的显著特征是其双重动力学性质:推理动力学(控制前向计算期间的ODE演化)和训练动力学(控制模型参数的优化)。这使得神经ODE成为研究多种设置(如多层神经网络(例如ResNet)、自回归模型(具有下一个token生成动力学)、生成模型以及理论神经科学中的递归神经网络)的特别合适的理论框架。在这项工作中,我们引入了一个基于理论的模型类,用于研究通过在线随机梯度下降训练的神经ODE。我们通过动态平均场理论求解这些模型的训练动力学,并推导出高维极限下的学习曲线。

英文摘要

Neural ordinary differential equations (neural ODEs) have rapidly gained prominence as a powerful and unifying framework for conceptualizing artificial neural networks, elegantly connecting the continuous-time modeling of dynamical systems with the discrete, data-driven paradigm of modern deep learning. Beyond their practical advantages they offer fresh theoretical insights into the training and generalization properties of neural networks. The distinctive feature of this framework is its dual dynamical nature: inference dynamics, which govern the ODE evolution during forward computation, and training dynamics, which control the optimization of model parameters. This makes neural ODEs a particularly well-suited theoretical framework for studying a large variety of settings such as multi-layer neural networks (ResNets for example), autoregressive models (with next-token generation dynamics), generative models, and recurrent neural networks in theoretical neuroscience. In this work, we introduce a theoretically grounded class of models for studying neural ODEs trained via online stochastic gradient descent. We solve the training dynamics of these models via dynamical mean field theory and derive learning curves in the high-dimensional limit.

2605.18898 2026-06-12 cs.LG stat.ML 交叉投稿

A Two-Parameter Weibull Framework for Diagnosing Transformer Weight Distributions

一种双参数Weibull框架用于变压器权重分布诊断

Tiexin Ding

发表机构 * Independent Researcher(独立研究者)

AI总结 本文提出了一种基于Weibull分布的双参数框架,用于分析Transformer中元素权重幅度分布,通过实验发现不同模块的k值分布特征,并揭示了训练过程中lambda参数的变化规律。

Comments 27 pages, 14 figures. Companion library npm-weibull-py and benchmark database available at https://github.com/tiexinding/NPM-Weibull-public

详情
AI中文摘要

我们应用Weibull分布——极值理论中的一个双参数家族——作为诊断框架,用于分析Transformer中元素权重幅度分布。在初始化时,i.i.d.高斯权重给出|w| ~ HalfNormal,产生k ~ 1.20通过中间80%概率-图拟合(此工作中的协议)。这个锚点使k成为一种原则性的、架构无关的训练动态测量工具;在每个层的每个检查点独立拟合每个权重矩阵,使能够进行每组件、每层和每步的诊断,这些聚合统计无法解决。将此框架应用于12个模型,涵盖7个架构家族(Pythia, OLMo-1/2, LLaMA-3, Mistral, Qwen2.5/3)揭示了三个发现。首先,FFN模块和注意力输出投影W_o——传输类——落在狭窄的k带中:在12个条目中,中位数终端k在[1.186, 1.204]之间(跨家族CV=0.51%),在SwiGLU/GeLU激活、Pre-LN/QK-Norm放置和70M-14B大小之间共享。其次,注意力输入投影W_q, W_k——选择类——脱离Weibull家族,其严重程度由存储形状决定:分别存储Q/K(OLMo-1, OLMo-2)产生k在[0.76, 0.99](深层);GQA模型产生k在[1.10, 1.16](轻微);Pythia的合并W_qkv占据过渡区,跟踪训练预算T/tau单调递增。第三,lambda在训练过程中显著增长,并在Pythia家族中与sqrt(eta/lambda_wd)成比例(Pearson r=0.94,三种传输类型),方向上与Fan等人(2025)一致。这两个参数携带独立信息:k标记功能类别,lambda标记训练进度。我们发布了npm-weibull-py v0.4(Python库)和DATABASE_v9_1在https://github.com/tiexinding/NPM-Weibull-public。

英文摘要

We apply the Weibull distribution -- a two-parameter family from extreme-value theory -- as a diagnostic framework for element-wise weight magnitude distributions in transformers. At initialization, i.i.d. Gaussian weights give |w| ~ HalfNormal, yielding k ~ 1.20 via middle-80% probability-plot fit (the protocol used throughout this work). This anchor makes k a principled, architecture-independent measuring stick for training dynamics; fitting each weight matrix independently at every layer at every checkpoint enables per-component, per-layer, and per-step diagnostics that aggregate statistics cannot resolve. Applying this framework to 12 model entries spanning 7 architectural families (Pythia, OLMo-1/2, LLaMA-3, Mistral, Qwen2.5/3) reveals three findings. First, FFN modules and the attention output projection W_o -- the Transmission Class -- fall in a narrow k band: median terminal k in [1.186, 1.204] across 12 entries (cross-family CV = 0.51%), shared across SwiGLU/GeLU activations, Pre-LN/QK-Norm placements, and 70M-14B sizes. Second, the attention input projections W_q, W_k -- the Selection Class -- depart from the Weibull family, with severity shaped by storage: separately-stored Q/K (OLMo-1, OLMo-2) yields k in [0.76, 0.99] (deep); GQA models yield k in [1.10, 1.16] (mild); Pythia's merged W_qkv occupies a transitional zone tracking training budget T/tau monotonically. Third, lambda grows substantially during training and scales with sqrt(eta/lambda_wd) within the Pythia family (Pearson r = 0.94, three Transmission kinds), directionally consistent with Fan et al. (2025). The two parameters carry independent information: k labels the functional class, lambda labels training progress. We release npm-weibull-py v0.4 (Python library) and DATABASE_v9_1 at https://github.com/tiexinding/NPM-Weibull-public .

2606.01172 2026-06-12 cs.LG stat.ME stat.ML 版本更新

Revisiting Neural Processes via Fourier Transform and Volterra Series

通过傅里叶变换和Volterra级数重新审视神经过程

Peiman Mohseni, Nick Duffield, Raymond K. W. Wong

发表机构 * University of Cambridge(剑桥大学)

AI总结 本文利用Volterra展开和集合傅里叶卷积,提出了两种新的条件神经过程模型,解决了现有平移等变神经过程在可解释性和计算效率上的局限性。

详情
AI中文摘要

从有限的、不规则采样的测量中建模未知的潜在函数是科学和工程中的一个反复出现的挑战。神经过程(NPs)是一类概率函数模型,是有前景的解决方案——尤其是当赋予领域特定的对称性(如平移等变性)时,这提高了样本效率和泛化能力。然而,现有的平移等变NPs面临两个局限性:(i)它们堆叠带有非线性的通用组件,模糊了诱导的函数类并限制了可解释性;(ii)卷积设计依赖于具有局部感受野的核,并需要密集的均匀输入网格,而基于注意力的方法避免了这些问题,但随观测数量呈二次方缩放。我们通过两个贡献解决了这两个问题。首先,利用Volterra展开,我们将连续平移等变算子表征为高阶卷积的和,实现了分析透明性,同时允许通过一阶卷积进行高效近似。其次,我们引入了集合傅里叶卷积(SFConvs),这是一种频域参数化方法,直接在不规则采样点上操作,实现近似全局感受野,并在观测数量上线性缩放。基于这些思想,我们提出了两种条件神经过程(CNPs):SFConvCNPs,它堆叠带有非线性的SFConv块,以及SFVConvCNPs,它整合了Volterra公式。在合成和真实世界数据集上的实验证明了我们的方法相对于最先进基线的有效性。

英文摘要

Modeling unknown latent functions from finite, irregularly sampled measurements is a recurring challenge across science and engineering. Neural processes (NPs), a family of probabilistic functional models, are promising solutions -- especially when endowed with domain-specific symmetries like translation equivariance, which improve sample efficiency and generalization. Yet existing translation-equivariant NPs face two limitations: (i) they stack generic components with non-linearities, obscuring the induced function class and limiting interpretability; and (ii) convolutional designs rely on kernels with local receptive fields and require dense uniform input grids, while attention-based methods avoid these issues but scale quadratically with the number of observations. We address both with two contributions. First, using the Volterra expansion, we characterize continuous translation-equivariant operators as sums of higher-order convolutions, yielding analytical transparency while admitting efficient approximation by first-order convolutions. Second, we introduce set Fourier convolutions (SFConvs), a frequency-domain parameterization that operates directly on irregularly sampled points, achieves approximately global receptive fields, and scales linearly in the number of observations. Building on these ideas, we propose two conditional NPs (CNPs): SFConvCNPs, which stack SFConv blocks with non-linearities, and SFVConvCNPs, which integrate the Volterra formulation. Experiments on synthetic and real-world datasets demonstrate our methods' efficacy against state-of-the-art baselines.

2605.28076 2026-06-12 stat.ML cs.NA math.NA nlin.CD physics.data-an 版本更新

Diagnosing the conditional-mean barrier in scientific machine-learning surrogates

条件均值障碍:从确定性回归到条件分布学习

Junfeng Chen

AI总结 本文提出条件均值障碍概念,通过残差-特征正交性和决定系数两个诊断指标识别该障碍,并证明添加潜在随机性会迫使平方损失预测器回到条件均值,从而需要分布评分损失来跨越障碍。

详情
AI中文摘要

计算科学与工程中的许多问题在粗粒化、部分观测或逆重建后变成一对多映射:一个已解析状态可能无法确定唯一的子网格强迫,一个结构描述符可能无法确定唯一的有效响应,一个低分辨率观测可能对应多个合理的高分辨率场。在这种情况下,确定性代理可能学习到一个定义明确的数学对象,但仍会遗漏应用相关的不确定性。本教程开发了一个以条件均值障碍为中心的自包含模块:平方损失预测器达到条件均值且剩余误差为不可约的偶然方差时的点。我们给出了两个定位该障碍的诊断方法:残差-特征正交性和决定系数(相对于其解释方差上限),并证明向平方损失预测器添加潜在随机性会使其坍缩回条件均值。因此,跨越障碍需要一种对分布而非点预测进行评分的损失函数。我们简要整理了常见的分布目标,包括负对数似然、矩和可观测匹配、变分目标、对抗散度和分数匹配,根据每个目标针对的条件律特征进行分类。重点在于障碍本身以及识别它的有限数据程序,而非对超越障碍的方法进行综述。基于CPU的双分支律和双尺度Lorenz-96闭合问题的演示展示了诊断如何区分确定性欠拟合与剩余分布变异性。

英文摘要

Many problems in computational science and engineering become one-to-many after coarse graining, partial observation, or inverse reconstruction: a resolved state may not determine a unique subgrid forcing, a structural descriptor may not determine a unique effective response, and a low-resolution observation may correspond to many plausible high-resolution fields. In such settings, deterministic surrogates may learn a well-defined mathematical object while still missing application-relevant uncertainty. This tutorial develops a self-contained module centered on the conditional-mean barrier: the point at which a squared-loss predictor has reached the conditional mean and the remaining error is irreducible aleatoric variance. We give two diagnostics for locating this barrier, residual-feature orthogonality and the coefficient of determination against its explained-variance ceiling, and prove that adding latent randomness to a squared-loss predictor collapses it back to the conditional mean. Crossing the barrier therefore requires a loss that scores distributions rather than point predictions. We briefly organize common distributional objectives, including negative log-likelihood, moment and observable matching, variational objectives, adversarial divergences, and score matching, by the feature of the conditional law each targets. The emphasis is the boundary itself and a finite-data procedure for recognizing it, rather than a survey of methods beyond it. CPU-based demonstrations on a two-branch law and a two-scale Lorenz-96 closure problem show how the diagnostics distinguish deterministic underfitting from residual distributional variability.

2603.17527 2026-06-12 stat.ML cs.LG math.OC 版本更新

Mirror Descent on Riemannian Manifolds

黎曼流形上的镜像下降

Jiaxin Jiang, Lei Shi, Jiyuan Tan

发表机构 * School of Mathematical Sciences, Fudan University, Shanghai 200433, China(复旦大学数学学院,上海200433,中国) Shanghai Key Laboratory for Contemporary Applied Mathematics, Fudan University, Shanghai 200433, China(上海当代应用数学重点实验室,复旦大学,上海200433,中国)

AI总结 将镜像下降推广到黎曼流形,通过重参数化提出黎曼镜像下降(RMD)及其随机变体,并建立非渐近收敛保证,在Stiefel流形上退化为曲线梯度下降(CGD)。

详情
AI中文摘要

镜像下降(MD)是一种可扩展的一阶方法,广泛应用于大规模优化,包括图像处理、策略优化和神经网络训练。本文将MD推广到黎曼流形上的优化。具体地,我们通过重参数化开发了一个黎曼镜像下降(RMD)框架,并进一步提出了RMD的随机变体。我们还为RMD和随机RMD建立了非渐近收敛保证。作为在Stiefel流形上的应用,我们的RMD框架退化为[26]中提出的曲线梯度下降(CGD)方法。此外,当将随机RMD框架特化到Stiefel设置时,我们得到了CGD的随机扩展,这有效地解决了大规模流形优化问题。

英文摘要

Mirror Descent (MD) is a scalable first-order method widely used in large-scale optimization, with applications in image processing, policy optimization, and neural network training. This paper generalizes MD to optimization on Riemannian manifolds. In particular, we develop a Riemannian Mirror Descent (RMD) framework via reparameterization and further propose a stochastic variant of RMD. We also establish non-asymptotic convergence guarantees for both RMD and stochastic RMD. As an application to the Stiefel manifold, our RMD framework reduces to the Curvilinear Gradient Descent (CGD) method proposed in [26]. Moreover, when specializing the stochastic RMD framework to the Stiefel setting, we obtain a stochastic extension of CGD, which effectively addresses large-scale manifold optimization problems.

2603.11242 2026-06-12 stat.ML cs.LG 版本更新

A Unified Latent Space Disentanglement VAE Framework with Robust Disentanglement Effectiveness Evaluation

统一潜在空间解缠的VAE框架及鲁棒的解缠效果评估

Xiaoan Lang, Md Mostafizer Rahman, Fang Liu

发表机构 * Department of Applied and Computational Mathematics and Statistics(应用与计算数学与统计系) Lucy Family Institute for Data & Society(数据与社会学院)

AI总结 提出统一框架bfVAE整合多种解缠VAE方法,并开发FVH-LT和DBSR-LS评估解缠效果,引入LSSI指标量化潜在结构分离,无需真实生成因子。

详情
AI中文摘要

评估和解释潜在表示(如变分自编码器VAE)对于多样数据类型仍然是一个重大挑战,尤其是当真实生成因子未知时。为此,我们将几种最先进的用于潜在空间解缠的VAE方法统一到一个框架——bfVAE中。为了评估解缠VAE模型的有效性并增强潜在空间可解释性,我们提出了通过潜在遍历的特征方差异质性(FVH-LT)和潜在空间中的脏块稀疏回归(DBSR-LS)。为了确保学习到的潜在空间的鲁棒可解释性,我们开发了一种贪婪对齐策略(GAS),该策略减轻了标签切换问题,并对齐不同运行中的潜在维度,为结果聚合奠定基础。我们还引入了一个方便的标量潜在空间分离指数(LSSI),该指数基于FVH-LT和DBSR-LS的GAS对齐输出,在不知道真实生成因子的情况下总结整体潜在结构分离。我们将bfVAE与五个VAE模型进行比较,并在七个表格和图像数据集上验证了FVH-LT、DBSR-LS和LSSI的有效性。在我们检查的实验设置下,bfVAE提供了一个更灵活的解缠框架,在解缠和重构之间实现了比基准VAE模型更有利的整体权衡;FVH-LT和DBSR-LS可靠地揭示了语义上有意义且与领域相关的潜在结构,并且通常产生一致的结果;LSSI对潜在结构分离做出了有效的定量总结。

英文摘要

Evaluating and interpreting latent representations, such as variational autoencoders (VAEs), remains a significant challenge for diverse data types, especially when ground-truth generative factors are unknown. To address this, we unify several state-of-the-art disentangled VAE approaches for latent space disentanglement into one framework -- bfVAE. To assess the effectiveness of a disentangled VAE model and enhance latent space interpretability, we propose Feature Variance Heterogeneity via Latent Traversal (FVH-LT) and Dirty Block Sparse Regression in Latent Space (DBSR-LS). To ensure robust interpretability of learned latent space, we develop a greedy alignment strategy (GAS) that mitigates label switching and aligns latent dimensions across runs to set the foundation of result aggregation. We also introduce a convenient scalar latent space separation index (LSSI) based on the GAS-aligned outputs of FVH-LT and DBSR-LS to summarize the overall latent structural separation without knowledge of the ground-truth generative factors. We compare bfVAE to five VAE models and validate the effectiveness FVH-LT, DBSR-LS, and LSSI in on seven tabular and image datasets. Under our examined experimental settings, bfVAE provides a more flexible disentanglement framework achieves more favorable overall trade-off between disentanglement and reconstruction than the benchmark VAE models; FVH-LT and DBSR-LS reliably uncover semantically meaningful and domain-relevant latent structures and generally yield consistent results; and LSSI makes an effective quantitative summary of latent structural separation.

2304.13836 2026-06-12 cs.LG cs.AI cs.CV stat.ME 版本更新

On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective

论 $\textit{RemOve-And-Retrain}$ 的陷阱:数据处理不等式视角

Junhwa Song, Keumgang Cha, Junghoon Seo

发表机构 * KAIST(韩国科学技术院)

AI总结 从信息论角度揭示ROAR基准的缺陷:数据无关的后处理可提升ROAR分数,导致对归因图信息量的误判,并发现模糊性偏差。

Comments Accepted at the 2026 ICML Workshop on Mechanistic Interpretability

详情
AI中文摘要

RemOve-And-Retrain (ROAR) 基准被广泛用于评估特征归因方法,但其有效性尚未从信息论角度得到充分探索。我们证明,对归因图进行模型和数据无关的后处理(通过数据处理不等式,这些变换\emph{不能}增加关于决策函数的信息)通常可以改善ROAR分数。这意味着ROAR排名的提升本身并不能证明归因图携带更多关于模型的信息。我们将这种失败模式归因于对空间模糊掩膜的偏好。在CIFAR-10、SVHN和CUB-200上的实验显示,模糊度与ROAR性能之间存在一致的关联,这种模式也出现在ROAD变体中。我们为更谨慎的基于移除的基准测试提供了指导方针,这对验证神经网络内部机制的机械理解具有重要意义。

英文摘要

The RemOve-And-Retrain (ROAR) benchmark is widely used to evaluate feature attribution methods, yet its validity remains underexplored from an information-theoretic perspective. We show that model- and data-agnostic post-processing of attribution maps (transformations that, by the data processing inequality, \emph{cannot} add information about the decision function) can often improve ROAR scores. This means that an improved ROAR ranking is not, by itself, evidence that an attribution map carries more information about the model. We trace this failure mode to a bias toward spatially blurry masks. Experiments on CIFAR-10, SVHN, and CUB-200 show a consistent association between blurriness and ROAR performance, a pattern that also appears in the ROAD variant. We provide guidelines for more cautious removal-based benchmarking, with implications for validating mechanistic understanding of neural network internals.

2508.21531 2026-06-12 stat.ML cs.LG stat.CO 版本更新

Adaptive generative moment matching networks for improved learning of dependence structures

自适应生成矩匹配网络用于改进依赖结构学习

Marius Hofert, Gan Yao

发表机构 * Department of Statistics and Actuarial Science, The University of Hong Kong(香港大学统计与精算科学系)

AI总结 提出自适应带宽选择的最大均值差异混合核用于生成矩匹配网络,通过增加核数量和早停策略提升训练性能,在copula随机数生成、高维收敛率及金融数据依赖建模中优于传统方法。

详情
AI中文摘要

引入了一种用于最大均值差异(MMD)中混合核的自适应带宽选择程序,以拟合生成矩匹配网络(GMMNs),并展示了copula随机数生成器的改进学习。基于训练损失的相对误差,在训练过程中增加核的数量;此外,验证损失的相对误差被用作早停标准。虽然训练时间保持相似,但自适应训练GMMNs(AGMMNs)显著提高了训练性能,这通过验证MMD轨迹、样本和验证MMD值得以展示。在三个应用中,AGMMNs相比GMMNs和参数copula模型也表现出优越性。首先,首次在高达100维的维度中研究了基于copula的准随机与伪随机样本的估计量收敛速度。其次,重复的验证MMD以及蒙特卡洛和准蒙特卡洛应用证明了AGMMNs在去GARCH化后的标普500指数50个成分所隐含的copula模型上的改进训练。最后,后一个数据集和富时100指数的50个成分被用于证明AGMMNs的改进训练确实转化为改进的模型预测。

英文摘要

An adaptive bandwidth selection procedure for the mixture kernel in the maximum mean discrepancy (MMD) for fitting generative moment matching networks (GMMNs) is introduced, and improved learning of copula random number generators is demonstrated. Based on the relative error of the training loss, the number of kernels is increased during training; additionally, the relative error of the validation loss is used as an early stopping criterion. While training time remains similar, adaptively training GMMNs (AGMMNs) significantly increases training performance, which is shown based on validation MMD trajectories, samples and validation MMD values. Superiority of AGMMNs over GMMNs and parametric copula models is also demonstrated in terms of three applications. First, convergence rates of estimators based on quasi-random versus pseudo-random samples from copulas are investigated in dimensions as large as 100 for the first time. Second, replicated validation MMDs, as well as Monte Carlo and quasi-Monte Carlo applications demonstrate the improved training of AGMMNs for a copula model implied by the 50 constituents of the S&P 500 index after deGARCHing. Last, both the latter dataset and 50 constituents of the FTSE 100 are used to demonstrate that the improved training of AGMMNs indeed translates to an improved model prediction.

2502.18959 2026-06-12 cs.LG stat.ML 版本更新

Fourier Multi-Component and Multi-Layer Neural Networks: Unlocking High-Frequency Potential

傅里叶多分量与多层神经网络:解锁高频潜力

Shijun Zhang, Hongkai Zhao, Yimin Zhong, Haomin Zhou

发表机构 * Department of Applied Mathematics(应用数学系) Hong Kong Polytechnic University(香港理工大学) Department of Mathematics(数学系) Duke University(杜克大学) Department of Mathematics and Statistics(数学与统计学系) Auburn University(阿伯茨伦大学) School of Mathematics(数学学院) Georgia Institute of Technology(佐治亚理工学院)

AI总结 提出傅里叶多分量与多层神经网络(FMMNN),结合正弦型激活函数与多分量多层结构,通过低秩架构实现指数级函数逼近能力,优化景观优于标准全连接网络,并设计缩放随机初始化方法加速训练,在高频函数逼近任务中取得高精度与良好收敛性。

Comments Our code and implementation details are available at https://github.com/ShijunZhangMath/FMMNN

详情
AI中文摘要

神经网络的结构及其激活函数的选择对其性能至关重要。同样重要的是确保这两个元素良好匹配,因为它们的对齐是有效表示和学习的关键。在本文中,我们引入了傅里叶多分量与多层神经网络(FMMNN),该模型将正弦型激活函数与MMNN的多分量多层结构相结合。在FMMNN中,每个分量表示为固定随机正弦型基函数的可训练线性组合,而多层组合则生成更复杂且自适应的频率特征。我们证明,即使在低秩架构下,FMMNN仍能保持函数逼近的指数级表达能力。我们还分析了FMMNN的优化景观,发现其比标准全连接神经网络更有利,尤其是对于高频目标。此外,我们提出了一种针对FMMNN第一层权重的缩放随机初始化方法,当样本充足时,该方法能加速训练并提高最终性能。大量数值实验支持我们的理论见解,表明FMMNN在振荡函数逼近基准上实现了高精度和良好的收敛行为。

英文摘要

The architecture of a neural network and the choice of its activation function are both fundamental to its performance. Equally important is ensuring that these two elements are well matched, as their alignment is key to effective representation and learning. In this paper, we introduce the Fourier Multi-Component and Multi-Layer Neural Network (FMMNN), a model that combines sine-type activations with the multi-component and multi-layer structure of MMNNs. In an FMMNN, each component is represented as a trainable linear combination of fixed random sine-type basis functions, while multi-layer composition generates more complex and adaptive high-frequency features. We establish that FMMNNs retain exponential expressive power for function approximation even under a low-rank architectural structure. We also analyze the optimization landscape of FMMNNs and find it to be substantially more favorable than that of standard fully connected neural networks, especially for high-frequency targets. In addition, we propose a scaled random initialization method for the first-layer weights in FMMNNs, which accelerates training and improves final performance when sufficient samples are available. Extensive numerical experiments support our theoretical insights, showing that FMMNNs achieve strong accuracy and favorable convergence behavior on oscillatory function-approximation benchmarks.

7. 生物统计与医学统计 9 篇

2606.12677 2026-06-12 stat.ME 新提交

Restricted Multivariate Spatial Modeling

受限多变量空间建模

Jihyeon Kwon, Harrison Quick

AI总结 针对多变量条件自回归模型信息量过强的问题,提出一种通过重参数化控制信息量的受限MCAR模型,并在心脏病死亡数据中展示其优势。

Comments 30 pages

详情
AI中文摘要

在对小区域健康事件建模时,Besag、York和Mollié(BYM)的条件自回归(CAR)框架被广泛使用。对于多结局,多变量CAR(MCAR)扩展除了空间依赖性外,还容纳了共享风险因素的疾病之间的依赖性,并且还可以联合建模单一疾病的人口统计子组,允许在相关人群之间借用信息。然而,最近的研究表明,BYM CAR模型可能信息量过强,导致估计过于精确。虽然MCAR模型由于跨子组共享额外信息而预期信息量更强,但其信息量水平此前尚未被量化。我们提出了一个框架来测量MCAR模型的信息量,作为先前工作的扩展,并引入了一种控制信息量的方法,确保模型对每个子组的贡献相当。我们通过在一个计算高效的框架内对MCAR模型进行重参数化来实现这一点。我们展示了MCAR模型在信息量和过度平滑方面与BYM CAR模型的比较,并利用按种族和性别分层的县级心脏病死亡数据突出了受限MCAR模型的优势。

英文摘要

When modeling health events in small areas, the conditional autoregressive (CAR) framework of Besag, York, and Mollié (BYM) is widely used. For multiple outcomes, the multivariate CAR (MCAR) extension accommodates dependence among diseases that share risk factors, in addition to spatial dependence, and can also jointly model demographic subgroups for a single disease, allowing information to be borrowed across related populations. However, recent studies have shown that the BYM CAR model can be overly informative, leading to excessively precise estimates. While the MCAR model is expected to be more informative due to additional information shared across subgroups, its level of informativeness has not been previously quantified. We propose a framework to measure MCAR model informativeness as an extension of prior work and introduce a method to control it, ensuring the model contributes comparably to each subgroup. We achieve this through a reparameterization of the MCAR model within a computationally efficient framework. We demonstrate how the MCAR model compares with the BYM CAR model in terms of informativeness and oversmoothing and highlight the advantages of the restricted MCAR model using county-level heart disease death data stratified by race and sex.

2606.12566 2026-06-12 stat.ME 新提交

Inferring resource selection and utilization distributions from irregular and error-prone animal tracking data

从不规则且带有误差的动物追踪数据推断资源选择和利用分布

Fanny Dupont, Brett T. McClintock, Jan-Ole Fischer, Marianne Marcoux, Nigel E. Hussey, Marie Auger-Méthé

AI总结 提出基于拉普拉斯近似的单阶段框架,通过TMB实现,同时处理测量误差和不规则采样,在模拟和独角鲸数据中优于两步法。

Comments 26 pages

详情
AI中文摘要

栖息地选择和空间利用是理解动物分布的基础。从遥测数据量化栖息地偏好的传统方法假设规则采样和可忽略的测量误差。然而,这些假设在海洋系统中经常被违反。实践者通常在进行模型拟合之前对数据进行规则化和过滤,但这两步过程未能传播过滤阶段的不确定性,并可能导致有偏估计。栖息地驱动的Langevin扩散模型提供了一种优雅的替代方案,自然地适应不规则采样。然而,通过状态空间公式纳入测量误差具有挑战性,因为栖息地协变量依赖于潜在的真实位置。我们利用拉普拉斯近似同时积分真实位置并考虑潜在路径上的栖息地协变量,从而在模板模型构建器(TMB)中高效实现单阶段框架。通过这样做,我们提供了第一个能够处理依赖于潜在变量的协变量的TMB实现,允许通过快速高效的极大似然估计进行推断。模拟表明,我们的方法优于两步法,即使在显著的测量误差和缺失数据下也能恢复栖息地选择参数,并得到更准确的利用分布和轨迹重建。应用于独角鲸(Monodon monoceros)遥测数据时,两步法将栖息地选择系数大幅缩小至接近零,而我们的统一方法恢复了更强的信号。我们的框架为栖息地选择推断中长期存在的测量误差和时间不规则性挑战提供了计算高效的解决方案,适用于广泛的分类群和环境。

英文摘要

Habitat selection and space use are fundamental to understanding animal distribution. Traditional methods for quantifying habitat preferences from telemetry data assume regular sampling and negligible measurement error. However, these assumptions are routinely violated in marine systems. Practitioners typically regularize and filter the data before fitting models, but these two-step procedures do not propagate uncertainty from the filtering stage and can yield biased estimates. Habitat-driven Langevin diffusion models offer an elegant alternative, naturally accommodating irregular sampling. However, incorporating measurement error via a state-space formulation is challenging because habitat covariates depend on the latent true locations. We address this using the Laplace approximation to simultaneously integrate over true locations and account for habitat covariates along latent paths, yielding a single-stage framework efficiently implemented in Template Model Builder (TMB). By doing so, we provide the first TMB implementation capable of handling covariates that depend on latent variables, allowing inference via fast and efficient maximum likelihood estimation. Simulations show that our approach outperforms the two-step method, recovering habitat-selection parameters even under substantial measurement error and missing data, with more accurate utilization distributions and trajectory reconstructions. Applied to narwhal (Monodon monoceros) telemetry data, the two-step method substantially shrinks the habitat selection coefficient towards zero, while our unified approach recovers a much stronger signal. Our framework offers a computationally efficient solution to long-standing challenges of measurement error and temporal irregularity in habitat selection inference, applicable across a wide range of taxa and environments.

2606.13236 2026-06-12 cs.LG cs.AI cs.SD stat.AP 新提交

Decoding Insect Song: A Multitask Semisupervised Orthoptera Bioacoustic Classifier

解码昆虫之歌:一种多任务半监督直翅目生物声学分类器

Olga Isupova, Danil Kuzin, Ella Browning, Tom Mills, Steven Reece

发表机构 * University of Oxford(牛津大学)

AI总结 提出PULSE半监督多任务框架,结合弱监督分类、自监督学习和知识蒸馏,在直翅目生物声学分类中优于通用模型,并通过主动学习进一步提升性能。

Comments ICML 2026 Workshop on Machine Learning for Audio

详情
AI中文摘要

被动声学监测在生态推断方面具有巨大潜力,但现有的自动化工具通常训练范围狭窄且不可迁移。我们通过PULSE(一种用于直翅目生物声学的半监督多任务框架)解决了这些局限性,该框架结合了弱监督物种分类、未标记野外音频的自监督学习以及来自通用生物声学模型的知识蒸馏。我们的领域自适应专家模型在所有指标上均优于最先进的通用模型(宏F1:0.21 vs. 0.07;AUC:0.74 vs. 0.45;AP:0.32 vs. 0.19),主动学习进一步将F1提升至0.34,AUC提升至0.84。除了分类之外,学习到的嵌入编码了生态上有意义的结构,并通过交互式可视化工具暴露出来,用于生态发现。

英文摘要

Passive acoustic monitoring holds great promise for ecological inference, yet existing automated tools are typically narrowly trained and non-transferable. We address these limitations with PULSE, a semi-supervised, multi-task framework for Orthoptera bioacoustics, combining weakly-supervised species classification, self-supervised learning on unlabelled field audio, and knowledge distillation from a general-purpose bioacoustic model. Our domain-adapted specialist model outperforms a state-of-the-art general model across all metrics (macro F1: 0.21 vs. 0.07; AUC: 0.74 vs. 0.45; AP: 0.32 vs. 0.19), with active learning further raising F1 to 0.34 and AUC to 0.84. Beyond classification, the learned embeddings encode ecologically meaningful structure, exposed through an interactive visualisation tool for ecological discovery.

2602.17041 2026-06-12 stat.ME 版本更新

Reframing Population-Adjusted Indirect Comparisons as a Transportability Problem: An Estimand-Based Perspective and Implications for Health Technology Assessment

将人口调整间接比较重新定义为可迁移性问题:基于估计量的视角及其对卫生技术评估的影响

Conor Chandler, Jack Ishak

AI总结 本文从估计量角度形式化人口调整间接比较中的可迁移性,区分条件与边际处理效应,并揭示效应修饰、可压缩性与效应尺度如何影响迁移,为卫生技术评估中间接证据的使用提供指导。

Comments 26 pages (excluding supplement and references), 7 figures, 1 table

详情
AI中文摘要

当随机对照试验招募不同患者群体且缺乏头对头比较时,人口调整间接比较(PAICs)被广泛用于综合证据。尽管PAICs调整了试验间观察到的人群差异,但仅调整并不能确保估计效应可迁移至卫生技术评估(HTA)中决策相关的人群。我们从基于估计量的角度审视并形式化PAICs中的可迁移性。我们区分条件与边际处理效应估计量,并展示可迁移性如何依赖于效应修饰、可压缩性以及效应修饰尺度与效应度量之间的一致性。通过示例说明,即使效应修饰因子在不同治疗间共享,对于常用的非可压缩性度量(包括风险比和比值比),边际效应通常依赖于人群。相反,在线性预测变量尺度上定义的可压缩性和条件效应表现出更有利的可迁移性属性。我们进一步证明,成对PAIC方法通常识别在比较人群中所定义的效应,将这些估计应用于其他人群需要额外的、通常是隐含的迁移步骤,这需要进一步的假设。这对HTA有直接影响,因为PAIC推导的效应通常应用于为不同目标人群定义的成本效果和决策模型中。我们的结果阐明了何时将PAIC推导的处理效应应用于期望目标人群是合理的,何时需要额外假设,以及何时应将结果解释为特定人群而非决策相关,从而支持在HTA及相关决策环境中更透明、更有原则地使用间接证据。

英文摘要

Population-adjusted indirect comparisons (PAICs) are widely used to synthesize evidence when randomized controlled trials enroll different patient populations and head-to-head comparisons are unavailable. Although PAICs adjust for observed population differences across trials, adjustment alone does not ensure transportability of estimated effects to decision-relevant populations for health technology assessment (HTA). We examine and formalize transportability in PAICs from an estimand-based perspective. We distinguish conditional and marginal treatment effect estimands and show how transportability depends on effect modification, collapsibility, and alignment between the scale of effect modification and the effect measure. Using illustrative examples, we demonstrate that even when effect modifiers are shared across treatments, marginal effects are generally population-dependent for commonly used non-collapsible measures, including hazard ratios and odds ratios. Conversely, collapsible and conditional effects defined on the linear predictor scale exhibit more favorable transportability properties. We further show that pairwise PAIC approaches typically identify effects defined in the comparator population and that applying these estimates to other populations entails an additional, often implicit, transport step requiring further assumptions. This has direct implications for HTA, where PAIC-derived effects are routinely applied within cost-effectiveness and decision models defined for different target populations. Our results clarify when applying PAIC-derived treatment effects to desired target populations is justified, when doing so requires additional assumptions, and when results should instead be interpreted as population-specific rather than decision-relevant, supporting more transparent and principled use of indirect evidence in HTA and related decision-making contexts.

2601.04192 2026-06-12 stat.ME 版本更新

Prediction Intervals for Future Event Counts at Interim Analyses of Time-to-Event Clinical Trials

时间-事件临床试验中期分析中未来事件计数的预测区间

Edoardo Ratti, Federico L. Perlino, Stefania Galimberti, Maria G. Valsecchi

AI总结 针对时间-事件临床试验中期分析,提出基于条件参数自助法的患者级框架,构建未来事件计数的预测区间,并通过模拟和实际案例验证其有效性。

Comments 36 pages, 19 figures

详情
AI中文摘要

时间-事件终点是评估各疾病领域治疗效果的核心。在具有时间-事件终点的临床试验中,中期和最终分析可用的信息主要由观察到的事件数而非入组患者数决定。因此,中期监测需要评估在预定未来分析日期前预计将累积多少额外事件。量化这些计数的不确定性对于评估计划的信息水平是否可能达到、预测延迟或事件超限以及支持试验进行中的操作决策至关重要。这在儿科肿瘤学试验中尤其相关,因为事件累积通常具有不确定性。尽管预测终点成熟时间的方法已很成熟,但在固定日历时间对事件计数进行区间预测仍不完善。我们提出一个患者级框架,用于在时间-事件试验的中期分析中构建此类区间。以中期数据为条件,未来计数遵循具有患者特异性事件概率的泊松-二项分布;我们使用条件参数自助法估计该分布。在标准正则条件下,自助法是一致的,并产生渐近校准的预测区间。该框架适应了分阶段入组、患者级协变量、管理删失、随机失访以及入组日期与失访之间在条件于已实现中期数据之前的可能依赖关系。我们通过模拟研究其操作特征,并利用一项儿童急性淋巴细胞白血病的真实III期试验进行说明。

英文摘要

Time-to-event endpoints are central to evaluating treatment efficacy across disease areas. In clinical trials with time-to-event endpoints, the information available for interim and final analyses is largely determined by the number of observed events rather than by the number of enrolled patients. Interim monitoring therefore requires assessing how many additional events are expected to accrue by scheduled future analysis dates. Quantifying uncertainty around these counts is essential for assessing whether planned information levels are likely to be reached, anticipating delays or event overrunning, and supporting operational decisions while the trial is ongoing. This is especially relevant in pediatric oncology trials, where event accrual is often uncertain. Although methods for predicting time to endpoint maturation are well established, interval prediction for event counts at fixed calendar times remains less developed. We propose a patient-level framework for constructing such intervals at interim analyses of time-to-event trials. Conditionally on the interim data, the future count follows a Poisson--binomial law with patient-specific event probabilities; we estimate this law using a conditional parametric bootstrap. Under standard regularity conditions, the bootstrap is consistent and yields asymptotically calibrated prediction intervals. The framework accommodates staggered entry, patient-level covariates, administrative censoring, random loss to follow-up, and possible dependence between entry dates and loss to follow-up before conditioning on the realised interim data. We study its operating characteristics in simulation studies and illustrate it using a real-world phase III trial in childhood acute lymphoblastic leukaemia.

2201.13095 2026-06-12 stat.ME 版本更新

Joint Count Transformation Models with Covariate-dependent Correlations

具有协变量相关相关性的联合计数变换模型

Lukas Graz, Luisa Barbanti, Roland Brandl, Torsten Hothorn

AI总结 提出联合计数变换模型,结合无分布边际计数变换与协变量依赖的高斯Copula,通过联合最大似然估计高效建模多物种丰度及其相关性,在鸟类案例中捕获季节变化模式。

详情
AI中文摘要

联合物种分布模型对于理解生态协变量如何塑造物种群落至关重要。然而,大多数现有方法受限于计数数据的刚性参数分布,且无法模拟种间关联如何随这些协变量变化。我们引入了联合计数变换模型,这是一个旨在克服这些限制的新框架。我们的方法将多个物种的无分布边际计数变换模型与协变量依赖的潜高斯Copula相结合,以建模种间相关性,该相关性可解释为观测计数尺度上的Spearman秩相关。所有模型参数通过联合最大似然估计高效估计,并在R包tram中实现。我们将此框架应用于模拟三种食鱼鸟类的联合丰度,以季节性作为主要协变量。我们的模型成功捕获了复杂的、物种特异性的季节性丰度模式,包括高零计数的时期和方差的季节性变化。此外,模型揭示了物种之间强烈的、随季节变化的相关性。这些发现与经验方法一致,并且与计算昂贵的参数化贝叶斯分层建模物种群落(HMSC)框架的结果相似。通过多达10个物种的模拟研究,证明了我们方法的一致性、准确性和可行性。

英文摘要

Joint Species Distribution Models are essential for understanding how ecological covariates shape species communities. However, most existing approaches are limited by rigid parametric distributions for count data and the inability to model how interspecific associations change with those covariates. We introduce joint count transformation models, a novel framework designed to overcome these limitations. Our approach combines distribution-free marginal count transformation models for multiple species with a covariate-dependent latent Gaussian copula to model interspecific correlations, interpretable as Spearman's rank correlation on the observed count scale. All model parameters are estimated efficiently via joint maximum likelihood estimation, implemented in the R package tram. We apply this framework to model the joint abundance of three fish-eating bird species, using seasonality as the primary covariate. Our model successfully captured the complex, species-specific seasonal abundance patterns, including periods of high zero-counts and seasonal shifts in variance. Furthermore, the model revealed strong, seasonally-varying correlations between the species. These findings are consistent with an empirical approach and similar to those from the computationally expensive parametric Bayesian Hierarchical Modelling of Species Communities (HMSC) framework. Consistency, accuracy and feasibility of our approach are demonstrated in a simulation study for up to 10 species.

2509.12473 2026-06-12 stat.ME 版本更新

Cox Regression on the Plane

平面上的Cox回归

Yael Travis-Lumer, Micha Mandel, Ido Didi Fabian, Rebecca A. Betensky, Malka Gorfine

AI总结 提出两种基于Lehmann型表示的Cox比例风险模型扩展,用于双变量生存数据,通过伪观测方法估计回归参数,并证明估计量的一致性和渐近正态性。

Comments 89 pages, including appendices, figures, and tables

详情
AI中文摘要

Cox比例风险模型是单变量生存分析中最广泛使用的回归模型,但其对双变量生存数据的扩展仍然很少。我们基于生存函数的Lehmann型表示提出了两种新的扩展。第一种是简单Lehmann模型,是一种直接扩展,保留了简单的结构。第二种是广义Lehmann模型,通过引入三个不同的回归参数允许更大的灵活性,并将简单Lehmann模型作为特例。这些模型在生存概率方面具有直接解释,提供了一个透明、完全半参数的框架,用于评估协变量对边际生存概率及其依赖性的影响,而无需指定copula或脆弱分布。为了估计回归参数,我们基于双变量生存数据的伪观测方法,并通过两步程序将其扩展到广义模型。我们建立了所得估计量的一致性和渐近正态性。通过模拟研究和来自全球视网膜母细胞瘤研究的数据应用说明了所提出的方法。

英文摘要

The Cox proportional hazards model is the most widely used regression model in univariate survival analysis, yet extensions to bivariate survival data remain scarce. We propose two novel extensions based on a Lehmann-type representation of the survival function. The first, the simple Lehmann model, is a direct extension that retains a straightforward structure. The second, the generalized Lehmann model, allows greater flexibility by incorporating three distinct regression parameters and includes the simple Lehmann model as a special case. The models admit a direct interpretation in terms of survival probabilities, providing a transparent, fully semiparametric framework for assessing covariate effects on both marginal survival probabilities and their dependence, without requiring specification of a copula or frailty distribution. To estimate the regression parameters, we build on a pseudo-observation-based approach for bivariate survival data and extend it to the generalized model via a two-step procedure. We establish consistency and asymptotic normality of the resulting estimators. The proposed approach is illustrated through simulation studies and an application to data from the Global Retinoblastoma Outcome Study.

2508.20349 2026-06-12 stat.ME 版本更新

Covariate-adjusted win statistics in randomized clinical trials with ordinal outcomes

序数结局随机对照试验中协变量调整的胜率统计量

Zhiqiang Cao, Scott Zuo, Mary Ryan Baumann, Kendra Plourde, Patrick Heagerty, Guangyu Tong, Fan Li

AI总结 针对序数结局,提出基于倾向性评分加权和增广加权的胜率估计方法,实现协变量调整以提高效率,并证明模型稳健性。

详情
AI中文摘要

序数结局在临床中常见,通常代表疾病进展的不同阶段或功能损伤的不同程度。本文通过内在成对结局比较(如胜率和胜率差)来表征序数结局的平均处理效应。认识到基线协变量调整对提高精度的价值,我们首先开发了倾向性评分加权估计量,包括逆概率加权(IPW)和重叠加权(OW),专门用于估计胜率估计量。此外,我们开发了增广加权估计量,利用额外的序数结局回归以可能提高仅加权的效率。利用U统计量理论,我们建立了所有估计量的渐近理论,并推导了闭式方差估计量以支持统计推断。我们还证明了所有协变量调整估计量即使在相关工作模型错误指定时也不会损害目标估计量的一致性;因此这些协变量调整估计量具有模型稳健性。通过模拟,我们展示了加权估计量相对于未调整估计量的效率提升,而增广加权估计量在除极端情况外进一步提高了效率。最后,我们通过ORCHID试验说明了所提出的方法,并在R包winPSW中实现了我们的协变量调整方法。

英文摘要

Ordinal outcomes are common in clinical settings where they often represent increasing levels of disease progression or different levels of functional impairment. In this article, we focus on representing the average treatment effect for ordinal outcomes via intrinsic pairwise outcome comparisons captured through win estimands, such as the win ratio and win difference. Recognizing the value of baseline covariate adjustment toward enhanced precision, we first develop propensity score weighting estimators, including both inverse probability weighting (IPW) and overlap weighting (OW), tailored to estimating win estimands. Furthermore, we develop augmented weighting estimators that leverage an additional ordinal outcome regression to potentially improve efficiency over weighting alone. Leveraging the theory of U-statistics, we establish the asymptotic theory for all estimators, and derive closed-form variance estimators to support statistical inference. We also prove that all of the covariate-adjusted estimators do not compromise consistency for the target estimand even when the associated working models are incorrectly specified; hence these covariate-adjusted estimators are model-robust. Through simulations we demonstrate the enhanced efficiency of the weighted estimators over the unadjusted estimator, with the augmented weighting estimators showing a further improvement in efficiency except for extreme cases. Finally, we illustrate our proposed methods with the ORCHID trial, and implement our covariate adjustment methods in an R package winPSW.

2412.12967 2026-06-12 stat.ME 版本更新

Neural Posterior Estimation for Stochastic Epidemic Modeling

随机流行病建模的神经后验估计

Prayag Chatha, Fan Bu, Jeffrey Regier, Evan Snitkin, Jon Zelner

AI总结 提出使用神经后验估计(NPE)校准随机传染病模型,通过模拟训练神经网络近似后验分布,在样本效率上优于近似贝叶斯计算(ABC),并应用于医疗相关感染数据。

Comments 36 pages, 22 figures, preprint. To be published in the Annals of Applied Statistics

详情
AI中文摘要

随机传染病模型捕捉了公共卫生结果的不确定性,并在流行病学实践中日益流行。然而,使用现有参数估计方法将这些模型校准到观测数据具有挑战性。随机流行病模型是非线性动力系统,具有潜在的大状态空间,导致似然密度在计算上难以处理。我们开发了一种使用神经后验估计(NPE)校准复杂流行病模型到高维数据的方法,这是一种用于基于模拟推断的新技术。在NPE中,在模拟数据上训练的神经条件密度估计器学习“反转”随机模拟器,返回后验分布的参数近似。我们引入了一个随机的、离散时间的易感-感染(SI)模型,具有异质性传播,用于医疗相关感染(HAIs)。HAIs是医疗系统的重大负担,它们表现出高比例的无症状携带,使得估计感染率变得困难。通过广泛的模拟实验,我们表明NPE能够以比近似贝叶斯计算(ABC)更高的样本效率产生准确的感染率后验估计。然后,我们使用NPE将我们的SI模型拟合到一家长期急性护理机构中耐碳青霉烯肺炎克雷伯菌的爆发,发现了患者间传播风险中基于位置的异质性证据。我们认为我们的方法可以有效地应用于广泛的机制传播模型和传染病流行病学问题。

英文摘要

Stochastic infectious disease models capture uncertainty in public health outcomes and have become increasingly popular in epidemiological practice. However, calibrating these models to observed data is challenging with existing methods for parameter estimation. Stochastic epidemic models are nonlinear dynamical systems with potentially large latent state spaces, resulting in computationally intractable likelihood densities. We develop an approach to calibrating complex epidemiological models to high-dimensional data using Neural Posterior Estimation, a novel technique for simulation-based inference. In NPE, a neural conditional density estimator trained on simulated data learns to "invert" a stochastic simulator, returning a parametric approximation to the posterior distribution. We introduce a stochastic, discrete-time Susceptible Infected (SI) model with heterogeneous transmission for healthcare-associated infections (HAIs). HAIs are a major burden on healthcare systems. They exhibit high rates of asymptotic carriage, making it difficult to estimate infection rates. Through extensive simulation experiments, we show that NPE produces accurate posterior estimates of infection rates with greater sample efficiency compared to Approximate Bayesian Computation (ABC). We then use NPE to fit our SI model to an outbreak of carbapenem-resistant Klebsiella pneumoniae in a long-term acute care facility, finding evidence of location-based heterogeneity in patient-to-patient transmission risk. We argue that our methodology can be fruitfully applied to a wide range of mechanistic transmission models and problems in the epidemiology of infectious disease.

8. 经济金融与社会科学统计 5 篇

2606.13401 2026-06-12 stat.AP 新提交

Scaling Demand-Side Flexibility Through Dynamic Tariffs

通过动态电价扩展需求侧灵活性

Lucas Brylle, Niels Andersen, Henrik Madsen

AI总结 本文论证动态电价激励的隐性需求侧灵活性是应对配电网挑战的最可扩展且经济有效的方法,可节省每座受限变电站1300-4800万丹麦克朗。

详情
AI中文摘要

丹麦配电网中持续的电气化和可再生能源整合带来了重大运营挑战,包括储备容量不足、过载导致的组件退化、电压不稳定以及不断增加的基础设施投资需求。本文论证,通过动态电价激励的隐性需求侧灵活性(DSF)是应对现代配电网这些挑战的最可扩展且经济有效的方法。我们证明,虽然显式灵活性机制提供了运营确定性,但它们无法扩展到解决异构客户群中的系统范围拥堵。基于显示强烈价格响应行为的经验消费数据、因监管框架(如丹麦市场模型3.0和电价模型3.0)而变化的价格以及经济分析,我们展示了通过延迟或避免加固,每座受限变电站可节省1300-4800万丹麦克朗的电网成本。我们认为,隐性DSF机制代表了收入中性的可扩展灵活性解决方案的必要路径,可以在保持系统可靠性的同时延迟昂贵的电网加固。除了直接的电网节省外,额外的价值流包括避免峰值发电成本、减少连接延迟和降低停电风险,进一步增强了经济性。关键是,动态电价提供了将实时电网约束传达给消费者的机制,使价格信号能够准确反映配电网在任何给定时间和地点的实际容量状态。

英文摘要

The ongoing electrification and integration of renewable energy sources in Denmark's distribution grids pose significant operational challenges, including insufficient reserve capacity, component degradation due to overload, voltage instability, and increasing infrastructure investment requirements. This article argues that implicit demand-side flexibility (DSF) incentivized through dynamic tariffs offers the most scalable and cost-effective approach to address these challenges in a modern distribution network. We demonstrate that while explicit flexibility mechanisms provide operational certainty, they cannot scale to address system-wide congestion across heterogeneous customer bases. Drawing on empirical consumption data showing strong price-responsive behavior, varying prices due to, e.g., regulatory frameworks including the Danish Market Model 3.0 and Tariff Model 3.0, and economic analysis, we demonstrate potential grid savings of 13--48 million DKK per constrained substation through deferred or avoided reinforcement. We argue that implicit DSF mechanisms represent the necessary pathway for revenue-neutral scalable flexibility solutions that can defer costly grid reinforcements while maintaining system reliability. Beyond direct grid savings, additional value streams include avoided peak generation costs, reduced connection delays, and lower outage risk, further strengthening the economic case. Critically, dynamic tariffs offer the mechanism through which real-time grid constraints can be communicated to consumers, enabling price signals that accurately reflect the actual state of the capacity of the distribution network at any given point in time and space.

2606.13094 2026-06-12 stat.AP 新提交

Efficient Estimation of A-basis and B-Basis Value under Epistemic Uncertainty using Importance Sampling and Control Variates

基于重要性采样和控制变量的认知不确定性下A基准和B基准值的高效估计

Elton Donfack-Siewe, Jérôme Morio, Sylvain Dubreuil, Jean-Philippe Navarro, Christian Fagiano

AI总结 针对航空航天认证中的保守分位数估计问题,提出一种利用重要性采样和控制变量在混合不确定性下高效估计A基准和B基准的方法,确保无偏一致估计并量化认知不确定性来源。

详情
AI中文摘要

在航空航天认证和其他安全关键领域,保守分位数估计(如A基准和B基准值)对于保证可靠性至关重要。虽然这些指标传统上来自实验活动,但本文关注使用经过验证的确定性数值模型进行估计。该问题在混合偶然-认知不确定性下提出,考虑了有限材料数据、有限采样效应和代理模型误差。我们提出了一种在混合不确定性下具有统计保证的保守设计分位数估计方法。所提出的方法利用重要性采样和控制变量,在固定计算预算内实现准确高效的估计。一个关键点是代理模型仅作为方差缩减工具,这保证了无偏且一致的分位数估计。通过明确整合所有不确定性来源,所提出的框架为估计A基准和B基准提供了一种数值替代方案。此外,Sobol敏感性指数无需额外成本即可获得,从而洞察主要的认知不确定性来源。结构模型上的数值实验证明了该方法的可靠性和计算效率。特别是,将其应用于大规模工业模拟证实了其适用于航空航天认证工作流程,并突显了其在实际工程环境中的相关性。

英文摘要

In aerospace certification and other safety-critical domains, conservative quantile estimation such as A- and B-basis values is essential to guarantee reliability. While these metrics are traditionally derived from experimental campaigns, this work focuses on their estimation using a validated deterministic numerical model. The problem is formulated under mixed aleatory-epistemic uncertainty, accounting for limited material data, finite sampling effects, and surrogate modeling errors. We propose a methodology for estimating conservative design quantiles with statistical guarantees under mixed uncertainties. The proposed method leverages importance sampling and control variates to achieve accurate and efficient estimates within a fixed computational budget. One key point is the surrogate model's role solely as a variance reduction device, which guarantees unbiased and consistent quantile estimation. By explicitly integrating all sources of uncertainty, the proposed framework provides a numerical alternative to estimate A-basis and B-Basis. Furthermore, Sobol-based sensitivity indices are obtained at no additional cost, offering insight into the dominant epistemic sources. Numerical experiments on structural models demonstrate the method's reliability and computational efficiency. In particular, the application to large-scale industrial simulations confirms its suitability for aerospace certification workflows and highlights its relevance for real world engineering environments.

2606.13019 2026-06-12 stat.AP 新提交

Stochastic Modeling of Composite Interfaces: Sensitivity to Spatial Correlation and Bayesian Identification from Standard Fracture Tests

复合材料界面的随机建模:对空间相关性的敏感性及基于标准断裂试验的贝叶斯识别

Elton Donfack-Siewe, Sylvain Dubreuil, Christian Fagiano, Jérôme Morio, Jean-Philippe Navarro

AI总结 提出随机有限元框架,通过空间相关随机场表征界面变异性,从标准断裂试验中利用近似贝叶斯计算提取关键参数,提升航空复合材料可靠性评估。

详情
AI中文摘要

为了在数值上处理复合材料结构中的不确定性,本文提出了一个随机有限元框架,旨在提高航空航天复合材料的可靠性评估,特别关注加强筋脱粘。通过用空间相关的随机场表示层压部件之间的界面变异性,该方法旨在考虑更高尺度的模拟和测试中的散射效应。对标准化模式I和模式II断裂试验进行的参数研究表明,相关长度是观察到的变异性的主要驱动因素,而协方差核的正则性只有边际影响。为了保证工业相关性,我们证明可以使用近似贝叶斯计算方法从实验断裂数据中提取这一关键参数。因此,所提出的方法为高保真虚拟测试以及在耐损伤复合材料机身设计中不确定性的预测管理提供了一条稳健的途径。

英文摘要

To enable a numerical handling of uncertainties in composite structures, this work presents a stochastic finite-element framework aimed at improving the reliability assessment of aerospace composites, with particular attention to stiffener debonding. By representing interface variability between laminate parts with spatially correlated random fields, the method aims at considering scattering effect at a higher scale of simulation and testing. A parametric study carried out on standardized Mode I and Mode II fracture tests reveals that the correlation length is the primary driver of observed variability, while the regularity of the covariance kernel has only a marginal impact. To guarantee industrial relevance, we demonstrate that this key parameter can be extracted from experimental fracture data using an Approximate Bayesian Computation approach. The proposed methodology therefore offers a robust route to high-fidelity virtual testing and to the predictive management of uncertainties in the design of damage-tolerant composite airframes.

2606.12889 2026-06-12 stat.AP 新提交

The Persistent Non-Response Bias in a Sample-Matched Poll for the 2024 U.S. Presidential Election

2024年美国总统选举中样本匹配民调的持续无应答偏差

Jay Chooi

AI总结 针对2024年美国总统选举民调偏差,利用数据缺陷相关性框架分析6万受访者数据,发现特朗普选民的无应答偏差持续存在,并提出基于历史数据缺陷相关性和投票率的预选举偏差校正估计器。

Comments Submitted to Journal of Survey Statistics and Methodology

详情
AI中文摘要

唐纳德·特朗普赢得了2024年美国总统选举,尽管民调预测民主党领先,这呼应了2016年的民调失误。利用数据缺陷相关性框架,我们重新审视了6万受访者的合作选举研究,发现即使在针对美国成年人口进行样本匹配后,特朗普选民的无应答偏差仍然存在,且量级相同(ρ=-0.0030,而2016年为-0.0045)。我们还发现,在调整投票率后,哈里斯选民存在正向应答偏差。与2016年的发现一致,民调误差随州人口规模扩大而增加,更大的样本导致与常规置信区间的更大偏离,最大州的样本有效规模减少超过99%。我们提出了一种基于历史数据缺陷相关性和投票率的预选举偏差校正估计器,仅使用先前选举数据即可将均方根误差从0.13降至0.05,与选举后加权(均方根误差0.09)相当。

英文摘要

Donald Trump won the 2024 US Presidential Election despite polls predicting a Democratic lead, echoing the polling miss in 2016. Using the data defect correlation framework, we revisit the 60,000-respondent Cooperative Election Study and find that non-response bias for Trump voters persists on the same order of magnitude ($ρ=-0.0030$ vs $-0.0045$ in 2016) even under sample-matching to the US adult population. We additionally find evidence of positive response bias for Harris voters after adjusting for turnout. Consistent with findings in 2016, polling errors scale with state population size, and larger samples produce greater departures from conventional confidence intervals, with reductions of effective sample size exceeding 99% in the largest states. We propose a pre-election bias correction estimator informed by historical data defect correlations and turnout rates that decreases RMSE from 0.13 to 0.05 using only prior election data, comparable to post-election weighting (RMSE 0.09).

2502.07695 2026-06-12 stat.AP stat.ME 版本更新

A scalable Bayesian double machine learning framework, with application to racial disproportionality assessment

可扩展的贝叶斯双机器学习框架及其在种族不成比例评估中的应用

Yu Luo, Vanessa McNealis, Yijing Li

AI总结 提出一种结合贝叶斯经验似然与双机器学习的半参数部分线性结构回归方法,用于控制高维混杂并纳入先验假设,应用于伦敦拦截搜查数据发现种族不成比例受行政区种族构成影响。

详情
AI中文摘要

拦截搜查实践中的种族不成比例引发了对其社会和行为影响的重大关切。在伦敦,黑人被拦截搜查的可能性大约是白人的四倍。利用2019年1月至2023年12月伦敦拦截搜查事件的数据,本文旨在调查涉及黑人的表达性犯罪拦截量与其他种族相比的不成比例性。我们采用半参数部分线性结构回归方法,并引入一种结合双机器学习技术的贝叶斯经验似然程序,以控制高维混杂并适应强先验假设。此外,我们证明了所提程序在覆盖方面产生有效的后验。将该方法应用于拦截搜查数据集,我们发现针对黑人社区的种族不成比例可能在关注表达性犯罪时受到行政区种族构成的影响。

英文摘要

Racial disproportionality in stop and search practices elicits substantial concerns about its societal and behavioral impacts. In London, Black individuals are about four times more likely to be stopped and searched than White individuals. Using data on stop and search events in London from January 2019 to December 2023, this paper aims to investigate disproportionality in the volume of stops for expressive crimes involving Black individuals compared to other ethnicities. We employ a semi-parametric partially linear structural regression method and introduce a Bayesian empirical likelihood procedure combined with double machine learning techniques to control for high-dimensional confounding and to accommodate strong prior assumptions. In addition, we show that the proposed procedure yields a valid posterior in terms of coverage. Applying this approach to the stop and search dataset, we find that racial disproportionality aimed at the Black community may be influenced by the borough racial composition when focusing on expressive crimes.

9. 数据隐私、稳健性与公平性 6 篇

2606.13327 2026-06-12 stat.ME stat.OT 新提交

Disclosure risk in a geo-spatial setting

地理空间环境中的披露风险

Peter-Paul de Wolf

AI总结 针对主题地图发布统计信息时披露风险与效用的平衡问题,提出一种不受可修改面积单元问题影响的新风险度量,该度量与目标人口局部密度相关并考虑多单元连接,通过企业位置示例数据集展示其行为。

详情
AI中文摘要

使用主题地图发布统计信息已成为一种流行的可视化方式。与所有统计出版物一样,主题地图也必须处理披露风险与效用之间的平衡。然而,大多数风险和效用度量并未考虑地图的空间特征。一些提出的空间风险度量存在可修改面积单元问题(MAUP):略微改变区域分类可能会影响风险。实际上,即使是网格的微小平移也可能影响该风险。我们提出了一种新的风险度量,它不受MAUP的影响。此外,我们的风险直接与(目标)人口的局部密度相关,并考虑到多个单元可能连接到单个位置的情况。我们使用一个虚构但真实的企业位置示例数据集展示了风险度量的行为。我们的风险度量可以进行调整,以考虑放大或缩小对(感知)风险的影响以及所用分辨率的影响。

英文摘要

Using thematic maps to publish statistical information has become a popular visualization. As is the case with all statistical publications, thematic maps also have to deal with the balance between disclosure risk and utility. However, most risk and utility measures do not take into account the spatial character of a map. Some of the proposed spatial risk measures suffer from the Modifiable Areal Unit Problem (MAUP): slightly changing regional classifications may influence the risk. Indeed, even a small translation of for example a grid may influence that risk. We propose a new risk measure that does not suffer from MAUP. Moreover, our risk is directly related to the local density of the (target) population and takes into account that often multiple units may be connected to a single location. We show the behavior of our risk measure using an example dataset of fake but realistic locations of enterprises. Our risk measure can be adapted to take into account the effect on the (perceived) risk of zooming in or out and the effect of the used resolution.

2606.13025 2026-06-12 stat.ME 新提交

Diagnostics-guided variance-inflated Fay-Herriot estimation from non-probability samples

诊断引导的方差膨胀Fay-Herriot估计:基于非概率样本

Andrius Čiginas

AI总结 针对非概率样本的小域估计,提出诊断引导的方差膨胀Fay-Herriot估计,通过域诊断指标调整方差膨胀,在弱覆盖域中加强平滑,显著降低估计误差。

Comments 17 pages, 2 figures

详情
AI中文摘要

非概率数据源在小域估计中日益受到关注,但逆概率加权(IPW)给出模型依赖的域估计量,其可靠性在不同域间可能差异显著。标准Fay-Herriot(FH)平滑跨域借用强度,但它使用提供的区域级方差估计,仿佛它们完全描述了输入估计量的不确定性。当某些域覆盖弱、权重不稳定或辅助平衡差时,这可能产生误导,因为这些特征可能表明选择偏差风险,而仅凭估计方差无法捕捉。我们提出一种诊断引导的方差膨胀FH估计量,用于有限总体域总量。该方法从校准的IPW域估计量出发,通过一组域诊断总结其可靠性,并在FH观测方程中引入混合方差膨胀成分。诊断表明IPW信息较弱的域因此被更强烈地平滑到区域级回归均值。基于立陶宛商业企业的伪真实总体验证表明,与校准IPW相比,估计误差大幅降低。

英文摘要

Non-probability data sources are increasingly considered in small area estimation, but inverse probability weighting (IPW) gives model-dependent domain estimators whose reliability may vary substantially across domains. Standard Fay-Herriot (FH) smoothing borrows strength across domains, yet it uses the supplied area-level variance estimates as if they fully described the uncertainty of the input estimators. This can be misleading when some domains have weak coverage, unstable weights, or poor auxiliary balance, since these features may indicate selection-bias risk not captured by the estimated variance alone. We propose a diagnostics-guided variance-inflated FH estimator for finite-population domain totals. The method starts from calibrated IPW domain estimators, summarizes their reliability through a small set of domain diagnostics, and introduces a mixture variance-inflation component in the FH observation equation. Domains whose diagnostics indicate weaker IPW information are thereby smoothed more strongly toward the area-level regression mean. A truth-known validation based on a pseudo-real population of Lithuanian business enterprises shows a substantial reduction in estimation error relative to calibrated IPW.

2606.13629 2026-06-12 stat.ME cs.AI cs.LG stat.ML 新提交

Valid Inference with Synthetic Data via Task Exchangeability

通过任务可交换性实现基于合成数据的有效推断

Lezhi Tan, Tijana Zrnic

AI总结 提出任务可交换性条件,确保在科学研究中使用合成数据进行统计推断的有效性,并给出在民意调查和AI评估中的应用。

详情
AI中文摘要

越来越多的工作主张在科学研究中使用合成数据。例如,社会科学家主张在试点研究中使用LLM生成的“硅样本”;AI评估越来越依赖“LLM作为裁判”的输出;蛋白质组学研究通过生成合成蛋白质结构的生成模型加速。这些发展引发了一个有趣的可能性:合成数据可以帮助研究人员提出更多问题、进行更多研究并加速发现。但它们也引发了一个根本性的担忧:合成数据可能有偏、有噪声且设定错误。在这项工作中,我们提出了在科学研究中使用合成数据的统计原则,并具有可证明的有效性保证。关键见解是一个我们称为任务可交换性的新技术条件。非正式地说,这是一个要求,即研究人员可以识别出有真实数据可用的历史任务,使得他们当前感兴趣的任务与历史任务在适当的数学意义上可交换。我们开发了在任务可交换性下进行有效推断的方法,以及即使在可交换性之外也能提供保证的扩展。我们通过硅样本的民意调查和自动评分器的AI评估来展示该框架。

英文摘要

There is a proliferation of work arguing for the use of synthetic data in scientific research. For example, social scientists are arguing for the use of LLM-generated "silicon samples" in pilot studies; AI evaluations increasingly rely on "LLM-as-a-judge" outputs; and proteomics research is accelerated by generative models that produce synthetic protein structures. These developments raise an intriguing possibility: synthetic data may help researchers ask more questions, run more studies, and accelerate discovery. But they also raise a fundamental concern: synthetic data can be biased, noisy, and misspecified. In this work, we propose statistical principles for using synthetic data in scientific research with provable validity guarantees. The key insight is a new technical condition that we call task exchangeability. Informally, this is a requirement that the researcher can identify historical tasks, for which real data is available, such that their current task of interest is exchangeable with the historical tasks in an appropriate mathematical sense. We develop methods for valid inference under task exchangeability, together with extensions that provide guarantees even beyond exchangeability. We demonstrate the framework on public opinion surveys with silicon samples and AI evaluation with autoraters.

2606.12654 2026-06-12 stat.ME cs.LG stat.ML 新提交

Computationally tractable robust differentially private mean estimation

计算可处理的鲁棒差分隐私均值估计

Kelly Ramsay

AI总结 提出一种名为“气球均值”的新差分隐私均值估计器,通过扩展马氏距离球上的迭代裁剪实现计算可处理性、鲁棒性及零集中差分隐私,理论保证在重尾和污染椭圆模型下的统计性能与鲁棒性。

Comments 40 pages, 17 figures

详情
AI中文摘要

我们开发了一种新的差分隐私均值估计器,称为气球均值。气球均值的主要特点是计算可处理且对异常观测具有鲁棒性。它基于在扩展的马氏距离球(即“气球”)上的迭代裁剪过程。该方法满足零集中差分隐私,并依赖于少量可解释的调优参数。我们在重尾和污染椭圆模型下提供了理论保证,刻画了其统计性能和对异常值的鲁棒性。大量模拟表明,气球均值对重尾和污染数据具有鲁棒性,并且在污染环境下优于现有的差分隐私均值估计器。

英文摘要

We develop a new, differentially private mean estimator called the balloon mean. The main features of the balloon mean are that it is computationally tractable and enjoys robustness to outlying observations. It is based on an iterative clipping procedure over expanding Mahalanobis balls, or ``balloons.'' The method satisfies zero-concentrated differential privacy and depends on a small number of interpretable tuning parameters. We provide theoretical guarantees under heavy-tailed and contaminated elliptical models, characterizing its statistical performance and robustness to outliers. Extensive simulations demonstrate that the balloon mean is robust to heavy-tailed and contaminated data, and outperforms existing differentially private mean estimators in contaminated settings.

2601.21324 2026-06-12 stat.ML cs.LG 版本更新

Bulk-Calibrated Credal Ambiguity Sets: Fast, Tractable Decision Making under Out-of-Sample Contamination

批量校准的置信模糊集:样本外污染下的快速、可处理决策

Mengqi Chen, Thomas B. Berrett, Theodoros Damoulas, Michele Caprio

发表机构 * University of Bristol(布里斯托大学) University of Cambridge(剑桥大学) University of California, Berkeley(加州大学伯克利分校) University of Oxford(牛津大学)

AI总结 提出批量校准置信模糊集,通过分离批量内污染和尾部贡献,得到闭式有限风险目标,转化为线性或二阶锥规划,实现高效鲁棒优化。

Comments Accepted for publication (spotlight) at ICML 2026

详情
AI中文摘要

分布鲁棒优化(DRO)在模糊集上最小化最坏情况期望损失,该模糊集可捕捉样本外环境中的分布偏移。虽然Huber(线性-空)污染是$\varepsilon$分数任意扰动的经典最小假设模型,但将其纳入模糊集可能导致最坏情况风险无穷大,且DRO目标变得无意义,除非施加强有界性或支撑假设。我们通过引入批量校准的置信模糊集来解决这些挑战:我们从数据中学习一个高质量批量集,同时考虑批量内的污染,并分别约束剩余尾部贡献。这导致一个闭式、有限的$\mathrm{mean}+\sup$鲁棒目标,以及针对常见损失和批量几何结构的可处理线性或二阶锥规划。通过该框架,我们强调并利用上期望(不精确概率概念)与最坏情况风险之间的等价性,展示IP置信集如何转化为具有可解释容忍水平的DRO目标。在重尾库存控制、地理偏移房价回归和人口偏移文本分类上的实验显示了竞争性的鲁棒性-准确性权衡和高效的优化时间,使用了贝叶斯、频率学派或经验参考分布。

英文摘要

Distributionally robust optimisation (DRO) minimises the worst-case expected loss over an ambiguity set that can capture distributional shifts in out-of-sample environments. While Huber (linear-vacuous) contamination is a classical minimal-assumption model for an $\varepsilon$-fraction of arbitrary perturbations, including it in an ambiguity set can make the worst-case risk infinite and the DRO objective vacuous unless one imposes strong boundedness or support assumptions. We address these challenges by introducing bulk-calibrated credal ambiguity sets: we learn a high-mass bulk set from data while considering contamination inside the bulk and bounding the remaining tail contribution separately. This leads to a closed-form, finite $\mathrm{mean}+\sup$ robust objective and tractable linear or second-order cone programs for common losses and bulk geometries. Through this framework, we highlight and exploit the equivalence between the imprecise probability (IP) notion of upper expectation and the worst-case risk, demonstrating how IP credal sets translate into DRO objectives with interpretable tolerance levels. Experiments on heavy-tailed inventory control, geographically shifted house-price regression, and demographically shifted text classification show competitive robustness-accuracy trade-offs and efficient optimisation times, using Bayesian, frequentist, or empirical reference distributions.

2506.23033 2026-06-12 cs.LG stat.ML 版本更新

How Reliable are Fairness Audits with Unreliable Data?

不可靠数据下的公平性审计有多可靠?

Yash Vardhan Tomar

发表机构 * Purdue University(普渡大学)

AI总结 研究受保护标签缺失对公平性缓解审计的影响,提出种子校准压力测试区分缺失效应与随机波动,发现正可用性缺失通常不改变缓解方法效果,但无标签端点表现不同,且阈值优化可能将单轴公平性增益转化为交叉危害。

详情
AI中文摘要

公平性审计是负责任机器学习部署的关键组成部分。然而,在不完全受保护标签访问下审计建议的可靠性仍然知之甚少。在这项工作中,我们关注公平性缓解审计中的受保护标签缺失。我们引入了一种种子校准压力测试,以将缺失效应与完全标签下已经存在的种子间波动分离开来。在ACS/Folktables任务中,我们发现正可用性缺失通常不会将选定的缓解方法移出完全标签的种子基线。无标签端点表现不同,暴露了ERM等效候选和确定性断点,而不是广泛的缺失效应。我们还发现,阈值优化可以将单轴公平性增益转化为高于零点的交叉危害,这是一种更尖锐的失败模式,在随机森林验证下似乎仍然可见。总体而言,我们的结果强调,在将受保护标签缺失视为审计脆弱性的证据之前,应报告种子零校准、候选集背景和交叉后果。

英文摘要

Fairness audits are a key component of responsible machine-learning deployment. Yet, audit-recommendation reliability under incomplete protected-label access is still poorly understood. In this work, we focused on protected-label missingness in fairness mitigation audits. We introduced a seed-calibrated stress test to separate missingness effects from seed-to-seed movement already present under complete labels. Across ACS/Folktables tasks, missingness settings that retain some protected labels usually do not move selected mitigation methods beyond a complete-label seed-to-seed baseline. At $0%$ protected-label access, candidates collapse to an empirical-risk-minimization baseline and deterministic tie-breaking rather than revealing a broad missingness effect. We also found that threshold optimization can turn fairness gains on a single protected axis into intersectional harm above a seed baseline, and this threshold-optimizer finding persists under random-forest validation. Overall, our results highlight that protected-label missingness should be reported with seed-null calibration, candidate-set context, and intersectional consequences before it is treated as evidence of audit fragility.

10. 数据集、软件与应用 8 篇

2606.13523 2026-06-12 stat.CO 新提交

HNPclassifier: An R Package for Hierarchical Neyman-Pearson Classification

HNPclassifier:用于分层Neyman-Pearson分类的R包

Lujia Yang, Che Shen, Shunan Yao, Lijia Wang

AI总结 提出HNPclassifier R包,实现分层Neyman-Pearson框架,通过内置或用户提供的评分函数控制有序多类分类中的欠分类错误。

详情
AI中文摘要

在多类分类问题中,类别通常具有自然的优先级顺序(例如,癌症分期、COVID-19严重程度等级或空气质量类别)。在这种情况下,优先正确识别更严重的类别并控制欠分类错误(即当来自高优先级类别的观测被错误分类到低优先级类别时)非常重要。Wang等人(2024)的分层Neyman-Pearson(H-NP)框架针对有序多类设置开发,以优先控制欠分类错误;其H-NP伞算法在用户指定水平上以高概率控制欠分类错误。本文介绍了R包HNPclassifier,该包实现了H-NP伞算法,使用内置学习器(如逻辑回归、随机森林和支持向量机)以及用户提供的评分函数构建H-NP分类器,从而实现对有序多类分类任务的有效错误控制。

英文摘要

In multi-class classification problems, classes often have a natural priority ordering (e.g., cancer stages, COVID-19 severity levels, or air-quality categories). In such settings, it is important to prioritize correct identification of more severe classes and to control under-classification errors, which occur when an observation from a higher-priority class is misclassified into a lower-priority one. The Hierarchical Neyman-Pearson (H-NP) framework of Wang et al. (2024) was developed for ordered multi-class settings to prioritize under-classification error control; its H-NP umbrella algorithm provides high-probability control of under-classification errors at user-specified levels. This paper introduces the R package HNPclassifier, which implements H-NP umbrella algorithms to construct H-NP classifiers using built-in learners such as logistic regression, random forests, and support vector machines, as well as user-supplied scoring functions, thereby enabling effective error control for ordered multi-class classification tasks.

2606.12642 2026-06-12 astro-ph.EP astro-ph.IM stat.AP 新提交

Quantifying Surface Heterogeneity Across Asteroid (101955) Bennu using Candidate Site Remote Sensing Data

利用候选采样点遥感数据量化小行星(101955)贝努的表面异质性

Emma-Catherine Belhadfa, Neil E. Bowles, Katherine A. Shirley, Amy A. Simon, Victoria E. Hamilton, Hannah H. Kaplan

AI总结 通过OSIRIS-REx任务获取的可见光-近红外和热红外光谱,量化贝努表面在2-10米尺度上的矿物组成和物理性质异质性,发现不同采样点间水合指标和硅酸盐波段存在显著差异。

Comments Currently under review at JGR: Planets

详情
AI中文摘要

OSIRIS-REx任务在小行星(101955)贝努的四个候选采样点(Nightingale、Osprey、Sandpiper和Kingfisher)获取了空间分辨(2-10米光斑尺寸)的可见光-近红外(VNIR)和热红外(TIR)光谱。为了量化像贝努这样的小天体(半径约500米)的表面异质性,我们探索了遥感观测的光谱数据,以得出关于矿物组成和驱动表面变化的关键物理过程的结论。我们从OSIRIS-REx可见光和红外光谱仪以及OSIRIS-REx热发射光谱仪数据中提取诊断性波段参数,以量化各采样点之间的组成和物理变化,并评估其矿物学背景。VNIR光谱显示出相似的整体反射率形状,但在光谱斜率和2.74微米OH吸收方面存在系统性差异。TIR发射率光谱揭示了克里斯琴森特征、硅酸盐伸缩和弯曲波段位置的适度但统计上显著的偏移,表明硅酸盐组成、水合状态和Mg/Fe相对丰度的差异。主成分分析将每个采样点分离成多变量波段参数空间中的不同簇,而K-means聚类识别出站点内的光谱子群。Welch方差分析和Hotelling检验证实了站点间波段参数变化的显著性。这些结果表明,贝努表面在2-10米尺度上保留了可测量的光谱异质性,不同站点间的水合指示剂和硅酸盐波段位置存在变化。Nightingale的光谱特性涵盖了所有四个站点观测到的全部范围,为将返回样本的实验室分析置于贝努更广泛的组成多样性和蚀变历史背景中建立了遥感基线。

英文摘要

The OSIRIS-REx mission acquired spatially resolved (2-10 m spot sizes) visible-near infrared (VNIR) and thermal infrared (TIR) spectra across four candidate sampling sites on asteroid (101955) Bennu: Nightingale, Osprey, Sandpiper, and Kingfisher. To quantify heterogeneity across a small body (about 500 m radius) like Bennu, we explore remotely observed spectral data to draw conclusions about the mineralogical composition and key physical processes that drive surface variability. We derive diagnostic band parameters from the OSIRIS-REx Visible and Infrared Spectrometer and the OSIRIS-REx Thermal Emission Spectrometer datasets to quantify compositional and physical variability across sites and assess their mineralogical context. The VNIR spectra exhibit similar overall reflectance shapes but systematic differences in spectral slopes and the 2.74 micron OH absorption. TIR emissivity spectra reveal modest but statistically significant shifts in the Christiansen Feature, silicate stretching, and bending band positions, indicating differences in silicate composition, hydration state, and Mg/Fe relative abundance. Principal component analysis separates each site into distinct clusters in multivariate band-parameter space, whereas K-means clustering identifies intra-site spectral sub-populations. Welch's Analysis of Variance and Hotelling's tests confirm that band-parameter variations between sites are significant. These results reveal that Bennu's surface preserves measurable spectral heterogeneity at 2-10 m scales, with site-to-site variations in hydration indicators and silicate band positions. The spectral properties of Nightingale encompass the full range observed across all four sites, establishing a remote sensing baseline for contextualizing laboratory analyses of the returned sample within Bennu's broader composition diversity and alteration history.

2604.12497 2026-06-12 cs.LG stat.ML 版本更新

Allocating Human Oversight in AI-Enabled Analytics

AI赋能分析中的人类监督分配

Zikun Ye, Jiameng Lyu, Rui Tao

发表机构 * Michael G. Foster School of Business, University of Washington(华盛顿大学迈克尔·G·福斯特商学院) Department of Management Science, School of Management, Fudan University(复旦大学管理学院管理科学系) Guanghua School of Management, Peking University(北京大学光华管理学院)

AI总结 针对AI预测可靠性异质且未知的问题,提出基于上置信界的在线学习策略,动态分配有限的人类验证预算,使终端效率损失随预算增长趋于零。

详情
AI中文摘要

组织越来越多地部署AI作为面向客户的决策过程中的低成本预测层,包括需求感知、服务质量监控、产品测试和市场研究,但AI生成的信号在不同任务、产品和客户细分中的可靠性并不均匀。因此,企业仍然需要稀缺的人类验证(标签、审计、调查回复或后续测量)来将AI输出锚定到真实情况。由于人类真实情况本身存在噪声,在不同标注者之间甚至重复判断中都有所变化,企业必须为每个任务收集并平均多个人类标签,这使得人类验证成本高昂。我们研究如何在可靠性异质且在部署前未知的情况下,将有限的人类验证预算分配到多个AI辅助任务中。我们将其置于调优的预测驱动推断框架内。每个人类标签既提高了AI辅助估计的精度,也揭示了任务的修正难度,即在使用AI预测作为控制变量后剩余的方差。如果难度已知,最优分配将遵循Neyman平方根规则;由于未知,我们提出一种基于上置信界的策略,该策略在线学习难度并将验证导向AI最不可靠的任务。我们证明,随着预算增长,该策略相对于最优分配的终端效率损失趋于零。在合成实验和一个包含68个任务和超过2000名受访者的真实数字孪生调查中,当可靠性异质时,该策略缩小了与最优分配的大部分差距,优于均匀分配和epsilon-贪婪分配;在调查数据上,它还优于先探索后提交的试点设计,并将均匀分配的10-12%差距缩小到2-6%。AI的价值不仅取决于模型准确性,还取决于将人类监督定向到AI错误影响最大的操作策略。

英文摘要

Organizations increasingly deploy AI as a low-cost prediction layer in customer-facing decision processes, including demand sensing, service-quality monitoring, product testing, and market research, but AI-generated signals are unevenly reliable across tasks, products, and customer segments. Firms therefore still need scarce human validation (labels, audits, survey responses, or follow-up measurements) to anchor AI outputs to ground truth. Because human ground truth is itself noisy, varying across labelers and even across repeated judgments, the firm must collect and average several human labels per task, which makes human validation costly. We study how to allocate a limited human-validation budget across many AI-assisted tasks when reliability is heterogeneous and unknown before deployment. We cast this within tuned prediction-powered inference. Each human label both sharpens the AI-assisted estimate and reveals the task's rectification difficulty, the variance that remains after the AI prediction is optimally used as a control variate. If difficulties were known, the optimal allocation would follow a Neyman square-root rule; because they are unknown, we propose a policy based on upper confidence bounds that learns them online and steers validation toward tasks where AI is least reliable. We prove that the policy's terminal efficiency loss relative to the oracle allocation vanishes as the budget grows. In synthetic experiments and a real digital-twin survey with 68 tasks and over 2000 respondents, it closes most of the gap to the oracle when reliability is heterogeneous, outperforming uniform and epsilon-greedy allocation; on the survey data it also outperforms explore-then-commit pilot designs and cuts uniform's 10--12% gap to 2--6%. The value of AI depends not only on model accuracy but also on the operational policy that targets human oversight where AI errors matter most.

2601.09693 2026-06-12 cs.LG stat.ML 版本更新

Contrastive Geometric Learning Unlocks Unified Structure- and Ligand-Based Drug Design

对比几何学习实现统一的结构与配体药物设计

Lisa Schneckenreiter, Sohvi Luukkonen, Lukas Friedrich, Daniel Kuhn, Günter Klambauer

发表机构 * DeepMind Ltd(DeepMind有限公司)

AI总结 提出对比几何模型ConGLUDe,统一结构与配体训练,实现虚拟筛选、靶标钓鱼和配体条件口袋预测,在多项基准测试中表现优异。

Comments Forty-Third International Conference on Machine Learning

详情
AI中文摘要

基于结构和基于配体的计算药物设计传统上依赖于不相关的数据源和建模假设,限制了它们在大规模上的联合使用。在这项工作中,我们引入了用于统一计算药物设计的对比几何学习(ConGLUDe),这是一个单一的对比几何模型,统一了基于结构和基于配体的训练。ConGLUDe将产生全蛋白质表示和预测结合位点的隐式嵌入的几何蛋白质编码器与快速配体编码器耦合,消除了对预定义口袋的需求。通过对比学习将配体与全局蛋白质表示和多个候选结合位点对齐,ConGLUDe除了支持虚拟筛选和靶标钓鱼外,还支持配体条件口袋预测,同时在蛋白质-配体复合物和大规模生物活性数据上联合训练。在多种基准测试中,ConGLUDe实现了具有竞争力的零样本虚拟筛选性能,在具有挑战性的靶标钓鱼任务上显著优于现有方法,并展示了最先进的配体条件口袋选择。这些结果突显了统一结构-配体训练的优势,并将ConGLUDe定位为迈向药物发现通用基础模型的一步。

英文摘要

Structure-based and ligand-based computational drug design have traditionally relied on disjoint data sources and modeling assumptions, limiting their joint use at scale. In this work, we introduce Contrastive Geometric Learning for Unified Computational Drug Design (ConGLUDe), a single contrastive geometric model that unifies structure- and ligand-based training. ConGLUDe couples a geometric protein encoder that produces whole-protein representations and implicit embeddings of predicted binding sites with a fast ligand encoder, removing the need for predefined pockets. By aligning ligands with both global protein representations and multiple candidate binding sites through contrastive learning, ConGLUDe supports ligand-conditioned pocket prediction in addition to virtual screening and target fishing, while being trained jointly on protein-ligand complexes and large-scale bioactivity data. Across diverse benchmarks, ConGLUDe achieves competitive zero-shot virtual screening performance, substantially outperforms existing methods on a challenging target fishing task, and demonstrates state-of-the-art ligand-conditioned pocket selection. These results highlight the advantages of unified structure-ligand training and position ConGLUDe as a step toward general-purpose foundation models for drug discovery.

2511.02430 2026-06-12 stat.CO cs.MS cs.SE stat.ML 版本更新

Efficient Solvers for SLOPE in R, Python, Julia, and C++

R、Python、Julia 和 C++ 中 SLOPE 的高效求解器

Johan Larsson, Malgorzata Bogdan, Krystyna Grzesiak, Mathurin Massias, Jonas Wallin

AI总结 提出一套在 R、Python、Julia 和 C++ 中高效求解 Sorted L-One Penalized Estimation (SLOPE) 问题的软件包,采用混合坐标下降算法,支持多种损失函数和数据结构,性能优于现有实现。

Comments 30 pages, 8 figures

详情
AI中文摘要

我们提供了一套在 R、Python、Julia 和 C++ 中高效求解 Sorted L-One Penalized Estimation (SLOPE) 问题的软件包。这些软件包采用了一种高效的混合坐标下降算法,能够拟合广义线性模型(GLM),并支持多种损失函数,包括高斯、二项、泊松和多项逻辑回归。我们的实现旨在快速、内存高效且灵活。这些软件包支持多种数据结构(稠密、稀疏和内存外矩阵),并设计用于高效拟合完整的 SLOPE 路径以及处理 SLOPE 模型的交叉验证,包括松弛 SLOPE。我们展示了如何使用这些软件包的示例,以及在真实和模拟数据上展示其性能的基准测试,结果表明我们的软件包在速度上优于现有的 SLOPE 实现。

英文摘要

We present a suite of packages in R, Python, Julia, and C++ that efficiently solve the Sorted L-One Penalized Estimation (SLOPE) problem. The packages feature a highly efficient hybrid coordinate descent algorithm that fits generalized linear models (GLMs) and supports a variety of loss functions, including Gaussian, binomial, Poisson, and multinomial logistic regression. Our implementation is designed to be fast, memory-efficient, and flexible. The packages support a variety of data structures (dense, sparse, and out-of-memory matrices) and are designed to efficiently fit the full SLOPE path as well as handle cross-validation of SLOPE models, including the relaxed SLOPE. We present examples of how to use the packages and benchmarks that demonstrate the performance of the packages on both real and simulated data and show that our packages outperform existing implementations of SLOPE in terms of speed.

2508.14858 2026-06-12 stat.ME stat.ML 版本更新

Data Fusion for High-Resolution Estimation

数据融合用于高分辨率估计

Amy Guan, Roshni Sahoo, Joshua Salomon, Stefan Wager, Marissa Reitsma

AI总结 提出一种融合无偏低分辨率数据与有偏高分辨率数据的方法,通过KL散度学习与行政数据一致的分布,显著降低高分辨率估计的偏差。

详情
AI中文摘要

人口健康指标的高分辨率估计对于精准公共卫生至关重要。我们提出了一种高分辨率估计方法,融合了不同的数据源:无偏的低分辨率数据源(例如汇总的行政数据)和可能有偏的高分辨率数据源(例如个体层面的在线调查回复)。我们假设可能有偏的高分辨率数据源是在一个抽样偏差模型下从总体生成的,其中可观测变量可以任意影响响应概率,但具有相同可观测变量的单元之间响应概率的对数差异与其可观测变量和结果的充分统计量之间的差异呈线性关系。我们的数据融合方法学习一个分布,该分布在KL散度意义上最接近在线调查分布,并且与汇总的行政数据以及我们的抽样偏差模型一致。在一个包含三个指标的重复测量的测试平台上,该测试平台同时使用了(在线)家庭脉搏调查和同一时间段内两个地理分辨率下的真实数据源,与仅依赖单一数据源的基线方法相比,我们的方法显著减少了高分辨率估计中的偏差。

英文摘要

High-resolution estimates of population health indicators are critical for precision public health. We propose a method for high-resolution estimation that fuses distinct data sources: an unbiased, low-resolution data source (e.g. aggregated administrative data) and a potentially biased, high-resolution data source (e.g. individual-level online survey responses). We assume that the potentially biased, high-resolution data source is generated from the population under a model of sampling bias where observables can have arbitrary impact on the probability of response but the difference in the log probabilities of response between units with the same observables is linear in the difference between sufficient statistics of their observables and outcomes. Our data fusion method learns a distribution that is closest (in the sense of KL divergence) to the online survey distribution and consistent with the aggregated administrative data and our model of sampling bias. This approach significantly reduces bias in high-resolution estimates compared to baselines that rely on a single data source alone on a testbed that includes repeated measurements of three indicators measured by both the (online) Household Pulse Survey and ground-truth data sources at two geographic resolutions over the same time period.

2407.18572 2026-06-12 stat.AP math.ST stat.OT stat.TH 版本更新

Bernoulli amputation

伯努利缺失生成

Marius Hofert, James Jackson, Niels Hagenbuch

AI总结 提出一种基于伯努利分布和copula的随机缺失生成方法,通过指定缺失指示变量的分布而非手动模式,灵活生成多种缺失模式,包括结构化缺失。

详情
AI中文摘要

提出了一种新颖的随机缺失生成方法,即向完整数据集中引入缺失值的过程。该方法只需指定缺失指示变量的分布,而无需手动指定每个缺失模式,即可构建多种缺失模式。通过copula和伯努利边际以原则性方式建模缺失指示变量,从而能够纳入缺失模式中的依赖性。除了经典的缺失机制如完全随机缺失、随机缺失和非随机缺失外,该方法还能建模结构化缺失,如块缺失,以及通过混合模型建模单调缺失,这些是现实数据集中常见的缺失数据模式。数学上推导了联合缺失概率和缺失相关性等性质。通过数学示例和基于一个样本量足够小、可视觉识别每个缺失数据点的知名示例数据集的经验说明,展示了该方法在仅需指定缺失指示变量的分布假设下捕捉不同缺失模式的灵活性。最后,提供了一个应用于多元金融时间序列的示例。

英文摘要

A novel, stochastic approach to amputation, the process of introducing missing values to a complete dataset, is presented. It allows one to construct a wide variety of missingness patterns by only having to specify distributions of missingness indicators as opposed to specifying each missingness pattern manually. Missingness indicators are modeled in a principled way via copulas and Bernoulli margins, thus allowing one to incorporate dependence in missingness patterns. Besides more classical missingness mechanisms such as missing completely at random, missing at random, and missing not at random, the approach is able to model structured missingness such as block missingness and, via mixtures, monotone missingness, which are patterns of missing data frequently found in real-life datasets. Properties such as joint missingness probabilities or missingness correlation are derived mathematically. The flexibility of the approach in capturing different missingness patterns while only requiring to specify distributional assumptions on missingness indicators is demonstrated with mathematical examples and empirical illustrations in terms of a well-known example dataset of sufficiently small sample size that allows to identify each missing data point visually. Finally, an example application to multivariate financial time series is provided.

2501.04823 2026-06-12 cs.RO math.OC stat.AP 版本更新

Learning Robot Safety from Sparse Human Feedback using Conformal Prediction

基于共形预测从稀疏人类反馈中学习机器人安全

Aaron O. Feldman, Joseph A. Vincent, Maximilian Adang, JunEn Low, Mac Schwager

发表机构 * Department of Aeronautics and Astronautics, Stanford University(航空航天工程系,斯坦福大学)

AI总结 通过人类对策略轨迹的二元反馈,利用共形预测识别包含未来策略错误的状态区域,构建具有保证漏检率的预警系统,并用于改进模型预测控制器的安全性。

详情
AI中文摘要

确保机器人安全可能具有挑战性;用户定义的约束可能遗漏边缘情况,策略即使从安全数据训练也可能变得不安全,并且安全可能是主观的。因此,我们通过向标记不安全行为的人类展示策略轨迹来学习机器人安全。从这种二元反馈中,我们使用共形预测的统计方法识别一个状态区域(可能在学习的潜在空间中),保证包含用户指定比例的未来策略错误。我们的方法是样本高效的,因为它基于最近邻分类,避免了共形预测中常见的保留数据。通过提醒机器人是否到达可疑的不安全区域,我们获得了一个模拟人类安全偏好且具有保证漏检率的预警系统。通过视频标注,我们的系统可以检测四旋翼视觉运动策略何时无法通过指定门。我们提出了一种通过避免可疑不安全区域来改进策略的方法。通过它,我们提高了模型预测控制器的安全性,这在30次四旋翼飞行跨越6个导航任务的实验测试中得到了证明。提供了代码和视频。

英文摘要

Ensuring robot safety can be challenging; user-defined constraints can miss edge cases, policies can become unsafe even when trained from safe data, and safety can be subjective. Thus, we learn about robot safety by showing policy trajectories to a human who flags unsafe behavior. From this binary feedback, we use the statistical method of conformal prediction to identify a region of states, potentially in learned latent space, guaranteed to contain a user-specified fraction of future policy errors. Our method is sample-efficient, as it builds on nearest neighbor classification and avoids withholding data as is common with conformal prediction. By alerting if the robot reaches the suspected unsafe region, we obtain a warning system that mimics the human's safety preferences with guaranteed miss rate. From video labeling, our system can detect when a quadcopter visuomotor policy will fail to steer through a designated gate. We present an approach for policy improvement by avoiding the suspected unsafe region. With it we improve a model predictive controller's safety, as shown in experimental testing with 30 quadcopter flights across 6 navigation tasks. Code and videos are provided.

11. 其他/综合统计 12 篇

2606.13280 2026-06-12 math.ST stat.TH 新提交

Generalization Bounds for Transformer-Based Next-Token Prediction in a Language Model

基于Transformer的语言模型中下一个词预测的泛化界

Insung Kong, Niklas Dexheimer, Johannes Schmidt-Hieber

AI总结 针对文本数据特性,提出基于对数双线性语言模型扩展的数据分布,推导深度Transformer架构的泛化界,揭示其对网络结构、词汇量、文档数和文档长度的依赖。

详情
AI中文摘要

对LLM预训练的精细统计理解需要分析针对封装文本数据关键特征的数据分布的Transformer架构。为此,我们基于自然语言处理文献中对数双线性语言模型的扩展,提出了一种文本数据分布。对于这一数据生成过程,我们推导了深度Transformer架构的泛化界,突出了对网络架构、词汇量、文档数量和文档长度的依赖性。

英文摘要

A refined statistical understanding of LLM pre-training requires the analysis of the transformer architecture for data distributions that encapsulate key characteristics of text data. To address this, we propose a text data distribution based on an extension of the log-bilinear language model from the natural language processing literature. For this data generating process, we derive generalization bounds for deep transformer architectures, highlighting the dependence on the network architecture, the vocabulary size, the number of documents and the document length.

2606.13230 2026-06-12 math.ST stat.TH 新提交

Consistency of variational approximations under bounded Kullback--Leibler divergence

有界Kullback-Leibler散度下变分近似的一致性

Hien Duy Nguyen, Jacob Westerhout, Thomas Guilmeau, Julyan Arbel

AI总结 研究变分近似在贝叶斯推断中继承后验一致性的条件,证明在一般度量空间上,若近似测度与目标测度序列的KL散度一致有界且目标后验弱收敛到真参数处的狄拉克测度,则变分序列也一致。

详情
AI中文摘要

变分方法广泛用于在精确计算不可行时近似贝叶斯推断中的后验分布。我们研究这种近似何时继承后验一致性。我们的第一个结果表明,在一般度量空间上,从近似测度到紧目标测度序列的Kullback-Leibler散度的一致有界迫使近似序列是紧的。由此可知,如果目标后验弱收敛到真参数处的狄拉克测度,那么任何与目标具有有界Kullback-Leibler散度的变分序列也是一致的。我们还给出了验证该有界性条件的简单对数矩条件,并针对光滑广义后验分布进行了说明。

英文摘要

Variational methods are widely used to approximate posterior distributions in Bayesian inference when exact computation is infeasible. We study when such approximations inherit posterior consistency. Our first result shows that, on a general metric space, a uniform bound on the Kullback--Leibler divergence from the approximating measures to a tight sequence of target measures forces the approximating sequence to be tight. It follows that if the target posteriors converge weakly to a Dirac mass at the true parameter, then any variational sequence with bounded Kullback--Leibler divergence to the targets is also consistent. We also give simple logarithmic-moment conditions that verify this boundedness condition, and illustrate them for smooth generalised posterior distributions.

2606.13084 2026-06-12 math.ST math.PR stat.TH 新提交

Characterizing metric-space-valued processes: separating classes and weak invariance principles for measure-theoretic inference

度量空间值过程的刻画:度量统计推断中的分离类与弱不变原理

Anne van Delft

AI总结 研究缺乏拓扑向量空间结构的度量空间值随机过程,利用球性质建立分离类,提出基于测度的推断方法,并推导弱不变原理与L^p替代方法。

详情
AI中文摘要

本文研究取值于缺乏拓扑向量空间结构的度量空间中的随机过程,该领域以拓扑、几何和时间依赖结构之间的复杂相互作用为特征。我们正式证明了允许等距希尔伯特嵌入的空间构成一个严格子类,包含在具有球性质的更广泛度量空间类中。当底层空间无法等距嵌入到希尔伯特空间时,传统核方法易受几何失真影响,而我们通过利用该更广泛类固有的基本结构性质来绕过这些限制;即,Borel概率测度由其球上的值唯一确定。这些分离类为随后引入的基于测度的推断方法提供了基础。我们推导了一族时变随机测度的均匀收敛性,以及相应非平稳随机场的弱不变原理。该框架明确揭示了依赖性和几何复杂性如何影响样本路径正则性。此外,由于小球概率的快速衰减可能禁止基于上确界的差异测度的极限分布存在,我们开发了基于$L^p$的替代方法。通过直接利用所引入的收敛结果,该方法规避了对高阶$U$-过程公式的需求。最后,对于确实允许等距希尔伯特嵌入且自然出现$U$-过程的空间,我们建立了退化和非退化多参数$U$-过程的极限理论,并证明了局部差异检验在动态参数框架下保持渐近稳定性。

英文摘要

This article investigates stochastic processes taking values in metric spaces that lack a topological vector space structure, a regime characterized by intricate interplay between topological, geometric, and temporal dependence structures. It is formally established that spaces admitting an isometric Hilbertian embedding constitute a strict subclass within the much broader class of metric spaces possessing the ball property. While traditional kernel methods are susceptible to geometric distortion when the underlying space cannot be isometrically embedded into a Hilbert space, we bypass such limitations by exploiting a fundamental structural property inherent to this broader class; namely, that Borel probability measures are uniquely determined by their values on balls. These separating classes provide the foundation for the subsequently introduced measure-theoretic inference methodology. We derive uniform convergence of a family of time-dependent random measures, alongside weak invariance principles for the corresponding nonstationary random fields. This framework explicitly exposes how dependence and geometric complexity influence sample path regularity. Furthermore, because the rapid decay of small-ball probabilities can prohibit the existence of limiting distributions for supremum-based discrepancy measures, we develop $L^p$-based alternatives. By directly leveraging the introduced convergence results, this approach circumvents the need for higher-order $U$-process formulations. Finally, for spaces that do admit an isometric Hilbertian embedding, and where $U$-processes naturally arise, we establish limit theory for both degenerate and nondegenerate multi-parameter $U$-processes, and demonstrate that local discrepancy tests maintain asymptotic stability under dynamic parameter regimes.

2606.12943 2026-06-12 math.ST stat.TH 新提交

Phase transition of Schott's statistic for high-dimensional heavy-tailed data

高维重尾数据Schott统计量的相变

Hantao Chen, Guangming Pan, Cheng Wang

AI总结 研究Schott统计量在α正则变化高维数据下的渐近分布,发现轻尾(α>3)和重尾(α<3)情形存在相变,提出适用于未知位置参数和所有α>0的标准化检验统计量。

Comments 42 pages

详情
AI中文摘要

考虑Schott (2005) 提出的统计量,定义为来自α正则变化总体的数据样本相关矩阵的Frobenius范数的平方。我们在以数据维数p、样本量n和正则变化系数α为特征的一般框架下研究其渐近分布。特别地,我们识别出渐近行为中的相变现象。对于轻尾总体(α>3),我们重新审视了无α的渐近分布,但放宽了对p/n比率的约束。对于重尾总体(α<3),我们推导了一个新的渐近正态分布,其方差显式依赖于α。我们还提出了渐近方差的一致估计量,使得标准化的Schott检验统计量对于未知的位置参数和所有α>0仍然适用。

英文摘要

Consider Schott's statistic (Schott, 2005) defined as the squared Frobenius norm of the sample correlation matrix for data from $α$-regularly varying populations. We investigate its asymptotic distribution in a general framework characterized by data dimension p, sample size n, and regularly varying coefficients $α$. In particular, we identify a phase transition phenomenon in the asymptotic behavior. For light-tailed populations ($α> 3$), we revisit the $α$-free asymptotic distribution but relax the constraint on the ratio of $p/n$. For heavy-tailed populations ($α< 3$), we derive a new asymptotic normal distribution whose variance explicitly depends on $α$. We also propose a consistent estimator for the asymptotic variance such that the standardized Schott's test statistic remains applicable for unknown location parameters and all $α> 0$.

2606.12448 2026-06-12 physics.geo-ph stat.CO stat.ME 新提交

A generalized framework for performance-based earthquake engineering: integrated assessment of structural reliability and resilience

基于性能的地震工程通用框架:结构可靠性与韧性的综合评估

C. NArdin, S. Marelli, B. Sudret, M. Broccardo

AI总结 提出一个通用PBEE框架,通过连续时间马尔可夫链将损伤和恢复嵌入系统动力学,统一描述结构可靠性和韧性,并利用生成矩阵的谱特性高效计算指标。

详情
AI中文摘要

评估地震灾害下的结构性能需要考虑损伤累积和震后恢复。在当前基于性能的地震工程(PBEE)中,恢复通常被视为后处理属性,而结构性能采用泊松超越假设建模,该假设隐含可更新性和无记忆性。这些假设阻碍了在重复地震荷载下对可靠性和韧性的统一处理。本研究提出了一个通用PBEE框架,其中损伤和恢复通过连续时间马尔可夫链直接嵌入系统动力学。单个生成矩阵控制状态依赖的转移,提供了结构可靠性和韧性的统一描述,同时与标准PBEE指标兼容。时间相关的失效概率和可靠性指标从瞬态系统动力学导出,而韧性通过倒塌前的预期运行时间比例量化。该框架利用生成矩阵的谱特性高效且透明地计算这两个指标。该方法通过一个三状态示例进行说明,并应用于两个结构原型:一个支撑框架和一个基础隔震系统。结果表明,即使传统可靠性指标表现出有限的敏感性,恢复动力学也能强烈影响长期韧性,强调了在生命周期地震性能评估中明确考虑恢复的必要性。

英文摘要

Assessing structural performance under seismic hazard requires accounting for both damage accumulation and post-event recovery. In current performance-based earthquake engineering (PBEE), recovery is generally treated as a post-processing attribute, while structural performance is modeled using Poissonian exceedance assumptions that imply renewability and memorylessness. These assumptions hinder a unified treatment of reliability and resilience under repeated seismic loading. This study proposes a generalized PBEE framework in which damage and recovery are embedded directly into the system dynamics through a continuous-time Markov chain. A single generator matrix governs state-dependent transitions, providing a unified description of structural reliability and resilience while remaining compatible with standard PBEE metrics. Time-dependent failure probabilities and reliability indices are derived from the transient system dynamics, whereas resilience is quantified through the expected fraction of operational time before collapse. The framework exploits the spectral properties of the generator matrix to compute both metrics efficiently and transparently. The methodology is illustrated on a three-state example and applied to two structural archetypes: a braced frame and a base-isolated system. Results show that recovery dynamics can strongly affect long-term resilience even when conventional reliability measures exhibit limited sensitivity, emphasizing the need to explicitly account for recovery in life-cycle seismic performance assessment.

2606.11110 2026-06-12 math.ST cs.IT math.IT stat.TH 新提交

Fixed-Threshold One-Bit Toeplitz Covariance Estimation under Sparse-Ruler Sampling

固定阈值一位Toeplitz协方差估计在稀疏尺采样下

Zhiyong Cheng, Shengyao Chen

AI总结 研究固定阈值一位量化结合确定性稀疏尺采样时的Toeplitz协方差估计,提出中心化稀疏尺Toeplitz估计器并证明维度无关的高斯方差收缩定理,在平衡覆盖几何下达到极小化最优率。

Comments v2: substantially revised; 21 pages main text + appendix, 59 pages total

详情
AI中文摘要

我们研究当固定阈值一位量化与确定性稀疏尺采样结合时的Toeplitz协方差估计。每个观测比特可以进入多个滞后乘积。在非零阈值下,符号具有非零均值,这种确定性顶点重用使得原始符号乘积具有一致的单一顶点分量。该分量改变了方差几何。原始非零阈值乘积由加权度行和而非滞后覆盖或边Frobenius几何控制。中心化符号去除了顶点分量,留下退化的稀疏对统计量。然后我们证明了有界坐标变换的空心二次型的维度无关高斯方差收缩定理。该定理适用于硬阈值符号,并通过边权重的Frobenius范数控制任意确定性稀疏支撑,与维度、支撑大小或最大度无关。对于算子范数估计,我们构建了具有池化边际校准的中心化稀疏尺Toeplitz估计器。领先的oracle项为\\[ \gamma_0 L_1\kappa_{\rm obs} \sqrt{\frac{\varphi(\Omega)\log d}{n}}, \qquad \varphi(\Omega)=\sum_{s=1}^{d-1}q_s^{-1}, \\] 而插件项由边际比特预算\\(n|\Omega|\\)控制。在已知尺度恒等邻域子模型中的一个真实谱打包下界表明,在平衡覆盖几何下,\\(\sqrt{\varphi(\Omega)\log d/n}\\)依赖性是固有的。在非饱和区域中,当该覆盖项占主导时,oracle估计器在子模型上达到极小化最优率;关于条件数、曲率和插件校准常数的优化依赖性留待进一步研究。

英文摘要

We study Toeplitz covariance estimation when fixed-threshold one-bit quantization is combined with deterministic sparse-ruler sampling, so that each observed bit is reused across many lag products. At a nonzero threshold the signs have nonzero mean, and this reuse gives raw sign products a coherent one-vertex variance component governed by weighted row sums; centering removes it and leaves a degenerate sparse-pair statistic. We prove a Gaussian variance contraction theorem for hollow quadratic forms of bounded coordinate transforms, including hard threshold signs: the variance is bounded by the squared correlation operator norm times the squared Frobenius norm of the edge weights, with constants independent of dimension, support size and maximum degree. For the oracle centered sparse-ruler estimator, the leading operator-norm term is \(γ_0L_1κ_{\rm obs}\sqrt{φ(Ω)\log d/n}\), where \(φ(Ω)=\sum_{s=1}^{d-1}q_s^{-1}\) is the coverage coefficient of the ruler; pooled marginal calibration from the \(n|Ω|\) observed bits adds a plug-in term. A spectral-packing lower bound in a known-scale identity-neighborhood submodel shows that this dependence is intrinsic under balanced coverage geometry; in the non-saturated regime where the coverage term dominates, the oracle estimator is minimax rate optimal over this submodel.

2603.26116 2026-06-12 stat.ME stat.AP 版本更新

Reconciling Latent Variables and Networks: Exploring and extending the Psychometric-Toolbox

整合潜在变量与网络:探索和扩展心理测量工具箱

Kevin Kistermann, Vivato V. Andriamiarana, Augustin Kelava

AI总结 本文回顾并综合了网络心理测量与经典心理测量方法的联系,提出通过跨学科统计方法扩展心理测量工具箱,促进跨领域合作,提升方法论系统的性和目标性。

详情
AI中文摘要

自网络心理测量引入以来,已建立了与经典心理测量模型(如IRT、SEM、GLM)及其他领域方法的联系。本文回顾了这些发展,并通过探索性文献检索进一步扩展和以可视化形式呈现。这种视角为通过整合和学习其他领域开发的统计方法来扩展心理测量工具箱提供了机会,这些方法往往解决相似或相同的问题。强调这些方法论的共同点可能促进传统上独立的跨领域合作。此外,了解这些联系可能使方法论发展更加系统和目标明确,并可能使开发统计方法与通过软件工具进行实证研究之间实现有意义的分工。最后,这些方法论进展为实证研究提供了新机会,并可能有助于解决长期存在的心理测量构念及更广泛的心理现象概念问题。

英文摘要

Since the introduction of network psychometrics, several connections to statistical models in "classical" psychometrics (i.e., IRT, SEM, GLM) as well as to approaches from other research fields have been established. In this paper, these developments have been reviewed and synthesized and, based on an exploratory literature search, further advanced and presented in an accessible visual format. This perspective opens up promising opportunities to extend the psychometric-toolbox by incorporating and learning from statistical methodologies developed in other research domains, which often address similar or even identical problems. Highlighting these methodological commonalities may also foster collaboration across research fields that have traditionally remained largely independent. Moreover, awareness of these connections may render methodological development more systematic and goal-directed and may enable a meaningful division of labor, for example between the development of statistical methodology and its practical implementation for empirical research through software tools. Finally, these methodological advances provide new opportunities for empirical research and may contribute to a reconciliation with longstanding conceptual issues concerning psychometric constructs and, more broadly, psychological phenomena.

2501.19126 2026-06-12 math.ST stat.TH 版本更新

Asymptotic optimality theory of confidence intervals of the mean

均值置信区间的渐近最优性理论

Vikas Deep, Achal Bassamboo, Sandeep Juneja

AI总结 研究在i.i.d.样本下构造均值置信区间的问题,基于样本量与置信水平的渐近关系划分三种学习机制,并证明基于KL散度的置信区间在指数族和有界支撑分布族中达到渐近最优宽度。

详情
AI中文摘要

我们研究经典问题:给定\(N\)个i.i.d.样本,构造分布均值的置信区间(CI),使得CI以至少\(1 - \delta\)的概率包含真实均值,其中\(\delta \in (0,1)\)。我们根据当样本量\(N_{\delta} \to \infty\)且\(\delta \to 0\)时任何CI的最小可达极限宽度,刻画了三种不同的学习机制。在第一种机制中,\(N_{\delta}\)增长慢于\(\log(1/\delta)\),任何CI的极限宽度等于分布支撑的宽度,排除了有意义的推断。在第二种机制中,\(N_{\delta}\)与\(\log(1/\delta)\)同阶,我们精确刻画了依赖于缩放常数的最小极限宽度。在第三种机制中,\(N_{\delta}\)增长快于\(\log(1/\delta)\),可实现完全学习,CI的极限宽度收缩到零,收敛到真实均值。我们证明,基于Kullback-Leibler(KL)散度的浓度不等式导出的CI在充分学习和完全学习机制下,对于单参数指数族和有界支撑分布族,达到了渐近最优性能,即获得了最小极限宽度。此外,这些结果可推广到单侧CI,只需适当调整宽度概念。最后,我们将结果推广到具有随机每样本成本的情形,受随机模拟器和云服务选择等实际应用启发。我们考虑成本预算\(C_{\delta}\)而非固定样本量,识别类似的学习机制并刻画最优CI构造策略。

英文摘要

We address the classical problem of constructing confidence intervals (CIs) for the mean of a distribution, given \(N\) i.i.d. samples, such that the CI contains the true mean with probability at least \(1 - δ\), where \(δ\in (0,1)\). We characterize three distinct learning regimes based on the minimum achievable limiting width of any CI as the sample size \(N_δ \to \infty\) and \(δ\to 0\). In the first regime, where \(N_δ\) grows slower than \(\log(1/δ)\), the limiting width of any CI equals the width of the distribution's support, precluding meaningful inference. In the second regime, where \(N_δ\) scales as \(\log(1/δ)\), we precisely characterize the minimum limiting width, which depends on the scaling constant. In the third regime, where \(N_δ\) grows faster than \(\log(1/δ)\), complete learning is achievable, and the limiting width of the CI collapses to zero, converging to the true mean. We demonstrate that CIs derived from concentration inequalities based on Kullback--Leibler (KL) divergences achieve asymptotically optimal performance, attaining the minimum limiting width in both sufficient and complete learning regimes for distributions in two families: single-parameter exponential and bounded support. Additionally, these results extend to one-sided CIs, with the width notion adjusted appropriately. Finally, we generalize our findings to settings with random per-sample costs, motivated by practical applications such as stochastic simulators and cloud service selection. Instead of a fixed sample size, we consider a cost budget \(C_δ\), identifying analogous learning regimes and characterizing the optimal CI construction policy.

2512.24701 2026-06-12 math.ST stat.TH 版本更新

Epistemic Confidence Statement via Extended Likelihood

通过扩展似然法的认知置信陈述

Youngjo Lee

AI总结 本文通过扩展似然法形式化Fisher的认知置信,澄清了信仰概率的争议,并建立了观测数据认知置信与未来数据频率覆盖概率的直接联系,进而将认知置信陈述扩展到多维参数,并应用高阶渐近理论改进一阶渐近结果。

详情
AI中文摘要

Fisher的信仰概率最近在认知置信的概念下重新引起了关注。通过扩展似然法可以形式化认知置信陈述,从而澄清了关于其信仰概率性质的几个长期争议。它建立了Fisher对观测数据的认知置信概念与Neyman对未来数据的频率论随机覆盖概率之间的直接联系,从而使得认知置信陈述能够扩展到多维参数。我们展示了如何应用高阶渐近理论来改进观测区域的一阶渐近认知置信陈述,这是扩展似然性质的直接结果。

英文摘要

Fisher's fiducial probability has recently attracted renewed attention under the notion of epistemic confidence. Epistemic confidence statements can be formulated through extended likelihoods, thereby clarifying several long-standing controversies regarding its fiducial probability properties. It establishes a direct connection between Fisher's epistemic notion of confidence for observed data and Neyman's frequentist aleatory coverage probability for future data, thereby enabling extension of epistemic confidence statements for multidimensional parameters. We demonstrate how higher-order asymptotic theory can be applied to refine the first-order asymptotic epistemic confidence statements of the observed region, as a direct consequence of extended likelihood property.

2511.21441 2026-06-12 math.ST stat.TH 版本更新

Hierarchical Besov-Laplace priors for spatially inhomogeneous binary classification

面向空间非齐次二元分类的层次化Besov-Laplace先验

Patric Dolmeta, Matteo Giordano

AI总结 针对空间非齐次二元分类问题,提出基于Besov-Laplace先验的层次贝叶斯方法,通过精细调节正则化超先验实现后验分布最优收敛率,并设计高效MCMC算法。

Comments 28 pages, supplement included, 4 figures, 4 tables. To Appear in Advances in Data Analysis and Classification

详情
AI中文摘要

我们研究了非参数贝叶斯二元分类问题,其中未知概率响应函数可能具有空间非齐次性,例如,在域上总体平坦但呈现局部尖锐变化。我们考虑基于逆问题和成像文献中的Besov-Laplace先验的层次化过程,并对正则化参数进行精心调节的超先验。我们证明了所得后验分布以最优速率向真实值集中,自动适应未知的正则性。为了在实践中实现后验推断,我们基于最近针对Besov-Laplace先验的特定维度鲁棒方法,设计了一种高效的马尔可夫链蒙特卡洛(MCMC)算法。然后,我们在广泛的数值模拟中测试了所考虑的方法,获得了对理论结果的坚实验证。

英文摘要

We study nonparametric Bayesian binary classification, in the case where the unknown probability response function is possibly spatially inhomogeneous, for example, being generally flat across the domain but presenting localized sharp variations. We consider a hierarchical procedure based on the Besov-Laplace priors from the inverse problems and imaging literature, with a carefully tuned hyper-prior on the regularity parameter. We show that the resulting posterior distribution concentrates towards the ground truth at optimal rate, automatically adapting to the unknown regularity. To implement posterior inference in practice, we devise an efficient Markov chain Monte Carlo (MCMC) algorithm based on recent ad-hoc dimension-robust methods for Besov-Laplace priors. We then test the considered approach in extensive numerical simulations, where we obtain a solid corroboration of the theoretical results.

2411.00429 2026-06-12 stat.ME

Unbiased mixed variables distance

无偏混合变量距离

Michel van de Velden, Alfonso Iodice D'Enza, Angelos Markos, Carlo Cavicchia

AI总结 本文提出无偏混合变量距离,解决不同变量类型和量纲对距离计算的偏倚问题,提供通用公式以构建无偏距离。

Comments 40 pages, 9 figures

详情
Journal ref
Journal of Computational and Graphical Statistics (2026)
AI中文摘要

在混合环境中定义距离需要量化不同类型的变量和不同量纲的变量的观测差异。尽管已有多种混合变量距离的提议,但这些距离往往偏向特定变量类型和测量单位。即,个体变量对总体距离的贡献受测量类型或量纲的影响。本文定义了无偏混合变量距离,使得个体变量对总体距离的贡献不受测量类型或量纲的影响。我们定义了量化此类偏倚的相关概念,并提供了一个通用公式,可用于构建无偏混合变量距离。

英文摘要

Defining a distance in a mixed setting requires the quantification of observed differences of variables of different types and of variables that are measured on different scales. There exist several proposals for mixed variable distances, however, such distances tend to be biased towards specific variable types and measurement units. That is, the variable types and scales influence the contribution of individual variables to the overall distance. In this paper, we define unbiased mixed variable distances for which the contributions of individual variables to the overall distance are not influenced by measurement types or scales. We define the relevant concepts to quantify such biases and we provide a general formulation that can be used to construct unbiased mixed variable distances.

1710.03070 2026-06-12 cs.NE cs.LG q-bio.NC stat.ML

full-FORCE: A Target-Based Method for Training Recurrent Networks

full-FORCE:一种基于目标的训练循环网络方法

Brian DePasquale, Christopher J. Cueva, Kanaka Rajan, G. Sean Escola, L. F. Abbott

发表机构 * Department of Neuroscience(神经科学系) Zuckerman Institute(Zuckerman研究所) Columbia University(哥伦比亚大学) Department of Physiology and Cellular Biophysics(生理学与细胞生物物理学系) Columbia University College of Physicians and Surgeons(哥伦比亚大学医学与外科学院) Princeton Neuroscience Institute(普林斯顿神经科学研究所) Lewis-Sigler Institute for Integrative Genomics(整合基因组学研究所)

AI总结 本文提出一种基于目标的循环网络训练方法,通过引入第二网络提供目标动态,实现更高效的任务处理,具有更少的神经元和更高的噪声鲁棒性。

Comments 20 pages, 8 figures

详情
Journal ref
PLoS ONE (2018)
AI中文摘要

训练好的循环网络是建模动态神经计算的强大工具。我们提出了一种基于目标的方法,用于修改循环网络的全连接矩阵,以训练其执行涉及时间复杂输入/输出转换的任务。该方法在训练过程中引入第二个网络,提供合适的“目标”动态,有助于完成任务。由于利用了全循环连接,该方法产生的网络在执行任务时比传统的最小二乘(FORCE)方法使用更少的神经元,并具有更高的噪声鲁棒性。此外,我们展示了如何通过向目标生成网络引入额外的输入信号,这些信号作为任务提示,大大扩展了可学习的任务范围,并提供了对训练任务执行网络动态复杂性和性质的控制。

英文摘要

Trained recurrent networks are powerful tools for modeling dynamic neural computations. We present a target-based method for modifying the full connectivity matrix of a recurrent network to train it to perform tasks involving temporally complex input/output transformations. The method introduces a second network during training to provide suitable "target" dynamics useful for performing the task. Because it exploits the full recurrent connectivity, the method produces networks that perform tasks with fewer neurons and greater noise robustness than traditional least-squares (FORCE) approaches. In addition, we show how introducing additional input signals into the target-generating network, which act as task hints, greatly extends the range of tasks that can be learned and provides control over the complexity and nature of the dynamics of the trained, task-performing network.