arXivDaily arXiv每日学术速递 周一至周五更新
重置

1. 统计理论与方法 11 篇

2606.14615 2026-06-15 stat.ME stat.AP 新提交

Testing Preferential Sampling

测试优先采样

Isabel Natario, Andreia Monteiro

AI总结 提出一种简单易行的优先采样检验方法,基于采样点数量与测量值的依赖性,通过模拟和真实数据验证其有效性。

Comments 23 pages, 19 figures

详情
AI中文摘要

地质统计学旨在从有限位置(通常存在测量误差)的观测中推断空间连续现象。当空间过程与采样过程存在随机依赖时,就会发生优先采样。忽略此问题会导致有偏估计,因此识别它非常重要,但执行和理解并不简单。本文提出一种简单易行的优先采样检验方法,克服了上述困难。该方法基于采样点数量与相应测量值之间的依赖性。通过大规模模拟研究评估了所提检验的性能,考虑了不同的优先程度、与协变量的关系、不同样本量以及不同的检验程序条件。结果令人鼓舞,正确检测优先采样的比例很高,并通过应用于已知真实数据集(苔藓样本中铅浓度以及红虾和蓝虾捕获数据)进一步得到确认。

英文摘要

Geostatistics aims to infer a spatially continuous phenomenon from observations collected at a finite number of locations, frequently measured with error. Whenever there is stochastic dependence between the spatial and sampling processes, preferential sampling occurs. Ignoring this problem drives to incorrect and biased estimates and, therefore, recognizing it is quite important, but not always simple to execute and understand. In this work, a test for assessing preferential sampling, simple and easy to implement, is presented, overcoming the previous concerns. It is based on the dependence between the number of sampled points and the values of the corresponding measures. The performance of the proposed test id assessed through a large simulation study, which consideres different levels of preferentiability, relation with a covariate, different sample sizes and different test procedure conditions. The results are quite encouraging, with high levels of correct preferential sampling detections, further confirmed by the test application to already known real data sets of lead concentrations in moss samples and red and blue shrimp capture data.

2606.14092 2026-06-15 stat.ME 新提交

Cauchy Aggregation of Ridge-Regularized Hotelling Tests for High-Dimensional Change-Point Detection

高维变点检测中岭正则化Hotelling检验的柯西聚合

Ping Zhao, Le Zhou, Long Feng

AI总结 针对岭正则化Hotelling型变点检验中依赖未知参数的岭参数问题,提出在固定网格上计算p值并用柯西组合规则聚合,避免选择单一岭值,理论证明联合弱收敛并保证检验有效性,实验表明该方法在多种协方差和信号配置下具有稳定尺寸和接近最优的检验功效。

详情
AI中文摘要

岭正则化Hotelling型(RHT)变点检验依赖于岭参数$\lambda$,但最优功效值由未知的协方差结构和未知的均值偏移决定。我们通过在一个有限确定性网格上计算固定岭p值,并用柯西组合规则聚合它们,避免了选择单一岭值。在固定岭RHT统计量的标准随机矩阵条件下,我们建立了岭过程的有限网格联合弱收敛。这导致了在联合极限校准下的固定水平有效性和解析柯西p值的小尾有效性。蒙特卡洛实验表明,确定性网格柯西聚合在多种协方差和信号配置下具有稳定的尺寸行为,并且其功效接近最佳稳定固定岭选择。

英文摘要

Ridge-regularized Hotelling-type (RHT) change-point tests depend on a ridge parameter $λ$, but the power-optimal value is determined by the unknown covariance structure and the unknown mean shift. We avoid selecting a single ridge value by computing fixed-ridge p-values on a finite deterministic grid and aggregating them with the Cauchy combination rule. Under the standard random-matrix conditions for fixed-ridge RHT statistics, we establish finite-grid joint weak convergence of the ridge processes. This leads to fixed-level validity under joint-limit calibration and small-tail validity for the analytic Cauchy p-value. Monte Carlo experiments show that deterministic-grid Cauchy aggregation has stable size behavior and achieves power close to the best stable fixed ridge choice across a range of covariance and signal configurations.

2606.14085 2026-06-15 stat.ME 新提交

Bias-corrected empirical likelihood-based inference for the tail index under heavy-tailed models

重尾模型下尾指数的偏差校正经验似然推断

Haodi Liang, Natalia Nolde

AI总结 结合偏差校正与经验似然方法,提出一种新的尾指数估计量,并建立渐近理论,模拟和实例验证其有效性。

详情
AI中文摘要

重尾概率模型的尾指数参数在刻画分布函数尾部衰减中起关键作用,并常用于各种极值分析问题的外推过程。本文重新审视尾指数估计问题,结合偏差校正和经验似然估计的思想,提出一种估计量,为现有的一些估计量提供了有吸引力的替代方案。我们为所提出的估计量建立了渐近理论,并通过模拟研究展示了其在有限样本情况下的表现。该方法还应用于一个数据示例以作说明。

英文摘要

The tail index parameter of heavy-tailed probability models plays a key role in characterizing the tail decay of the underlying distribution function and is often involved in extrapolation procedures for various extreme value analysis questions. In this paper we revisit the question of tail index estimation and combine the ideas of bias-correction and empirical likelihood estimation to propose an estimator that offers an attractive alternative to some of the existing estimators. We develop an asymptotic theory for the proposed estimator and conduct simulation studies to demonstrate its performance in finite sample situations. The method is also applied to a data example for illustration.

2606.14019 2026-06-15 stat.ME math.PR math.ST stat.TH 新提交

Real-order moments, tail representations, and logarithmic means

实阶矩、尾部表示与对数均值

Roberto Vila, Eduardo Nakano

AI总结 本文建立任意随机变量实阶矩的统一框架,通过累积分布函数和生存函数给出积分表示,涵盖连续、离散和混合分布,并应用于zeta和Skellam分布,同时得到对数矩表示,连接对数均值、拉普拉斯变换与Frullani恒等式。

Comments 12 pages, 3 figures

详情
AI中文摘要

本文为任意随机变量的实阶矩研究建立了一个统一框架。基于累积分布函数和生存函数建立了通用的积分表示,涵盖了支撑在整个实直线上的连续、离散和混合分布。这些公式推广了非负随机变量的经典尾部积分恒等式,并提供了对正阶、分数阶和负阶矩的统一处理。对于离散分布,推导了基于累积概率的显式级数表示,给出了矩存在的简单判据。展示了在zeta分布和Skellam分布上的应用,说明了尾部行为如何决定矩的有限性,以及如何通过累积分布函数几何地表示矩。此外,还得到了对数矩的表示,将对数均值、拉普拉斯变换与经典的Frullani恒等式联系起来。这些结果为矩表示提供了统一视角,并在尾部概率、分布函数、拉普拉斯变换与矩存在性之间建立了有用的联系。

英文摘要

This paper develops a unified framework for the study of real-order moments of arbitrary random variables. General integral representations are established in terms of cumulative distribution functions and survival functions, covering continuous, discrete, and mixed distributions supported on the whole real line. These formulas extend the classical tail-integral identities for nonnegative random variables and provide a common treatment of positive, fractional, and negative moments. For discrete distributions, explicit series representations are derived in terms of cumulative probabilities, yielding simple criteria for the existence of moments. Applications are presented for the zeta and Skellam distributions, illustrating how tail behavior determines moment finiteness and how moments can be represented geometrically through cumulative distribution functions. In addition, a representation for logarithmic moments is obtained, linking logarithmic means, Laplace transforms, and the classical Frullani identity. The results provide a unified perspective on moment representations and establish useful connections between tail probabilities, distribution functions, Laplace transforms, and moment existence.

2606.13973 2026-06-15 stat.ME stat.CO 新提交

Scan Statistics for Nonhomogeneous Poisson Processes with Extreme-Value Calibration and Application to CNV Detection

具有极值校准的非齐次泊松过程的扫描统计量及其在CNV检测中的应用

Tung-Lung Wu, Asanka R. Duwage

AI总结 针对非齐次泊松过程,提出基于极值分布校准的扫描统计量方法,用于检测拷贝数变异,并通过模拟和实际测序数据验证其有效性。

详情
AI中文摘要

我们开发了一种扫描统计方法,用于在双样本非齐次泊松过程(NHPP)框架下检测局部聚类,该方法受下一代测序数据中拷贝数变异(CNV)分析的启发。对照样本用于构建经验时间变换,在该变换下,原假设下变换后的病例样本在[0,1]上近似均匀分布。扫描统计量定义为移动窗口内变换点的最大数量。我们证明扫描统计量收敛到广义极值(GEV)分布,其极值指数捕捉了重叠窗口引起的依赖性。使用最大似然法和超越聚类法估计GEV参数和极值指数,提供了检验的渐近校准。还开发了一种置换程序作为非参数替代方案。模拟研究表明,在所考虑的设置下,置换校准保持了接近名义水平的经验I类错误,而GEV校准对于较小的窗口是准确的。两种提出的程序在异质基线强度下与连续检验方法相比显示出有竞争力的功效。对测序数据的应用说明了所提出方法在检测CNV区域方面的有效性。

英文摘要

We develop a scan statistic method for detecting local clusters in a two-sample nonhomogeneous Poisson process (NHPP) framework, motivated by copy number variation (CNV) analysis in next-generation sequencing data. The control sample is used to construct an empirical time transformation, under which the transformed case sample is approximately uniform on [0,1] under the null hypothesis. The scan statistic is defined as the maximum number of transformed points within a moving window. We show that the scan statistic converges to a generalized extreme value (GEV) distribution with an extremal index that captures the dependence induced by overlapping windows. The GEV parameters and extremal index are estimated using maximum likelihood and exceedance clustering methods, providing an asymptotic calibration of the test. A permutation procedure is also developed to provide a nonparametric alternative. Simulation studies show that the permutation calibration maintains empirical Type I error close to the nominal level across the considered settings, and the GEV calibration is accurate for smaller windows. Both proposed procedures show competitive power compared with the continuous testing method under heterogeneous baseline intensities. An application to sequencing data illustrates the effectiveness of the proposed approach for detecting CNV regions.

2606.13864 2026-06-15 econ.EM math.ST stat.ME stat.TH 新提交

The Generalized Fisher Transformation: Finite-Sample Properties and Inference

广义Fisher变换:有限样本性质与推断

Ilya Archakov, Peter Reinhard Hansen

AI总结 研究广义Fisher变换(GFT)的有限样本性质,发现其坐标近似高斯、不相关且协方差几乎与相关矩阵无关,从而在有限样本中比传统方法提供更好的推断。

详情
AI中文摘要

我们研究了广义Fisher变换(GFT)的有限样本行为,该变换将相关矩阵$C$参数化为$\gamma(C)=\operatorname{vecl}\log C$。GFT坐标将Fisher变换推广到维度$n>2$:对于椭圆分布数据,其有限样本分布接近高斯分布。更引人注目的是,这些坐标几乎不相关,且它们的协方差在很大程度上与$C$无关。这种近似正交性和不变性使得基于GFT的推断在有限样本中比基于样本相关或逐元素Fisher变换相关的推断表现更好,产生的估计误差近似高斯、弱相关且近乎枢轴。

英文摘要

We study the finite-sample behavior of the Generalized Fisher Transformation (GFT), the parametrization of a correlation matrix $C$ by $γ(C)=\operatorname{vecl}\log C$. The GFT coordinates extend Fisher's transformation to dimension $n>2$: for elliptical data their finite-sample distributions are close to Gaussian. More strikingly, the coordinates are nearly uncorrelated and their covariance is largely invariant to $C$. This approximate orthogonality and invariance make GFT-based inference far better behaved in finite samples than inference based on sample correlations or element-wise Fisher transformed correlations, yielding estimation errors that are approximately Gaussian, weakly dependent, and nearly pivotal.

2606.13780 2026-06-15 hep-ph cs.LG hep-ex stat.ML 新提交

Conformal calibration and look-elsewhere effect in anomaly detection for new-physics searches

新物理搜索中异常检测的共形校准与look-elsewhere效应

Jack Y. Araz, Michael Spannowsky

发表机构 * Department of Physics and Astronomy, University College London(大学学院伦敦物理系) Department of Engineering, City St. George’s, University of London(伦敦大学城市圣乔治学院工程系) Institute for Theoretical Physics, Campus Süd, Karlsruhe Institute of Technology (KIT)(卡尔斯鲁厄理工学院(KIT)理论物理研究所) Institute for Quantum Materials and Technologies, Karlsruhe Institute of Technology(卡尔斯鲁厄理工学院量子材料与技术研究所)

AI总结 提出基于共形预测的校准层,将任意异常分数转化为具有分布无关、有限样本保证的显著性,同时修正背景误建模和look-elsewhere效应。

Comments 22 pages, 15 figures, 3 tables. Comments welcome

详情
AI中文摘要

机器学习驱动的异常检测正在重塑新物理搜索,但其统计解释方法已落后。原始异常分数缺乏校准意义,扫描多个区域的模型会放大look-elsewhere效应,而领域依赖的渐近显著性对异常检测器特别容易遭受的背景误建模视而不见。我们提出一个基于共形预测的校准层,能将任意异常分数转化为具有分布无关、有限样本保证的可辩护显著性。共形预测将分数转化为有效的局部p值,加权和Mondrian变体修复了共振搜索中边带到信号区域的可交换性失败,而Gross-Vitells步骤将结果转化为考虑look-elsewhere的全局显著性。该层同时做两件事:它暴露了标准流程无法发现的校准错误,并在不重新训练检测器的情况下进行修正。在公开的LHC Olympics数据上,一个分类器产生了子结构-质量相关性,使得边带校准的背景p值变得反保守。表面上看,这仅由背景塑造就制造了约$46\sigma$的过剩,而无标签加权修正消除了这一过剩,恢复了诚实的零假设。当作为盲法宽质量凸起搜索运行时,标准渐近和未加权程序即使在无信号窗口也会制造$\gtrsim10\sigma$和约$5\sigma$的过剩,而共形层没有产生任何误报,其全局误报率在仅背景伪实验中得到验证。结果是一条可审计、与检测器无关的路径,从未校准分数到考虑试验因子的显著性,可集成到实验异常搜索中。

英文摘要

Machine-learned anomaly detection is reshaping searches for new physics, but it has outrun the statistics used to interpret it. A raw anomaly score has no calibrated meaning, a model that scans many regions inflates the look-elsewhere effect, and the asymptotic significances the field relies on are blind to the background mismodelling that anomaly detectors are especially prone to. We propose a calibration layer, built on conformal prediction, that turns any anomaly score into a defensible significance with distribution-free, finite-sample guarantees. Conformal prediction converts scores into valid local p-values, weighted and Mondrian variants repair the sideband-to-signal-region exchangeability failures that resonant searches suffer, and a Gross-Vitells step carries the result through to a look-elsewhere-aware global significance. The layer does two things at once. It exposes miscalibration that the standard pipeline cannot see, and it corrects it without retraining the detector. On public LHC Olympics data, a classifier develops a substructure-mass correlation that makes sideband-calibrated background p-values anti-conservative. Taken at face value, this manufactures a $\sim 46σ$ excess from background sculpting alone, which the label-free weighted correction removes, restoring an honest null. When run as a blind wide-mass bump hunt, the standard asymptotic and unweighted procedures fabricate $\gtrsim10σ$ excesses and $\approx5σ$ excesses even in signal-free windows, while the conformal layer raises no false alarms and its global false-positive rate is verified on background-only pseudoexperiments. The result is an auditable, detector-agnostic path from an uncalibrated score to a trials-factor-aware significance, ready to be folded into experimental anomaly searches.

2606.09391 2026-06-15 math.ST physics.ao-ph stat.ME stat.TH 新提交

Kling-Gupta linear regression

Kling-Gupta线性回归

Hristos Tyralis, Georgia Papacharalampous

AI总结 本文形式化Kling-Gupta损失函数,推导多元线性回归中参数估计的显式公式,证明其与普通最小二乘的差异,并建立渐近性质。

Comments 64 pages, 8 figures, 3 tables

详情
AI中文摘要

尽管Kling-Gupta效率($\mathrm{KGE}$)在水文模型评估中被广泛采用,但其作为统计估计量的性质仍未探索。研究这些性质是必要的,因为参数估计和预测评估本质上是关联的。为此,我们在极值估计框架内形式化了负向Kling-Gupta损失$L_\mathrm{KG} = (1 - \mathrm{KGE})^2$(等价于最大化$\mathrm{KGE}$),并分析了其在多元线性回归中的行为。我们建立了参数估计的显式公式,表明Kling-Gupta线性回归通过一个由预测变量和响应的样本方差及协方差决定的方差膨胀因子,缩放普通最小二乘(OLS)系数向量。我们证明,Kling-Gupta线性回归预测在训练集上复制了响应的样本方差,这与OLS固有的方差缩减形成对比,而两种估计量都保持了观测的样本均值,并在预测与响应之间实现了相同的样本相关性。我们分析表明,没有单一的估计量能同时最大化Nash-Sutcliffe效率$\mathrm{NSE}$和$\mathrm{KGE}$:OLS估计量达到最大可能的$\mathrm{NSE}$但未达到最大$\mathrm{KGE}$,而Kling-Gupta估计量以牺牲$\mathrm{NSE}$为代价最大化$\mathrm{KGE}$。我们证明了Kling-Gupta估计量几乎必然收敛到明确定义的总体极限,并代数表达了这些极限。此外,我们评估了两种估计量的训练集和测试集性能指标,表明对于每个估计量,训练集和独立测试集上的指标渐近收敛到相同的极限(尽管OLS和Kling-Gupta回归的极限不同)。

英文摘要

Kling-Gupta efficiency ($\mathrm{KGE}$) is a model performance evaluation metric widely used in hydrology, but its properties as a statistical estimator have remained unexplored. We formalize the Kling-Gupta loss $L_\mathrm{KG} = (1 - \mathrm{KGE})^2$ in an extremum estimation framework (maximizing $\mathrm{KGE}$) for multiple linear regression. We give explicit formulas showing that Kling-Gupta regression scales the ordinary least squares (OLS) coefficient vector by a variance-inflation factor depending on sample variances and covariances. Its predictions reproduce the training set response variance, unlike OLS's variance reduction, while both maintain the response mean and achieve the same sample correlation. We prove that no estimator simultaneously maximizes Nash-Sutcliffe efficiency ($\mathrm{NSE}$) and $\mathrm{KGE}$: OLS maximizes $\mathrm{NSE}$ but not $\mathrm{KGE}$, whereas Kling-Gupta regression maximizes $\mathrm{KGE}$ at the expense of $\mathrm{NSE}$. We establish almost-sure convergence of the Kling-Gupta estimator to well-defined population limits. The training and test set performance metrics for both estimators converge asymptotically to identical limits (different for OLS vs. Kling-Gupta). In a single-predictor model with fixed intercept, we identify conditions where a global minimum of $L_\mathrm{KG}$ does not exist because of discontinuity at zero slope. This work establishes a mathematical foundation for $\mathrm{KGE}$-based estimation and clarifies its effects on predictive performance in hydrologic modeling.

2603.29047 2026-06-15 cond-mat.stat-mech math.PR stat.AP 版本更新

Longest weakly increasing subsequences of discrete random walks on the integers with heavy tailed distribution of increments

具有重尾增量分布的整数离散随机游走的最长弱递增子序列

José Ricardo G. Mendonça, Marcelo V. Freire

AI总结 研究重尾分布增量随机游走的最长弱递增子序列长度,发现有限方差时标度为√n log n,无限方差时标度为n^θ (θ>0.5),且分布近似对数正态。

Comments elsarticle style, 21 pages, 13 figures, 6 tables, 25 refs. Version v2 as published

详情
Journal ref
Physica A 697, 131732 (2026)
AI中文摘要

我们研究了$n$步随机游走的最长弱递增子序列(弱LIS)长度的行为,其中游走的非零整数增量$k = \pm 1, \pm 2, \dots$由对称重尾质量分布给出,该分布与$|k|^{-1-\alpha}$成正比,参数$\alpha > 0$取多个值;同时研究了简单随机游走($k=\pm 1$)的情况,当$\alpha$足够大以至于在$n$的尺度上超出$\pm 1$的跳跃基本不存在时,$n$步重尾游走退化为简单随机游走。通过探索性拟合、加权非线性最小二乘和嵌套模型比较,我们发现当增量分布具有有限方差($\alpha > 2$)时,样本平均长度$\langle{L_{n}}\rangle$的标度行为为$\langle{L_{n}}\rangle \sim \sqrt{n}\log{n}$;当方差无限($\alpha \leq 2$)时,$\langle{L_{n}}\rangle \sim n^{\theta}$,其中指数$\theta > 0.5$变化。分布诊断表明,$L_{n}$分布的主体部分非常接近对数正态模型,尽管在尾部观察到系统性偏差。我们的结果证实并扩展了先前关于其他类型重尾随机游走的LIS的结果,并提出了一个猜想:$L_{n}$的分布是否由对数正态分布给出,或者可以有效地用对数正态分布描述。

英文摘要

We investigate the behavior of the length of the longest weakly increasing subsequences (weak LIS) of $n$-step random walks with nonzero integer increments $k = \pm 1, \pm 2, \dots$ given by a symmetric heavy tailed mass distribution proportional to $|k|^{-1-α}$ for several values of the real parameter $α> 0$ together with that of the simple random walk ($k=\pm 1$), to which the $n$-step heavy tailed walks reduce when $α$ grows large enough that step jumps beyond $\pm 1$ become essentially absent on the scale of $n$. By means of exploratory fits, weighted nonlinear least squares, and nested-model comparisons, we found that the sample average length $\langle{L_{n}}\rangle$ scales like $\langle{L_{n}}\rangle \sim \sqrt{n}\log{n}$ when the distribution of increments has finite variance ($α> 2$) and $\langle{L_{n}}\rangle \sim n^θ$ with a varying exponent $θ> 0.5$ when the variance is infinite ($α\leq 2$). Distributional diagnostics indicate that the bulk of the $L_{n}$ distribution is very well-approximated by a lognormal model, though systematic deviations are observed in the tails. Our results corroborate and expand upon previous results for the LIS of other types of heavy-tailed random walks and raise a conjecture as to whether the distribution of $L_{n}$ is given, or can be effectively described, by a lognormal distribution.

2310.19435 2026-06-15 math.AT stat.ME 版本更新

A novel characterization of structures in smooth regression curves: from a viewpoint of persistent homology

平滑回归曲线中结构的新刻画:从持续同调的角度

Satish Kumar, Subhra Sankar Dhar

AI总结 利用持续同调分析回归曲线的一阶导数超水平集,刻画单调性、凸性和模态等结构,并建立估计一致性及统计显著性度量。

Comments This is the published version of the article which is published in Electronic Journal of Statistics, 2026

详情
Journal ref
Electronic Journal of Statistics, 20(2026)
AI中文摘要

我们利用持续同调刻画平滑回归曲线中的单调性、凸性和模态等结构。持续同调是拓扑数据分析中的关键工具,能够检测数据中的高维拓扑特征,如连通分量和空洞(环或圈)。换句话说,持续同调是同调的多尺度版本,基于连通分量和空洞对集合进行刻画。我们使用函数的超水平集通过持续同调提取几何特征。特别地,我们通过函数超水平集的持续同调探索回归曲线中的结构,其中感兴趣的函数是回归函数的一阶导数。在此研究过程中,我们扩展了现有估计回归函数一阶导数持续同调的程序,并建立了其一致性。此外,作为所提出方法论的应用,我们证明了函数导数的持续同调可以揭示函数本身持续同调无法看到的隐藏结构。特别地,我们刻画了单调性、凸性和模态等结构,并提出了统计显著性度量以在实践中推断这些结构。最后,我们进行了实证研究,在模拟和真实数据集上实施所提出的方法论,并将结果与现有方法论进行比较。

英文摘要

We characterize structures such as monotonicity, convexity, and modality in smooth regression curves using persistent homology. Persistent homology is a key tool in topological data analysis that detects higher-dimensional topological features such as connected components and holes (cycles or loops) in the data. In other words, persistent homology is a multiscale version of homology that characterizes sets based on the connected components and holes. We use super-level sets of functions to extract geometric features via persistent homology. In particular, we explore structures in regression curves via the persistent homology of super-level sets of a function, where the function of interest is - the first derivative of the regression function. In the course of this study, we extend an existing procedure of estimating the persistent homology for the first derivative of a regression function and establish its consistency. Moreover, as an application of the proposed methodology, we demonstrate that the persistent homology of the derivative of a function can reveal hidden structures in the function that are not visible from the persistent homology of the function itself. In particular, we characterize structures such as monotonicity, convexity, and modality, and propose a measure of statistical significance to infer these structures in practice. Finally, we conduct an empirical study to implement the proposed methodology on simulated and real data sets and compare the derived results with an existing methodology.

2509.12356 2026-06-15 math.ST stat.ML stat.TH 版本更新

Jackknife Variance Estimation for Hájek-Dominated Generalized U-Statistics

Hájek主导的广义U统计量的Jackknife方差估计

Jakob R. Juergens

AI总结 针对一类广义U统计量,证明Jackknife方差估计量的比率相合性,并应用于两尺度分布最近邻回归估计器,在更弱条件下得到一致方差估计。

Comments 60 pages

详情
AI中文摘要

基于子抽样和随机化估计量的有效不确定性量化通常依赖于方差估计量,其行为远不如基础点估计量被理解。我们证明了Jackknife方差估计量及其某些删除-$d$变体对于一类广义U统计量的比率相合性,这类统计量的方差渐近地由它们的Hajek投影主导,且归一化的一阶投影平方满足行向$L^r$弱大数律,经典固定阶情形作为特例恢复。这种投影主导加平方大数律结构统一并推广了现有文献中的几个准则,阐明了简单非参数Jackknife在广义设置下何时有理论依据,并在比先前要求弱得多的条件下为两尺度分布最近邻回归估计量提供一致的方差估计。

英文摘要

Valid uncertainty quantification for subsampling-based and randomized estimators often depends on variance estimators whose behavior is much less understood than that of the underlying point estimator. We prove ratio-consistency of the jackknife variance estimator, and certain delete-$d$ variants, for a broad class of generalized U-statistics whose variance is asymptotically dominated by their Hajek projection and whose normalized first-projection squares satisfy a row-wise $L^r$ weak law, with the classical fixed-order case recovered as a special instance. This projection-dominance plus square-LLN structure unifies and generalizes several criteria from the existing literature, clarifies when the simple nonparametric jackknife is theoretically justified in the generalized setting, and yields consistent variance estimation for the two-scale distributional nearest-neighbor regression estimator under substantially weaker conditions than previously required.

2. 贝叶斯统计与概率建模 6 篇

2606.14544 2026-06-15 stat.ME math.ST stat.TH 新提交

On the design distribution for predictive Bayesian regression

预测贝叶斯回归中的设计分布

Wanyue Sun, Edwin Fong

AI总结 研究预测贝叶斯回归中设计分布对推断的影响,提出满足可识别性和设计不变性的参数鞅后验方法,适用于高维回归。

详情
AI中文摘要

贝叶斯推断的预测方法通过一系列一步预测来访问后验分布,从而无需马尔可夫链蒙特卡洛即可通过预测重采样进行推断。在随机设计回归中,需要明确指定预测设计分布,但这一选择的影响很少受到正式关注。我们研究了这种预测设计分布在参数鞅后验回归中的作用,并确定了对于有效推断至关重要的可识别性和设计不变性的预测概念,特别是在高维回归中。基于这些基础,我们引入了一类新的参数鞅后验回归方法,该方法满足这些要求的弱形式,并通过正则化自然地适应高维设置。然后我们通过模拟说明了我们的方法。

英文摘要

The predictive approach to Bayesian inference accesses the posterior distribution via a sequence of one-step-ahead predictives, enabling inference via predictive resampling without Markov chain Monte Carlo. In the random-design regression setting, an explicit specification of the predictive design distribution is required, yet the impact of this choice has received little formal attention. We study the role of this predictive design distribution in parametric martingale posteriors for regression, and identify predictive notions of identifiability and design invariance that are essential for valid inference, particularly in the high-dimensional regression setting. Building on these foundations, we introduce a novel class of parametric martingale posteriors for regression that satisfies a weak form of these desiderata, and naturally accommodates the high-dimensional setting through regularization. We then illustrate our method through a simulation.

2606.14382 2026-06-15 stat.ME math.ST stat.TH 新提交

Predictive Concordance for Parameter Optimisation and Mixture Synthesis

参数优化与混合合成的预测一致性

Tobias Adrian, Domenico Giannone, Matteo Luciani, Mike West

AI总结 基于期望误分类率(EMR)提出概率一致性度量,通过最大化EMR或其正则化变体优化参数,并应用于宏观经济政策情景预测中的混合合成。

Comments 18 pages, 2 figures, 1 table

详情
AI中文摘要

我们讨论了基于期望误分类率(EMR)的两个概率分布之间的一致性概率度量。重点是比较给定参考分布与参数化类别中的其他分布,并通过识别最大化EMR或其正则化变体的参数值来优化一致性。EMR是一种实用且具有决策理论意义的度量,其优化可直接解释为具有有界效用函数的贝叶斯决策分析。我们探讨了EMR的理论性质,讨论了与其他度量(包括Küllback-Leibler散度)的关系,并认识到其优化具有合成贝叶斯仿真解释,有助于理解和指定正则化惩罚。方法学的一个主要领域是混合合成,其中参数化族是给定分布的离散混合。一个详细的例子来自宏观经济政策设置中的情景预测,这是推动新方法的关键应用领域。理论发展为高效数值优化提供了基础,分析可通过直接蒙特卡洛模拟轻松实现。

英文摘要

We discuss probabilistic measures of concordance between two probability distributions based on the expected misclassification rate (EMR). The focus is on comparing a given reference distribution with other distributions in a parametrised class, and optimising concordance by identifying parameter values maximising EMR or a regularised variant. EMR is a practical and decision-theoretically meaningful measure, and its optimisation has direct interpretation as a Bayesian decision analysis with a bounded utility function. We explore theoretical properties of EMR, discuss relationships with other measures including Küllback-Leibler divergence, and recognise that its optimisation has a synthetic Bayesian emulation interpretation that aids understanding and specification of regularisation penalties. A main area of methodology is in mixture synthesis where the parametrised family is a discrete mixture of given distributions. A detailed example comes from scenario forecasting in macroeconomic policy settings, a key applied area motivating the new methodology. Theoretical developments underlie efficient numerical optimisation and analysis is easily implemented using direct Monte Carlo simulation.

2510.27144 2026-06-15 stat.ME 版本更新

Calibrating Bayesian Inference

校准贝叶斯推断

Yang Liu, Jonathan P. Williams, Jan Hannig

AI总结 针对先验分布与真实参数生成过程不匹配导致贝叶斯推断不可靠的问题,提出通过校准贝叶斯可信区间实现频率有效性,并开发了随机逼近算法。

详情
AI中文摘要

贝叶斯统计因其直观的不确定性量化和便捷的信息更新规则,在心理学研究中日益流行。然而,在许多应用中,先验分布仅被用作促进计算的工具,而非真实主观信念的表示。因此,依赖标准贝叶斯理由来证明推断程序的合理性在概念上变得缺乏基础。在本文中,我们建议通过重复抽样数据和参数来评估有限样本性能,作为“实用贝叶斯”的替代理由。我们展示了通常基于后验的推断的一个关键弱点:当分析者选择的先验分布与真实参数生成过程不匹配时,贝叶斯推断可能产生误导。鉴于真实过程在实践中很少已知,我们提出了一种更安全的替代方案:校准贝叶斯可信区间以实现频率有效性。后一个标准更强,并且保证了贝叶斯推断的有效性,无论底层参数生成机制如何。为了解决实际中的校准问题,我们提出了一种新的随机逼近算法。我们进行并报告了一个蒙特卡洛实验,观察到在某些参数生成场景下,未校准的贝叶斯推断可能过于宽松,而我们的校准解决方案始终维持有效性。我们还使用一个涉及位置尺度回归的真实数据示例说明了所提出的校准程序。

英文摘要

Bayesian statistics has gained popularity in psychological research due to its intuitive uncertainty quantification and convenient information-updating rules. In many applications, however, prior distributions are introduced merely as instruments to facilitate computation, rather than as representations of genuine subjective belief. Consequently, relying on standard Bayesian justifications for inferential procedures becomes conceptually ungrounded. In this paper, we recommend evaluating finite-sample performance over repeated sampling of data and parameters as an alternative justification for "pragmatic Bayes." We demonstrate a key vulnerability in the usual posterior-based inference: when analysts' chosen prior distribution mismatches the true parameter-generating process, Bayesian inference can be misleading. Given that this true process is rarely known in practice, we propose a safer alternative: calibrating Bayesian credible regions to achieve frequentist validity. This latter criterion is stronger and guarantees validity of Bayesian inference regardless of the underlying parameter-generating mechanism. To solve the calibration problem in practice, we propose a novel stochastic approximation algorithm. A Monte Carlo experiment is conducted and reported, in which we observe that uncalibrated Bayesian inference can be liberal under certain parameter-generating scenarios, whereas our calibrated solution consistently maintain validity. We also illustrate the proposed calibration procedure using a real-data example involving location-scale regression.

2602.13421 2026-06-15 stat.ML cs.AI q-bio.NC 版本更新

Metabolic cost of information processing in Poisson variational autoencoders

泊松变分自编码器中信息处理的代谢成本

Hadi Vafaii, Jacob L. Yates

发表机构 * Redwood Center for Theoretical Neuroscience(理论神经科学红木中心) UC Berkeley(伯克利大学)

AI总结 通过泊松变分自编码器,发现KL散度项与先验发放率成正比,产生代谢成本项,从而在编码保真度和能量消耗之间实现权衡。

Comments Published in CCN 2026 Proceedings: https://doi.org/10.32470/6ff31r0

详情
AI中文摘要

生物系统中的计算从根本上受到能量约束,但标准的计算理论将能量视为自由可用。在这里,我们认为在泊松假设下的变分自由能最小化为能量感知的计算理论提供了一条有原则的路径。我们的关键观察是,泊松自由能目标中的Kullback-Leibler(KL)散度项与模型神经元的先验发放率成正比,产生了一个惩罚高基线活动的涌现代谢成本项。这种结构将抽象的信息论量——*编码率*——与具体的生物物理变量——*发放率*——耦合起来,从而能够在编码保真度和能量消耗之间进行权衡。这种耦合自然地出现在泊松变分自编码器(P-VAE)中——一种受大脑启发的生成模型,它将输入编码为离散的尖峰计数,并作为特例恢复出尖峰形式的*稀疏编码*——但在标准高斯VAE中不存在。为了证明这种代谢成本结构是泊松公式所独有的,我们将P-VAE与Grelu-VAE(一种对潜在样本应用ReLU整流的高斯VAE,用于控制非负约束)进行比较。通过对KL项权重系数$\eta$和潜在维度的系统扫描,我们发现增加$\eta$会单调地增加P-VAE中的稀疏性并降低平均尖峰活动。相比之下,Grelu-VAE的表示保持不变,证实了该效应是泊松统计所特有的,而非非负表示的副产品。这些结果确立了泊松变分推理作为资源受限计算理论的一个有前景的基础。

英文摘要

Computation in biological systems is fundamentally energy-constrained, yet standard theories of computation treat energy as freely available. Here, we argue that variational free energy minimization under a Poisson assumption offers a principled path toward an energy-aware theory of computation. Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons, yielding an emergent metabolic cost term that penalizes high baseline activity. This structure couples an abstract information-theoretic quantity -- the *coding rate* -- to a concrete biophysical variable -- the *firing rate* -- which enables a trade-off between coding fidelity and energy expenditure. Such a coupling arises naturally in the Poisson variational autoencoder (P-VAE) -- a brain-inspired generative model that encodes inputs as discrete spike counts and recovers a spiking form of *sparse coding* as a special case -- but is absent from standard Gaussian VAEs. To demonstrate that this metabolic cost structure is unique to the Poisson formulation, we compare the P-VAE against Grelu-VAE, a Gaussian VAE with ReLU rectification applied to latent samples, which controls for the non-negativity constraint. Across a systematic sweep of the KL term weighting coefficient $β$ and latent dimensionality, we find that increasing $β$ monotonically increases sparsity and reduces average spiking activity in the P-VAE. In contrast, Grelu-VAE representations remain unchanged, confirming that the effect is specific to Poisson statistics rather than a byproduct of non-negative representations. These results establish Poisson variational inference as a promising foundation for a resource-constrained theory of computation.

2602.09161 2026-06-15 stat.ML cs.LG 版本更新

Minimum Distance Summaries for Robust Neural Posterior Estimation

最小距离摘要用于鲁棒神经后验估计

Sherman Khoo, Dennis Prangle, Song Liu, Mark Beaumont

发表机构 * University of Cambridge(剑桥大学)

AI总结 提出最小距离摘要方法,通过最大均值差异(MMD)在测试时自适应调整摘要统计量,在不修改预训练神经后验估计器的情况下实现鲁棒推断,理论保证鲁棒性并实验验证。

详情
AI中文摘要

基于模拟的推断(SBI)通过首先在先验-模拟器对上训练神经后验估计器(NPE),通常使用低维摘要统计量,实现摊销贝叶斯推断,然后可以在新测试观测上查询以廉价地重复用于快速推断。由于NPE是在训练数据分布下估计的,当观测偏离训练分布时,它容易受到误指定的影响。许多鲁棒SBI方法通过修改NPE训练或引入误差模型来解决这个问题,将鲁棒性与推断网络耦合,损害了摊销和模块化。我们引入了最小距离摘要,一种即插即用的鲁棒NPE方法,独立于预训练NPE自适应调整测试时的摘要统计量。利用最大均值差异(MMD)作为观测数据与摘要条件预测分布之间的距离,自适应摘要从MMD继承了强鲁棒性属性。我们证明该算法可以通过随机傅里叶特征近似高效实现,产生轻量级、无模型的测试时自适应过程。我们为算法的鲁棒性提供了理论保证,并在各种合成和真实世界任务上进行了实证评估,表明在最小额外开销下实现了显著的鲁棒性提升。

英文摘要

Simulation-based inference (SBI) enables amortized Bayesian inference by first training a neural posterior estimator (NPE) on prior-simulator pairs, typically through low-dimensional summary statistics, which can then be cheaply reused for fast inference by querying it on new test observations. Because NPE is estimated under the training data distribution, it is susceptible to misspecification when observations deviate from the training distribution. Many robust SBI approaches address this by modifying NPE training or introducing error models, coupling robustness to the inference network and compromising amortization and modularity. We introduce minimum-distance summaries, a plug-in robust NPE method that adapts queried test-time summaries independently of the pretrained NPE. Leveraging the maximum mean discrepancy (MMD) as a distance between observed data and a summary-conditional predictive distribution, the adapted summary inherits strong robustness properties from the MMD. We demonstrate that the algorithm can be implemented efficiently with random Fourier feature approximations, yielding a lightweight, model-free test-time adaptation procedure. We provide theoretical guarantees for the robustness of our algorithm and empirically evaluate it on a range of synthetic and real-world tasks, demonstrating substantial robustness gains with minimal additional overhead.

2404.07440 2026-06-15 stat.ME 版本更新

Bayesian Penalized Transformation Models: Structured Additive Location-Scale Regression for Arbitrary Conditional Distributions

贝叶斯惩罚变换模型:任意条件分布的结构化加性位置-尺度回归

Johannes Brachem, Paul F. V. Wiemann, Thomas Kneib

AI总结 提出贝叶斯惩罚变换模型,通过半参数位置-尺度回归直接估计响应变量的条件分布,结合结构化加性预测器和光滑先验,实现不确定量化,并在模拟和实际数据中验证有效性。

详情
AI中文摘要

惩罚变换模型(PTMs)是一个半参数位置-尺度回归族,直接从数据中估计响应的条件分布,并通过结构化加性预测器对位置和尺度进行建模。模型的核心是一个单调递增的变换函数,将响应分布与参考分布联系起来。变换函数配备了一个光滑先验,用于正则化估计分布与参考分布的偏离程度。PTMs可以看作是条件变换模型与位置、尺度和形状广义加性模型之间的桥梁。基于马尔可夫链蒙特卡洛的PTM推断为条件分布以及协变量效应提供了直接的不确定性量化。模拟研究证明了该方法的有效性,并与多种替代方法进行了比较。应用于第四次荷兰生长研究和弗雷明汉心脏研究,展示了其实用性和实际效用。一个功能完整的实现以Python库的形式提供。本文的补充材料可在线获取。

英文摘要

Penalized transformation models (PTMs) are a semiparametric location-scale regression family that estimate a response's conditional distribution directly from the data, and model the location and scale through structured additive predictors. The core of the model is a monotonically increasing transformation function that relates the response distribution to a reference distribution. The transformation function is equipped with a smoothness prior that regularizes how much the estimated distribution diverges from the reference. PTMs can be seen as a bridge between conditional transformation models and generalized additive models for location, scale and shape. Markov chain Monte Carlo inference for PTMs offers straightforward uncertainty quantification for the conditional distribution as well as for the covariate effects. A simulation study demonstrates the effectiveness of the approach and includes comparisons to many alternative methods. Applications to the Fourth Dutch Growth Study and the Framingham Heart Study illustrate the usage and practical utility. A full-featured implementation is available as a Python library. Supplementary material for this article is available online.

3. 因果推断与实验设计 4 篇

2606.14132 2026-06-15 stat.ME math.ST stat.TH 新提交

HSCI: Neyman-Orthogonal Causal Inference under High-Dimensional Proportional Hazards

HSCI: 高维比例风险下的Neyman正交因果推断

Yingying Fan, Lan Gao, Daoji Li, Jinchi Lv

AI总结 针对生存研究中高维协变量混杂下的处理效应推断问题,提出基于Neyman近正交得分的高维生存因果推断框架,实现根n渐近正态性和一致方差估计,显著降低偏差。

详情
AI中文摘要

在生存研究中,当处理分配和结果受到许多基线协变量混杂时,有效的处理效应推断是基础且具有挑战性的。为此,本文提出了一个高维生存因果推断(HSCI)框架,该框架在稀疏高维Cox比例风险结果模型和高维逻辑倾向得分工作模型下提供有效的推断。为了减轻干扰估计偏差,我们开发了处理效应的Neyman近正交得分,并通过交叉拟合实现。在双重稳健干扰率条件下,我们建立了根n渐近正态性和一致方差估计。我们还将该框架扩展到高维生存协变量效应的推断。模拟示例证实,与正则化Cox估计量相比,HSCI显著减少了偏差,并在不同维度、删失和错误指定倾向模型设置下保持了有效的置信区间覆盖。对弥漫性大B细胞淋巴瘤数据的应用进一步展示了其在高维生物医学生存研究中的价值。

英文摘要

Valid treatment effect inference in survival studies is fundamental yet challenging when the treatment assignments and outcomes are confounded by many baseline covariates. To this end, in this paper we propose a high-dimensional survival causal inference (HSCI) framework that delivers valid inference under a sparse high-dimensional Cox proportional hazards outcome model and a high-dimensional logistic propensity score working model. To mitigate the nuisance estimation bias, we develop a Neyman near-orthogonal score for the treatment effect and implement it with cross-fitting. Under doubly robust nuisance-rate conditions, we establish the root-n asymptotic normality and consistent variance estimation. We also extend the framework to inference on high-dimensional survival covariate effects. Simulation examples confirm that HSCI reduces sharply the bias relative to the regularized Cox estimators and maintains valid confidence interval coverage across different dimensionality, censoring, and misspecified propensity-model settings. An application to diffuse large-B-cell lymphoma data further showcases its value for high-dimensional biomedical survival studies.

2606.14131 2026-06-15 stat.ME 新提交

G-computation for causal effect estimation from observational hierarchical data with unmeasured cluster context

G-计算用于从具有未测量聚类背景的分层观察数据中估计因果效应

Shafayet Khan Shafee, Bishal Sarker, Md. Niamul Islam Sium

AI总结 针对分层观察数据中未测量的聚类层面混杂因素和效应异质性,提出基于随机效应模型的组内G-计算方法,通过按处理流行率分组估计并聚合,有效降低偏差。

Comments 19 pages, 7 figures, 1 supplementary figure, 4 supplementary tables; supplementary material included as an appendix within the same file

详情
AI中文摘要

观察性研究经常涉及分层数据结构,其中个体嵌套在更高层级的单元中。在这种情况下,未测量的聚类层面因素可能会混淆处理-结果关系,并可能额外引起跨聚类的处理效应异质性,使因果效应估计复杂化。我们通过将随机效应模型(REM)作为结果模型,形式化了G-计算在分层观察数据中的应用,并提出了一种旨在减少未测量聚类背景引起的偏差的组内G-计算策略。该方法根据观察到的处理流行率对聚类进行分组,在组内执行G-计算,然后聚合组特定的估计值。通过广泛的蒙特卡洛模拟,我们使用线性模型和REM比较了标准和组内G-计算实现。结果表明,当未测量的聚类层面变量仅作为混杂因素时,基于REM的标准和组内实现均显著减少偏差;而当未测量的聚类层面因素同时作为混杂因素和处理效应异质性来源时,所提出的组内REM估计器实现了最低的RMSE。我们应用所提出的组内REM估计器,使用2019年孟加拉国MICS数据估计青少年怀孕对儿童身高-年龄Z分数的因果效应,得到估计效应为-0.12(95% bootstrap CI: [-0.18, -0.06])。所提出的组内G-计算框架为减少分层观察研究中未测量的聚类层面混杂和处理效应异质性带来的偏差提供了一种策略。

英文摘要

Observational studies frequently involve hierarchical data structures in which individuals are nested within higher-level units. In such settings, unmeasured cluster-level factors may confound the treatment-outcome relationship and may additionally induce treatment effect heterogeneity across clusters, complicating causal effect estimation. We formalize the use of g-computation for hierarchical observational data by incorporating random-effects models (REM) as outcome models and propose a within-group g-computation strategy designed to reduce bias arising from unmeasured cluster context. The approach groups clusters according to their observed treatment prevalence and performs g-computation within groups before aggregating group-specific estimates. Through extensive Monte Carlo simulations, we compare the standard and within-group implementations of g-computation using both linear models and REM. Results show that both standard and within-group REM-based implementations substantially reduce bias when the unmeasured cluster-level variable acts solely as a confounder, whereas the proposed within-group REM estimator achieves the lowest RMSE when the unmeasured cluster-level factor acts as both a confounder and a source of treatment effect heterogeneity. We apply the proposed within-group REM estimator to estimate the causal effect of adolescent pregnancy on the child height-for-age Z-score using 2019 Bangladesh MICS data, obtaining an estimated effect of -0.12 (95% bootstrap CI: [-0.18, -0.06]). The proposed within-group g-computation framework offers a strategy for reducing bias from unmeasured cluster-level confounding and treatment effect heterogeneity in hierarchical observational studies.

2606.13947 2026-06-15 stat.ME 新提交

Constraint-based difference graph discovery in a linear setting

线性设定下基于约束的差异图发现

Daria Bystrova, Emilie Devijver

AI总结 针对两个环境间的因果差异图推断问题,提出基于回归系数相等性检验的线性结构因果模型发现方法,引入diff-separation准则和LDiffPC算法。

详情
AI中文摘要

在许多科学领域中,比较不同群体间的因果关系至关重要。本文研究了推断两个环境间差异图的问题,并提出了一种基于回归系数相等性检验的线性结构因果模型因果发现方法。我们表明,回归系数的不变性受超越标准d-separation的图形条件支配。因此,我们引入了diff-separation,这是一种图形准则,用于刻画当条件集阻断所有能够引起跨环境回归系数差异的路径时的情形。基于这一准则,我们引入了相应的diff-faithfulness假设,将图形diff-separation陈述与回归系数的等式约束联系起来。最后,我们提出了LDiffPC,一种PC风格的算法,该算法利用回归系数的相等性检验从多环境数据中恢复差异。

英文摘要

Comparing causal relationships across populations is essential in many scientific domains. This paper studies the problem of inferring a difference graph between two environments and proposes a causal discovery method for linear structural causal models based on equality tests of regression coefficients. We show that invariance of regression coefficients is governed by graphical conditions that go beyond standard d-separation. Therefore, we introduce diff-separation, a graphical criterion that characterizes when a conditioning set blocks all paths capable of inducing differences in regression coefficients across environments. Building on this criterion, we introduce a corresponding diff-faithfulness assumption, linking graphical diff-separation statements to equality constraints on regression coefficients. Finally, we propose LDiffPC, a PC-style algorithm that uses equality tests of regression coefficients to recover the differences from multi-environment data.

2505.17961 2026-06-15 stat.ME cs.AI math.ST stat.AP stat.TH 版本更新

Federated Causal Inference from Multi-Site Observational Data via Propensity Score Aggregation

基于倾向得分聚合的多中心观测数据联邦因果推断

Rémi Khellaf, Aurélien Bellet, Julie Josse

发表机构 * University of Technology, CNRS, France(法国技术大学、国家科学研究中心)

AI总结 提出通过联邦学习聚合各站点倾向得分,利用成员权重估计平均处理效应,解决多中心观测数据因隐私限制无法集中的因果推断问题。

详情
AI中文摘要

因果推断通常假设可以集中访问个体层面数据。然而,在实践中,数据往往分散在多个站点,由于隐私、后勤或法律限制,集中化不可行。我们通过联邦学习方法从分散的观测数据中估计平均处理效应来解决这个问题,允许通过交换聚合统计量而非个体层面数据进行推断。我们提出了一种新方法,使用成员权重(定义为给定协变量条件下站点成员的概率)通过联邦加权平均局部得分来估计倾向得分。成员权重可以使用标准联邦学习算法通过参数或非参数分类模型灵活估计。得到的倾向得分用于构建联邦逆概率加权和增强逆概率加权估计量。与元分析方法(当任何站点违反积极性时失败)相比,我们的方法利用跨站点处理分配的异质性来改善重叠。我们表明,在站点层面的样本量、处理机制和协变量分布异质性下,联邦逆概率加权和增强逆概率加权表现良好。理论分析以及在模拟和真实数据上的实验证明了相对于元分析及相关方法的明显优势。

英文摘要

Causal inference typically assumes centralized access to individual-level data. Yet, in practice, data are often decentralized across multiple sites, making centralization infeasible due to privacy, logistical, or legal constraints. We address this problem by estimating the Average Treatment Effect (ATE) from decentralized observational data via a Federated Learning (FL) approach, allowing inference through the exchange of aggregate statistics rather than individual-level data. We propose a novel method to estimate propensity scores via a federated weighted average of local scores using Membership Weights (MW), defined as probabilities of site membership conditional on covariates. MW can be flexibly estimated with parametric or non-parametric classification models using standard FL algorithms. The resulting propensity scores are used to construct Federated Inverse Propensity Weighting (Fed-IPW) and Augmented IPW (Fed-AIPW) estimators. In contrast to meta-analysis methods, which fail when any site violates positivity, our approach exploits heterogeneity in treatment assignment across sites to improve overlap. We show that Fed-IPW and Fed-AIPW perform well under site-level heterogeneity in sample sizes, treatment mechanisms, and covariate distributions. Theoretical analysis and experiments on simulated and real-world data demonstrate clear advantages over meta-analysis and related approaches.

4. 高维统计与正则化 2 篇

2606.14436 2026-06-15 stat.ME math.OC stat.ML 新提交

Joint Nuclear and $\ell_1$ Regularization for Logistic Matrix Regression with Applications to Brain Imaging

联合核范数和ℓ1正则化的逻辑矩阵回归及其在脑成像中的应用

Damian Brzyski, Aaron Cohen, Zijian Wang, Mario Dzemidzic, David A. Kareken, Jaroslaw Harezlak

AI总结 提出一种结合核范数和ℓ1惩罚的凸优化框架,用于逻辑标量-矩阵回归,以同时实现系数矩阵的低秩和稀疏结构,并应用于脑成像数据分析。

详情
AI中文摘要

我们引入了一种新的凸优化框架用于逻辑标量-矩阵回归,该框架结合了核范数和ℓ1范数惩罚,以强制估计系数矩阵同时具有低秩和稀疏结构。所提出的方法能够在存在二元响应的情况下对高维矩阵值预测变量进行可解释建模。我们基于交替方向乘子法(ADMM)推导了一种定制算法,以高效求解由此产生的凸优化问题,并建立了所得解的理论性质。数值实验清楚地证明了我们的方法在恢复有意义的预测模式方面的有效性。最后,我们将该方法应用于脑成像数据,以识别功能性脑连接矩阵中具有酒精使用障碍(AUD)家族史受试者特征的结构。

英文摘要

We introduce a new convex optimization framework for logistic scalar-on-matrix regression which incorporates nuclear and $\ell_1$ norm penalties to enforce simultaneously low-rank and sparse structures in the estimated coefficient matrix. The proposed method enables interpretable modeling of high-dimensional matrix-valued predictors in the presence of binary responses. We derive a custom algorithm based on the Alternating Direction Method of Multipliers (ADMM) to efficiently solve the resulting convex optimization problem and establish the theoretical properties of the obtained solution. Numerical experiments clearly demonstrate the effectiveness of our method in recovering meaningful predictive patterns. Finally, we apply our method to the brain imaging data to identify structures in functional brain connectivity matrices that are characteristic of subjects with a family history of alcohol use disorders (AUDs).

2405.03063 2026-06-15 math.ST cs.IT cs.LG math.IT stat.ME stat.ML stat.TH 版本更新

Stability of a Generalized Debiased Lasso with Applications to Resampling-Based Variable Selection

广义去偏Lasso的稳定性及其在基于重抽样的变量选择中的应用

Jingbo Liu

发表机构 * Department of Statistics, University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校统计系) Department of Electrical and Computer Engineering, the Grainger College of Engineering(格拉inger工程学院电子与计算机工程系)

AI总结 提出基于稳定性原理的广义去偏Lasso估计量,通过设计矩阵单列扰动下的简单更新公式,在比例增长机制下实现渐近精确近似,显著降低重抽样变量选择的计算成本。

Comments to appear in Bernoulli

详情
AI中文摘要

我们提出了一种基于稳定性原理的广义去偏Lasso估计量。当设计矩阵的单列被扰动时,该估计量允许一个简单的更新公式,可以从原始解计算得出。在具有良好条件协方差的次高斯设计下,这种近似在比例增长机制下对于除消失比例坐标外的所有坐标是渐近精确的。证明依赖于集中和反集中论证来控制误差项和符号变化。相比之下,在类似假设下建立可比较的分布极限(例如高斯性)仍然是开放的。作为一个应用,我们表明该近似显著降低了基于重抽样的变量选择过程的计算成本,包括条件随机化测试和局部knockoff滤波器。

英文摘要

We propose a generalized debiased Lasso estimator based on a stability principle. When a single column of the design matrix is perturbed, the estimator admits a simple update formula that can be computed from the original solution. Under sub-Gaussian designs with well-conditioned covariance, this approximation is asymptotically accurate for all but a vanishing fraction of coordinates in the proportional growth regime. The proof relies on concentration and anti-concentration arguments to control error terms and sign changes. In contrast, establishing comparable distributional limits (e.g., Gaussianity) under similar assumptions remains open. As an application, we show that the approximation significantly reduces the computational cost of resampling-based variable selection procedures, including the conditional randomization test and a local knockoff filter.

5. 时间序列与空间统计 10 篇

2606.14532 2026-06-15 physics.soc-ph stat.ME 新提交

Modeling inhomogeneous spatial point configurations with applications to replicated patterns in waiting crowds

非均匀空间点构型建模及其在等待人群重复模式中的应用

Lars Sickert Karam, Rui M. Castro, Maarten Schoukens, Alessandro Corbetta

AI总结 本文提出一种利用重复空间模式推断半参数空间点过程的方法,并应用于等待人群建模,通过分离位置吸引力和排斥交互作用,在模拟和真实数据中验证了模型的有效性。

Comments 35 pages, 24 figures

详情
AI中文摘要

在本文中,我们通过两个相互关联的贡献将空间点过程的统计推断与等待行人人群的分析联系起来。首先,在方法论方面,我们开发了一种利用重复空间模式(即来自同一过程的多个近似独立的实现)进行半参数空间点过程模型的推断程序。其次,我们展示了空间点过程为等待行人提供了合适的建模框架,捕捉了两个关键方面:由位置吸引力驱动的空间非均匀性和行人之间的排斥交互作用。这两个组成部分本身是推断问题的核心,因为空间点过程建模依赖于从交互作用中分离背景强度。尽管重复空间模式在点过程文献中很少见,但通过一个独特的真实行人数据集,我们在此获得了这些模式,从而将方法论的发展与物理应用直接联系起来。我们使用所提出的方法在模拟研究和真实案例研究中拟合和评估行列式点过程和吉布斯点过程。尽管在解耦非均匀性和交互作用的影响方面仍然存在挑战,但这些模型能够再现等待行人的关键经验特征。

英文摘要

In this article, we connect statistical inference for spatial point processes with the analysis of waiting pedestrian crowds through two interconnected contributions. First, on the methodological side we develop an inference procedure for semiparametric spatial point process models leveraging replicated spatial patterns, i.e., multiple approximately independent realizations from the same process. Second, we show that spatial point processes provide a suitable modeling framework for waiting pedestrians, capturing two key aspects: spatial inhomogeneity driven by location attractiveness and repulsive interactions between pedestrians. These two components are central to the inference problem itself, since spatial point process modeling hinges on disentangling background intensity from interaction. Although replicated spatial patterns are rare in point process literature, they are available here through a unique real-life pedestrian dataset, thereby directly linking the methodological development to the physical application. We use the proposed methods to fit and evaluate determinantal and Gibbs point processes in a simulation study and a real-world case study. Despite persistent challenges in decoupling the influences of inhomogeneity from interaction, these models are able to reproduce key empirical features of waiting pedestrians.

2606.14417 2026-06-15 stat.AP 新提交

Stable Multivariate Functional Time Series Prediction for Major Geomagnetic Indices

主要地磁指数的稳定多元函数时间序列预测

Yian Yu, Shasha Zou, Tuija Pulkkinen, Yang Chen

AI总结 提出一种鲁棒的多元函数时间序列预测框架,采用重叠滚动窗口保持时间连续性,结合FPCA降维和VARX模型捕捉跨序列动态,用于预测五个关键地磁指数,在6-24小时预测上优于现有方法。

详情
AI中文摘要

高分辨率科学数据,如地磁指数流,通常表现出复杂的时间依赖性,可以通过函数数据分析来建模。传统的函数时间序列方法通常将连续过程划分为不重叠的片段,这人为地破坏了时间连续性,并可能限制估计效率和稳定性。这在具有噪声、突然和大尺度变化的地磁时间序列预测中尤为明显。本研究提出了一种鲁棒的多元函数时间序列预测框架,用于具有序列间相关性和外生预测变量的多维时间序列。我们引入了一种重叠滚动窗口方案,以保持时间一致性并减少边界信息损失,从而丰富有效样本量,实现更高效和稳定的估计。我们集成了函数主成分分析进行降维,以及带有外生输入的向量自回归模型,以捕捉相关序列间的潜在动态。我们还构建了计算高效的一致性预测区间用于不确定性量化。该框架受同时预测五个关键地磁指数Kp、Dst、SYM-H、SME和SMR的启发,并应用于此,使用太阳风参数作为预测变量。实证结果表明,该方法优于最先进的机器学习基线,将预测范围扩展到6-24小时,并提供校准的不确定性界限。

英文摘要

High\text{--}resolution scientific data, such as geomagnetic index streams, often exhibit complex temporal dependencies that can be modeled through functional data analysis. Conventional functional time series (FTS) methods typically partition continuous processes into non-overlapping segments, which artificially fragments temporal continuity and can limit estimation efficiency and stability. This is particularly evident in geomagnetic time series prediction due to their noisy, sudden, and large\text{--}scale changes. This study presents a robust multivariate FTS forecasting framework for multi\text{--}dimensional time series with inter\text{--}series correlations and the existence of exogenous predictors. We introduce an overlapping rolling\text{--}window scheme that preserves temporal coherence and mitigates boundary information loss, thereby enriching the effective sample size for a more efficient and stable estimation. We integrate functional principal component analysis for dimension reduction with a vector autoregressive model with exogenous inputs to capture latent dynamics across correlated series. We also construct computationally efficient conformal prediction intervals for uncertainty quantification. The framework is motivated by and applied to the simultaneous forecasting of five critical geomagnetic indices, Kp, Dst, SYM\text{--}H, SME, and SMR, using solar wind parameters as predictors. Empirical results show that this approach outperforms state\text{--}of\text{--}the\text{--}art machine learning baselines, extends forecast horizons to 6\text{--}24 hours, and provides calibrated uncertainty bounds.

2606.14313 2026-06-15 stat.ML cs.LG 新提交

Nonlocal Bayesian Modeling of Continuous Spatio-Temporal Dynamics

连续时空动力学的非局部贝叶斯建模

Jaeyeong Lee, Heeyoung Kim

发表机构 * Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology (KAIST)(工业与系统工程系,韩国科学技术院)

AI总结 提出NLBST模型,通过坐标基展开和连续时间ODE结合非局部积分微分方程,实现不规则观测下的连续时空预测与不确定性量化。

Comments Accepted at UAI 2026

详情
AI中文摘要

现实世界的时空预测必须处理不规则时间点、空间稀疏观测以及不确定性量化的需求。这种设置通常因非局部相互作用(长程空间耦合)而进一步复杂化。对连续空间、连续时间的非局部动力学进行建模自然会导致无限维积分微分方程(IDE),使得原则性的贝叶斯推断变得棘手。我们提出了非局部贝叶斯时空模型(NLBST),这是一个用于连续时空场的分层贝叶斯框架,它在保留可处理推断的同时学习显式的非局部耦合。NLBST通过基于坐标的空间基展开表示潜在场,并用连续时间ODE对系数过程进行建模,其可学习的线性算子对应于非局部IDE的伽辽金约化;神经ODE残差捕获额外的非线性动力学。线性高斯观测模型使得在缺失和不规则观测下能够进行卡尔曼式顺序更新,而空间基表示则使得无需重新训练即可在未测量位置进行归纳预测。全局参数通过变分推断学习,不确定性通过贝叶斯层次结构处理。在合成和真实数据集上的实验表明,该模型具有强大的预测能力和空间泛化能力,且不确定性校准良好,在强非局部和部分观测场景下相比基线方法取得了显著提升。

英文摘要

Real-world spatio-temporal forecasting must handle irregular time points, spatially sparse observations, and the need for uncertainty quantification. This setting is often further compounded by nonlocal interactions (long-range spatial coupling). Modeling continuous-space, continuous-time nonlocal dynamics naturally leads to infinite-dimensional integro-differential equations (IDEs), making principled Bayesian inference intractable. We propose the NonLocal Bayesian Spatio-Temporal model (NLBST), a hierarchical Bayesian framework for continuous spatio-temporal fields that learns explicit nonlocal coupling while retaining tractable inference. NLBST represents the latent field via a coordinate-based spatial basis expansion and models the coefficient process with a continuous-time ODE whose learnable linear operator corresponds to a Galerkin reduction of a nonlocal IDE; a Neural ODE residual captures additional nonlinear dynamics. A linear-Gaussian observation model enables Kalman-style sequential updates under missing and irregular observations, while the spatial basis representation enables inductive prediction at unmeasured locations without retraining. Global parameters are learned via variational inference, and uncertainty is handled through a Bayesian hierarchy. Experiments on synthetic and real-world datasets demonstrate strong forecasting and spatial generalization with well-calibrated uncertainty, yielding substantial gains over baselines in strongly nonlocal and partially observed regimes.

2606.14116 2026-06-15 cs.LG stat.ME 新提交

DTVEM-RE: A Hierarchical Random-Effects Extension of the Differential Time-Varying Effect Model for Person-Specific Multi-Lag Estimation in Intensive Longitudinal Data

DTVEM-RE:差分时变效应模型的分层随机效应扩展,用于密集纵向数据中个体特异性多滞后估计

Amartya Bhattacharya

发表机构 * Geisel School of Medicine, Dartmouth College(达特茅斯学院盖泽尔医学院)

AI总结 针对DTVEM假设所有人共享相同滞后结构的局限,提出DTVEM-RE扩展,允许个体拥有自己的滞后系数,通过贝叶斯分层VAR和连续时间OU模型实现,模拟和实证表明其能恢复个体间变异并提升预测性能。

详情
AI中文摘要

Jacobson等人(2019)提出的差分时变效应模型(DTVEM)是寻找密集纵向数据中最佳时间滞后的流行工具,但它假设所有人共享相同的滞后结构。原作者将此问题列为未来工作,这与现代临床研究的前提——个体存在差异——相冲突。我们提出DTVEM-RE,一种允许每个人拥有自己滞后系数的扩展,包含两种确认步骤版本:在Stan中实现的离散时间分层贝叶斯VAR,它在个体间进行信息汇集并提供校准的不确定性;以及在ctsem中实现的连续时间个体Ornstein-Uhlenbeck模型,它直接处理不均匀间隔的测量点。我们报告了四个结果。模拟显示,贝叶斯版本恢复个体间变异tau_a的偏差低于0.01,覆盖率为90%至93%。在Fisher等人(2017)的EMA数据集(N=40)上,个体特异性滞后1效应在三个情绪项目上相差一个数量级,贝叶斯和GAMM估计高度一致(r=0.87至0.92),且DTVEM-RE在四种离散时间方法中给出最佳的一步预测。多滞后版本显示所有九个tau_k值的可信区间均排除零,且个体差异最大的滞后在不同项目间变化,这是仅考虑滞后1的方法(如mlVAR)无法检测到的。最后,两个版本在个体特异性滞后1估计上几乎完全一致(r >= 0.995),差异仅如收缩所预测。据我们所知,DTVEM-RE是DTVEM风格滞后检测的第一个个体特异性实现,并且它包含标准DTVEM作为特例。

英文摘要

The Differential Time-Varying Effect Model (DTVEM) of Jacobson et al. (2019) is a popular tool for finding the best time lag in intensive longitudinal data, but it assumes everyone shares the same lag structure. The original authors named fixing this as future work, and it clashes with the premise of modern clinical research, which is that people differ. We present DTVEM-RE, an extension that lets each person have their own lag coefficients, with two versions of the confirmatory step: a discrete-time hierarchical Bayesian VAR in Stan, which pools across people and gives calibrated uncertainty, and a continuous-time per-person Ornstein-Uhlenbeck model in ctsem, which handles unevenly spaced beeps directly. We report four results. A simulation shows the Bayesian version recovers the between-person spread tau_a with bias below 0.01 and coverage of 90 to 93 percent. On the Fisher et al. (2017) EMA dataset (N=40), person-specific lag-1 effects vary by an order of magnitude across three mood items, the Bayesian and GAMM estimates agree closely (r=0.87 to 0.92), and DTVEM-RE gives the best one-step-ahead prediction among four discrete-time methods. A multi-lag version shows all nine tau_k values have credible intervals excluding zero, and the lag where people differ most changes across items, something lag-1-only methods like mlVAR cannot detect. Finally, the two versions agree almost exactly on person-specific lag-1 estimates (r >= 0.995), differing only as shrinkage predicts. DTVEM-RE is, to our knowledge, the first person-specific implementation of DTVEM-style lag detection, and it contains standard DTVEM as a special case.

2606.13823 2026-06-15 cs.LG eess.SP stat.ML 新提交

A Stationarity-and-Coupling Criterion for Training-Free Time-Lagged Spectral Embeddings of Multivariate Time Series

多变量时间序列无训练时滞谱嵌入的平稳性与耦合准则

Siddharth Pal, Viktoria Rojkova

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出基于时滞相关矩阵截断的固定长度描述符D(τ),通过平稳高斯VAR(1)模型推导其适用条件:信号近似平稳且类别信息存在于跨通道时间耦合而非边际功率。

Comments 25 pages, 2 figures, 10 tables

详情
AI中文摘要

我们研究多变量时间序列的无训练固定长度描述符,不仅问这样的描述符是否表现良好,而且问何时可以预期它有效。我们的研究对象是$D(\tau)$,它由时滞相关矩阵在Marchenko-Pastur边缘截断构建,使得仅信号承载的特征值存活,并通过与类质心的余弦相似度分类,零学习参数。核心贡献不是描述符本身,而是一个可证伪的适用性准则。基于平稳高斯VAR(1)模型,我们论证当信号近似平稳且类别信息存在于它们的跨通道时间耦合而非边际每通道功率时,$D(\tau)$能分离两个类别。我们半正式地推导出三个结果:可区分性条件、为什么静态($\tau=0$)协方差退化为随机、以及为什么平稳但功率判别范式会击败描述符。该准则是可操作的:一个两部分预检测试——增强Dickey-Fuller平稳性检验和功率基线饱和检验——在任何训练前预测适用性。我们在混合数据集上验证了这两部分。在满足准则的四个范式(Sleep-EDF、BCI-IV-2a、MIT-BIH、ESC-50)上,描述符以极低成本与强基线竞争,在Sleep-EDF上20受试者留一法下达到$88.5\pm4.5\\%$,单CPU线程。在违反准则的三个范式——非平稳ERP、以及功率判别的金融波动和可穿戴压力模式——上,它完全如预检预测的那样失败,而这些负面结果更具信息量。我们明确$D(\tau)$不是最准确的表示;其价值在于它是一个紧凑、无训练的嵌入,其有效域事先已知。

英文摘要

We study training-free fixed-length descriptors for multivariate time series and ask not merely whether such a descriptor performs well, but when it can be expected to work at all. Our object of study is $D(τ)$, built from a time-lagged correlation matrix truncated at the Marchenko-Pastur edge so that only signal-bearing eigenvalues survive and classified by cosine similarity to class centroids with zero learned parameters. The central contribution is not the descriptor but a falsifiable applicability criterion for it. Working from a stationary Gaussian VAR(1) model, we argue that $D(τ)$ separates two classes when the signals are approximately stationary and the class information lives in their cross-channel temporal coupling rather than in marginal per-channel power. We derive, semi-formally, three consequences: a distinguishability condition, why the static ($τ=0$) covariance collapses to chance, and why a stationary but power-discriminated paradigm defeats the descriptor. The criterion is operational: a two-part pre-flight test -- an augmented Dickey-Fuller stationarity check and a power-baseline saturation check -- predicts applicability before any training. We validate both halves on a mixed assortment. On four paradigms that satisfy the criterion (Sleep-EDF, BCI-IV-2a, MIT-BIH, ESC-50) the descriptor is competitive with strong baselines at a fraction of their cost, reaching $88.5\pm4.5\%$ under 20-subject leave-one-subject-out on Sleep-EDF on a single CPU thread. On three that violate it -- non-stationary ERPs, and financial-volatility and wearable-stress regimes that are power-discriminated -- it fails exactly as the pre-flight predicts, and these negatives are the more informative half. We are explicit that $D(τ)$ is not the most accurate representation; its value is a compact, training-free embedding whose domain of validity is known in advance.

2606.02231 2026-06-15 stat.ML cs.LG stat.ME 版本更新

Identifiable Markov Switching Models with Instantaneous Effects and Exponential Families

具有瞬时效应和指数族的可识别马尔可夫切换模型

Roel Hulsman, Carles Balsells-Rodas, Sara Magliacane

发表机构 * University of Amsterdam(阿姆斯特丹大学)

AI总结 针对非平稳时间序列,提出在指数族噪声下具有瞬时效应的马尔可夫切换模型的可识别性理论,并开发FlowMSM框架用于检测隐状态和恢复因果结构。

Comments International Conference on Machine Learning (ICML) 2026

详情
AI中文摘要

时间系统通常表现出非平稳行为,例如季节性气候变化或1型糖尿病患者的血糖波动。对非平稳性建模的一种方法是通过离散隐状态,即时间的平稳片段。此类系统诱导出马尔可夫切换模型(MSM),这是一类隐马尔可夫模型,其中隐状态和观测变量之间存在自回归依赖关系。在存在频繁状态切换以及非线性和非高斯动态的情况下,特别是在变量之间存在瞬时效应(例如由于测量速率较慢)时,识别隐状态具有挑战性。在这项工作中,我们建立了在时间状态依赖、非线性滞后和瞬时效应以及来自指数族的独立噪声下,隐状态和状态依赖因果结构的可识别性。我们的可识别性理论涵盖了因果模型的非时间混合。此外,我们引入了FlowMSM,这是一个状态检测框架,可与任何平稳因果发现方法配对,以恢复状态依赖的因果结构。在合成基准和金融经济学数据集上的实验证明了我们的方法在检测隐状态和从非平稳时间序列中发现因果结构方面的有效性。

英文摘要

Temporal systems often exhibit non-stationary behaviour, such as seasonal climate variation or glucose fluctuations in patients with type-1 diabetes. One way to model non-stationarity is through discrete latent regimes, i.e., stationary segments of time. Such systems induce a Markov Switching Model (MSM), a class of Hidden Markov Models with autoregressive dependencies among latent regimes and observed variables. Identifying latent regimes is challenging in the presence of frequent regime switches and nonlinear and non-Gaussian dynamics, particularly when there are instantaneous effects between the variables, e.g., due to slow rates of measurements. In this work, we establish the identifiability of both latent regimes and regime-dependent causal structures under temporal regime dependencies, nonlinear lagged and instantaneous effects, and independent noise from the exponential family. Our identifiability theory subsumes non-temporal mixtures of causal models. Furthermore, we introduce FlowMSM, a regime detection framework that can be paired with any stationary causal discovery method to recover regime-dependent causal structures. Experiments on synthetic benchmarks and a financial economics dataset demonstrate the effectiveness of our approach to detect latent regimes and discover causal structures from non-stationary time series.

2509.06697 2026-06-15 econ.EM cs.LG stat.AP stat.ML 版本更新

Neural ARFIMA model for forecasting BRIC exchange rates with long memory

具有长期记忆的神经ARFIMA模型用于预测BRIC汇率

Donia Besher, Madhurima Panja, Shovon Sengupta, Tanujit Chakraborty

AI总结 本文提出神经ARFIMA模型,结合ARFIMA的长期记忆结构和神经网络非线性能力,以提高BRIC汇率预测精度。

详情
AI中文摘要

准确预测汇率仍是一个持续挑战,特别是对于新兴经济体如巴西、俄罗斯、印度和中国(BRIC)。这些序列表现出长期记忆和非线性,传统时间序列模型难以捕捉。汇率动态还受全球经济政策不确定性、美国股市波动性、美国货币政策不确定性、油价增长率和短期利率等因素影响。本文提出神经自回归分数积分移动平均(NARFIMA)模型,结合ARFIMA的长期记忆结构和神经网络的非线性学习能力,并纳入外生变量。我们建立了NARFIMA的渐近平稳性,并利用符合预测区间量化预测不确定性。实证结果表明,NARFIMA在预测BRIC汇率方面始终优于基准方法。

英文摘要

Exchange rate forecasting remains a challenging problem, particularly for emerging economies, where the observed time series exhibit pronounced long-memory dependence, nonlinear dynamics, and sensitivity to macro-financial drivers. Classical models such as ARFIMA capture long-range persistence but fail to adequately represent nonlinear relationships, while modern machine learning approaches often neglect the underlying long-memory structure in macroeconomic series. To address this gap, we propose a Neural AutoRegressive Fractionally Integrated Moving Average (NARFIMA) model that integrates ARFIMA-based long-memory modeling with neural networks for nonlinear function approximation, while incorporating exogenous macroeconomic and uncertainty indicators. The framework provides a unified approach for capturing persistence, nonlinear dynamics, and external shocks. We establish asymptotic stationarity of the NARFIMA process and develop conformal prediction intervals for distribution-free uncertainty quantification. Empirical results for BRIC exchange rates show that NARFIMA consistently outperforms a broad range of forecasting benchmarks across multiple horizons, underscoring the importance of explicitly modeling long-memory dependence in exchange rate dynamics. The `narfima' R package provides an implementation of our approach.

2512.03777 2026-06-15 stat.ME stat.AP stat.ML 版本更新

A comparison between initialization strategies for the infinite hidden Markov model

无限隐马尔可夫模型的初始化策略比较

Federico P. Cortese, Luca Rossini

AI总结 针对无限隐马尔可夫模型,系统评估了有限HMM常用初始化策略的适用性,发现基于距离的聚类初始化优于基于模型和均匀随机初始化。

详情
AI中文摘要

无限隐马尔可夫模型为建模具有结构变化和复杂动态的时间序列提供了灵活的框架,无需预先指定潜在状态的数量。这种灵活性通过层次狄利克雷过程先验实现,而高效的贝叶斯推断则通过波束采样器实现,该采样器结合动态规划和切片采样自适应地截断无限状态空间。尽管方法论发展广泛,但初始化在该框架中的作用受到的关注有限。本文通过系统评估有限隐马尔可夫模型常用的初始化策略,并评估它们在无限设置中的适用性,填补了这一空白。模拟和真实数据集的结果表明,基于距离的聚类初始化始终优于基于模型和均匀随机初始化,后者是现有文献中最广泛采用的。

英文摘要

Infinite hidden Markov models provide a flexible framework for modeling time-series with structural changes and complex dynamics, without requiring the number of latent states to be specified in advance. This flexibility is achieved through the hierarchical Dirichlet process prior, while efficient Bayesian inference is enabled by the beam sampler, which combines dynamic programming with slice sampling to truncate the infinite state space adaptively. Despite extensive methodological developments, the role of initialization in this framework has received limited attention. This gap is addressed by systematically evaluating initialization strategies commonly used for finite hidden Markov models and assessing their suitability in the infinite setting. Results from both simulated and real datasets show that distance-based clustering initializations consistently outperform model-based and uniform alternatives, the latter being the most widely adopted in the existing literature.

2512.01513 2026-06-15 stat.ME 版本更新

Dynamic functional brain connectivity results depend on modeling assumptions: comparing the sliding-window method and the Wishart process for dynamic hypothesis testing

动态功能脑连接结果依赖于建模假设:比较滑动窗口法和Wishart过程在动态假设检验中的应用

Hester Huijsdens, Linda Geerligs, Max Hinne

AI总结 研究比较滑动窗口法和Wishart过程贝叶斯假设检验在检测动态功能连接中的表现,发现建模假设显著影响结果,并强调谨慎选择假设的重要性。

详情
AI中文摘要

理解功能脑连接的时间动态对于解决网络神经科学中的各种问题至关重要,例如连接如何影响认知以及如何随疾病变化。一个基本挑战是评估连接是否真正表现出动态性,还是仅仅是静态的。最常用的方法使用滑动窗口方法对随时间变化的功能连接进行建模,并且通常与频率论假设检验框架结合以评估动态性。然而,这需要定义适当的抽样分布和超参数(如窗口长度),这给动态性施加了特定的假设。在这里,我们探讨这些假设如何影响动态连接的检测,并引入一种基于Wishart过程的贝叶斯假设检验的替代方法。该框架估计连接估计中的不确定性,并利用它为动态和静态连接提供证据强度。它通过先验分布编码假设,允许将关于连接时间依赖结构的先验知识纳入模型。通过模拟,我们比较了两种方法,并展示了不同假设如何影响动态连接的检测。最后,通过将两种方法应用于fMRI工作记忆任务,我们发现组水平的结论对建模选择的鲁棒性增强。我们的工作强调了在评估动态连接时仔细考虑建模假设的重要性。

英文摘要

Understanding the temporal dynamics of functional brain connectivity is important for addressing various questions in network neuroscience, such as how connectivity affects cognition and changes with disease. A fundamental challenge is to evaluate whether connectivity truly exhibits dynamics, or simply is static. The most common approach uses sliding-window methods to model functional connectivity over time, and this is often combined with frequentist hypothesis testing frameworks to evaluate dynamics. However, this requires defining appropriate sampling distributions and hyperparameters, such as window length, which imposes specific assumptions on the dynamics. Here, we explore how these assumptions influence the detection of dynamic connectivity, and introduce an alternative approach based on Bayesian hypothesis testing with Wishart processes. This framework estimates uncertainty in the connectivity estimates, and uses this to provide strength of evidence for both dynamic and static connectivity. It encodes assumptions through prior distributions, allowing prior knowledge on the time-dependent structure of connectivity to be incorporated into the model. Using simulations, we compare the two approaches and demonstrate how different assumptions affect the detection of dynamic connectivity. Finally, by applying both approaches to an fMRI working-memory task, we find that conclusions at the group-level increase robustness to modeling choices. Our work highlights the importance of carefully considering modeling assumptions when evaluating dynamic connectivity.

2508.00542 2026-06-15 physics.soc-ph cs.IT math.IT physics.data-an physics.med-ph stat.ME 版本更新

Assessing (im)balance in signed brain networks

评估带符号脑网络中的(不)平衡

Marzio Di Vece, Emanuele Agrimi, Samuele Tatullo, Tommaso Gili, Miguel Ibáñez-Berganza, Tiziano Squartini

AI总结 本文提出一种基于信息论和假设检验的方法,将多元时间序列投影为带符号图,并应用于脑网络,发现脑网络存在挫折,且负子图主要来自皮层下结构。

Comments 44 pages, 19 figures, 1 table

详情
AI中文摘要

许多复杂系统——无论是金融、自然还是社会系统——都由单元(如股票、神经元或智能体)组成,其联合活动可以表示为多元时间序列。一个既具有实际重要性又具有理论重要性的问题涉及仅从动态状态推断任意两个单元之间是否存在静态关系。本文旨在传统假设检验框架内解决这一问题:简而言之,我们的建议是,如果两个单元的行为足够相似,则将它们连接起来。为了实现这一目标,我们通过以下步骤将多元时间序列投影到带符号图上:i) 将前者的经验性质与在适当基准下预期的性质进行比较,以及ii) 如果相应序列共享显著大量的一致(不一致)值,则用正(负)边连接任意两个单元。为了定义我们的基准,我们采用一种基于信息论的方法,该方法根植于香农熵的约束最大化,这一过程产生了一个多元时间序列的集成,该集成平均保留了某些经验性质,同时随机化了其他所有内容。我们通过解决神经科学领域中最及时的问题之一——即确定脑网络是否受挫,如果是,受挫程度如何——来展示我们方法的可能应用。正如我们的结果所示,情况确实如此,对潜在负子图的主要贡献来自皮层下结构(以及较小程度上来自边缘区域)。在介观层面,使用带符号随机块模型实例化的贝叶斯信息准则的最小化表明,大脑区域聚集成与松弛平衡理论的统计变体一致的模块。

英文摘要

Many complex systems - be they financial, natural, or social - are composed of units - such as stocks, neurons, or agents - whose joint activity can be represented as a multivariate time series. An issue of both practical and theoretical importance concerns the possibility of inferring the presence of a static relationship between any two units solely from their dynamic behaviour. The present contribution aims at tackling such an issue within the framework of traditional hypothesis testing: briefly speaking, our suggestion is that of linking any two units if behaving in a sufficiently similar way. To achieve such a goal, we project a multivariate time series onto a signed graph by i) comparing the empirical properties of the former with those expected under a suitable benchmark and ii) linking any two units with a positive (negative) edge in case the corresponding series shares a significantly large number of concordant (discordant) values. To define our benchmarks, we adopt an information-theoretic approach that is rooted into the constrained maximisation of Shannon entropy, a procedure inducing an ensemble of multivariate time series that preserves some of the empirical properties on average, while randomising everything else. We showcase the possible applications of our method by addressing one of the most timely issues in the domain of neurosciences, i.e. that of determining if brain networks are frustrated or not, and, if so, to what extent. As our results suggest, this is indeed the case, with the major contribution to the underlying negative subgraph coming from the subcortical regions (and, to a lesser extent, from the limbic ones). At the mesoscopic level, the minimisation of the Bayesian Information Criterion, instantiated with the Signed Stochastic Block Model, reveals that brain regions gather into modules aligning with the statistical variant of the Relaxed Balance Theory.

6. 计算统计与MCMC 5 篇

2606.14289 2026-06-15 math.OC cs.LG cs.NA cs.NE math.NA stat.ML 新提交

Operator Calculus for Population-Based Optimization: A Mean-Field Convergence Theory

基于群体的优化的算子演算:平均场收敛理论

Pekka Malo, Lauri Viitasaari, Patrik Nummi, Antti Suominen, Ankur Sinha, Olli Tahvonen

发表机构 * Aalto University(阿尔托大学)

AI总结 提出一种算子演算,将多种基于群体的优化方法统一为三个基本算子(变异、选择、重组)的复合,并建立模块化Lyapunov原理,证明在稳定性和正则性条件下指数收敛。

Comments 71 pages, 4 figures, 2 tables; ancillary files contain Python code reproducing the numerical experiments

详情
AI中文摘要

基于群体的和分布优化方法,从进化策略和基于共识的优化到协方差矩阵适应和视为分布动力学的随机梯度方法,被广泛用于非凸或黑箱问题,但它们的收敛分析仍然分散在特定算法的技术中。我们引入一种算子演算,其中一大类这样的方法,在选择适当的状态空间并在必要时通过记忆或策略变量增强状态后,被描述为作用于概率测度的三个基本算子(变异、选择、重组)的复合。在明确的稳定性和正则性条件下,复合算子允许一个预生成子,其连续时间极限是一个保持算子分裂的输运-反应-跳跃(TRJ)偏微分方程。在此基础之上,我们建立了一个模块化的Lyapunov原理。如果状态空间Lyapunov函数既在完整生成子下耗散,又控制相关的搜索空间度量,那么状态空间Lyapunov泛函和诱导的搜索误差指数衰减。加性生成子结构允许逐个算子地组装耗散估计,为验证复合平均场算法的收敛性提供了一个工具箱。

英文摘要

Population-based and distributional optimization methods, from evolution strategies and consensus-based optimization to covariance-matrix adaptation and stochastic gradient methods viewed as distributional dynamics, are widely used for nonconvex or black-box problems, yet their convergence analyses remain fragmented across algorithm-specific techniques. We introduce an operator calculus in which a broad class of such methods, after choosing an appropriate state space and, where necessary, augmenting the state by memory or strategy variables, is described as a composition of three elementary operators (mutation, selection, and recombination) acting on probability measures. Under explicit stability and regularity conditions, the composite operator admits a pre-generator whose continuous-time limit is a transport-reaction-jump (TRJ) PDE that preserves the operator splitting. On this foundation we establish a modular Lyapunov principle. If a state-space Lyapunov function both dissipates under the full generator and controls the relevant search-space gauges, then the state-space Lyapunov functional and the induced search errors decay exponentially. The additive generator structure allows dissipation estimates to be assembled operator by operator, providing a toolkit for certifying convergence of composite mean-field algorithms.

2606.13850 2026-06-15 stat.ME stat.ML 新提交

Controller-Augmented Hidden Markov Models: A Computational Framework for Constrained Sequential Inference

控制器增强隐马尔可夫模型:一种用于约束序列推理的计算框架

Lekha Patel, Luis Damiano

AI总结 提出控制器增强隐马尔可夫模型(CHMMs),通过将约束编译为有限状态控制器,在增广链上执行精确的前向后向和维特比递归,实现离散和连续时间下的约束推理,并保证推理精确性、EM单调上升、线性复杂度和误指定下的总变差界。

详情
AI中文摘要

隐马尔可夫模型是序列推理的基础,但其马尔可夫假设在路径约束(如优先级要求、访问基数或单调状态进展)下失效,这些约束引入长程依赖,使标准动态规划算法无效。为此,我们提出控制器增强隐马尔可夫模型(CHMMs),该框架将每个约束编译为一个跟踪最小充分历史的有限状态控制器,之后在增广链上执行标准的前向后向和维特比递归,在离散和连续时间(通过均匀化)下计算精确的约束后验和最大后验路径。我们建立了四个理论保证:约束推理的精确性、约束EM的单调上升、推理复杂度与控制器基数线性相关,以及约束误指定下的总变差界。一个涵盖排序、访问、路径和时间四类共11个约束族的控制器编码目录使该框架可操作。实验上,我们在三个不同性质的真实序列标注任务上评估CHMMs与6种替代解码器:果蝇基因结构解码、CASAS智能家居环境中的自由生活活动识别,以及可穿戴传感器的协议定义人类活动识别。结果揭示了清晰的局部与累积二分法:在累积约束体制下,控制器增强能唯一恢复全局可行轨迹,而在局部主导体制下,简单解码器在有效性上与之匹配。理论和实验共同刻画了何时需要精确控制器增强以及何时简单方法足够。

英文摘要

Hidden Markov models are foundational for sequential inference, but their Markovian assumption fails under pathwise constraints such as precedence requirements, visitation cardinalities, or monotonic state progression, which induce long-range dependencies that invalidate standard dynamic programming algorithms. To deal with this, we present Controller-Augmented Hidden Markov Models (CHMMs), a framework that compiles each constraint into a finite-state controller tracking the minimal sufficient history, after which standard forward--backward and Viterbi recursions on the augmented chain compute exact constrained posteriors and maximum a posteriori paths in both discrete and continuous time, the latter through uniformization. We establish four theoretical guarantees: exactness of constrained inference, monotone ascent of constrained EM, inference complexity linear in the controller cardinality, and a total-variation bound under constraint misspecification. A catalog of controller encodings covering 11 constraint families across the ordering, visitation, path, and temporal categories operationalizes the framework. Empirically, we evaluate CHMMs against 6 alternative decoders on 3 real-world sequence-labeling tasks of substantively different character: gene-structure decoding in \emph{Drosophila melanogaster}, free-living activity recognition in CASAS smart-home environments, and protocol-defined human activity recognition from wearable sensors. The results reveal a clean local-versus-cumulative dichotomy in which controller augmentation is uniquely able to recover globally feasible trajectories on cumulative-constraint regimes, whilst simpler decoders are matched in validity on locally-dominated regimes. Together, theory and experiment characterize when exact controller augmentation is necessary and when simpler approaches suffice.

2604.19725 2026-06-15 math.ST stat.CO stat.TH 版本更新

Fast computation and theoretical guarantees for the NPMLE in exponential family mixtures

指数族混合模型中NPMLE的快速计算与理论保证

Yan Zhang

AI总结 提出数据压缩策略将NPMLE计算中似然评估成本降至样本量的多对数阶,并证明一类近似NPMLE的边际密度估计达到近乎参数收敛速度。

详情
AI中文摘要

本工作在指数族混合模型的(近似)非参数最大似然估计(NPMLE)研究中取得两项进展。首先,我们开发了一种数据压缩策略,将NPMLE计算中重复似然评估的成本降低到样本量的多对数阶。其次,我们证明,对于一大类近似NPMLE,得到的边际密度估计达到近乎参数的收敛速度。

英文摘要

This work makes two advances in the study of the (approximate) nonparametric maximum likelihood estimator (NPMLE) for exponential family mixture models. First, we develop a data-compression strategy that reduces the cost of repeated likelihood evaluations in NPMLE computation to polylogarithmic order in the sample size. Second, we show that, for a broad class of approximate NPMLEs, the resulting marginal density estimator attains an almost parametric rate of convergence.

2506.06542 2026-06-15 stat.ML cs.LG 版本更新

Direct Fisher Score Estimation for Likelihood Maximization

直接Fisher得分估计用于似然最大化

Sherman Khoo, Yakun Wang, Song Liu, Mark Beaumont

发表机构 * School of Mathematics, University of Bristol(布里斯托大学数学学院) School of Biological Sciences, University of Bristol(布里斯托大学生物科学学院)

AI总结 针对似然函数难解但模型模拟易得的问题,提出基于局部得分匹配的顺序梯度优化方法,直接建模Fisher得分,实现快速高效的似然最大化。

详情
AI中文摘要

我们研究当似然函数难以处理但模型模拟易于获得时的似然最大化问题。我们提出一种顺序的、基于梯度的优化方法,该方法基于局部得分匹配技术直接建模Fisher得分,该技术使用来自每个参数迭代周围局部区域的模拟。通过对代理得分模型采用线性参数化,我们的技术允许闭式最小二乘解。这种方法提供了一种快速、灵活且高效的Fisher得分近似,有效平滑了似然目标,并缓解了复杂似然景观带来的挑战。我们为得分估计器提供了理论保证,包括平滑引入的偏差界限。在一系列合成和真实世界问题上的实证结果表明,与现有基准相比,我们的方法具有优越的性能。

英文摘要

We study the problem of likelihood maximization when the likelihood function is intractable but model simulations are readily available. We propose a sequential, gradient-based optimization method that directly models the Fisher score based on a local score matching technique which uses simulations from a localized region around each parameter iterate. By employing a linear parameterization to the surrogate score model, our technique admits a closed-form, least-squares solution. This approach yields a fast, flexible, and efficient approximation to the Fisher score, effectively smoothing the likelihood objective and mitigating the challenges posed by complex likelihood landscapes. We provide theoretical guarantees for our score estimator, including bounds on the bias introduced by the smoothing. Empirical results on a range of synthetic and real-world problems demonstrate the superior performance of our method compared to existing benchmarks.

2210.03964 2026-06-15 stat.ME cs.CG 版本更新

An Efficient and Continuous Voronoi Density Estimator

一种高效且连续的Voronoi密度估计器

Giovanni Luca Marchetti, Vladislav Polianskii, Anastasiia Varava, Florian T. Pokorny, Danica Kragic

AI总结 提出基于Voronoi图的径向Voronoi密度估计器(RVDE),利用局部几何自适应性和线性时间复杂度,解决了传统VDE不连续和计算昂贵的问题,在高维数据上表现优于其他非参数密度估计方法。

Comments 13 pages

详情
AI中文摘要

我们引入了一种非参数密度估计器,称为径向Voronoi密度估计器(RVDE)。RVDE基于Voronoi剖分的几何结构,因此具有局部几何适应性和广泛的收敛性质。由于其径向定义,RVDE是连续的,并且计算复杂度与数据集大小呈线性关系。这弥补了先前研究的VDE的主要缺点,即高度不连续且计算成本高。我们对RVDE的模态进行了理论研究,并对其在高维数据上的性能进行了实证研究。结果表明,RVDE优于其他非参数密度估计器,包括最近引入的VDE。

英文摘要

We introduce a non-parametric density estimator deemed Radial Voronoi Density Estimator (RVDE). RVDE is grounded in the geometry of Voronoi tessellations and as such benefits from local geometric adaptiveness and broad convergence properties. Due to its radial definition RVDE is continuous and computable in linear time with respect to the dataset size. This amends for the main shortcomings of previously studied VDEs, which are highly discontinuous and computationally expensive. We provide a theoretical study of the modes of RVDE as well as an empirical investigation of its performance on high-dimensional data. Results show that RVDE outperforms other non-parametric density estimators, including recently introduced VDEs.

7. 机器学习统计基础 23 篇

2606.14679 2026-06-15 cs.LG cs.SY eess.SY math.OC stat.ML 新提交

Optimal Hidden-Target Learning for Online Inventory Optimization on General Convex Sets

一般凸集上在线库存优化的最优隐藏目标学习

Anthony Pineci, Yunzong Xu

发表机构 * UIUC(伊利诺伊大学厄巴纳-香槟分校)

AI总结 针对一般凸容量集上的在线库存优化问题,提出隐藏目标投影方法,将遗憾从逆概率依赖改进为平方根逆概率依赖,并证明匹配下界,同时首次给出强凸损失的 polylog 遗憾和动态遗憾保证。

详情
AI中文摘要

在线库存优化(OIO)是具有物理记忆的在线凸优化:库存结转使得可行动作集依赖于过去。一个自然的原则——在随机库存学习以及最近在单一线性容量约束下的OIO中使用——是维护一个由在线学习器选择的隐藏目标,并将其投影到当前可行的订货上限集上。我们证明,对于任意有界凸容量集上的OIO,这一简单原则是最优的。以在线梯度下降为基础学习器,该方法将一般凸集上OIO的最佳已知遗憾保证从对共同需求概率的逆依赖改进为平方根逆依赖,并且我们证明了匹配的下界。同样的原则为强凸损失提供了首个多对数遗憾保证,并为一般凸容量集上的欧几里得路径变化提供了首个动态遗憾保证。分析引入了一个范数对齐原则:正确的状态变量是隐藏目标到可行集的距离,以与投影相同的范数度量。在范数对齐下,该距离路径地演化为一个标量队列,目标移动作为到达,共同需求作为服务。这种简化为一维队列控制解决了状态依赖性,并将保证扩展到一般凸容量集,超出了先前乘积方法的范围。在合成和真实库存数据上的实验证实了该理论。

英文摘要

Online inventory optimization (OIO) is online convex optimization with physical memory: inventory carryover makes the feasible action set depend on the past. A natural principle, used in stochastic inventory learning and recently in OIO under a single linear capacity constraint, is to maintain a hidden target chosen by an online learner and implement its projection onto the currently feasible order-up-to set. We prove that this simple principle is optimal for OIO on arbitrary bounded convex capacity sets. With online gradient descent as the base learner, the method improves the best known regret guarantee for OIO on general convex sets from inverse to inverse-square-root dependence on the common-demand probability, and we prove a matching lower bound. The same principle gives the first polylogarithmic regret guarantee for strongly convex losses and the first dynamic regret guarantee adapting to Euclidean path variation on general convex capacity sets. The analysis introduces a norm alignment principle: the right state variable is the distance from the hidden target to the feasible set, measured in the same norm as the projection. Under norm alignment, this distance evolves pathwise as a scalar queue, with target movement as arrival and common demand as service. This reduction to one-dimensional queue control resolves the state dependence and extends the guarantees to general convex capacity sets, beyond the reach of prior productwise approaches. Experiments on synthetic and real-world inventory data corroborate the theory.

2606.14592 2026-06-15 stat.ML cs.LG stat.AP stat.ME 新提交

Cluster LOCO: Feature Importance For Interpreting Clusters

Cluster LOCO:用于解释聚类的特征重要性

Claire M. He, Genevera I. Allen

发表机构 * Department of Statistics Columbia University(统计学系哥伦比亚大学)

AI总结 提出模型无关的聚类特征重要性方法Cluster LOCO,通过特征遮挡和泛化性度量,可靠识别驱动聚类结构的特征。

Comments 36 pages, 12 figures

详情
AI中文摘要

聚类广泛用于探索性分析和科学发现,推动从市场细分到生物数据分析的洞察,但随着现代数据集变得日益庞大和复杂,其输出可能难以解释、审计和重现。聚类的可靠使用需要理解哪些特征驱动了发现的结构,然而与监督学习方法相比,聚类在特征级解释方面仍然稀缺。此外,现有的聚类特征重要性分数通常与特定算法和数据假设相关。为了解决这些挑战,我们提出了Cluster LOCO(Leave-One-Covariate-Out),一个模型无关的聚类特征重要性分数族。Cluster LOCO基于特征遮挡和聚类泛化性,即在一个数据子集上学习的聚类标签能否在保留样本上被准确预测。对于任何选定的聚类算法,Cluster LOCO通过测量移除某个特征对泛化性的降低程度来量化该特征的重要性。我们首先介绍了基于数据分割的Cluster LOCO-Split,然后将其扩展到Cluster LOCO-MP,一种适用于大规模数据的minipatch集成版本。通过合成模拟和在单细胞转录组学中细胞类型发现的应用,我们展示了Cluster LOCO比现有的聚类特征重要性方法更可靠地恢复信息特征。

英文摘要

Clustering is widely used for exploratory analysis and scientific discovery, driving insights from market segmentation to biological data analysis, but its outputs can be difficult to interpret, audit, and reproduce as modern datasets become increasingly large and complex. Reliable use of clustering requires understanding which features drive the discovered structure, yet feature-level explanations for clustering remain scarce compared with methods in supervised learning. Furthermore, existing clustering feature importance scores are often tied to specific algorithms and data assumptions. To address these challenges, we propose Cluster LOCO (Leave-One-Covariate-Out), a family of model-agnostic feature importance scores for clustering. Cluster LOCO is built on feature occlusion and clustering generalizability, defined as whether cluster labels learned on one subset of the data can be accurately predicted on held-out samples. For any chosen clustering algorithm, Cluster LOCO quantifies a feature's importance by measuring how much its removal degrades generalizability. We first introduce Cluster LOCO-Split, which relies on data splitting, and then extend it to Cluster LOCO-MP, a minipatch ensemble-based version designed for large-scale data. Across synthetic simulations and an application to cell-type discovery in single-cell transcriptomics, we show that Cluster LOCO more reliably recovers informative features than existing clustering feature importance methods.

2606.14560 2026-06-15 math.OC cs.LG stat.ML 新提交

Free Heavy-Tailed Lunch for Muon: A Theoretical Justification of Empirical Success

Muon 的免费重尾午餐:实证成功的理论证明

Florian Hübler, Thomas Pethick, Suvrit Sra

发表机构 * Department of Computer Science, ETH Zurich, Switzerland(苏黎世联邦理工学院计算机科学系) Department of Mathematics, Technical University of Munich, Germany(慕尼黑技术大学数学系) Munich Center for Machine Learning (MCML)(慕尼黑机器学习中心)

AI总结 本文在重尾非凸优化中证明,Muon 等非欧几里得方法在核范数平稳性下达到最优样本复杂度,避免了欧几里得方法的维度依赖,并通过大语言模型实验验证。

详情
AI中文摘要

最近,具有矩阵值更新的非欧几里得优化方法(如 Muon 和 Scion)在训练 Transformer 模型方面显示出强大的实证性能,但其相对于欧几里得方法的理论优势仍知之甚少。我们在重尾非凸机制中解决了这一差距,其中随机梯度具有有界的 $p$ 阶中心矩,$p \in (1,2]$。我们表明,某些非欧几里得方法在更强的平稳性度量下实现了最优样本复杂度,而欧几里得方法则会产生额外的维度相关成本。因此,对于 $m \times n$ 矩阵,Muon 在核范数下找到一个 $\varepsilon$-平稳点所需的样本数为 $\mathcal{O}\left(\min\{m, n\} \frac{\Delta_1 L}{\varepsilon^2} \left(\frac \sigma \varepsilon \right)^{\frac p {p-1}}\right)$,吸收了重尾噪声而无需额外的维度依赖,这与欧几里得方法不同。我们进一步证明,对于所有一阶方法在核范数平稳性下,该样本复杂度(包括其维度依赖)是最优的。在大语言模型上的实验支持了我们的理论。令人惊讶的是,我们的结果表明,除了 Muon 的谱几何之外,其他 Schatten 几何在某些设置下也能具有竞争力。

英文摘要

Non-Euclidean optimisation methods with matrix-valued updates, such as Muon and Scion, have recently shown strong empirical performance for training Transformer models, yet their theoretical advantages over Euclidean methods remain poorly understood. We address this gap in the heavy-tailed non-convex regime, where stochastic gradients have bounded $p$-th central moments, $p \in (1,2]$. We show that certain non-Euclidean methods achieve optimal sample complexity under stronger stationarity measures, while Euclidean methods incur additional dimension-dependent costs. As a consequence, for $m \times n$ matrices, Muon finds an $\varepsilon$-stationary point in nuclear norm within $\mathcal{O}\left(\min\{m, n\} \frac{Δ_1 L}{\varepsilon^2} \left(\frac σ\varepsilon \right)^{\frac p {p-1}}\right)$ samples, absorbing heavy-tailed noise without extra dimension dependence, unlike Euclidean methods. We further prove this sample complexity, including its dimension dependence, is optimal for all first-order methods under nuclear-norm stationarity. Experiments on large language models support our theory. Surprisingly, our results suggest that other Schatten geometries beyond the spectral geometry of Muon can perform competitively in certain settings.

2606.14416 2026-06-15 cs.LG stat.ML 新提交

Federated Learning for Feature Generalization with Convex Constraints

基于凸约束的联邦学习特征泛化

Dongwon Kim, Donghee Kim, Sung Kuk Shyn, Kwangsu Kim

发表机构 * Dongwon Kim(金东Won) Donghee Kim(金东浩) Sung Kuk Shyn(申 Sung Kuk) Kwangsu Kim(金光Su)

AI总结 针对联邦学习中客户端数据异构导致的泛化问题,提出FedCONST方法,利用线性凸约束自适应调整更新幅度,平衡参数学习,并通过梯度信噪比分析验证其有效性,实现跨异构环境的强泛化。

Comments Accepted at the 42nd International Conference on Machine Learning (ICML 2025)

详情
AI中文摘要

联邦学习(FL)常因客户端数据异构而难以泛化。局部模型容易过拟合其局部数据分布,甚至可迁移特征在聚合过程中也可能被扭曲。为应对这些挑战,我们提出FedCONST,一种基于全局模型参数强度自适应调整更新幅度的方法。这可以防止过度强调已学好的参数,同时加强未充分发展的参数。具体而言,FedCONST采用线性凸约束来确保训练稳定性,并在聚合过程中保留局部学到的泛化能力。梯度信噪比(GSNR)分析进一步验证了FedCONST在增强特征可迁移性和鲁棒性方面的有效性。因此,FedCONST有效对齐了局部和全局目标,减轻了过拟合,促进了跨不同FL环境的更强泛化,达到了最先进的性能。

英文摘要

Federated learning (FL) often struggles with generalization due to heterogeneous client data. Local models are prone to overfitting their local data distributions, and even transferable features can be distorted during aggregation. To address these challenges, we propose FedCONST, an approach that adaptively modulates update magnitudes based on the parameter strength of the global model. This prevents over-emphasizing well-learned parameters while reinforcing underdeveloped ones. Specifically, FedCONST employs linear convex constraints to ensure training stability and preserve locally learned generalization capabilities during aggregation. A Gradient Signal to Noise Ratio (GSNR) analysis further validates the effectiveness of FedCONST in enhancing feature transferability and robustness. As a result, FedCONST effectively aligns local and global objectives, mitigating overfitting and promoting stronger generalization across diverse FL environments, achieving state-of-the-art performance.

2606.14390 2026-06-15 cond-mat.dis-nn stat.ML 新提交

Local Coverage Governs Memorization in Diffusion Models

局部覆盖支配扩散模型中的记忆化

Claudia Merger, Sebastian Goldt

AI总结 通过扩散模型与核密度估计的联系,发现记忆化由局部数据覆盖主导:低覆盖区域孤立样本被记忆,高覆盖区域支持插值泛化。

详情
AI中文摘要

扩散模型中的记忆化通常被视为模型或数据集的全局属性。然而在实践中,单个扩散模型可以同时生成记忆化和新颖的样本。哪些训练样本最有可能被记忆?在这项工作中,我们表明记忆化由\emph{局部数据覆盖}支配。利用扩散模型与核密度估计(KDE)之间的联系,我们推导出一个理论准则,根据训练数据在其邻域内的密度和训练数据集的大小来预测一个点是否被记忆。在高维极限下,这导致一个尖锐的局部转变:低覆盖区域被孤立的训练样本主导,这些样本被记忆,而密集区域支持插值和泛化。我们通过实验验证了这些预测,表明记忆化随局部稀疏性增加,并且扩散模型在同一模型内表现出记忆化和新颖样本的共存。将该框架扩展到多类设置,我们进一步表明,具有更高类内稀疏性(因此更低局部覆盖)的类别被更强烈地记忆。我们的结果提供了扩散模型中记忆化的局部视角,从数据几何角度解释了记忆化何时何地发生。

英文摘要

Memorization in diffusion models is often treated as a global property of the model or dataset. In practice, however, a single diffusion model can simultaneously generate both memorized and novel samples. Which training samples are most likely to be memorized? In this work, we show that memorization is governed by \emph{local data coverage}. Leveraging the connection between diffusion models and kernel density estimation (KDE), we derive a theoretical criterion that predicts whether a point is memorized based on the density of training data in its neighborhood and the size of the training dataset. In the high-dimensional limit, this leads to a sharp, local transition: regions of low coverage are dominated by isolated training samples, which are memorized, while dense regions support interpolation and generalization. We validate these predictions empirically, showing that memorization increases with local sparsity and that diffusion models exhibit a coexistence of memorized and novel samples within the same model. Extending this framework to multi-class settings, we further show that classes with higher intra-class sparsity (and thus lower local coverage) are more strongly memorized. Our results provide a local view of memorization in diffusion models, explaining when and where memorization occurs in terms of data geometry.

2606.14268 2026-06-15 stat.ML cs.LG 新提交

Gradient boosting for extremes: sampling theory and application to insurance

极值的梯度提升:抽样理论及其在保险中的应用

Stéphane Lhaut, Olivier Lopez

发表机构 * CREST, CNRS, Ecole polytechnique, Groupe ENSAE-ENSAI, ENSAE Paris, Institut Polytechnique de Paris, Palaiseau, France(CREST、国家科学研究中心、巴黎高等工业学校、ENSAE-ENSAI集团、巴黎ENSAE、巴黎理工学院、Palaiseau法国)

AI总结 提出梯度提升估计广义帕累托分布的理论,通过正交重参数化改进收敛性,并在保险数据中验证了方法有效性。

Comments 36 pages, 10 figures

详情
AI中文摘要

我们为梯度提升在超阈值建模中估计协变量依赖的广义帕累托(GP)分布开发了统计学习理论。在对GP似然进行正交重参数化以对角化其Fisher信息矩阵后,我们将估计问题纳入经验风险最小化(ERM)框架,并推导了提升估计器的非渐近误差界。我们的分析考虑了过程中的三个不同误差来源:统计波动、GP模型渐近性质固有的近似偏差(在二阶正则变化下控制)以及与有限次提升迭代相关的近似误差,明确了由此产生的偏差-方差权衡。通过模拟,我们展示了重参数化的实际好处,表明它在训练过程中显著降低了梯度相关性并提高了收敛稳定性。该方法应用于德克萨斯州保险部的医疗事故保险数据集,包含超过18000个已结索赔。梯度提升方法对和解成本分布的尾部拟合良好,并揭示出和解天数是对尾部重尾性起主导作用的预测因子,这与准备金文献中的早期发现一致。

英文摘要

We develop a statistical learning theory for gradient boosting applied to the estimation of covariate-dependent Generalized Pareto (GP) distributions in the context of Peaks-over-Threshold modeling. After an orthogonal reparametrization of the GP likelihood that diagonalizes its Fisher information matrix, we cast the estimation problem within the Empirical Risk Minimization (ERM) framework and derive non-asymptotic error bounds for the boosting estimator. Our analysis accounts for three distinct sources of error in the process: statistical fluctuations, the approximation bias inherent to the asymptotic nature of the GP model-controlled under second-order regular variation-and the approximation error associated with the finite number of boosting iterates, making explicit the resulting bias-variance trade-off. We illustrate the practical benefits of the reparametrization through simulations, showing that it significantly reduces gradient correlation during training and improves convergence stability. The methodology is applied to a medical malpractice insurance dataset from the Texas Department of Insurance, comprising over 18 000 closed claims. The gradient boosting approach yields a good fit for the tail of settlement cost distributions and reveals that the number of days to settlement is the dominant predictor of tail heaviness, consistent with earlier findings in the reserving literature.

2606.14053 2026-06-15 stat.ML cs.LG 新提交

Hybrid Uncertainty Sensitivity Analysis Based on the HSIC for High-Dimensional Responses with Aleatory--Epistemic Separation

基于HSIC的混合不确定性灵敏度分析:面向具有偶然-认知分离的高维响应

Shijie Zhong, Jiangfeng Fu, Pengfei Wei

发表机构 * School of Power and Energy, Northwestern Polytechnical University(能源学院,西北工业大学)

AI总结 提出双空间张量积RKHS框架,通过分解核函数和双重Möbius反演,将全局依赖度量正交分解为纯偶然效应、纯认知效应及其交互贡献,实现高维响应下混合不确定性的灵敏度分析。

Comments 19 pages, 7 figures

详情
AI中文摘要

量化混合偶然和认知不确定性对高维系统响应的影响仍然是全局灵敏度分析(GSA)中的主要挑战。现有的基于希尔伯特-施密特独立性准则(HSIC)的方法主要局限于单输出设置,并且缺乏对异质不确定性来源及其相互作用的严格分解。为了解决这一局限性,提出了一种新颖的双空间张量积RKHS框架,用于混合不确定性下的灵敏度分析。通过在潜在输入空间和多维输出空间上构造因子化核,推导出并发双重Möbius反演,将全局依赖度量正交分解为纯偶然效应、纯认知效应及其交互贡献。得到的维度灵敏度指数保留了所有输出维度上的不确定性归因结构。为了满足分解所需的独立性假设,引入了基于逆概率积分变换的辅助变量表示,使得能够在统一的潜在空间中处理层次不确定性和Copula诱导的相关性。进一步开发了完全向量化的单循环实现,以避免嵌套蒙特卡洛模拟的计算负担。通过置换检验和Bootstrap置信区间量化统计显著性和估计不确定性。在改进的多输出Ishigami函数和空气动力学压力场问题上的数值研究证明了所提出框架的准确性、可扩展性和实际适用性。

英文摘要

Quantifying the influence of hybrid aleatory and epistemic uncertainties on high-dimensional system responses remains a major challenge in global sensitivity analysis (GSA). Existing Hilbert--Schmidt Independence Criterion (HSIC)-based approaches are primarily restricted to single-output settings and lack a rigorous decomposition of heterogeneous uncertainty sources and their interactions. To address this limitation, a novel double-space tensor-product RKHS framework is proposed for sensitivity analysis under hybrid uncertainty. By constructing factorized kernels over both the latent input space and the multidimensional output space, a concurrent double Möbius inversion is derived to orthogonally decompose the global dependence measure into pure aleatory effects, pure epistemic effects, and their interaction contributions. The resulting dimension-wise sensitivity indices preserve the uncertainty attribution structure across all output dimensions. To satisfy the independence assumptions required by the decomposition, an auxiliary-variable representation based on the inverse probability integral transform is introduced, enabling the treatment of hierarchical uncertainties and Copula-induced correlations within a unified latent space. A fully vectorized single-loop implementation is further developed to avoid the computational burden of nested Monte Carlo simulation. Statistical significance and estimation uncertainty are quantified through permutation testing and Bootstrap confidence intervals. Numerical studies on a modified multi-output Ishigami function and an aerodynamic pressure-field problem demonstrate the accuracy, scalability, and practical applicability of the proposed framework.

2606.14028 2026-06-15 stat.ML cs.LG 新提交

Anytime-Valid Confirmation of Label-Shift Corrections

标签偏移修正的任意有效确认

Seungjin Choi

发表机构 * Seungjin Choi

AI总结 针对标签稀缺时预指定偏移修正的确认问题,提出基于条件e值的任意有效序贯检验方法,利用似然比乘积构造非负鞅,将常规模型监测转化为正式检验。

Comments ICML 2026 Workshop on Hypothesis Testing

详情
AI中文摘要

在小型批次的科学部署中,即使未标记的目标输入可用,标记的目标结果也可能过于稀缺,无法进行可靠的偏移估计。我们解决了互补的设置,其中从业者根据领域知识预先指定了标签偏移修正,并询问传入的标记结果是否支持该修正。我们表明,经过标签偏移修正的预测与源预测之间的每个观测的似然比是一个条件e值,因此其运行乘积是一个非负鞅,Ville不等式产生一个任意有效的确认规则。对数鞅等于源预测与修正预测之间的累积负对数预测密度(NLPD)差距,将常规模型监测转化为正式的序贯检验。拒绝意味着传入数据支持相对于源预测的假定修正,但这不是对偏移程度的精确估计。对于具有高斯标签偏移比率的高斯过程源,存在封闭形式。高斯过程回归模拟验证了类型I控制、有限样本功效、校准敏感性以及基于标签重新估计的可靠先验的小批量优势。

英文摘要

In small-batch scientific deployments, labeled target outcomes may be too scarce for reliable shift estimation even when unlabeled target inputs are available. We address the complementary setting where the practitioner has a pre-specified label-shift correction from domain knowledge and asks whether incoming labeled outcomes support it. We show that the per-observation likelihood ratio between a label-shift-corrected predictive and the source predictive is a conditional e-value, so its running product is a nonnegative martingale and Ville's inequality yields an anytime-valid confirmation rule. The log martingale equals the cumulative negative log-predictive density (NLPD) gap between the source and the corrected predictive, converting routine model monitoring into a formal sequential test. Rejection means the incoming data support the posited correction relative to the source predictive, but it is not a precise estimate of the degree of shift. Closed forms are available for GP sources with Gaussian label-shift ratios. GP regression simulations validate Type I control, finite-sample power, miscalibration sensitivity, and the small-batch advantage of a reliable prior over label-based re-estimation.

2606.14023 2026-06-15 stat.ML cs.LG stat.ME 新提交

Geometric Domain Adaptation via Optimal Transport for Linear Regression in R^2

R^2中线性回归的几何域自适应:基于最优传输

Brian Britos, Mathias Bourel

发表机构 * University of the People(人民大学)

AI总结 针对源域与目标域存在旋转、平移或缩放变换的线性回归问题,提出结合K-means与最优传输的方法估计变换,实现目标数据稀缺时的模型自适应,理论证明p≥2时最优传输恢复变换。

详情
AI中文摘要

最优传输最近通过对齐源分布和目标分布,成为域自适应的一种强大方法。我们研究了一个监督域自适应问题,其中源域和目标域在$\mathbb{R}^2$中通过旋转、平移或缩放相关联。我们证明,当使用$p \geq 2$的$p$-范数成本时,最优传输映射能够恢复底层映射。基于这一见解,我们开发了一种结合$K$-means和最优传输的方法来估计底层映射,从而在目标数据稀缺时实现线性回归模型的自适应。模拟表明,与基线方法相比,性能有所提升。我们不依赖高表达力的深度学习架构,而是专注于经典机器学习模型,以强调可解释性和理论洞察。这一视角使我们能够明确刻画最优传输在恢复旋转、平移和缩放等几何变换中的作用。我们的贡献包括一个将最优传输与$\mathbb{R}^2$中的旋转、平移和缩放联系起来的理论结果,以及一种用于线性回归自适应的实用方法,在该空间的域自适应任务中既提供概念清晰性又具有应用价值。

英文摘要

Optimal Transport has become recently a powerful method for domain adaptation by aligning source and target distributions. We study a supervised domain adaptation problem where source and target domains are related by a rotation or a translation or a homothety in $\mathbb{R}^2$. We prove that the optimal transport map recovers the underlying map when using a $p-$norm cost with $p \geq 2$. Based on this insight, we develop a method combining $K-$means and optimal transport to estimate the underlying map, enabling adaptation of linear regression models when target data is scarce. Simulations demonstrate improved performance over baseline methods. Rather than relying on highly expressive deep learning architectures, we focus on classical machine learning models to emphasize interpretability and theoretical insight. This perspective allows us to explicitly characterize the role of optimal transport in recovering geometric transformations such as rotations, translations, and homotheties. Our contributions include a theoretical result linking optimal transport and rotations, translations and homothecies in $\mathbb{R}^2$, and a practical method for adaptation in linear regression offering both conceptual clarity and applied value in domain adaptation tasks in this space.

2606.13984 2026-06-15 stat.ML cs.LG stat.ME 新提交

A General Framework for Decision Trees via Bregman Divergences

基于Bregman散度的决策树通用框架

Mathias Bourel

发表机构 * IESTA, Facultad de Ciencias Económicas y de Administración, Universidad de la República, Uruguay(乌拉圭拉普拉塔大学经济与管理学院,IESTA) IRL-2030, Instituto Franco-Uruguayo de Matemática e Interacciones (IFUMI)(法乌数学与互动研究所(IFUMI))

AI总结 提出基于Bregman散度的CART推广框架,统一多种损失函数和分裂准则,并研究生成凸函数的强凸性与光滑性对杂质增益、估计器稳定性和一致性的影响。

详情
AI中文摘要

决策树因其可解释性、灵活性以及适应非线性结构的能力,成为统计学习中的基本工具之一。其中,由Breiman、Friedman、Olshen和Stone于1984年引入的分类与回归树(CART)成为最具影响力的算法之一,至今仍是分类和回归问题中最广泛使用的方法之一。另一方面,由Lev Bregman于1967年在凸优化背景下引入的Bregman散度,提供了广泛的一类损失函数,自然地推广了平方欧氏距离。该族包括Kullback-Leibler散度、Poisson散度和Itakura-Saito散度,以及与指数族分布相关的若干损失函数。此外,Bregman散度具有丰富的几何结构,并与凸分析和信息几何有深刻联系。本文提出基于Bregman散度的CART范式推广,从而获得适应不同统计模型和底层几何结构的更广泛的决策树族。尽管CART或经典实现(如rpart)等算法包含了不同的杂质准则,但这些准则通常针对每个特定模型以临时方式引入。相比之下,Bregman散度方法提供了一个统一的框架,使得这些准则可以从共同的凸和几何原理中推导和解释。除了算法构建,我们还研究了这些树的理论性质。特别地,我们研究了生成凸函数的性质(如强凸性或光滑性)如何影响父节点与子节点之间的杂质增益,以及估计器的稳定性和一致性。

英文摘要

Decision trees are one of the fundamental tools in statistical learning due to their interpretability, flexibility, and their ability to adapt to nonlinear structures. Among them, the Classification and Regression Trees, introduced by Breiman, Friedman, Olshen, and Stone in 1984, became one of the most influential algorithms and remains one of the most widely used methods for classification and regression problems. On the other hand, Bregman divergences, introduced by Lev Bregman in 1967 in the context of convex optimization, provide a broad family of loss functions that naturally generalize the squared Euclidean distance. This family includes, among others, the Kullback-Leibler divergence, the Poisson divergence, and the Itakura-Saito divergence, as well as several losses associated with distributions belonging to the exponential family. Moreover, Bregman divergences possess a rich geometric structure and deep connections with convex analysis and information geometry. In this work, we propose a generalization of the CART paradigm based on Bregman divergences, thereby obtaining a broader family of decision trees adapted to different statistical models and underlying geometries. Although algorithms such as CART or classical implementations such as rpart incorporate different impurity criteria, these are usually introduced in an ad hoc manner for each specific model. In contrast, the Bregman divergence approach provides a unified framework that allows these criteria to be derived and interpreted from common convex and geometric principles. Beyond the algorithmic construction, we also investigate theoretical properties of these trees. In particular, we study how properties of the generating convex function -- such as strong convexity or smoothness -- influence impurity gains between parent and child nodes, as well as stability and consistency properties of the estimator.

2606.13982 2026-06-15 stat.ML cs.LG 新提交

Adaptive Nucleus Truncation for Long-Form Reasoning

自适应核截断用于长形式推理

Ousmane Amadou Dia

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出自适应核截断采样(ANTS),通过熵条件控制器动态调整截断宽度,在长文本生成中提升推理性能,在33B参数稀疏MoE模型上平均提升1.9-5.2分。

详情
AI中文摘要

采样在长形式语言模型推理中扮演重要角色。在数千个解码步骤中,候选token集合的微小变化可能累积成不同的推理轨迹、稳定性配置和最终答案。现有的截断方法如top-$p$、min-$p$和固定top-$n\sigma$采样改进了无限制采样,但它们依赖固定阈值,无法适应熵、任务难度、训练阶段或生成预算的变化。我们引入自适应核截断采样(ANTS),将top-$n\sigma$采样从固定解码规则扩展为长形式生成的自适应展开控制机制。ANTS在温度缩放前选择最大logit周围的标准邻域,使用熵条件控制器自适应调整截断宽度,并保留一个无截断回退臂以在截断不安全时稳定训练。在33B总参数/4B活跃参数的稀疏混合专家推理模型上,ANTS在8K、16K和32K生成预算下分别比基于百分比的基准平均提升1.9、3.8和5.2分。最大提升出现在指令遵循和数学推理上,其中IFBench在32K时提升超过10分,AIME 2025提升7分。代码生成揭示了重要的预算交互:在Codeforces上,ANTS在8K时落后于基线,但在16K和32K时逆转差距并显著提升ELO。这些结果表明,采样器设计不应仅被视为解码超参数,而应作为我们稳定和扩展长预算推理的一部分。

英文摘要

Sampling plays an important role in long-form language-model reasoning. Over thousands of decoding steps, small changes in the candidate token set can compound into different reasoning trajectories, stability profiles, and final answers. Existing truncation methods such as top-$p$, min-$p$, and fixed top-$nσ$ sampling improve over unrestricted sampling, but they rely on fixed thresholds that cannot adapt to changes in entropy, task difficulty, training stage, or generation budget. We introduce Adaptive Nucleus Truncation Sampling (ANTS), which extends top-\(nσ\) sampling from a fixed decoding rule into an adaptive rollout-control mechanism for long-form generation. ANTS selects standardized neighborhoods around the maximum logit before temperature scaling, adapts the truncation width using an entropy-conditioned controller, and retains a no-truncation fallback arm to stabilize training when truncation becomes unsafe. On a 33B-total / 4B-active sparse Mixture-of-Experts reasoning model, ANTS improves average performance over percentage-based benchmarks by +1.9, +3.8, and +5.2 points at 8K, 16K, and 32K generation budgets, respectively. The strongest gains appear on instruction following and mathematical reasoning, with IFBench improving by more than 10 points at 32K and AIME 2025 improving by 7 points. Code generation reveals an important budget interaction. On Codeforces, ANTS trails the baseline at 8K, but reverses this gap and substantially improves ELO at 16K and 32K. These results suggest that sampler design should be treated not just as a decoding hyperparameter, but as part of how we stabilize and scale long-budget reasoning.

2606.13796 2026-06-15 stat.ML cs.LG 新提交

Recursively Trained Diffusion Models: Limiting Collapse Distribution and Spectral Characterization

递归训练的扩散模型:限制崩溃分布与谱特征

Naïl B. Khelifa, Richard E. Turner, Ramji Venkataramanan

发表机构 * University of Cambridge(剑桥大学)

AI总结 研究递归训练扩散模型时的分布崩溃问题,证明即使完美学习也会因早期停止导致漂移,并收敛到唯一极限分布,该分布具有低通滤波谱特性。

详情
AI中文摘要

生成模型在其自身输出上的递归训练可能导致模型崩溃,即与真实数据分布的复合漂移。现有的理论工作限制了扩散模型背景下有限轮误差的累积,但有两个问题仍然悬而未决:递归收敛到何种分布,以及收敛速度如何?我们回答了这两个问题,并分离出一种不同于不完美学习的机制:即使具有完美的分数估计和精确采样,反向扩散的早期停止(出于数值稳定性需要)也会驱动逐渐偏离数据分布。我们证明该递归几何收敛到唯一的极限分布,该分布具有闭式表征,即数据分布的无限混合,其中每个分量是数据分布的高斯平滑版本,且平滑程度递增。该极限的Hermite谱分解表明,递归训练充当低通滤波器:编码精细非高斯结构的高阶模式比粗模式衰减得更强。这种谱图景启发了一种退火截断调度,该调度在再训练轮次中逐步缩小截断时间;我们证明任何收敛到0的调度都能渐近消除递归复合。最后,我们展示了理想化表征的鲁棒性:在存在离散化和分数估计误差的情况下,学习到的分布保持在理想极限周围的Wasserstein-2球内,且具有模式依赖的收缩率,高阶误差比低阶误差收缩更快。我们在合成高斯混合和CIFAR-10上验证了该理论。

英文摘要

Recursive training of generative models on their own outputs can lead to model collapse, a compounding drift away from the true data distribution. Existing theoretical works bound finite-round error accumulation in the context of diffusion models, but two questions remain open:~what distribution does the recursion converge to, and how fast? We answer both, isolating a mechanism distinct from imperfect learning: even with perfect score estimation and exact sampling, the early stopping of the reverse diffusion (required for numerical stability) drives a progressive drift away from the data distribution. We prove that this recursion converges geometrically to a unique limiting distribution, which admits a closed-form characterization as an infinite mixture of increasingly Gaussian-smoothed versions of the data distribution. A Hermite spectral decomposition of this limit reveals that recursive training acts as a low-pass filter: higher-order modes, which encode fine non-Gaussian structure, are attenuated much more strongly than coarse modes. This spectral picture motivates annealed truncation schedules that progressively shrink truncation times across retraining rounds; we prove that any schedule converging to $0$ asymptotically eliminates recursive compounding. Finally, we show our idealized characterization is robust: in the presence of discretization and score estimation errors, the learned distribution remains in a Wasserstein-2 ball around the ideal limit, with mode-dependent contraction rates that contract high-order errors faster than low-order ones. We validate the theory on synthetic Gaussian mixtures and CIFAR-10.

2606.13709 2026-06-15 stat.ML cs.LG 新提交

LoMC: Localized Multidirectional Correction for Refusal Suppression in Routed Foundation Models

LoMC: 路由基础模型中拒绝抑制的局部多方向校正

Yan Hong, Kedong Xiu, Wei Li, Jun Lan, Huijia Zhu, Shuheng Zhou, Zhongcai Lyu, Weiqiang Wang, Jianfu Zhang

发表机构 * Ant Group(蚂蚁集团) Zhejiang University(浙江大学) Shanghai Jiao Tong University(上海交通大学)

AI总结 提出LoMC方法,通过支持门控干预框架在路由MoE和混合MoE模型中实现紧凑的拒绝抑制,提升非拒绝目标响应行为并保持通用能力。

详情
AI中文摘要

我们研究了路由MoE和混合MoE基础模型中的受控后训练拒绝抑制,旨在增加非拒绝目标响应行为,同时在紧凑的干预足迹下保持通用能力。现有的基于广泛方向的编辑可能会扰动通用计算,而仅支持专家编辑通常缺乏足够的容量来纠正异质拒绝表示。为了解决这一限制,我们引入了局部多方向校正(LoMC),一种支持门控干预框架,遵循支持-然后-校正的执行顺序:它首先识别紧凑的编辑支持,然后将原型校正方向聚合成逐层校正方向,最后仅在选定的支持内应用秩一逐层校正。通过使用编辑支持作为结构门控约束,LoMC在不扩大干预范围的情况下增加了校正容量。在四个路由骨干上的纯文本和多模态安全基准实验表明,LoMC在紧凑干预足迹下显著改善了非拒绝目标响应行为,同时保持了通用能力。

英文摘要

We study controlled post-training refusal suppression in routed MoE and hybrid-MoE foundation models, aiming to increase non-refusal target-response behavior while preserving general capability under a compact intervention footprint. Existing broad direction-based edits can perturb general-purpose computation, whereas support-only expert edits often lack sufficient capacity to correct heterogeneous refusal representations. To address this limitation, we introduce Localized Multidirectional Correction (LoMC), a support-gated intervention framework that follows a support-then-correction execution order: it first identifies a compact edit support, then aggregates prototype correction directions into layer-wise correction directions, and finally applies rank-one layer-wise correction only within the selected support. By using the edit support as a structural gating constraint, LoMC increases correction capacity without expanding the intervention scope. Experiments on text-only and multimodal safety benchmarks across four routed backbones show that LoMC substantially improves non-refusal target-response behavior while maintaining general capability under a compact intervention footprint.

2604.18419 2026-06-15 cs.LG cs.CL stat.ML 版本更新

Knowing When to Quit: A Principled Framework for Dynamic Abstention in LLM Reasoning

知道何时退出:LLM推理中动态弃权的原则性框架

Hen Davidov, Nachshon Cohen, Oren Kalinsky, Yaron Fairstein, Guy Kushilevitz, Ram Yazdi, Patrick Rebeschini

发表机构 * Hebrew University of Jerusalem(特拉维夫大学)

AI总结 本文提出一个基于正则化强化学习框架的动态弃权原则,通过价值函数与弃权奖励的比较来决定是否提前终止推理,在数学推理和毒性避免任务上优于现有方法。

详情
Journal ref
Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026. Copyright 2026 by the author(s)
AI中文摘要

利用思维链推理的大型语言模型常常因产生冗长且错误的响应而浪费大量计算资源。弃权可以通过抑制可能不正确的输出来缓解这一问题。虽然大多数弃权方法在生成之前或之后决定是否保留输出,但动态的生成中弃权考虑在每个token位置提前终止无前途的推理轨迹。先前的工作探索了这一想法的经验变体,但缺乏对弃权规则的原则性指导。我们提出了LLM动态弃权的形式化分析,将弃权建模为正则化强化学习框架中的一个显式动作。弃权奖励参数控制计算与信息之间的权衡。我们证明,在一般条件下,当价值函数低于该奖励时弃权严格优于自然基线。我们进一步推导了一种原则性且高效的方法来近似价值函数。在数学推理和毒性避免任务上的实证结果支持我们的理论,并展示了相比现有方法改进的选择性准确性。

英文摘要

LLMs utilizing chain-of-thought reasoning often waste substantial compute by producing long, incorrect responses. Abstention can mitigate this by withholding outputs unlikely to be correct. While most abstention methods decide to withhold outputs before or after generation, dynamic mid-generation abstention considers early termination of unpromising reasoning traces at each token position. Prior work has explored empirical variants of this idea, but principled guidance for the abstention rule remains lacking. We present a formal analysis of dynamic abstention for LLMs, modeling abstention as an explicit action within a regularized reinforcement learning framework. An abstention reward parameter controls the trade-off between compute and information. We show that abstaining when the value function falls below this reward strictly outperforms natural baselines under general conditions. We further derive a principled and efficient method to approximate the value function. Empirical results on mathematical reasoning and toxicity avoidance tasks support our theory and demonstrate improved selective accuracy over existing methods.

2605.11558 2026-06-15 cs.LG stat.ML 版本更新

A Composite Activation Function for Learning Stable Binary Representations

一种用于学习稳定二进制表示的复合激活函数

Seokhun Park, Choeun Kim, Kwanho Lee, Sehyun Park, Insung Kong, Yongdai Kim

发表机构 * Department of Statistics(统计学系) Seoul National University(首尔国立大学) Department of Applied Mathematics(应用数学系) University of Twente(埃因霍温理工大学)

AI总结 本文提出HTAF复合激活函数,通过平滑近似Heaviside函数实现稳定训练,适用于Spiking神经网络等模型,并引入ICBMs模型实现可解释的图像处理。

Comments 32 pages

详情
AI中文摘要

激活函数在神经网络中通过塑造内部表示起核心作用。最近,学习二进制激活表示因其在计算和内存效率以及可解释性方面的优势而受到广泛关注。然而,使用Heaviside激活函数训练神经网络仍具挑战性,因其非可导性阻碍了标准梯度优化。本文提出Heavy Tailed Activation Function (HTAF),一种Heaviside函数的平滑近似,使基于梯度的优化能够稳定训练。我们构造HTAF为sigmoid双曲正切复合函数,并理论证明其在零输入附近保持大梯度质量,同时在尾部区域表现出更慢的梯度衰减。我们展示Spiking神经网络、二进制神经网络和深度Heaviside神经网络可以使用HTAF稳定训练。最后,我们引入隐式概念瓶颈模型(ICBMs),一种利用HTAF诱导离散特征表示的可解释图像模型。在各种架构和图像数据集上的广泛实验表明,ICBMs能够稳定地实现离散化,同时预测性能与标准模型相当或更好。

英文摘要

Activation functions play a central role in neural networks by shaping internal representations. Recently, learning binary activation representations has attracted significant attention due to their advantages in computational and memory efficiency, as well as interpretability. However, training neural networks with Heaviside activations remains challenging, as their non-differentiability obstructs standard gradient-based optimization. In this paper, we propose Heavy Tailed Activation Function (HTAF), a smooth approximation to the Heaviside function that enables stable training with gradient-based optimization. We construct HTAF as a sigmoid hyperbolic tangent composite function and theoretically show that it maintains a large gradient mass around zero inputs while exhibiting slower gradient decay in the tail regions. We show that Spiking Neural Networks, Binary Neural Networks and Deep Heaviside neural Networks can be trained stably using HTAF with gradient-based optimization. Finally, we introduce Implicit Concept Bottleneck Models (ICBMs), an interpretable image model that leverages HTAF to induce discrete feature representations. Extensive experiments across various architectures and image datasets demonstrate that ICBM enables stable discretization while achieving prediction performance comparable to or better than standard models.

2506.14202 2026-06-15 cs.LG cs.AI stat.ML 版本更新

DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

DiffusionBlocks: 通过扩散解释进行分块神经网络训练

Makoto Shing, Masanori Koyama, Takuya Akiba

发表机构 * Sakana AI The University of Tokyo(东京大学)

AI总结 提出DiffusionBlocks框架,利用残差连接与动力系统的对应关系,将网络转换为去噪过程,通过分数匹配目标实现独立分块训练,在多种Transformer架构上达到与端到端训练相当的性能,同时降低内存需求。

Comments To appear at the 14th International Conference on Learning Representations (ICLR 2026). v4: Fixed typos in experimental details (Appendix E.4)

详情
AI中文摘要

端到端反向传播需要存储所有层的激活值,造成内存瓶颈,限制了模型的可扩展性。现有的分块训练方法提供了缓解该问题的途径,但它们依赖于特设的局部目标,并且在分类任务之外尚未得到充分探索。我们提出$\textit{DiffusionBlocks}$,一个将基于Transformer的网络转化为真正独立可训练块的原则性框架,这些块能保持与端到端训练相竞争的性能。我们的关键洞察在于利用残差连接自然对应于动力系统中的更新这一事实。通过对该系统进行最小修改,我们可以将这些更新转换为去噪过程的更新,其中每个块可以通过利用分数匹配目标独立学习。这种独立性使得每次只训练一个块的梯度成为可能,从而将内存需求按块数量成比例降低。我们在多种Transformer架构(视觉、扩散、自回归、递归深度和掩码扩散)上的实验表明,DiffusionBlocks训练与端到端训练性能匹配,同时能够在实际任务(超越小规模分类)上实现可扩展的分块训练。DiffusionBlocks提供了一种理论上有依据的方法,成功地将现代生成任务扩展到多种架构。代码可在该https URL获取。

英文摘要

End-to-end backpropagation requires storing activations throughout all layers, creating memory bottlenecks that limit model scalability. Existing block-wise training methods offer means to alleviate this problem, but they rely on ad-hoc local objectives and remain largely unexplored beyond classification tasks. We propose $\textit{DiffusionBlocks}$, a principled framework for transforming transformer-based networks into genuinely independent trainable blocks that maintain competitive performance with end-to-end training. Our key insight leverages the fact that residual connections naturally correspond to updates in a dynamical system. With minimal modifications to this system, we can convert the updates to those of a denoising process, where each block can be learned independently by leveraging the score matching objective. This independence enables training with gradients for only one block at a time, thereby reducing memory requirements in proportion to the number of blocks. Our experiments on a range of transformer architectures (vision, diffusion, autoregressive, recurrent-depth, and masked diffusion) demonstrate that DiffusionBlocks training matches the performance of end-to-end training while enabling scalable block-wise training on practical tasks beyond small-scale classification. DiffusionBlocks provides a theoretically grounded approach that successfully scales to modern generative tasks across diverse architectures. Code is available at https://github.com/SakanaAI/DiffusionBlocks .

2511.19656 2026-06-15 cs.LG math.OC stat.ML 版本更新

Lower Complexity Bounds for Nonconvex-Strongly-Convex Bilevel Optimization with First-Order Oracles

非凸-强凸双层优化的一阶Oracle下界复杂度

Kaiyi Ji

发表机构 * Kaiyi Ji(机凯毅)

AI总结 针对光滑非凸-强凸双层优化,在确定性和随机一阶Oracle模型下,分别证明了$\Omega(\kappa^{3/2}\epsilon^{-2})$和$\Omega(\kappa^{5/2}\epsilon^{-4})$的下界,改进了单层非凸优化和极小极大问题的已知最优下界。

Comments Accepted by ICML 2026

详情
AI中文摘要

尽管双层优化的上界保证已被广泛研究,但由于双层结构的复杂性,下界方面的进展有限。本文关注光滑非凸-强凸设定,并开发了新的困难实例,在确定性和随机一阶Oracle模型下得到了非平凡的下界。在确定性情形下,我们证明任何一阶零尊重算法至少需要$\Omega(\kappa^{3/2}\epsilon^{-2})$次Oracle调用才能找到$\epsilon$-精确的稳定点,改进了单层非凸优化和非凸-强凸极小极大问题已知的最优下界。在随机情形下,我们证明至少需要$\Omega(\kappa^{5/2}\epsilon^{-4})$次随机Oracle调用,同样强化了相关设定中的已知最优下界。我们的结果揭示了当前双层优化上下界之间的显著差距,并表明即使在简化设定(如二次下层目标)下,仍需进一步研究以理解标准一阶Oracle下双层优化的最优复杂度。

英文摘要

Although upper bound guarantees for bilevel optimization have been widely studied, progress on lower bounds has been limited due to the complexity of the bilevel structure. In this work, we focus on the smooth nonconvex-strongly-convex setting and develop new hard instances that yield nontrivial lower bounds under deterministic and stochastic first-order oracle models. In the deterministic case, we prove that any first-order zero-respecting algorithm requires at least $Ω(κ^{3/2}ε^{-2})$ oracle calls to find an $ε$-accurate stationary point, improving the optimal lower bounds known for single-level nonconvex optimization and for nonconvex-strongly-convex min-max problems. In the stochastic case, we show that at least $Ω(κ^{5/2}ε^{-4})$ stochastic oracle calls are necessary, again strengthening the best known bounds in related settings. Our results expose substantial gaps between current upper and lower bounds for bilevel optimization and suggest that even simplified regimes, such as those with quadratic lower-level objectives, warrant further investigation toward understanding the optimal complexity of bilevel optimization under standard first-order oracles.

2502.00336 2026-06-15 cs.LG stat.ML 版本更新

Denoising Score Matching with Random Features: Insights on Diffusion Models from Precise Learning Curves

随机特征去噪分数匹配:从精确学习曲线看扩散模型

Anand Jerry George, Rodrigo Veiga, Nicolas Macris

发表机构 * École Polytechnique Fédérale de Lausanne (EPFL)(联邦理工学院洛桑校区)

AI总结 通过随机特征神经网络参数化分数函数,推导去噪分数匹配的渐近精确误差,揭示模型复杂度、数据量和噪声样本数对扩散模型泛化与记忆的影响。

Comments Published at AISTATS 2026

详情
AI中文摘要

我们从理论上研究扩散模型中的泛化和记忆现象。实证研究表明,这些现象受模型复杂度和训练数据集大小的影响。在我们的实验中,我们进一步观察到去噪分数匹配(DSM)中每个数据样本使用的噪声样本数($m$)起着显著且非平凡的作用。我们通过在一个简单理论设置下推导DSM测试误差和训练误差的渐近精确表达式,捕捉这些行为并揭示其机制。分数函数由随机特征神经网络参数化,目标分布为$d$维高斯分布。我们在维度$d$、数据样本数$n$和特征数$p$趋于无穷大,同时保持比率$\psi_n=\frac{n}{d}$和$\psi_p=\frac{p}{d}$固定的情况下进行操作。通过刻画测试和训练误差,我们确定了作为$\psi_n$、$\psi_p$和$m$函数的泛化和记忆区域。我们的理论发现与实证观察一致。

英文摘要

We theoretically investigate the phenomena of generalization and memorization in diffusion models. Empirical studies suggest that these phenomena are influenced by model complexity and the size of the training dataset. In our experiments, we further observe that the number of noise samples per data sample ($m$) used during Denoising Score Matching (DSM) plays a significant and non-trivial role. We capture these behaviors and shed insights into their mechanisms by deriving asymptotically precise expressions for test and train errors of DSM under a simple theoretical setting. The score function is parameterized by random features neural networks, with the target distribution being $d$-dimensional Gaussian. We operate in a regime where the dimension $d$, number of data samples $n$, and number of features $p$ tend to infinity while keeping the ratios $ψ_n=\frac{n}{d}$ and $ψ_p=\frac{p}{d}$ fixed. By characterizing the test and train errors, we identify regimes of generalization and memorization as a function of $ψ_n,ψ_p$, and $m$. Our theoretical findings are consistent with the empirical observations.

2509.24710 2026-06-15 stat.ML cs.LG cs.NA math.NA 版本更新

MAD: Manifold Attracted Diffusion

MAD: 流形吸引扩散

Dennis Elbrächter, Giovanni S. Alberti, Matteo Santacesaria

发表机构 * Department of Mathematics, University of Vienna(维也纳大学数学系) MaLGa Center, Department of Mathematics, University of Genoa(热那亚大学数学系MaLGa中心)

AI总结 提出流形吸引扩散方法,利用流形假设通过扩展得分函数在推理阶段去除噪声,生成无噪声样本,在玩具问题、合成数据和真实数据上验证有效性。

详情
Journal ref
Forty-third International Conference on Machine Learning, 2026
AI中文摘要

基于得分的扩散模型是从图像分布中生成样本的一种高效方法。我们考虑训练数据来自目标分布的有噪声版本的情况,并提出一种可高效实现的推理过程修改,以生成无噪声样本。我们的方法受流形假设启发,该假设认为有意义的数据集中在高维环境空间的某个低维流形周围。核心思想是,噪声表现为离流形方向上的低幅度变化,而目标分布的相关变化主要限于流形方向。我们引入了扩展得分概念,并表明在简化设置中,它可以将小变化减少为零,同时基本保持大变化不变。我们描述了如何从标准得分的近似中高效计算其近似,并在玩具问题、合成数据和真实数据上展示了其有效性。

英文摘要

Score-based diffusion models are a highly effective method for generating samples from a distribution of images. We consider scenarios where the training data comes from a noisy version of the target distribution, and present an efficiently implementable modification of the inference procedure to generate noiseless samples. Our approach is motivated by the manifold hypothesis, according to which meaningful data is concentrated around some low-dimensional manifold of a high-dimensional ambient space. The central idea is that noise manifests as low magnitude variation in off-manifold directions in contrast to the relevant variation of the desired distribution which is mostly confined to on-manifold directions. We introduce the notion of an extended score and show that, in a simplified setting, it can be used to reduce small variations to zero, while leaving large variations mostly unchanged. We describe how its approximation can be computed efficiently from an approximation to the standard score and demonstrate its efficacy on toy problems, synthetic data, and real data.

2312.14889 2026-06-15 stat.ML cs.CR cs.LG math.ST stat.TH 版本更新

On Rate-Optimal Partitioning Classification from Observable and from Privatised Data

关于可观测数据和私有数据的最优划分分类方法

Balázs Csanád Csáji, László Györfi, Ambrus Tamás, Harro Walk

发表机构 * HUN-REN Institute for Computer Science and Control (SZTAKI)(HUN-REN计算机科学与控制研究所(SZTAKI)) Department of Probability Theory and Statistics, Institute of Mathematics, Eötvös Loránd University (ELTE)(概率论与统计学系,厄特沃什·洛朗大学数学学院(ELTE)) Department of Computer Science and Information Theory, Budapest University of Technology and Economics (BME)(计算机科学与信息理论系,布达佩斯技术与经济大学(BME)) Institute for Stochastics and Applications, University of Stuttgart(概率论与应用研究所,斯图加特大学)

AI总结 本文重新审视划分分类方法,在更宽松条件下(无需强密度假设)推导出可观测和私有数据下分类误差概率的收敛速率,该速率仅依赖于连续输入的内在维度。

详情
AI中文摘要

在本文中,我们重新审视了划分分类的经典方法,并在宽松条件下证明了新的收敛速率,既适用于可观测(非私有化)数据,也适用于私有化数据。我们考虑在 $d$ 维欧几里得空间中的分类问题。先前关于划分分类器的结果依赖于强密度假设(SDA),我们通过简单示例表明该假设具有限制性。在此,我们在更温和的假设下研究该问题。我们预设输入分布是绝对连续分布和离散分布的混合,使得绝对连续分量集中在 $d_a$ 维子空间上。除了标准的 Lipschitz 和边际条件外,还引入了绝对连续分量的一个新特征,据此计算分类误差概率的收敛速率,包括二元和多类情况。该界可以达到使用 SDA 所能达到的极小极大最优收敛速率,但在更温和的分布假设下。有趣的是,该收敛速率仅依赖于连续输入的内在维度 $d_a$,而非 $d$。在隐私约束下,数据无法直接观测,构建的分类器是合适的局部差分隐私机制随机结果的函数。在本文中,我们将拉普拉斯分布噪声添加到特征向量所有可能位置的离散化及其标签中。再次,可以在不使用 SDA 的情况下推导出分类误差概率收敛速率的紧上界,使得该速率依赖于 $2d_a$。

英文摘要

In this paper we revisit the classical method of partitioning classification and prove novel convergence rates under relaxed conditions, both for observable (non-privatised) and for privatised data. We consider the problem of classification in a $d$ dimensional Euclidean space. Previous results on the partitioning classifier worked with the strong density assumption (SDA), which is restrictive, as we demonstrate through simple examples. Here, we study the problem under much milder assumptions. We presuppose that the distribution of the inputs is a mixture of an absolutely continuous and a discrete distribution, such that the absolutely continuous component is concentrated on a $d_a$ dimensional subspace. In addition to the standard Lipschitz and margin conditions, a novel characteristic of the absolutely continuous component is introduced, by which the convergence rate of the classification error probability is computed, both for the binary and for the multi-class cases. This bound can reach the minimax optimal convergence rate achievable using SDA, but under much milder distributional assumptions. Interestingly, this convergence rate depends only on the intrinsic dimension of the continuous inputs, $d_a$, and not on $d$. Under privacy constraints, the data cannot be directly observed, and the constructed classifiers are functions of the randomised outcome of a suitable local differential privacy mechanism. In this paper we add Laplace distributed noises to the discretisations of all possible locations of the feature vector and to its label. Again, tight upper bounds on the convergence rate of the classification error probability can be derived, without using SDA, such that this rate depends on $2d_a$.

2507.20068 2026-06-15 cs.LG stat.ML 版本更新

PERRY: Policy Evaluation with Confidence Intervals using Auxiliary Data

PERRY: 使用辅助数据的策略评估与置信区间

Aishwarya Mandyam, Jason Meng, Ge Gao, Jiankai Sun, Mac Schwager, Barbara E. Engelhardt, Emma Brunskill

AI总结 提出两种方法,利用辅助数据构建离线策略评估的置信区间,通过共形预测和双重稳健估计,在多个模拟和真实医疗数据集上验证有效性。

详情
AI中文摘要

离线策略评估(OPE)方法在部署前估计新强化学习(RL)策略的价值。最近的研究表明,利用辅助数据集(例如由生成模型合成的数据)可以提高OPE方法的准确性。不幸的是,此类辅助数据集也可能存在偏差,并且现有在OPE中使用数据增强的方法缺乏原则性的不确定性量化。在医疗等高风险领域,可靠的不确定性估计对于确保RL策略的安全和知情部署至关重要。在这项工作中,我们提出了两种方法来构建带有数据增强的OPE的有效置信区间。第一种方法提供了关于$V^{\pi}(s)$的置信区间,即条件于初始状态$s$的策略价值。为此,我们引入了一种适用于具有连续状态空间的马尔可夫决策过程(MDP)的新共形预测方法,将先前工作扩展到更高维度的设置。其次,我们考虑更常见的任务,即估计所有初始状态上的平均策略性能$V^{\pi}$;我们引入了一种方法,该方法借鉴了双重稳健估计和预测驱动推断的思想。在涵盖库存管理、机器人、医疗以及来自MIMIC-IV的真实医疗数据集的模拟器中,我们发现我们的方法可以有效利用辅助数据,并一致地产生覆盖真实策略价值的置信区间,这与先前提出的方法不同。我们的工作使得OPE能够在高风险领域提供严格的不确定性估计成为可能。

英文摘要

Off-policy evaluation (OPE) methods estimate the value of a new reinforcement learning (RL) policy prior to deployment. Recent advances have shown that leveraging auxiliary datasets, such as those synthesized by generative models, can improve the accuracy of OPE methods. Unfortunately, such auxiliary datasets may also be biased, and existing methods for using data augmentation within OPE lack principled uncertainty quantification. In high stakes domains like healthcare, reliable uncertainty estimates are important for ensuring safe and informed deployment of RL policies. In this work, we propose two methods to construct valid confidence intervals for OPE with data augmentation. The first provides a confidence interval over $V^π(s)$, the policy value conditioned on an initial state $s$. To do so we introduce a new conformal prediction method suitable for Markov Decision Processes (MDPs) with continuous state spaces, extending prior work to higher-dimensional settings. Second, we consider the more common task of estimating the average policy performance over all initial states, $V^π$; we introduce a method that draws on ideas from doubly robust estimation and prediction powered inference. Across simulators spanning inventory management, robotics, healthcare, and a real healthcare dataset from MIMIC-IV, we find that our methods can effectively leverage auxiliary data and consistently produce confidence intervals that cover the ground truth policy values, unlike previously proposed methods. Our work enables a future in which OPE can provide rigorous uncertainty estimates for high-stakes domains.

2505.12992 2026-06-15 cs.LG cs.AI cs.CL stat.ML 版本更新

Fractured Chain-of-Thought Reasoning

断裂链式思维推理

Baohao Liao, Hanze Dong, Yuhui Xu, Doyen Sahoo, Christof Monz, Junnan Li, Caiming Xiong

发表机构 * University of Amsterdam(阿姆斯特丹大学) eBay Microsoft(微软) Google Research(谷歌研究) Salesforce

AI总结 提出断裂采样策略,通过截断推理链、调整轨迹数和解数,在推理时实现精度与成本的帕累托最优。

详情
AI中文摘要

推理时扩展技术通过在不重新训练的情况下利用额外的推理计算,显著增强了大型语言模型(LLMs)的推理能力。类似地,链式思维(CoT)提示及其扩展Long CoT通过生成丰富的中间推理轨迹来提高准确性,但这些方法会带来大量的token成本,阻碍了它们在延迟敏感场景中的部署。在这项工作中,我们首先证明截断CoT(即在完成推理前停止并直接生成最终答案)通常在使用显著更少token的情况下与完整CoT采样相匹配。基于这一见解,我们引入了断裂采样,这是一种统一的推理时策略,沿着三个正交轴在完整CoT和仅解决方案采样之间进行插值:(1)推理轨迹的数量,(2)每条轨迹的最终解数量,以及(3)推理轨迹被截断的深度。通过在五个不同的推理基准和多个模型规模上进行大量实验,我们证明断裂采样始终实现优越的精度-成本权衡,在Pass@k与token预算之间产生陡峭的对数线性缩放增益。我们的分析揭示了如何在这些维度上分配计算以最大化性能,为更高效和可扩展的LLM推理铺平了道路。代码可在该https URL获取。

英文摘要

Inference-time scaling techniques have significantly bolstered the reasoning capabilities of large language models (LLMs) by harnessing additional computational effort at inference without retraining. Similarly, Chain-of-Thought (CoT) prompting and its extension, Long CoT, improve accuracy by generating rich intermediate reasoning trajectories, but these approaches incur substantial token costs that impede their deployment in latency-sensitive settings. In this work, we first show that truncated CoT, which stops reasoning before completion and directly generates the final answer, often matches the full CoT sampling while using dramatically fewer tokens. Building on this insight, we introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling along three orthogonal axes: (1) the number of reasoning trajectories, (2) the number of final solutions per trajectory, and (3) the depth at which reasoning traces are truncated. Through extensive experiments on five diverse reasoning benchmarks and several model scales, we demonstrate that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget. Our analysis reveals how to allocate computation across these dimensions to maximize performance, paving the way for more efficient and scalable LLM reasoning. Code is available at https://github.com/BaohaoLiao/frac-cot.

2402.16388 2026-06-15 stat.ML cs.LG 版本更新

Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors

留一法、自助法和交叉共形异常检测器

Oliver Hennhöfer, Christine Preisach

发表机构 * German Federal Ministry for Economic Affairs and Climate Action(德国经济事务和气候行动部)

AI总结 为解决异常检测中校准数据不足的问题,基于共形预测提出留一法、自助法和交叉共形方法,在控制第一类错误率的同时提高数据效率。

Comments Published in 2024 IEEE International Conference on Knowledge Graph (ICKG)

详情
Journal ref
Proc. 2024 IEEE ICKG 15(1): 110-119 (February 2025)
AI中文摘要

异常检测系统中不确定性量化的需求日益重要。在此背景下,有效控制这些系统的第一类错误率而不增加第二类错误率,可以建立信任并减少与错误发现相关的成本。共形异常检测领域通过模型校准提供统计和有限样本有效性保证,成为一种有前景的方法。然而,对校准数据的依赖带来了实际限制,尤其是在低数据场景中。在本工作中,我们基于共形预测领域的方法,正式定义并评估了用于共形异常检测的留一法、自助法和交叉共形方法。超越经典的拆分共形方法,我们展示了用于计算重抽样共形$p$值的派生方法在全共形(直推式)方法的数据效率与拆分共形(归纳式)方法的计算效率之间提供了实用的折衷。我们验证了派生方法,并量化了它们在一类分类器和数据集上的改进。

英文摘要

The need for uncertainty quantification in anomaly detection systems has become increasingly important. In this context, effectively controlling Type I error rates without inflating Type II error rates in these systems can build trust and reduce costs associated with false discoveries. The field of conformal anomaly detection emerges as a promising approach for providing respective statistical and finite-sample validity guarantees through model calibration. However, reliance on calibration data imposes practical limitations, especially in low-data regimes. In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for conformal anomaly detection, building on methods from the field of conformal prediction. Looking beyond the classical split-conformal approach, we show that derived methods for calculating resampling-conformal $p$-values offer a practical compromise between the data efficiency of full-conformal (transductive) approaches and the computational efficiency of split-conformal (inductive) methods. We validate derived methods and quantify their improvements for a range of one-class classifiers and datasets.

8. 生物统计与医学统计 3 篇

2606.14587 2026-06-15 stat.ME 新提交

Typical Healthcare Pathways as a Basis for Admixture Modeling of Patient Trajectories

典型医疗路径作为患者轨迹混合建模的基础

Maryam Farhadizadeh, Carola S. Heinzel, August Sigle, Harald Binder, Frederik Wenz, Jan Hasenauer, Peter Pfaffelhuber, Nadine Binder

AI总结 提出一个框架,通过规则算法识别典型医疗路径,并利用混合模型将患者表示为典型路径的概率混合,用于患者轨迹的总结和亚组识别。

详情
AI中文摘要

背景:了解患者是否遵循相似或不同的护理模式对于表征临床实践、识别患者亚组和支持质量改进至关重要。然而,常规医疗轨迹难以直接比较,因为患者在诊断检查、治疗顺序、临床事件时间点和记录实践上可能存在差异。尽管存在这种变异,轨迹在队列层面常包含重复模式。方法:为应对这一挑战,我们提出一个框架,明确将队列层面的典型路径识别与患者层面的推断分开。在队列层面,我们使用基于规则的算法推导护理过程的可解释表示,以识别典型医疗路径,生成紧凑的路径图。然后将这些路径建模为马尔可夫链,并作为混合模型中的结构化组件,使每个患者能够表示为典型路径的概率混合,而非分配到单一路径组件。所得的混合权重为亚组表征提供了患者轨迹的紧凑表示。我们进一步评估了跨多个训练-测试分割的已识别路径和推断混合表示的稳定性。结果:在训练-测试分割中,该框架展示了一致的路径结构和患者层面的混合模式。应用于接受根治性前列腺切除术的前列腺癌患者的常规护理数据时,该框架识别了可解释的护理模式,并支持识别具有相似临床事件模式的患者亚组。结论:总体而言,所提出的框架为总结真实世界实践中的治疗路径和表征患者亚组提供了一种可解释且稳定的方法。

英文摘要

Background: Understanding whether patients follow similar or distinct patterns of care is important for characterizing clinical practice, identifying patient subgroups, and supporting quality improvement. However, routine healthcare trajectories are difficult to compare directly because patients may differ in their diagnostic workup, treatment sequencing, timing of clinical events, and documentation practices. Despite this variation, trajectories often contain recurring patterns at the cohort level. Methods: To address this challenge, we present a framework that explicitly separates cohort-level typical pathway identification from patient-level inference. At the cohort level, we derive an interpretable representation of care processes using a rule-based algorithm to identify typical healthcare pathways, resulting in a compact pathway graph. These pathways are then modeled as Markov chains and used as structured components in an admixture model, allowing each patient to be represented as a probabilistic mixture of typical pathways rather than being assigned to a single pathway component. The resulting admixture weights provide a compact representation of patient trajectories for subgroup characterization. We further assess the stability of the identified pathways and inferred admixture representations across multiple train-test splits. Results: Across train-test splits, the framework demonstrated consistent pathway structures and patient-level mixture patterns. Applied to routine care data from prostate cancer patients undergoing radical prostatectomy, the framework identified interpretable care patterns and supported the identification of patient subgroups with similar clinical event patterns. Conclusion: Overall, the proposed framework provides an interpretable and stable approach for summarizing treatment pathways and characterizing patient subgroups in real-world practice.

2606.14403 2026-06-15 stat.AP eess.SP stat.ME stat.ML 新提交

A Deep Zero-Inflated Model of North Atlantic Right Whale Presence To Support Blue Economy Management in the U.S. East Coast

支持美国东海岸蓝色经济管理的北大西洋露脊鲸存在的深度零膨胀模型

Jiaxiang Ji, Laura Nazzaro, Josh Kohut, Ahmed Aziz Ezzat

AI总结 提出深度零膨胀伯努利模型,联合建模潜在物种存在和条件检测概率,从异质协变量中学习复杂栖息地关系,生成高分辨率时空存在图以支持蓝色经济管理。

详情
AI中文摘要

有效建模濒危海洋哺乳动物物种(如北大西洋露脊鲸)对于平衡海洋保护与日益增长的蓝色经济至关重要。自主水下航行器收集的被动声学监测数据为局部海洋物种检测和海洋学传感提供了新机会,但也引入了复杂的统计挑战,如零膨胀、不完美检测和复杂的依赖结构。为此,我们提出了深度零膨胀伯努利(DeepZIB)模型——一种深度统计方法,它联合建模潜在物种存在和条件检测概率,同时从异质协变量信息中学习复杂的栖息地关系。我们建立了模型结构性质的理论结果,并进行了模拟实验,以证明其恢复潜在参数和潜在存在场的能力。应用于美国东海岸北大西洋露脊鲸的真实被动声学监测数据,展示了该模型在捕捉物种动态和空间变化栖息地方面的改进的模型充分性和预测性能。DeepZIB的一个关键优势是能够生成高分辨率、时空变化的存在图,为蓝色经济行业(从海上和海洋能源到渔业管理和海上运输)提供有针对性和风险意识的管理见解。

英文摘要

Effective modeling of endangered marine mammal species, such as the North Atlantic Right Whale, is critical for balancing marine conservation with the growing blue economy. Passive acoustic monitoring data collected by autonomous underwater vehicles provide new opportunities for localized marine species detection and oceanographic sensing, but introduce complex statistical challenges such as zero inflation, imperfect detection, and intricate dependence structures. In response, we propose the Deep Zero-Inflated Bernoulli (DeepZIB) model--a deep statistical method which jointly models latent species presence and conditional detection probabilities while learning complex habitat relationships from heterogeneous covariate information. We establish theoretical results on the model's structural properties and conduct simulation experiments to demonstrate its ability to recover underlying parameters and latent presence fields. Application to real-world passive acoustic monitoring data on the North Atlantic Right Whale along the U.S. East Coast demonstrates improved model adequacy and predictive performance in capturing the species' dynamic and spatially varying habitat. A key advantage of DeepZIB is its ability to generate high-resolution, spatially and temporally varying presence maps, providing valuable insights for targeted and risk-aware management of blue economy industries, ranging from offshore and marine energy, to fisheries management and maritime transport.

2510.16180 2026-06-15 stat.AP 版本更新

Estimating Time-Varying Epidemic Severity Rates with Adaptive Deconvolution

使用自适应反卷积估计时变流行病严重率

Jeremy Goldwasser, Addison J. Hu, Alyssa Bilinski, Daniel J. McDonald, Ryan J. Tibshirani

AI总结 针对时变严重率估计中比率估计量偏差大的问题,提出基于泊松-二项模型的自适应反卷积方法,结合趋势过滤惩罚实现平滑局部自适应估计,在COVID-19数据上优于标准方法。

详情
AI中文摘要

公共卫生中的几个关键指标传达了主要事件未来导致更严重次要事件的概率。这些“严重率”会随着新疗法、变异株或公共卫生干预等条件变化而在流行病过程中改变。实践中,诸如病死率等时变参数通常从汇总计数数据中估计。先前工作表明,常用的基于比率的估计量可能存在高度偏差,这促使了新方法的开发。在本文中,我们开发了一种基于近似次要事件的泊松-二项模型的自适应反卷积方法,并通过趋势过滤惩罚对该模型中的最大似然解进行正则化,以产生随时间平滑但局部自适应的严重率估计。这使我们能够回顾性和实时地计算严重率。基于COVID-19死亡和住院数据的实验表明,我们的反卷积估计量通常比标准的基于比率的方法更准确,并且对模型误设表现出合理的鲁棒性。

英文摘要

Several key metrics in public health convey the probability that a primary event will lead to a more serious secondary event in the future.Several key metrics in public health convey the probability that a primary event will lead to a more serious secondary event in the future. These "severity rates" can change over the course of an epidemic in response to shifting conditions like new therapeutics, variants, or public health interventions. In practice, time-varying parameters such as the case-fatality rate are typically estimated from aggregate count data. Prior work has demonstrated that commonly-used ratio-based estimators can be highly biased, motivating the development of new methods. In this paper, we develop an adaptive deconvolution approach based on approximating a Poisson-binomial model for secondary events, and we regularize the maximum likelihood solution in this model with a trend filtering penalty to produce smooth but locally adaptive estimates of severity rates over time. This enables us to compute severity rates both retrospectively and in real time. Experiments based on COVID-19 death and hospitalization data show that our deconvolution estimator is generally more accurate than the standard ratio-based methods, and displays reasonable robustness to model misspecification.

9. 经济金融与社会科学统计 3 篇

2606.14117 2026-06-15 stat.ME cs.AI 新提交

A Two-Stage Statistical Framework for Evaluating Associative Interference in Large Language Models

评估大语言模型中联想干扰的两阶段统计框架

Achraf Cohen, Andrew Kincaid

发表机构 * Department of Mathematics and Statistics, University of West Florida(数学与统计学系,西弗吉尼亚大学)

AI总结 提出两阶段统计框架,分离响应遵从性与任务一致性,评估三个LLM在性别-职业等领域的联想干扰,发现效应因模型而异。

Comments 11 pages; 2 figures

详情
AI中文摘要

大语言模型(LLM)越来越多地通过改编人类心理范式来评估偏见,然而方法论上的局限性——特别是将拒绝行为与任务表现混为一谈——阻碍了清晰的解释。在此,我们将内隐联想测验(IAT)改编为一个受控的强制选择框架,并引入一个两阶段建模方法,将响应遵从性与任务一致性分类分开。在三个当代LLM(Claude Sonnet-4、Gemini 2.5 Pro和GPT-5)上,我们评估了联想干扰,定义为不一致条件相对于一致条件下任务一致性的降低。虽然对结构化响应格式的遵从性普遍较高,但干扰效应在模型和领域之间差异很大。Claude Sonnet-4在性别-职业领域表现出强干扰(DeltaP = 0.086, 95% CrI [0.026, 0.173]),在性别-科学领域表现出较小但可信的效应。Gemini 2.5 Pro显示出减弱的干扰,而GPT-5在所有领域表现出最小或不可检测的干扰。这些发现表明,IAT风格的联想不对称性并非LLM的普遍属性,而是取决于模型特定特征。通过将干扰与遵从性分离并对项目水平变异性建模,本研究为评估LLM中的结构化响应模式提供了一个原则性框架。结果强调了模型特定评估的重要性,并表明联想干扰在现代系统中可以得到实质性缓解。

英文摘要

Large language models (LLMs) are increasingly evaluated for bias using adaptations of human psychological paradigms, yet methodological limitations-particularly the conflation of refusal behavior with task performance-have hindered clear interpretation. Here, we adapt the Implicit Association Test (IAT) to a controlled, forced-choice framework and introduce a two-stage modeling approach that separates response compliance from task-consistent classification. Across three contemporary LLMs (Claude Sonnet-4, Gemini 2.5 Pro, and GPT-5), we evaluate associative interference, defined as reduced task-consistency in incongruent relative to congruent conditions. While compliance with the structured response format was uniformly high, interference effects varied substantially across models and domains. Claude Sonnet-4 exhibited strong interference in the Gender--Career domain (DeltaP = 0.086, 95% CrI [0.026, 0.173]) and smaller but credible effects in Gender--Science. Gemini 2.5 Pro showed attenuated interference, and GPT-5 exhibited minimal or no detectable interference across domains. These findings demonstrate that IAT-style associative asymmetries are not a universal property of LLMs, but instead depend on model-specific characteristics. By isolating interference from compliance and modeling item-level variability, this study provides a principled framework for evaluating structured response patterns in LLMs. The results highlight the importance of model-specific assessment and suggest that associative interference can be substantially mitigated in modern systems.

2606.14009 2026-06-15 stat.ME econ.EM 新提交

Reliable Panel Regression: A Default Workflow for Slow-Moving, Mismeasured Variables

可靠的面板回归:针对缓慢变化、测量误差变量的默认工作流程

Andrew S. Rosenberg

AI总结 本文指出固定效应对缓慢变化且存在测量误差的变量会导致系数衰减,并提出了一个包含可靠性估计、修正估计和自相关边界的默认工作流程。

详情
AI中文摘要

政治科学家通常将固定效应下的系数收缩解释为混合关联存在混杂。本文展示了为什么对于缓慢变化且存在测量误差的回归变量,这种推断不可靠。固定效应可能移除大量信号,并从单位内变异中识别系数,而这种变异不成比例地包含测量误差,导致估计值向零衰减。因此,单独的固定效应系数可能无法区分混杂和测量误差。我证明衰减取决于回归变量的经验组内相关系数和测量可靠性。然后,我提出了一个面板回归的默认工作流程。研究者尽可能估计可靠性,报告混合估计和经组内可靠性修正的固定效应估计,当估计值符号相同时使用部分识别边界,当符号不同时报告固定效应作为单位内估计。对于没有可靠性估计的变量,我引入一个自相关边界,直接约束衰减因子。最后,我将此工作流程应用于几个已发表的结果,表明数据通常无法区分衰减和混杂,而该工作流程明确了研究者面临的情况。

英文摘要

Political scientists often interpret coefficient shrinkage under fixed effects as evidence that pooled associations are confounded. This paper shows why that inference is unreliable for slow-moving, mismeasured regressors. Fixed effects can remove much of the signal and identify coefficients from within-unit variation that is disproportionately measurement error, attenuating estimates toward zero. A lone fixed effects coefficient may therefore be unable to distinguish confounding from measurement error. I show that the attenuation depends on a regressor's empirical intraclass correlation and measurement reliability. I then propose a default workflow for panel regression. Researchers estimate reliability when possible, report pooled and fixed effects estimates with corrected within reliability, use partial identification bounds when the estimates share a sign, and report fixed effects as a within-unit estimate when they do not. For variables with no reliability estimate, I introduce an autocorrelation frontier that bounds the attenuation factor directly. I conclude by applying this workflow to several published results to show that the data often cannot distinguish attenuation from confounding, and the workflow makes clear which case the researcher faces.

2606.13697 2026-06-15 q-fin.PM stat.ME 新提交

On Reference-Regulated Multiperiod Mean-Variance Portfolio Optimization in High Dimensions

关于高维情形下参考调控的多期均值-方差投资组合优化

Yutao Deng, Jianjun Gao, Weichen Wang

AI总结 提出参考调控多期均值-方差框架,通过惩罚偏离参考策略,结合动态策略与参考组合优势,在高维渐近下刻画样本外夏普比率,显著提升多期策略稳定性与表现。

详情
AI中文摘要

多期均值-方差(MV)投资组合优化是Markowitz静态MV投资组合选择框架的重要扩展。与其静态对应物一样,多期MV投资组合仍然容易受到估计误差的影响。我们提出一个参考调控的多期均值-方差(RRMV)框架,该框架惩罚偏离参考策略的行为。因此,这种新的优化成功结合了动态策略和参考投资组合的优势。本文的一个关键贡献是在高维渐近下,考虑均值向量和协方差矩阵的估计误差,刻画了样本外夏普比率。我们展示了参考惩罚和投资期限如何共同影响优化投资组合的表现,以及正则化与单期投资组合优化的不同作用。大量的模拟和真实数据研究表明,所提出的框架显著提高了多期策略的稳定性和样本外夏普比率。

英文摘要

The multiperiod mean-variance (MV) portfolio optimization serves as a vital expansion of Markowitz's static MV portfolio selection framework. Just like its static counterpart, the multiperiod MV portfolio remains susceptible to estimation errors. We propose a reference-regulated multiperiod mean-variance (RRMV) framework that penalizes deviations from a reference policy. Therefore, this new optimization successfully combines the advantages of dynamic strategies and reference portfolios. A key contribution of this paper is the characterization of the out-of-sample Sharpe ratio under high-dimensional asymptotics with estimation errors in both the mean vector and the covariance matrix. We show how the reference penalty and the investment horizon jointly affect the optimized portfolio performance, and how regularization operates differently from the single-period portfolio optimization. Extensive simulation and real data studies demonstrate that the proposed framework improves the stability and out-of-sample Sharpe ratios of multiperiod policies significantly.

10. 数据隐私、稳健性与公平性 3 篇

2606.14506 2026-06-15 stat.ML cs.LG stat.ME 新提交

Beyond the Training Distribution: Evaluating Predictions Under Distribution Shift and Selection Bias

超越训练分布:评估分布偏移和选择偏差下的预测

Annie Ulichney, Amanda Coston

AI总结 针对协变量偏移和选择性标签共存时的模型评估问题,提出双机器学习程序估计目标风险,并通过eICU数据验证其准确性优于单独处理任一种偏差的方法。

详情
AI中文摘要

理解预测模型在新环境中的表现对于防止算法在决策中造成伤害至关重要。模型性能下降的两个常见原因是:(i) 协变量偏移,即目标协变量分布与源分布不同;(ii) 选择性标签,即结果的可观测性取决于历史决策。我们研究在协变量偏移和基于观测特征的选择性标签共同存在下的部署前模型评估。特别地,我们提出了一种双机器学习程序,用于在一般损失函数下估计任意黑箱预测模型的目标风险。我们在标准假设下证明了该估计量的可识别性,并基于目标风险的影响函数推导出偏差校正估计量。最后,我们通过使用eICU电子健康记录数据库的实验评估了我们的估计量,结果表明,与单独处理选择性标签或协变量偏移的方法以及结合标准插值方法的基线相比,我们的估计量更准确地跟踪真实目标风险。

英文摘要

Understanding how a prediction model will perform in a new environment before deployment is essential to preventing harm when algorithms inform decision-making. Two common sources of model performance degradation are (i) covariate shift, where the target covariate distribution differs from the source, and (ii) selective labels, where the observability of outcomes depends on historical decisions. We study pre-deployment model evaluation under the joint presence of covariate shift and labeling of outcomes selectively based on observed features. In particular, we present a double machine learning procedure for estimating the target risk of an arbitrary black-box prediction model under a general loss function. We show identification of this estimand under standard assumptions and derive a bias-corrected estimator based on the influence function of the target risk. Finally, we evaluate our estimator through experiments using the eICU electronic health records database, showing that it tracks the true target risk more accurately than methods that address either selective labels or covariate shift alone, as well as baselines that combine standard plug-in approaches.

2606.13902 2026-06-15 stat.AP 新提交

How Should We Measure Empirical Risk when Synthesizing Population Data?

合成人口数据时应如何衡量经验风险?

Joshua Snoke

AI总结 本文探讨合成全量人口数据时经验风险评估框架的不足,指出成员推断攻击和属性推断攻击等传统指标需重新审视,并强调需根据具体情境调整评估方法。

详情
AI中文摘要

合成数据已成为在共享数据时保护隐私的突出解决方案,但当前的经验风险评估框架从根本上假设了一个基于样本的上下文,这无法转化为对合成人口级别数据集的评估。本文探讨了为进行人口级别数据科学而合成整个群体时的含义,认为传统指标,如成员推断攻击(MIA)和属性推断攻击(AIA),需要重新审视。首先,在群体成员身份是公共知识或不被视为敏感信息的情况下,MIA可能变得无关紧要。其次,由于机密数据包含完整的人口信息,被单独识别的风险更高。此外,属性推断缺乏“样本外”比较组,意味着我们在定义可接受的推断时需要定义其他策略。最后,如果用例确实是实现人口级别数据科学,我们不能简单地依赖在生成合成数据之前返回子抽样。本文强调了在生成和评估合成人口数据时考虑情境的必要性。

英文摘要

Synthetic data has become a prominent solution for preserving privacy while sharing data, but current empirical risk assessment frameworks fundamentally assume a sample-based context that fails to translate for the evaluation of synthetic population level datasets. This commentary explores the implications when synthesizing entire populations in order to do population-level data science, arguing that traditional metrics, such as Membership Inference Attacks (MIA) and Attribute Inference Attacks (AIA), require re-examination. First, MIA may be rendered irrelevant in contexts where population membership is public knowledge or not considered sensitive information. Second, the risk of singling out is heightened because the confidential data contain full population information. Additionally, the absence of an "out-of-sample" comparison group for attribute inference means we need to define other policies when defining acceptable inferences. Finally, we cannot rely on simply returning to subsampling prior to generating synthetic data if the use case is truly to enable population-level data science. This commentary highlights the necessity for considering context when generating and evaluating synthetic population data.

2602.13848 2026-06-15 cs.LG stat.ML 版本更新

Testing For Distribution Shifts with Conditional Conformal Test Martingales

基于条件共形检验鞅的分布偏移检测

Shalev Shaer, Yarin Bar, Drew Prinster, Yaniv Romano

发表机构 * Technion - Israel Institute of Technology(技术ion - 以色列理工学院)

AI总结 提出一种顺序检验方法,通过固定参考集避免测试污染,利用稳健鞅构造实现任意有效的I型错误控制和渐近功效1,检测速度优于标准共形检验鞅。

详情
AI中文摘要

我们提出了一种用于检测任意分布偏移的顺序检验方法,该方法允许共形检验鞅(CTM)在固定的参考条件设置下工作。现有的CTM检测器通过不断用每个新样本扩展参考集来构建检验鞅,并以此评估新样本相对于过去观测的异常程度。虽然这种设计能实现任意有效的I型错误控制,但它存在测试污染问题:变化发生后,偏移后的观测进入参考集,稀释了分布偏移的证据,增加了检测延迟并降低了功效。相比之下,我们的方法通过将每个新样本与固定的零假设参考数据集进行比较,从设计上避免了污染。我们的主要技术贡献是一种稳健的鞅构造,该构造在条件于零假设参考数据时仍然有效,通过显式考虑有限参考集引起的参考分布估计误差来实现。这实现了任意有效的I型错误控制,同时保证了渐近功效为1和有界期望检测延迟。实验表明,我们的方法比标准CTM更快地检测到偏移,提供了一种强大且可靠的分布偏移检测器。

英文摘要

We propose a sequential test for detecting arbitrary distribution shifts that allows conformal test martingales (CTMs) to work under a fixed, reference-conditional setting. Existing CTM detectors construct test martingales by continually growing a reference set with each incoming sample, using it to assess how atypical the new sample is relative to past observations. While this design yields anytime-valid type-I error control, it suffers from test-time contamination: after a change, post-shift observations enter the reference set and dilute the evidence for distribution shift, increasing detection delay and reducing power. In contrast, our method avoids contamination by design by comparing each new sample to a fixed null reference dataset. Our main technical contribution is a robust martingale construction that remains valid conditional on the null reference data, achieved by explicitly accounting for the estimation error in the reference distribution induced by the finite reference set. This yields anytime-valid type-I error control together with guarantees of asymptotic power one and bounded expected detection delay. Empirically, our method detects shifts faster than standard CTMs, providing a powerful and reliable distribution-shift detector.

11. 数据集、软件与应用 8 篇

2606.14601 2026-06-15 cs.LG cs.SY eess.SY math.OC stat.CO 新提交

A Statistical and Machine Learning Framework for Operational Threshold Detection and Deployable Dispatch Controller Development in Hydrogen Multi-Energy Systems

氢多能系统中运行阈值检测与可部署调度控制器开发的统计与机器学习框架

Shadi Heenatigala, Hasanika Samarasinghe

发表机构 * Antioch College(安提阿学院) The Open University of Sri Lanka(斯里兰卡开放大学)

AI总结 提出统计与机器学习框架,利用一年高分辨率运行数据表征氢多能系统,通过统计分析和随机森林揭示非线性动态,并利用强化学习优化调度。

Comments 17 pages, 12 figures

详情
AI中文摘要

本研究提出了一个统计与机器学习框架,利用一年高分辨率运行数据表征氢基多能系统(H-MES)。统计分析揭示了由可再生能源盈余驱动的二元运行模式,其中太阳辐照度解释了氢气生产中45.7%的基于秩的方差,按常规标准属于大效应。只有高辐照度时期才触发有意义的电解槽参与,而电力需求则产生较弱的反向抑制效应($\epsilon^2 = 0.126$)。多元回归证实电解槽功率是主要的线性预测因子,并存在太阳-风协同交互作用。值得注意的是,随机森林分析将风能输出在预测重要性中排名第一,尽管其双变量相关性较弱(r = 0.167),揭示了参数方法无法发现的非线性动态。一个序列模型利用强24小时自相关性(r = 0.845)进行运行预测,而一个强化学习智能体优化了氢气收益调度。核心贡献在于证明了统计和机器学习方法在H-MES建模与控制中是互补的。

英文摘要

This study presents a statistical and machine learning framework for characterizing a hydrogen-based multi-energy system (H-MES) using one year of high-resolution operational data. Statistical analysis revealed a binary operation driven by renewable surplus, with solar irradiance explaining 45.7% of rank-based variance in hydrogen production, a large effect by conventional standards. Only high-irradiance periods triggered meaningful electrolyzer engagement, while electricity demand exerted a weaker inverse suppression effect ($ε^2 = 0.126$). Multiple regression confirmed electrolyzer power as the dominant linear predictor, with a synergistic solar-wind interaction. Notably, Random Forest analysis ranked wind output first in predictive importance despite its weak bivariate correlation (r = 0.167), revealing non-linear dynamics invisible to parametric methods. A sequence model exploited strong 24-hour autocorrelation (r = 0.845) for operational forecasting, while a reinforcement learning agent optimized hydrogen revenue dispatch. The core contribution is demonstrating that statistical and machine learning approaches are complementary for H-MES modeling and control.

2606.14143 2026-06-15 econ.EM stat.CO 新提交

Forecasting with Bayesian Panel Vector Autoregressions Using the R Package bpvars

使用R包bpvars进行贝叶斯面板向量自回归预测

Miguel Sanchez-Martinez, Tomasz Woźniak

AI总结 提出bpvars R包,通过贝叶斯层次面板VAR模型和缺失观测处理方法,实现对动态面板数据的高效预测与评估。

详情
AI中文摘要

R包bpvars旨在预测189个国家的就业、失业和劳动力参与率。然而,由于其建模框架的灵活性和稳健的编码,它通常适用于动态面板数据。它包括一系列贝叶斯层次面板向量自回归(VAR)模型,其特点是:(i) 国家特定的VAR模型,(ii) 其参数的先验分布以全局对应参数为中心,(iii) 具有灵活的多级层次先验分布,(iv) 包含文献中公认的基准选择的多种变体,以及(v) 四种替代规范,包括对国家特定或全局参数进行分组。一个显著的特征是基于模型一致的贝叶斯方法实现缺失观测处理。这些模型伴随贝叶斯预测,提供了广泛可能的规范,旨在提高预测精度并符合各种报告标准。我们还实现了伪样本外递归预测,以评估点预测和密度预测的性能。该包实现了模型规范、估计和预测例程,促进了简单的工作流程和可重复性,包括估计和预测结果的总结和可视化。由于采用了前沿的计量经济学和数值技术以及用C++编写的算法,它实现了非凡的计算速度。

英文摘要

The R package bpvars was designed to forecast employment, unemployment, and labour market participation rates of 189 countries. However, it is generally applicable to dynamic panel data due to the flexibility of its modelling framework and robust coding. It includes a family of Bayesian hierarchical panel Vector Autoregressions (VARs) that are characterised by: (i) country-specific VAR models (ii) with their parameters' priors centred around their global counterparts, and (iii) featuring flexible multi-level hierarchical prior distributions (iv) with many variants of well-established in the literature benchmark choices, and (v) four alternative specifications including groupping of country-specific or global parameters. A~distinguishing feature is its implementation of missing observation treatment based on a model-coherent Bayesian approach. These models are accompanied by Bayesian prediction, offering a wide range of possible specifications that aim to increase forecasting precision and comply with various reporting standards. We also implement pseudo-out-of-sample recursive forecasting for evaluating point and density forecast performance. The package implements model specification, estimation, and forecasting routines, facilitating simple workflows and reproducibility, including estimation and forecasting results summaries and visualisations. It achieves extraordinary computational speed thanks to the employment of frontier econometric and numerical techniques, as well as algorithms written in C++.

2606.14111 2026-06-15 physics.bio-ph q-bio.BM stat.ML 新提交

Temperature transferable Machine Learned Coarse Grained model for proteins

温度可迁移的机器学习粗粒化蛋白质模型

Jacopo Venturin, Cecilia Clementi

AI总结 提出一种热力学感知的温度可迁移MLCG框架,将粗粒化势能分解为能量和熵成分,通过精确热力学关系实现跨温度外推,在Chignolin蛋白上验证了温度依赖性的准确复现。

详情
AI中文摘要

粗粒化(CG)分子模拟为研究大型复杂生物系统提供了一种比全原子分子动力学更高效的替代方案。通过引入机器学习粗粒化(MLCG)模型,CG模拟的准确性得到了显著提升。然而,这些模型通常设计用于单一热力学点,缺乏温度可迁移性,无法用于预测温度依赖的量(如热容)。本文提出了一种热力学感知、温度可迁移的蛋白质MLCG框架,该框架明确地将粗粒化平均力势(PMF)分解为能量和熵成分。模型架构强制执行PMF能量与熵成分之间的精确热力学关系,并保证跨温度区间的物理一致外推和内插。我们在一个广泛的数据集上验证了该框架,该数据集涵盖了Chignolin蛋白在300 K至400 K之间五个温度下总计250微秒的分子动力学模拟,结果表明它能够复现参考全原子自由能面的温度依赖性,纠正了不感知温度的基线。此外,我们展示了可以应用一种廉价的、事后温度依赖性校正,无需重新训练MLCG势,即可准确恢复不同温度下的全原子热容。总体而言,这项工作为复杂生物分子系统的热力学可迁移MLCG模拟提供了一条物理基础路径。

英文摘要

Coarse-grained (CG) molecular simulations offer an efficient alternative to atomistic molecular dynamics to study large and complex biological systems. The accuracy of CG simulations has been increased dramatically by the introduction of machine-learned coarse-grained (MLCG) models. However, these models are typically designed to be used at a single thermodynamic point, lack temperature transferability, and can not be used to predict temperature dependent quantities like the heat capacity. Here we introduce a thermodynamically informed, temperature-transferable MLCG framework for proteins that explicitly decomposes the CG potential of mean force (PMF) into its energetic and entropic components. The model architecture enforces an exact thermodynamic relation between the energetic and entropic components of the PMF and guarantees physically consistent extrapolation and interpolation across temperature regimes. We validate this framework on an extensive dataset spanning a total of 250 $μ$s of molecular dynamics simulations across five temperatures between 300 K and 400 K for the Chignolin protein, and demonstrate that it reproduces the temperature dependency of the reference atomistic free energy surfaces, correcting the temperature-unaware baselines. Furthermore, we show that it is possible to apply an inexpensive, post-hoc temperature-dependent correction that does not require retraining the MLCG potential, accurately recovering the atomistic heat capacity at different temperatures. Overall, this work provides a physically grounded pathway toward thermodynamically transferable MLCG simulations of complex biomolecular systems.

2606.13742 2026-06-15 cs.LG cs.AI physics.comp-ph physics.flu-dyn stat.ML 新提交

A fully GPU-based workflow for building physics emulators of hypersonic flows

基于全GPU工作流构建高超声速流物理仿真器

Fabian Paischer, Dylan Rubini, Deniz A. Bezgin, Aaron B. Buhendwa, David Hauser, Florian Sestak, Johannes Brandstetter, Sebastian Kaltenbach, Nikolaus A. Adams

发表机构 * TU Munich(慕尼黑工业大学) Institute for Machine Learning, JKU Linz(林茨约翰·开普勒大学机器学习研究所) ELLIS Unit(ELLIS单元) EMMI AI

AI总结 提出全GPU工作流,集成加速数据生成与不确定性量化增强的神经仿真器训练,通过可微求解器JAX-Fluids实现残差驱动改进,提升物理一致性并支持外推。

Comments First authors contributed equally

详情
AI中文摘要

以高保真度和低计算成本解析复杂物理现象的能力是解决现代工程关键挑战的核心。一个典型例子是高超声速流,其中精确预测全流场拓扑,特别是激波位置和强度,至关重要。然而,超声速和高超声速流仍然是传统降阶模型和神经仿真器的绊脚石,这些模型难以在工业相关应用中物理一致地捕捉流态中的陡峭梯度。为此,我们引入了一个完全基于GPU的工作流,该工作流将加速数据生成与通过不确定性量化和物理感知细化增强的神经仿真器训练相结合。我们的工作流由可微高保真求解器(JAX-Fluids)实现,我们利用该求解器进行快速数据集创建和基于残差的神经仿真器改进,以增强物理一致性。在此框架基础上,我们首先提出了一系列模型架构,并分析了它们的缩放行为以揭示其优缺点。然后,我们表明基于残差的细化使得能够在仅提供网格和输入参数的情况下进行训练,显著降低残差并提高物理一致性。可微仿真和基于残差的细化共同产生了在其训练分布之外仍然可靠的物理仿真器,这是在现实工程设计循环中部署代理的关键要求。

英文摘要

The ability to resolve complex physical phenomena with high fidelity and at low computational cost is central to addressing key challenges in modern engineering. A prime example lies in hypersonic flows, where the precise prediction of the full flowfield topology, in particular with respect to shock wave location and intensity, is critical. Yet supersonic and hypersonic flows continue to be a stumbling block for traditional reduced-order models and neural emulators that struggle to capture steep gradients in flow states with physical consistency in applications of industrial relevance. To that end, we introduce a fully GPU based workflow that integrates accelerated data generation with the training of neural emulators augmented by uncertainty quantification and physics-aware refinement. Our workflow is enabled by a differentiable high-fidelity solver (JAX-Fluids) which we employ for rapid dataset creation and residual-based improvement of the neural emulator to enhance physical consistency. Building on this framework, we first present a suite of model architectures and analyze their scaling behavior to expose their strengths and shortcomings. We then show that residual-based refinement enables training on cases where only mesh and input parameters are available, substantially reducing residuals and improving physical consistency. Together, differentiable simulation and residual-based refinement yield physics emulators that remain reliable beyond their training distribution, a key requirement for deploying surrogates in real-world engineering design loops.

2606.08660 2026-06-15 stat.AP stat.ME stat.OT 新提交

Active Learning with Bayesian Reasoning: A POGIL-Based Pedagogy in Introductory Statistics

基于贝叶斯推理的主动学习:入门统计学中的POGIL教学法

Cheng-Han Yu, Angela Ebeling

AI总结 本文介绍一种面向过程的引导探究学习(POGIL)活动,用于在入门统计学中通过条件概率、贝叶斯定理和信念更新教授贝叶斯推理,并通过准实验比较POGIL与讲授式教学的效果,发现两者在考试表现和满意度上无显著差异。

详情
AI中文摘要

我们介绍了一种面向过程的引导探究学习(POGIL)风格的活动,用于在入门统计学中通过条件概率、贝叶斯定理和信念更新教授贝叶斯推理。该活动自成一体,使用可手工计算的概率,以双向表组织,并让学生参与结构化的团队角色。我们在一个本科入门统计学课程的四个部分中评估了该活动,采用准实验比较了POGIL风格和讲授式教学在贝叶斯定理单元的效果。结果包括学生在贝叶斯定理期末考试问题上的表现以及对教学的满意度。我们使用贝叶斯双变量广义线性模型比较了两种方法,同时考虑了专业类型、性别和种族。结果表明,不同教学风格和人口统计组之间的考试表现相似,高满意度的概率也相似,但存在相当大的不确定性,没有明确证据表明存在有意义的差异。这些发现表明,POGIL风格的活动在该单元中的表现与讲授式教学相当,同时提供了一种主动且课堂就绪的方式来介绍贝叶斯推理,无需困难的计算或模拟。我们提供了可调整的教学材料和一个可复现的贝叶斯分析框架,用于评估入门统计学中的主动学习创新。我们的研究支持在入门课程中可行地纳入贝叶斯推理,并可能帮助考虑主动学习的教师。

英文摘要

We introduce a Process Oriented Guided Inquiry Learning (POGIL)-style activity for teaching Bayesian reasoning in introductory statistics through conditional probability, Bayes' theorem, and belief updating. The activity is self-contained, uses hand-computable probabilities organized in two-way tables, and engages students in structured team roles. We evaluated the activity in four sections of an undergraduate introductory statistics course using a quasi-experimental comparison of POGIL-style and lecture-based instruction for a Bayes' theorem unit. Outcomes included student performance on Bayes' theorem final exam questions and satisfaction with instruction. We used a Bayesian bivariate generalized linear model to compare the two approaches while accounting for major type, gender, and race. The results indicated similar exam performance and similar probabilities of high satisfaction across instructional styles and demographic groups, with considerable uncertainty and no clear evidence of meaningful differences. These findings suggest that the POGIL-style activity performed comparably to lecture-based instruction for this unit while offering an active and classroom-ready way to introduce Bayesian reasoning without requiring difficult computation or simulation. We provide adaptable instructional materials and a reproducible Bayesian analytic framework for evaluating active learning innovations in introductory statistics. Our study supports the feasible inclusion of Bayesian reasoning in introductory courses and may help instructors considering active learning.

2604.23792 2026-06-15 astro-ph.IM astro-ph.CO stat.AP 版本更新

Beyond the Final Label: Exploiting the Untapped Potential of Classification Histories in Astronomical Light Curve Analysis

超越最终标签:利用天文光变曲线分类历史中未开发的潜力

Zhuoyang Zhou, Alex I. Malz, Chad M. Schafer, Konstantin Malanchev, Guillermo Cabrera-Vives, Christopher Hernández

AI总结 提出利用分类历史及其时间演化增强光变曲线分类的框架,结合循环神经网络与注意力机制提升精度,并引入基于Wasserstein距离的新评估指标。

Comments 27 pages, 10 figures; accepted for publication in the Astrophysical Journal

详情
AI中文摘要

维拉·C·鲁宾天文台的遗产时空巡天(LSST)将产生大量瞬变和变天文物体的通量时间序列(光变曲线)。每次新的通量观测时,光变曲线分类器需要生成候选类别的更新概率分布,这些分布将与全球社区共享,用于识别后续观测的有趣目标以及时间敏感性较低的分析应用。利用扩展LSST天文时间序列分类挑战(ELAsTiCC)中参与分类器的合成光变曲线和分类结果,我们研究了一种新颖框架,通过整合分类历史及其时间演化来增强现有光变曲线分类。为了展示该方法的潜力,我们引入了一个结合循环神经网络和加性注意力模块的模型,与挑战中的现有分类器相比,该模型显示出改进的分类精度和更平衡的精确率-召回率性能。此外,目前大多数(如果不是全部)现有分类器都是通过完整光变曲线上的最终分类结果来评估的;我们提出了新指标,通过考虑时间演化的分类概率分布之间的Wasserstein距离,评估分类器在使用有限数据时的稳定性、准确性和早期分类性能。我们的指标通过补充混淆矩阵和精确率-召回率等经典方法,为模型评估提供了更全面的视角。

英文摘要

The Legacy Survey of Space and Time (LSST) on the Vera C. Rubin Observatory will generate a massive collection of time series (light curves) of the measured flux of transient and variable astronomical objects. With each new flux observation, light curve classifiers need to generate updated probability distributions over candidate classes, which will then be shared with the global community for the purpose of identifying interesting targets for follow-up observations as well as less time-sensitive analysis applications. Using the synthetic light curves and classification results of participating classifiers from the Extended LSST Astronomical Time-series Classification Challenge (ELAsTiCC), we investigate a novel framework to enhance existing light curve classifications by incorporating their classification histories and the temporal evolution of these histories. To demonstrate the potential of this approach, we introduce a model that combines a recurrent neural network and an additive attention module, which shows improved classification accuracy and more balanced precision-recall performance compared to existing classifiers from the challenge. Furthermore, at this stage, most, if not all, of the existing classifiers are evaluated by their final classification results on complete light curves; we propose new metrics that evaluate the stability, accuracy, and early classification performance of a classifier's predictions when using limited data by considering the Wasserstein distance between the temporally evolving classification probability distributions. Our metrics offer a more comprehensive perspective for model assessment by supplementing classical methods such as the confusion matrix and precision-recall.

2411.10482 2026-06-15 cs.HC stat.AP 版本更新

The Noisy Work of Uncertainty Visualisation Research: A Review

不确定性可视化研究的嘈杂工作:综述

Harriet Mason, Dianne Cook, Sarah Goodwin, Emi Tanaka, Susan VanderPlas

AI总结 本文综述了数据可视化中不确定性表示的研究现状,指出定义不清导致结果冲突,并提供了可行的定义和示例,旨在指导图形方法和实验研究。

Comments 48 pages with 5 figures, condensed down for journal submission. Submitted to Annual Reviews of Statistics and Its Applications. Fixed mistake in author affiliations

详情
AI中文摘要

更好地表示数据可视化中的不确定性是近期研究活动的焦点。当前文献的一个问题是,关于不确定性的定义及其在图表中表示的含义缺乏清晰性。这种混淆导致文献中存在大量相互矛盾的结果,尤其是在评估不同不确定性表示有效性的实验中。在本综述中,我们总结了当前文献,提供了可行的定义,并通过示例说明这些定义。在此过程中,我们探讨了在统计图形中实现透明度真正需要什么。希望这将有助于指导新的图形方法和实验研究。

英文摘要

Better representation of the uncertainty in a data visualisation is a focus of recent research activity. A problem with the current literature is that there is a lack of clarity about the definition of uncertainty and what it means to represent it in a plot. This confusion results in a significant amount of conflicting results in the literature, especially in experiments that assess the effectiveness of different uncertainty representations. In this review, we summarise the current literature, provide workable definitions, and illustrate these definitions with examples. In doing so, we ask what it really takes to achieve transparency in statistical graphics. It is hoped that it will be useful for guiding new graphics methodology and experimental research.

2512.13069 2026-06-15 cs.LG physics.flu-dyn stat.ML 版本更新

Multi-fidelity aerodynamic data fusion by autoencoder transfer learning

基于自编码器迁移学习的多保真度气动数据融合

Javier Nieto-Centenero, Esther Andrés, Rodrigo Castellanos

发表机构 * Department of Aerospace Engineering, UC3M(航空航天工程系,UC3M) Theoretical and Computational Aerodynamics Group, Flight Physics Department, INTA(理论与计算空气动力学组,飞行物理部门,INTA)

AI总结 提出结合自编码器迁移学习与多分裂保形预测的多保真度深度学习框架,利用低保真数据学习潜在物理表示,微调解码器以极少量高保真数据实现高精度气动压力预测,并生成超过95%点覆盖的不确定度带。

Comments 27 pages, 13 figures

详情
AI中文摘要

准确的气动预测通常依赖于高保真度模拟;然而,其高昂的计算成本严重限制了其在数据驱动建模中的适用性。这一局限性促使了多保真度策略的发展,该策略利用廉价的低保真度信息而不牺牲准确性。针对这一挑战,本文提出了一种多保真度深度学习框架,该框架将基于自编码器的迁移学习与新开发的多分裂保形预测(MSCP)策略相结合,以在极端数据稀缺条件下实现具有不确定度感知的气动数据融合。该方法利用丰富的低保真度(LF)数据学习紧凑的潜在物理表示,该表示作为冻结的知识库,随后使用稀缺的高保真度(HF)样本对解码器进行微调。在NACA翼型(二维)和跨声速机翼(三维)数据库的表面压力分布测试中,该模型成功修正了LF偏差,并使用最少的HF训练数据实现了高精度的压力预测。此外,MSCP框架生成了稳健且可操作的不确定度带,点覆盖超过95%。通过将极端数据效率与不确定度量化相结合,本文为数据稀缺环境下的气动回归提供了一种可扩展且可靠的解决方案。

英文摘要

Accurate aerodynamic prediction often relies on high-fidelity simulations; however, their prohibitive computational costs severely limit their applicability in data-driven modeling. This limitation motivates the development of multi-fidelity strategies that leverage inexpensive low-fidelity information without compromising accuracy. Addressing this challenge, this work presents a multi-fidelity deep learning framework that combines autoencoder-based transfer learning with a newly developed Multi-Split Conformal Prediction (MSCP) strategy to achieve uncertainty-aware aerodynamic data fusion under extreme data scarcity. The methodology leverages abundant Low-Fidelity (LF) data to learn a compact latent physics representation, which acts as a frozen knowledge base for a decoder that is subsequently fine-tuned using scarce HF samples. Tested on surface-pressure distributions for NACA airfoils (2D) and a transonic wing (3D) databases, the model successfully corrects LF deviations and achieves high-accuracy pressure predictions using minimal HF training data. Furthermore, the MSCP framework produces robust, actionable uncertainty bands with pointwise coverage exceeding 95%. By combining extreme data efficiency with uncertainty quantification, this work offers a scalable and reliable solution for aerodynamic regression in data-scarce environments.

12. 其他/综合统计 16 篇

2606.14687 2026-06-15 math.PR math.OA math.OC math.ST stat.TH 新提交

Lehner's operator norm formulas, semidefinite programming, and spiked matrix models

Lehner的算子范数公式、半定规划与尖峰矩阵模型

Dmitriy Kunisky

AI总结 将Lehner的算子范数公式转化为半定规划,应用于尖峰矩阵模型,证明BBP相变并研究特征向量波动。

Comments 51 pages, 2 figures

详情
AI中文摘要

Lehner (1999) 推导了形如 $\mathfrak{X} = \mathbf{A}_0 \otimes \mathfrak{1} + \sum_{i = 1}^n \mathbf{A}_i \otimes \mathfrak{m}_i$ 的算子的算子范数 $\\|\mathfrak{X}\\|$ 的优雅公式,并容易推广到谱边 $\lambda_{\max}(\mathfrak{X})$,这些公式涉及正定矩阵上的非线性优化问题。这里 $\mathbf{A}_i$ 是有限维厄米矩阵,$\mathfrak{m}_i$ 是自由半圆或自由Rademacher算子族,$\mathfrak{1}$ 是恒等算子。我们首先证明Lehner的两个非线性优化都可以重写为线性半定规划(SDP),即使在Rademacher情形下Lehner的优化本身不是凸的。我们给出了这些SDP的原始和对偶形式,推导了互补松弛条件及其推论,并建议SDP比Lehner原始工作中提出的迭代数值方案更稳定和准确。然后我们将半圆情形下的SDP应用于尖峰矩阵模型,该模型最近由Bandeira, Cipolloni, Schröder和van Handel (2024) 通过Lehner公式研究。我们通过为相关的原始和对偶SDP构造可行变量,给出了他们在具有各向同性(但可能相关)高斯噪声的模型中建立的Baik-Ben Arous-Péché (BBP) 相变的新证明。结合我们的构造与最优对偶变量的敏感性解释,我们研究了此类模型的主特征向量的波动。我们推测并给出数值证据表明这些波动是高斯但各向异性且非普适的,并且它们的协方差可以通过Lehner公式的对偶优化器来计算,而该优化器近似于与噪声模型协方差相关的完全正算子的主特征矩阵。

英文摘要

Lehner (1999) derived elegant formulas for the operator norm $\|\mathfrak{X}\|$ of operators of the form $\mathfrak{X} = \mathbf{A}_0 \otimes \mathfrak{1} + \sum_{i = 1}^n \mathbf{A}_i \otimes \mathfrak{m}_i$, also easily generalized to the spectral edge $λ_{\max}(\mathfrak{X})$, in terms of nonlinear optimization problems over positive definite matrices. Here the $\mathbf{A}_i$ are finite-dimensional Hermitian matrices, the $\mathfrak{m}_i$ are either free semicircular or free Rademacher families of operators, and $\mathfrak{1}$ is the identity operator. We first show that both of Lehner's nonlinear optimizations can be rewritten as linear semidefinite programs (SDPs), even in the Rademacher case where Lehner's optimization is not itself convex. We give the primal and dual forms of these SDPs, derive the complementary slackness relations and consequences thereof, and propose that the SDPs are more stable and accurate than the iterative numerical scheme proposed in Lehner's original work. We then apply the SDPs from the semicircular case to spiked matrix models, studied recently via Lehner's formula by Bandeira, Cipolloni, Schröder, and van Handel (2024). We give a new proof of the Baik--Ben Arous--Péché (BBP) transition they establish in models with isotropic (but possibly correlated) Gaussian noise by constructing feasible variables for the associated primal and dual SDPs. Combining our construction with a sensitivity interpretation of optimal dual variables, we study the fluctuations of leading eigenvectors of such models. We conjecture and give numerical evidence that these fluctuations are Gaussian but anisotropic and non-universal, and that their covariance may be computed in terms of the optimizer of the dual of Lehner's formula, which in turn is approximately the leading eigenmatrix of a completely positive operator associated to the covariance of the noise model.

2606.14514 2026-06-15 math.ST stat.TH 新提交

Nonparametric inference on Fokker-Plank and McKean-Vlasov models

Fokker-Planck和McKean-Vlasov模型的非参数推断

Adriana Laurindo Monteiro, Roberto Imbuzeiro Oliveira

AI总结 提出基于核的速度场估计器,用于d维相互作用粒子的输运和扩散,建立均方误差率为h^2 + N^{-2/(d+2)}的相合性,涵盖Fokker-Planck和McKean-Vlasov两种设定。

详情
AI中文摘要

我们提出了一种基于核的估计器,用于控制$d$维相互作用粒子输运和扩散的速度场。假设初始位置独立同分布,服从分布$\mu_0$,我们建立了估计量的相合性,其显式均方误差率为$h^2 + N^{-2/(d+2)}$,其中$h$表示时间离散化步长,$N$表示粒子数。该分析涵盖两种不同的设定:Fokker-Planck方程,其中我们恢复潜在势函数;以及McKean-Vlasov方程,其中我们反卷积驱动平均场动力学的相互作用核。

英文摘要

We propose a kernel-based estimator of the velocity field governing the transport and diffusion of $d$-dimensional interacting particles. Assuming the initial positions are i.i.d. with law $μ_0$, we establish consistency of the estimator with an explicit mean-squared error rate of order $h^2 + N^(-2/(d+2))$, where $h$ denotes the time-discretization step and N the number of particles. The analysis covers two distinct settings: the Fokker-Planck equation, where we recover the underlying potential function, and the McKean-Vlasov equation, where we deconvolve the interaction kernel driving the mean-field dynamics.

2606.14450 2026-06-15 math.ST math.PR stat.TH 新提交

Universality for Products of Random Matrices with i.i.d. Entries and the Fuss--Catalan Number

具有独立同分布元素随机矩阵乘积的普适性与Fuss-Catalan数

Yanjin Xiang, Kun Chen, Zhihua Zhang

AI总结 研究独立同分布元素随机矩阵乘积的算子范数极限,证明其几乎必然收敛到σ^k γ_k,其中γ_k为k阶自由系数,并揭示Fuss-Catalan数在矩估计中的关键作用。

Comments 34pages

详情
AI中文摘要

设\((w_{ij})_{i,j\ge1}\)是一个由独立同分布实值或复值条目组成的无限阵列,均值为零,方差为\\(\sigma^2\\),且具有有限四阶矩。定义\\(W_n=(w_{ij})_{1\le i,j\le n}\\)和\\(X_n=n^{-1/2}W_n\\)。对于每个固定的\\(k\ge1\\),我们确定了由该族构建的若干固定乘积的几乎必然极限算子范数。定义第\\(k\\)个自由系数为\\[ \gamma_k:=\sqrt{\frac{(k+1)^{k+1}}{k^k}}. \\] 那么我们证明\\[ \\|X_n^k\\|\to\sigma^k\gamma_k \qquad \text{几乎必然}. \\] 对于从任意固定有限个\\(X_n\\)的独立副本池中有放回抽取的乘积,同样的极限成立;特别地,它适用于\\(k\\)个独立副本的乘积。因此,自由系数捕捉了在有限四阶矩假设下大随机矩阵之间的非交换特性。从尺度\\(\sigma^k(k{+}1)\\)到\\(\sigma^k \sqrt{k{+}1}\\)的经典Bai-Yin型幂估计的改进是我们结果的直接推论。主要技术挑战是通过\\(\E\Tr((X_n^kX_n^{*k})^m)\\)的高阶矩展开来证明上界。主要的零缺陷迹词是树状的,并由Fuss-Catalan数\\[ F_{k,m}= \frac1{km+1}\binom{(k+1)m}{m} \\] 计数。该组合工具有助于设计缺陷敏感的全局枚举:如果\\(L=km\\)且\\[ r=(L+1-v)+(L-q), \\] 那么具有缺陷\\(r\\)的可容许词类的数量最多为\\(F_{k,m}(Cm)^{Dr}\\)。这种\\(m\\)的多项式损失(次数与缺陷成正比)在对数矩范围内是可和的。

英文摘要

Let \((w_{ij})_{i,j\ge1}\) be a single infinite array of independent identically distributed real- or complex-valued entries of mean zero, variance \(σ^2\), and finite fourth moment. Set \(W_n=(w_{ij})_{1\le i,j\le n}\) and \(X_n=n^{-1/2}W_n\). For every fixed \(k\ge1\), we identify the almost sure limiting operator norm of several fixed products built from this family. Define the \(k\)-th freeness coefficient by \[ γ_k:=\sqrt{\frac{(k+1)^{k+1}}{k^k}}. \] Then we prove \[ \|X_n^k\|\toσ^kγ_k \qquad \text{almost surely}. \] The same limit holds for products sampled with replacement from any fixed finite pool of independent copies of \(X_n\); in particular, it holds for the product of \(k\) independent copies. Thus, the freeness coefficient captures the non-commuting characteristic between large random matrices %powers and independent or fixed-pool sampled products under the finite fourth moment assumption. The improvement of the classical Bai--Yin-type power estimate from the scale \(σ^k(k{+}1)\) to \(σ^k \sqrt{k{+}1}\) is a direct corollary of our result. The main technical challenge is to prove the upper bound using a high-moment expansion of %the upper bound is proved by a high-moment expansion of \(\E\Tr((X_n^kX_n^{*k})^m)\). The leading zero-defect trace words are tree-like and are counted by the Fuss--Catalan number \[ F_{k,m}= \frac1{km+1}\binom{(k+1)m}{m}. \] The combinatorial tool helps to devise a defect-sensitive global enumeration: if \(L=km\) and \[ r=(L+1-v)+(L-q), \] then the number of admissible word classes with defect \(r\) is at most \(F_{k,m}(Cm)^{Dr}\). This polynomial-in-\(m\) loss, with degree proportional to the defect, is summable in the logarithmic moment range.

2606.14335 2026-06-15 math.ST cs.IT cs.LG math.IT stat.TH 新提交

Recovery thresholds for hidden weighted sparse graphs

隐藏加权稀疏图的恢复阈值

Zhe Hou, Jingcheng Liu

发表机构 * State Key Laboratory for Novel Software Technology(新型软件技术国家重点实验室)

AI总结 研究从带噪加权完全图中恢复隐藏图的阈值,基于Rényi散度与Erdős-Rényi随机图的第一矩阈值建立统一刻画,并扩展到部分恢复和全有或全无现象。

Comments 34 pages, 4 figures

详情
AI中文摘要

从含噪高维数据中恢复结构信息是统计推断的基本任务。我们研究隐藏在随机加权完全图中的图的恢复阈值。具体地,未知图 $H^* \in H_n$ 均匀随机选取,并隐藏在 $n$ 个顶点的完全图中:边 $e \in H$ 的权重独立地服从分布 $P_n$;否则权重独立地服从分布 $Q_n$。目标是从这些边权重中恢复几乎全部的 $H$。假设分布 $P_n$ 和 $Q_n$ 之间的Rényi散度满足局部Lipschitz条件,且图族 $H_n$ 满足温和的密度条件,我们给出了恢复几乎全部 $H$(也称为几乎精确恢复)的信息论极限的统一刻画。该刻画将 $P_n$ 和 $Q_n$ 之间的KL散度与 $H$ 在Erdős-Rényi随机图模型 $G(n,p)$ 中的第一矩阈值的对数联系起来。我们的下界也扩展到部分恢复任务,其中只需恢复 $H$ 的常数 $\lambda$ 比例。最后但同样重要的是,对于某些伯努利和指数分布以及高斯分布,我们能够在指数尺度上展示全有或全无(AoN)阈值现象。

英文摘要

Recovering structural information from noisy high-dimensional data is a fundamental task in statistical inference. We investigate the recovery thresholds for a graph hidden in a randomly weighted complete graph. Specifically, an unknown graph $H^* \in H_n$ is chosen uniformly at random, and hidden in a complete graph of $n$ vertices as follows: the weight of an edge $e \in H$ is distributed independently according to $P_n$; otherwise the weight is distributed independently according to $Q_n$. The goal is to recover almost all of $H$ from these edge weights. Assuming a local Lipschitzness of the Rényi divergence between distributions $P_n$ and $Q_n$, and a mild density condition for the graphs $H_n$, we give a unified characterization of the information-theoretic limit for recovering almost all of $H$ (also known as almost exact recovery). Our characterization connects the KL divergence between $P_n$ and $Q_n$ to the logarithm of the first moment threshold of $H$ in the Erdős-Rényi random graph model $G(n,p)$. Our lower bound also extends to the task of partial recovery, in which only a constant $λ$-fraction of $H$ needs to be recovered. Last but not least, for certain Bernoulli and Exponential regimes, and for Gaussian distributions, we are able to show an All-or-Nothing (AoN) threshold phenomenon at the exponential scale.

2606.14087 2026-06-15 math.ST stat.TH 新提交

Confidence Bands for the Gradient Lines of a Density Function

密度函数梯度线的置信带

Ery Arias-Castro, Wanli Qiao

AI总结 针对密度函数从给定点出发的梯度上升线估计问题,提出基于核密度估计的插件估计量,建立弱收敛性,并利用该结果构造置信区域(含Bootstrap方法)。

详情
AI中文摘要

我们考虑从给定点出发的密度梯度上升线的估计问题。超越简单的一致性,我们为基于核密度估计的插件估计量建立了弱收敛结果。然后利用该结果构造梯度上升线的置信区域,包括通过Bootstrap方法。

英文摘要

We consider the problem of estimating the gradient ascent line of a density originating at a given point. Going beyond mere consistency, we establish a weak convergence result for a plugin estimator based on a kernel density estimator of the density. We then leverage that result to construct a confidence region for the gradient ascent line, including by bootstrap.

2606.13702 2026-06-15 eess.SP math-ph math.MP math.ST physics.data-an stat.TH 新提交

Uniform Asymptotics of the Pseudo Wigner-Ville Distribution for Nonlinear Chirps

非线性啁啾的伪Wigner-Ville分布的均匀渐近分析

Vincenzo Pierro, Maurizio Feo, Maurizio Ricciardi, Rocco P. Croce, Theo Demma, Innocenzo M. Pinto, Paolo Addesso

AI总结 针对非线性啁啾信号,利用振荡积分理论推导了伪Wigner-Ville分布的均匀渐近展开,以统一描述瞬时频率焦散处的过渡行为,并应用于引力波和雷达信号分析。

详情
AI中文摘要

复杂物理系统中非平稳信号的分析通常依赖于时频分布。其中,伪Wigner-Ville分布(PWVD)因其优越的分辨率而脱颖而出,但由于其固有的二次非线性,在数学上具有挑战性。这种非线性在相空间中产生复杂的干涉伪影和交叉项,可能掩盖信号的物理特征,特别是对于非线性啁啾。在这项工作中,我们为一般加窗非线性啁啾的PWVD建立了一个数学基础框架。通过利用具有合并驻点的振荡积分理论,我们推导了一个均匀渐近展开,弥合了启发式信号处理与半经典几何方法(Berry弦构造)之间的差距。得到的闭式表示用对称不完全Airy函数表示,提供了非线性变换行为的统一描述,正则化了瞬时频率焦散处的过渡。虽然该框架是通用的,但我们通过两个示例展示了其威力:引力波天文学中合并双星的高精度非线性啁啾和用于脉冲压缩应用的雷达非线性啁啾。分析结果成功预测了干涉图案的结构,并量化了基于峰值频率估计的系统偏差。因此,本研究建立了非线性数学分析与精密实验物理之间的系统桥梁,验证了PWVD作为高信噪比下详细源表征的鲁棒工具。

英文摘要

The analysis of non stationary signals in complex physical systems often relies on Time Frequency distributions. Among these, the Pseudo Wigner Ville Distribution (PWVD) stands out for its superior resolution but is mathematically challenging due to its inherent quadratic nonlinearity. This nonlinearity generates complex interference artifacts and cross terms in the phase space, potentially obscuring the physical features of the signal, particularly for nonlinear chirps. In this work, we establish a mathematically grounded framework for the PWVD for general windowed nonlinear chirps. By leveraging the theory of oscillatory integrals with coalescing stationary points, we derive a uniform asymptotic expansion that bridges the gap between heuristic signal processing and semiclassical geometric approaches (Berry's chord construction). The resulting closed form representation, expressed in terms of symmetric incomplete Airy functions, provides a unified description of the nonlinear transform's behavior, regularizing the transition across the instantaneous frequency caustics. While the framework is general, we show its power on two illustrative examples: the high precision nonlinear chirps of coalescing binaries in gravitational-wave astronomy and radar nonlinear chirps for pulse compression applications. The analytical results successfully predict the structure of interference patterns and quantify the systematic bias in peak based frequency estimation. Therefore, this study establishes a systematic bridge between nonlinear mathematical analysis and precision experimental physics, validating the PWVD as a robust tool for detailed source characterization in high signal to noise regimes.

2606.12057 2026-06-15 stat.AP 新提交

ChargeBD: Character-Aware Heterogeneous Agent Reasoning for Guided Engineering in Battery Development

ChargeBD:面向电池开发中引导工程的字符感知异构智能体推理

Rui Huang, Zekun Jiang, Mengran Hou, Xingyu Niu, Yuqiang Li, Qinying Gu, Tianhang Zhou

AI总结 提出ChargeBD框架,通过MBTI启发的角色智能体矩阵,结合异构推理,解决液流电池多尺度多目标研发中的自适应问题。

详情
AI中文摘要

液流电池(RFB)研究涵盖分子设计、电解质优化、电极和膜材料、电堆运行、系统管理和安全分析,使其成为一个受约束、多尺度、多目标的储能研发问题。尽管大型语言模型(LLM)可以支持科学知识整合和提案生成,但通用LLM推理在创新导向探索、基于规则的执行、机理建模和系统级权衡方面仍不够自适应。本文介绍ChargeBD,一个用于电池开发中引导工程的字符感知异构智能体推理框架。从50个RFB特定任务集开始,我们构建了500个问题的ESS-LLM基准,并定义了MBTI启发的角色智能体作为结构化认知偏差模板,而非心理测量工具或真实人格表征。选择DeepSeek-V3-Plus作为共享基础模型,评估16个MBTI启发的角色智能体,以构建角色能力矩阵和认知优势矩阵。

英文摘要

Redox-flow battery (RFB) research spans molecular design, electrolyte optimization, electrode and membrane materials, stack operation, system management, and safety analysis, making it a constrained, multi-scale, and multi-objective energy-storage R&D problem. Although large language models (LLMs) can support scientific knowledge integration and proposal generation, generic LLM reasoning remains insufficiently adaptive across innovation-oriented exploration, rule-based execution, mechanistic modeling, and system-level trade-offs. Here we introduce ChargeBD, a character-aware heterogeneous-agent reasoning framework for guided engineering in battery development. Starting from a 50-question RFB-specific task set, we construct a 500-question ESS-LLM Benchmark and define MBTI-inspired persona agents as structured cognitive-bias templates rather than psychometric instruments or representations of real personalities. DeepSeek-V3-Plus is selected as the shared base model, and 16 MBTI-inspired persona agents are evaluated to construct a persona capability matrix and a cognitive advantage matrix.

2505.16428 2026-06-15 math.ST stat.TH 版本更新

Sharp Asymptotic Minimaxity for Multiple Testing Using One-Group Shrinkage Priors

使用单组收缩先验进行多重检验的渐近极小化极大最优性

Sayantan Paul, Prasenjit Ghosh, Arijit Chakrabarti

AI总结 研究稀疏高斯序列模型中基于全局-局部尺度混合正态先验的贝叶斯多重检验规则的渐近极小化极大性质,证明在已知或未知稀疏度下,适当选择全局收缩参数可使检验规则达到精确极小化极大风险。

详情
AI中文摘要

本文研究了稀疏高斯序列模型中,使用一类广泛的全局-局部尺度混合正态分布作为均值的先验时,贝叶斯多重检验规则的渐近极小化极大性质。在标准误分类损失和由错误发现比例(FDP)与错误非发现比例(FNP)之和构成的复合损失下研究极小化极大性。当稀疏度已知时,我们证明通过基于稀疏度适当选择全局收缩参数,我们提出的检验规则在“beta-min”分离条件下对两种损失均渐近达到精确极小化极大风险。当稀疏度未知时,在关于稀疏度的适当假设下,同一规则的经验贝叶斯和完全贝叶斯自适应版本均渐近达到精确极小化极大风险。我们的结果表明,对于足够广泛的“马蹄型”先验(包括马蹄、Strawderman-Berger、标准双帕累托和某些逆伽马先验等),极小化极大性得以实现。对于非“马蹄型”先验,两种损失函数下的极小化极大性均不成立。据我们所知,这是关于基于全局-局部收缩先验的多重假设检验的首批此类结果。

英文摘要

This paper investigates asymptotic minimaxity properties of Bayesian multiple testing rules in the sparse Gaussian sequence model using a broad class of global-local scale mixtures of normals as priors for the means. Minimaxity is studied under standard misclassification loss and the composite loss given by the sum of the false discovery proportion (FDP) and false non-discovery proportion (FNP). When the sparsity level is known, we show that by suitably choosing the global shrinkage parameter based on the sparsity level, our proposed testing rule achieves the exact minimax risk asymptotically for both losses under the ''beta-min'' separation condition. When the sparsity level is unknown, both empirical Bayes and fully Bayesian adaptations of the same rule are shown to achieve exact minimax risk asymptotically under suitable assumptions on sparsity. Our results reveal that minimaxity is attained for ''horseshoe-type'' priors that are broad enough to include the horseshoe, Strawderman-Berger, standard double Pareto, and certain inverse-gamma priors, among others. For non-''horseshoe-type'' priors, minimaxity fails to hold for either loss function. To the best of our knowledge, these are the first results of their kind for multiple hypothesis testing based on global-local shrinkage priors.

2601.11478 2026-06-15 nlin.AO physics.app-ph physics.comp-ph physics.data-an stat.ML 版本更新

Temporal Complexity and Self-Organization in an Exponential Dense Associative Memory Model

指数型密集联想记忆模型中的时间复杂性与自组织

Marco Cafiso, Paolo Paradisi

AI总结 研究随机指数型密集联想记忆模型通过时间复杂性框架分析其自组织行为,发现噪声强度区间内出现复杂间歇性及无标度统计,且临界区随记忆负载增加而缩小。

详情
AI中文摘要

密集联想记忆(DAM)模型通过引入n体或指数相互作用推广了经典的Hopfield模型,大大提高了存储容量。虽然DAM模型的临界性已在统计平衡图像下得到广泛研究,但很少关注学习引起的时间自组织行为。在这项工作中,我们通过时间复杂性(TC)的视角研究随机指数型DAM(SEDAM)模型的行为,TC是一种通过有序与无序之间的间歇性转变事件和无标度时间统计来表征复杂系统的框架。与神经雪崩结构的生灭相关的转变事件被用于TC分析,并与基于重合结构的类似转变事件进行比较。我们系统地探索了TC指标如何依赖于控制参数,即噪声强度和记忆负载。我们的结果表明,SEDAM模型表现出复杂间歇性状态,其特征是非平凡的时间相关性和无标度行为,表明自组织动力学的自发涌现。值得注意的是,这种状态出现在噪声强度的有限区间内,而不是单个临界点,这与扩展临界性的概念一致。此外,达到临界区域(自组织行为出现)所需的噪声强度范围随着记忆负载的增加而略微减小。这项研究强调了TC作为理解人工和生物神经系统中学习和信息处理的补充框架的相关性,揭示了记忆负载与网络自组织能力之间的联系。

英文摘要

Dense Associative Memory (DAM) models generalize the classical Hopfield model by incorporating n-body or exponential interactions that greatly enhance storage capacity. While the criticality of DAM models has been largely investigated, mainly within a statistical equilibrium picture, little attention has been devoted to the temporal self-organizing behavior induced by learning. In this work, we investigate the behavior of a stochastic exponential DAM (SEDAM) model through the lens of Temporal Complexity (TC), a framework that characterizes complex systems by intermittent transition events between order and disorder and by scale-free temporal statistics. Transition events associated with birth-death of neural avalanche structures are exploited for the TC analyses and compared with analogous transition events based on coincidence structures. We systematically explore how TC indicators depend on control parameters, i.e., noise intensity and memory load. Our results reveal that the SEDAM model exhibits regimes of complex intermittency characterized by nontrivial temporal correlations and scale-free behavior, indicating the spontaneous emergence of self-organizing dynamics. Notably, such regimes arise over finite intervals of noise intensity rather than at a single critical point, consistent with the concept of extended criticality. Further, the noise intensity range needed to reach the critical region, where self-organizing behavior emerges, slightly decreases as the memory load increases. This study highlights the relevance of TC as a complementary framework for understanding learning and information processing in artificial and biological neural systems, revealing the link between the memory load and the self-organizing capacity of the network.

2512.19805 2026-06-15 cs.LG stat.ME

Guardrailed Uplift Targeting: A Causal Optimization Playbook for Marketing Strategy

Deepit Sapru

发表机构 * Deepit Sapru

详情
英文摘要

This paper introduces a marketing decision framework that optimizes customer targeting by integrating heterogeneous treatment effect estimation with explicit business guardrails. The objective is to maximize revenue and retention while adhering to constraints such as budget, revenue protection, and customer experience. The framework first estimates Conditional Average Treatment Effects (CATE) using uplift learners, then solves a constrained allocation problem to decide whom to target and which offer to deploy. It supports decisions in retention messaging, event rewards, and spend-threshold assignment. Validated through offline simulations and online A/B tests, the approach consistently outperforms propensity and static baselines, offering a reusable playbook for causal targeting at scale.

2511.11353 2026-06-15 stat.ME

Interpolated stochastic interventions based on propensity scores, target policies and treatment-specific costs

Johan de Aguas

详情
Journal ref
Proceedings of the AAAI Conference on Artificial Intelligence, 40(43):36654-36662, 2026
英文摘要

We introduce two families of stochastic interventions with discrete treatments that connect causal modeling to cost-sensitive decision making. The interventions arise from a cost-penalized information projection of the independent product of the organic propensity scores and a reference policy, yielding closed-form Boltzmann-Gibbs couplings. The induced marginals define modified stochastic policies that interpolate smoothly, via a tilt parameter, from the organic law or from the reference law toward a product-of-experts limit when all destination costs are strictly positive. The first family recovers and extends incremental propensity score interventions, retaining identification without global positivity. For inference on the expected outcomes after these policies, we derive the efficient influence functions under a nonparametric model and construct one-step estimators. In simulations, the proposed estimators improve stability and robustness to nuisance misspecification relative to plug-in baselines. The framework can operationalize graded scientific hypotheses under realistic constraints. Because inputs are modular, analysts can sweep feasible policy spaces, prototype candidates, and align interventions with budgets and logistics before committing experimental resources.

2501.15196 2026-06-15 stat.ML cs.LG

A Review on Self-Supervised Learning for Time Series Anomaly Detection: Recent Advances and Open Challenges

Aitor Sánchez-Ferrera, Borja Calvo, Jose A. Lozano

发表机构 * University of the Basque Country UPV/EHU(巴斯克大学UPV/EHU) Basque Center for Applied Mathematics (BCAM)(巴斯克应用数学中心)

详情
英文摘要

Time series anomaly detection presents various challenges due to the sequential and dynamic nature of time-dependent data. Traditional unsupervised methods frequently encounter difficulties in generalization, often overfitting to known normal patterns observed during training and struggling to adapt to unseen normality. In response to this limitation, self-supervised techniques for time series have garnered attention as a potential solution to undertake this obstacle and enhance the performance of anomaly detectors. This paper presents a comprehensive review of the recent methods that make use of self-supervised learning for time series anomaly detection. A taxonomy is proposed to categorize these methods based on their primary characteristics, facilitating a clear understanding of their diversity within this field. The information contained in this survey, along with additional details that will be periodically updated, is available on the following GitHub repository: https://github.com/Aitorzan3/Awesome-Self-Supervised-Time-Series-Anomaly-Detection.

2506.15441 2026-06-15 stat.ME math.ST stat.TH

Causal inference amid missingness-specific independencies and mechanism shifts

Johan de Aguas, Leonard Henckel, Johan Pensar, Guido Biele

详情
Journal ref
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:31-44, 2025
英文摘要

The recovery of causal effects in structural models with missing data often relies on $m$-graphs, which assume that missingness mechanisms do not directly influence substantive variables. Yet, in many real-world settings, missing data can alter decision-making processes, as the absence of key information may affect downstream actions and states. To overcome this limitation, we introduce $lm$-SCMs and $lm$-graphs, which extend $m$-graphs by integrating a label set that represents relevant context-specific independencies (CSI), accounting for mechanism shifts induced by missingness. We define two causal effects within these systems: the Full Average Treatment Effect (FATE), which reflects the effect in a hypothetical scenario had no data been missing, and the Natural Average Treatment Effect (NATE), which captures the effect under the unaltered CSIs in the system. We propose recovery criteria for these queries and present doubly-robust estimators for a graphical model inspired by a real-world application. Simulations highlight key differences between these estimands and estimation methods. Findings from the application case suggest a small effect of ADHD treatment upon test achievement among Norwegian children, with a slight effect shift due to missing pre-tests scores.

2506.09087 2026-06-15 cs.LG math.PR q-bio.NC stat.ML

Spiking Neural Models for Decision-Making Tasks with Learning

Sophie Jaffard, Giulia Mezzadri, Patricia Reynaud-Bouret, Etienne Tanré

详情
英文摘要

In cognition, response times and choices in decision-making tasks are commonly modeled using Drift Diffusion Models (DDMs), which describe the accumulation of evidence for a decision as a stochastic process, specifically a Brownian motion, with the drift rate reflecting the strength of the evidence. In the same vein, the Poisson counter model describes the accumulation of evidence as discrete events whose counts over time are modeled as Poisson processes, and has a spiking neurons interpretation as these processes are used to model neuronal activities. However, these models lack a learning mechanism and are limited to tasks where participants have prior knowledge of the categories. To bridge the gap between cognitive and biological models, we propose a biologically plausible Spiking Neural Network (SNN) model for decision-making that incorporates a learning mechanism and whose neurons activities are modeled by a multivariate Hawkes process. First, we show a coupling result between the DDM and the Poisson counter model, establishing that these two models provide similar categorizations and reaction times and that the DDM can be approximated by spiking Poisson neurons. To go further, we show that a particular DDM with correlated noise can be derived from a Hawkes network of spiking neurons governed by a local learning rule. In addition, we designed an online categorization task to evaluate the model predictions. This work provides a significant step toward integrating biologically relevant neural mechanisms into cognitive models, fostering a deeper understanding of the relationship between neural activity and behavior.

2401.16990 2026-06-15 stat.ME

Recovery and inference of causal effects with sequential adjustment for confounding and attrition

Johan de Aguas, Johan Pensar, Tomás Varnet Pérez, Guido Biele

详情
Journal ref
Journal of Causal Inference 13(1):20240009, 2025
英文摘要

Confounding bias and selection bias bring two significant challenges to the validity of conclusions drawn from applied causal inference. The latter can stem from informative missingness, such as in cases of attrition. We introduce the Sequential Adjustment Criteria (SAC), which extend available graphical conditions for recovering causal effects from confounding and attrition using sequential regressions, allowing for the inclusion of post-exposure and forbidden variables in the adjustment sets. We propose an estimator for the recovered Average Treatment Effect (ATE) based on Targeted Minimum-Loss Estimation (TMLE), which exhibits multiple robustness under certain technical conditions. This approach ensures consistency even in scenarios where the Double Inverse Probability Weighting (DIPW) and the naive plug-in sequential regressions approaches fall short. Through a simulation study, we assess the performance of the proposed estimator against alternative methods across different graph setups and model specification scenarios. As a motivating application, we examine the effect of pharmacological treatment for Attention-Deficit/Hyperactivity Disorder (ADHD) upon the scores obtained by diagnosed Norwegian schoolchildren in national tests using observational data ($n=9\,352$). Our findings align with the accumulated clinical evidence, affirming a positive but small impact of medication on academic achievement.

2502.10049 2026-06-15 stat.ME

The Probability of Tiered Benefit: Partial Identification with Robust and Stable Inference

Johan de Aguas, Sebastian Krumscheid, Johan Pensar, Guido Biele

详情
Journal ref
Proceedings of the Fourth Conference on Causal Learning and Reasoning, PMLR 275:90-113, 2025
英文摘要

We define the Probability of Tiered Benefit in scenarios with a binary exposure and an outcome that is either categorical with $K \geq 2$ ordered tiers or continuous partitioned by $K-1$ fixed thresholds into disjoint intervals. Similarly to other pure counterfactual queries, this parameter is not $g$-identifiable without additional assumptions. We demonstrate that strong monotonicity does not suffice for point identification when $K \geq 3$ and provide sharp bounds both with and without such constraint. Inference and uncertainty quantification for these bounds are challenging tasks due to potential nonregularity induced by ambiguities in the underlying individualized optimization problems. Such ambiguities can arise from immunities or null treatment effects in subpopulations with positive probability, affecting the lower bound estimate and hindering conservative inference. To address these issues, we extend the available Stabilized One-Step Correction (S1S) procedure by incorporating stratum-specific stabilizing matrices. Through simulations, we illustrate the benefits of this approach over existing alternatives. We apply our method to estimate bounds on the probabilities of tiered benefit and harm from pharmacological treatment for ADHD upon academic achievement, employing observational data from diagnosed Norwegian schoolchildren. Our findings indicate that while girls and children with low prior test performance could have moderate chances of both benefit and harm from treatment, a clear-cut recommendation remains uncertain across all strata.