arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.16941 2026-06-16 stat.ML cs.LG 新提交

A nonparametric two-sample test using a parametric integral probability metric

使用参数化积分概率度量的非参数双样本检验

Yuha Park, Yongdai Kim

AI总结提出基于单节点神经网络的参数化判别器类构造积分概率度量，得到非参数检验统计量PReLU-IPM，并证明其一致性和渐近等价性，实验表明有限样本下检验功效更高或相当。

Comments 45 pages. Accepted for publication in Statistical Analysis and Data Mining

详情

AI中文摘要

检测两个独立样本之间的分布差异是统计学和机器学习中的一个基本问题。非参数双样本检验提供了一个原则性框架，用于确定两个样本是否来自同一潜在分布，而不假设分布的任何特定参数形式。在本研究中，我们基于新引入的积分概率度量（IPM），使用一个特殊设计的、具有神经网络单节点的参数化判别器类，提出了一种新的双样本检验统计量。我们证明了所得到的检验统计量PReLU-IPM是非参数的，并为相关的双样本检验程序PReLU-TST建立了理论保证，包括其一致性以及在正则条件下与非参数基于IPM的检验的渐近等价性。通过分析多个模拟和真实基准数据集，我们证明了PReLU-TST在有限样本下，在一系列备择假设中实现了更高的检验功效，或与竞争对手表现相当。

英文摘要

Detecting distributional differences between two independent samples is a fundamental problem in statistics and machine learning. Nonparametric two-sample testing provides a principled framework for determining whether two samples are drawn from the same underlying distribution, without assuming any specific parametric form for the distribution. In this study, we propose a new two-sample test statistic based on a newly introduced integral probability metric (IPM), using a specially designed parametric discriminator class with a single node of a neural network. We show that the resulting test statistic, called PReLU-IPM, is nonparametric and establish theoretical guarantees for the associated two-sample testing procedure, PReLU-TST, including its consistency and asymptotical equivalence to nonparametric IPM-based tests under regularity conditions. By analyzing multiple simulated and real benchmark datasets, we demonstrate that PReLU-TST achieves higher power across a range of alternatives or performs comparably to its competitors, for finite samples.

URL PDF HTML ☆

赞 0 踩 0

2606.16913 2026-06-16 math.ST cs.NA math.NA stat.ML stat.TH 新提交

Optimal Multiscale Learning of Linear Operators

线性算子的最优多尺度学习

Jiaheng Chen, Daniel Sanz-Alonso

AI总结研究从含噪输入输出数据学习Sobolev空间之间有界线性算子的统计与计算极限，提出有限分辨率分块最小二乘估计器达到极小极大最优率，并实现自适应计算成本。

Comments 48 pages, 2 figures

2606.16689 2026-06-16 stat.ME 新提交

函数线性回归模型中均值响应推断的野自助法

Hyemin Yeon, Xiongtao Dai, Daniel Nordman

AI总结针对函数线性回归中残差自助法无法处理异方差、配对自助法计算成本高的问题，提出野自助法，兼具计算快速和适用范围广的优点，并给出截断水平选择方法。

详情

AI中文摘要

函数型回归变量使线性回归问题中的推断复杂化，因此自助法在量化不确定性和校准区间方面可发挥重要作用。然而，实践中最佳的自助法可能取决于数据因素以及计算方面的考虑，且现有自助法存在局限性：残差自助法计算快速简单，但误差异方差时可能失效；而配对自助法在函数线性回归中适用范围更广，但计算成本高得多。为弥补这一差距，我们开发了一种用于函数线性回归的野自助法，它类似于残差自助法的修改版本，但旨在像配对自助法一样具有广泛的应用范围，包括异方差误差。建立了其理论一致性，数值研究表明野自助法可提供准确且计算快速的推断。重要的是，我们还提出了一种实用且有效的截断水平选择方法，专门针对均值响应推断问题设计。通过一个天气数据示例进一步说明了所提出的函数线性回归自助法，并提供了配套的R包BTSinFLRM用于数值实现。

英文摘要

Functional regressors complicate inference in linear regression problems so that the bootstrap can play a useful role in quantifying uncertainty and calibrating intervals. The best bootstrap in practice, though, can depend on factors in the data as well as computational considerations and existing bootstraps can have limitations: residual bootstrap is computationally fast and simple but may fail when the errors are heterogeneous, while paired bootstrap applies more generally in functional linear regression at a cost of much higher computation. To bridge this gap, we develop a wild bootstrap method for functional linear regression, which is akin to a modified version of residual bootstrap but designed to have a wide scope of application like paired bootstrap, including to heteroscedastic errors. Its theoretical consistency is established and numerical studies suggest that wild bootstrap can provide accurate and computationally fast inference. Importantly, we also suggest a practical and effective approach of selecting truncation levels, specifically designed for mean response inference problems. The proposed bootstrap in functional linear regression is further illustrated through a weather data example, and an accompanying R package BTSinFLRM provides numerical implementations.

URL PDF HTML ☆

赞 0 踩 0

2606.16058 2026-06-16 stat.ME 新提交

Jeffreys-Type Penalized GEE for Correlated Binary Data with an Odds-Ratio Parameterization

基于Jeffreys型惩罚的GEE用于具有比值比参数化的相关二元数据

Anestis Touloumis

AI总结针对相关二元数据中分离现象导致的GEE失效问题，提出结合Jeffreys先验惩罚与边际比值比参数化的PGEE框架，确保有限估计并提高收敛性，通过模拟和实例验证其优于普通GEE。

详情

AI中文摘要

广义估计方程（GEE）广泛用于相关二元响应的总体平均推断，但在分离情况下（在小样本、稀疏或罕见事件设置中更可能出现），普通GEE可能失败，导致不收敛、无限或极端估计以及不可靠的推断。现有的惩罚GEE（PGEE）方法缓解了其中一些问题，但在非独立工作结构下通常不能保证有限估计，并且通常依赖于相关系数参数化，其允许范围随着拟合概率趋近于零或一而缩小，迫使工作关联在分离下趋向于独立性。我们提出了一个PGEE框架，结合了Jeffreys先验惩罚和边际比值比工作参数化。比值比参数化避免了这种失败，而具有可调强度$δ$（默认$δ= 1/2$）的惩罚在分离下稳定了估计。在工作独立性下，PGEE简化为Jeffreys先验惩罚最大似然估计，为logit、probit、互补对数-对数（complementary log-log）和cauchit链接提供有限估计。在非独立比值比结构下（其中形式上的有限性保证不可用），PGEE即使在分离设置中也实现了近乎完全的实证收敛。我们还提出了单步和混合变体OPGEE和HPGEE，以降低计算成本。模拟表明，所有三种变体在分离下显著优于普通GEE，同时在常规设置中保持普通GEE的性能。我们使用一个普通GEE失败的呼吸道疾病试验来说明该方法，并在R包geer中提供了实现。

英文摘要

Generalized estimating equations (GEE) are widely used for population-averaged inference on correlated binary responses, but ordinary GEE can fail under separation, a situation that is more likely in small-sample, sparse, or rare-event settings, leading to nonconvergence, infinite or extreme estimates, and unreliable inference. Existing penalized GEE (PGEE) approaches mitigate some of these problems but do not generally guarantee finite estimates under nonindependence working structures and often rely on correlation-coefficient parameterizations whose admissible range shrinks as fitted probabilities approach zero or one, forcing the working association toward independence under separation. We propose a PGEE framework that combines a Jeffreys-prior penalty with marginalized odds-ratio working parameterizations. The odds-ratio parameterization avoids this failure, while the penalty, with tunable strength $δ$ and default $δ= 1/2$, stabilizes estimation under separation. Under working independence, PGEE reduces to the Jeffreys-prior penalized maximum-likelihood estimator, yielding finite estimates for logit, probit, complementary log-log, and cauchit links. Under nonindependence odds-ratio structures, where a formal finiteness guarantee is unavailable, PGEE achieves near-complete empirical convergence even in separated settings. We also propose one-step and hybrid variants, OPGEE and HPGEE, that reduce computational cost. Simulations show that all three variants substantially outperform ordinary GEE under separation while retaining the performance of ordinary GEE in regular settings. We illustrate the method using a respiratory-illness trial in which ordinary GEE fails, and provide an implementation in the R package geer.

URL PDF HTML ☆

赞 0 踩 0

2606.16043 2026-06-16 stat.ME 新提交

Bias-Reduced GEE via Adjusted Estimating Equations, with Odds-Ratio Extensions

通过调整估计方程减少偏倚的广义估计方程，及其优势比扩展

Anestis Touloumis

AI总结针对小样本相关数据，提出一类通过调整估计方程实现一阶偏倚减少的广义估计方程（GEE）估计量，包括六种偏倚减少和校正估计量，并扩展至优势比参数化，适用于二元相关数据。

详情

AI中文摘要

广义估计方程（GEE）广泛用于相关数据分析，但当独立簇的数量较小或中等时，普通GEE回归估计量可能存在显著偏倚。我们通过将估计量视为聚类数据$M$-估计量，并推导出对估计方程的调整，以针对主要偏倚项，同时考虑工作协方差对均值参数的依赖性，从而开发了GEE的一阶偏倚减少原理。由此产生的类别包括三种偏倚减少估计量和三种一步偏倚校正类比，将Lunardon和Scharfstein（2017）的偏倚校正估计量以及Paul和Zhang（2014）的偏倚减少和偏倚校正估计量作为特例。该框架通过关联结构的相关系数参数化适用于一般响应类型，并通过成对优势比参数化扩展到相关二元数据，在此参数化下首次得到偏倚减少和偏倚校正的GEE估计量，其中边际均值兼容性约束远不如相关系数参数化严格，使其更适合小样本设置。在标准正则条件下，所有六个估计量与普通GEE具有相同的渐近分布。模拟研究表明，所提出的估计量在多种设置下减少了偏倚，同时保持了接近普通GEE的效率和覆盖率，一项临床试验分析说明了所提出估计量的实际应用。软件可在R包geer中获得。

英文摘要

Generalized estimating equations (GEE) are widely used for correlated data, but with small to moderate numbers of independent clusters the ordinary GEE regression estimators can be substantially biased. We develop a first-order bias-reduction principle for GEE by viewing the estimator as a clustered-data $M$-estimator and deriving an adjustment to the estimating equations that targets the leading bias term while accounting for the dependence of the working covariance on the mean parameters. The resulting class includes three bias-reduced estimators and three one-step bias-corrected analogs, nesting the bias-corrected estimator of Lunardon and Scharfstein (2017) and the bias-reduced and bias-corrected estimators of Paul and Zhang (2014) as special cases. The framework applies to general response types through correlation-coefficient parameterizations for the association structure and extends to correlated binary data through pairwise odds-ratio parameterizations, yielding the first bias-reduced and bias-corrected GEE estimators under this parameterization, for which the marginal-mean compatibility constraints are far less restrictive than those of correlation-coefficient parameterizations, making them better suited for small-sample settings. Under standard regularity conditions, all six estimators share the same asymptotic distribution as the ordinary GEE. Simulation studies show that the proposed estimators reduce bias while maintaining efficiency and coverage close to those of ordinary GEE across a range of settings, and a clinical trial analysis illustrates the proposed estimators in practice. Software is available in the R package geer.

URL PDF HTML ☆

赞 0 踩 0

2606.16013 2026-06-16 cond-mat.dis-nn cs.LG physics.data-an stat.ML 新提交

The limits of interpretability in multiple linear regression

多元线性回归中可解释性的极限

Anand Sharma, Chen Liu, Daniele Coslovich, Misaki Ozawa

发表机构 * Indian Institute of Science Education and Research（印度科学教育与研究学院）； Innovation and Research Division, Ge-Room Inc.（Ge-Room公司创新与研究部）； Dipartimento di Fisica, Università di Trieste（特里este大学物理系）； Univ. Grenoble Alpes, CNRS, LIPhy（格勒诺布尔阿尔卑斯大学，CNRS，LIPhy）

AI总结本文通过分析特征相关矩阵的本征模，理论解释了多重共线性导致线性回归权重不稳定和振荡模式，从而丧失可解释性的机制，并验证了岭回归的缓解作用。

Comments 23 pages, 8 figures

详情

AI中文摘要

解释机器学习模型已引起越来越多的关注，特别是在物理科学中，人们常常寻求理解潜在机制而不仅仅是进行预测。多元线性回归通常被视为比深度神经网络等更复杂模型更具可解释性的替代方案，因为其预测表示为输入特征的显式加权和。然而，当输入特征强相关时，即存在多重共线性时，学习到的权重可能表现出数据集间的大幅波动和跨物理相似特征的振荡行为，使得其解释变得困难甚至不可能。尽管统计学家熟知多重共线性下权重的不稳定性，但其对物理解释的影响，特别是与跨物理相似特征的振荡权重的联系，尚未得到系统阐明。本文通过分析特征相关矩阵的本征模，从理论上讨论了这种可解释性丧失背后的机制。我们表明，与多重共线性相关的小本征值模式会放大权重的波动，并产生不一定反映有意义贡献的振荡模式。我们在物理数据集上数值验证了这一理论图景，并表明岭回归抑制了这些不稳定模式，尽管得到的权重仍需谨慎解释。通过分析多种公开数据集，我们进一步证实了研究结果的普适性。我们的结果阐明了为何在存在多重共线性的情况下，即使对于线性回归模型，物理解释仍然可能困难。

英文摘要

Interpreting machine-learning models has attracted increasing attention, particularly in the physical sciences, where one often seeks to understand the underlying mechanisms rather than merely make predictions. Multiple linear regression is often regarded as an interpretable alternative to more complex models, such as deep neural networks, because its predictions are expressed as explicit weighted sums of input features. However, when input features are strongly correlated, namely in the presence of multicollinearity, the learned weights can exhibit large dataset-to-dataset fluctuations and oscillatory behavior across physically similar features, making their interpretation difficult or even impossible. Although the instability of the weights under multicollinearity is well known in statistics, its consequences for physical interpretation, in particular its connection to oscillatory weights across physically similar features, have not been systematically clarified. Here, we theoretically discuss the mechanism behind this loss of interpretability by analyzing the eigenmodes of the feature correlation matrix. We show that small-eigenvalue modes associated with multicollinearity amplify fluctuations in the weights and generate oscillatory patterns that do not necessarily reflect meaningful contributions. We test this theoretical picture numerically on physics datasets and show that Ridge regularization suppresses these unstable modes, although the resulting weights must still be interpreted with caution. We further confirm the generality of our findings beyond physics by analyzing a diverse collection of publicly available datasets. Our results clarify why, in the presence of multicollinearity, physical interpretation can remain difficult even for linear regression models.

URL PDF HTML ☆

赞 0 踩 0

2606.15836 2026-06-16 math.ST stat.ME stat.TH 新提交

Minimax Synthesis of Network Mechanisms

网络机制的最小最大综合

Marios Papamichalis, Regina Ruane

AI总结针对单一观测网络反映多种机制的问题，提出从图中同时估计各机制贡献及其组合方式的方法，通过偏差校正实现有效推断，并给出组合规则可识别性的稠密阈值。

Comments Under Review

详情

AI中文摘要

一个单一的观测网络同时反映了多种机制：社区、枢纽和聚类共存于一个图中，每种机制对应不同的模型。我们将网络视为候选机制的组合，并从单个图中研究每种机制的贡献强度及其组合方式。我们解决两个问题。第一个是如何在机制本身必须从图中估计时衡量每种机制的贡献：从同一数据拟合机制及其强度会使强度偏向于零，而校正可消除此偏差并得到有效的置信区间。第二个是组合规则本身是否可恢复：当图由两种机制共同作用生成时，仅凭图即可确定它们是加性组合还是交互作用，当且仅当图足够稠密时存在一个尖锐阈值，低于该阈值则无法通过任何检验进行判断。该估计通过观测边对候选机制进行校准。我们建立了匹配的最小最大速率，针对已知设计基准和估计设计问题本身，通过模拟验证了方法，并将其应用于真实网络，其中符号系数恢复了已知结构，并且在一种情况下，置信区间排除了候选机制的任何正贡献。

英文摘要

A single observed network reflects several mechanisms at once: communities, hubs, and clustering coexist in one graph, each a different model. We treat the network as a combination of candidate mechanisms and study, from a single graph, how strongly each mechanism contributes and how they combine. We address two questions. The first is how to measure each mechanism's contribution when the mechanisms must themselves be estimated from the graph: fitting the mechanisms and their strengths from the same data biases the strengths toward zero, and a correction removes this bias and yields valid confidence intervals. The second is whether the rule of combination is itself recoverable: when a graph is generated by two mechanisms acting together, the graph alone determines whether they combine additively or interact, exactly when the graph is dense enough, a sharp threshold below which no test can decide. The estimate calibrates the candidate mechanisms against the observed edges. We establish matching minimax rate, against a known-design benchmark and the estimated-design problem itself, confirm the methods in simulation, and apply them to real networks, where the signed coefficients recover known structure and, in one case, a confidence interval excludes any positive contribution from a candidate mechanism.

URL PDF HTML ☆

赞 0 踩 0

2606.15526 2026-06-16 stat.ME 新提交

磁异常检测的广义似然比检验：一种几何方法

C. Chenevas-Paule, S. Zozor, L. -L. Rouve, O. J. J. Michel, O. Pinaud, R. Kukla

AI总结针对磁异常检测，提出将信号参数约束在半代数空间（偶极子模型下的锥形区域）的广义似然比检验方法，提升检测性能，数值模拟显示优于现有方法且接近最优接收机。

详情

AI中文摘要

最先进的磁异常检测方法依赖于广义似然比检验（GLRT）。这些方法基于待检测源的参数模型，该模型用合适的函数基表示。本研究的主要目标之一是证明，对于给定的测量配置，信号被限制在由这些函数基生成的空间的一个受限子集内演化。信号的参数表示被识别为一个半代数空间，对于本文使用的偶极子模型，该空间是一个锥形区域，估计信号若不在此区域内则不满足物理方程。因此，第二个目标是利用这一性质将GLRT中的信号参数约束在半代数空间内，以提高检测性能。将所提算法的性能增益与传统方法进行比较；数值模拟表明，所提方法不仅优于现有方法，甚至能提供接近清晰（最优）接收机的结果。

英文摘要

State-of-the-art approaches to magnetic anomaly detection rely on the generalized likelihood ratio test (GLRT). These approaches are based on the formulation of a parametric model of the source to be detected, expressed in a suitable functional basis. One of the primary objectives of this study is to demonstrate that, for a given measurement configuration, the signal is constrained to evolve within a restricted subset of the space generated by these functional bases. The parametric representation of the signal is identified as a semi-algebraic space which, for the dipole model used in this article, turns out to be a cone outside of which the estimated signal does not satisfy the physical equations. Thus, a second objective is to exploit this property to constrain the signal parameters in the GLRT to belong to the semi-algebraic space, in order to improve detection performance. The performance gain of the proposed algorithm is compared to the one of conventional approaches; numerical simulations show that the proposed approach not only outperforms state-of-the-art methods but can even provide results close to those of the clear-seeing (optimal) receiver.

URL PDF HTML ☆

赞 0 踩 0

2606.15237 2026-06-16 stat.ME 新提交

Optimized Sequential Testing for Binary Ensemble Classifiers

二元集成分类器的优化序贯测试

Joseph Kalman, Amit Moscovich

AI总结提出一种序贯测试方法，通过提前停止基模型评估来降低二元集成分类器的计算成本，同时控制与完整集成的不一致率，并利用线性规划求解最优停止策略。

Comments 33 pages, 5 figures

详情

AI中文摘要

集成分类器是通过组合更简单基模型的结果（通常通过多数投票）进行预测的模型。一个经典例子是随机森林，它结合了决策树的预测。使用更多基模型的集成可以更准确，但训练和运行成本也更高。在本文中，我们考虑使用序贯测试领域的方法来降低二元分类计算成本的策略。我们不评估所有基模型并进行多数投票，而是顺序评估基模型，并在出现明确多数时停止执行。我们考虑了三种不同的最优性概念，用于最小化执行的基模型数量，同时控制与完整集成的不一致率的早期停止策略。对于每种最优性概念和允许的不一致率，我们展示了如何构建并高效求解线性规划以找到最优停止策略。我们在来自UC Irvine机器学习库的真实世界数据集以及Grinsztajn等人提出的基准数据集上测试了这些方法。我们发现，在大多数数据集上，这些方法在控制不一致率为0.1%的同时，提供了4倍或以上的加速。

英文摘要

Ensemble classifiers are predictive models that combine the results of simpler base models, often by majority vote. A classic example is random forests, which combine the predictions of decision trees. Ensembles that use more base models can be more accurate but also more costly to train and run. In this paper, we consider strategies for reducing the computational cost of binary classification using an approach from the field of sequential testing. Rather than evaluating all the base models and taking a majority vote, we evaluate the base models sequentially and stop execution when a clear majority emerges. We consider three different notions of optimality for early-stopping strategies that minimize the number of base models executed while controlling the rate of disagreement with the full ensemble. For each notion of optimality and allowable disagreement rate, we show that a linear program can be constructed and solved efficiently to find the optimal stopping strategy. We tested these methods on real-world datasets taken from the UC Irvine Machine Learning repository, and on the benchmark datasets proposed by Grinsztajn et al. We found that on most datasets, these methods provide speed-ups of 4x or more while controlling disagreement at 0.1%

URL PDF HTML ☆

赞 0 踩 0

2606.15097 2026-06-16 stat.ME stat.AP 新提交

Separate versus pooled winsorization for group mean contrasts: a finite-sample theory

分组均值对比的单独与合并截尾处理：有限样本理论

Chao Cheng, Chenshan Hu, Yukai Huang

AI总结针对重尾数据的分组均值对比，证明合并截尾无法达到次高斯率，而单独截尾可达到，且偏差更小、集中性更好，建议在组内而非合并后截尾。

详情

AI中文摘要

比较分组均值是许多统计领域的基础，包括双样本研究、随机试验和双重差分设计，但重尾结果会使传统估计量不稳定。一种常见的补救措施是在估计目标均值对比之前对数据进行截尾处理。主要方法——合并截尾——从所有组的合并样本中计算截尾阈值，而很少使用的替代方法——单独截尾——则在每组内计算阈值。我们研究了这两种截尾策略的有限样本偏差界，并证明了一个不可能结果：没有确定性的规则可以选择合并截尾水平以达到次高斯率。相比之下，单独截尾达到了这一速率，并且该保证扩展到分组均值的一般线性对比。模拟研究证实，合并截尾可能具有显著偏差，而单独截尾几乎无偏且围绕真实值集中。这些结果支持一个简单的建议：在每组内而非合并后进行截尾。

英文摘要

Comparing group means is foundational to many statistical areas, including two-sample studies, randomized trials, and difference-in-differences designs, yet heavy-tailed outcomes can make conventional estimators unstable. A common remedy is to winsorize the data before estimating the target mean contrast. The dominant approach, pooled winsorization, computes winsorization thresholds from the combined sample across all groups, while the rarely used alternative, separate winsorization, computes them within each group. We study finite-sample deviation bounds for these two winsorization strategies, and we prove an impossibility result: no deterministic rule for selecting the pooled winsorization level can attain the sub-Gaussian rate. In contrast, separate winsorization attains this rate, and the guarantee extends to general linear contrasts of group means. Simulation studies confirm that pooled winsorization can have substantial bias, while separate winsorization remains nearly unbiased and concentrates well around the truth. These results support a simple recommendation: winsorize within each group rather than after pooling.

URL PDF HTML ☆

赞 0 踩 0

2606.14921 2026-06-16 stat.ME 新提交

Flexible Method Comparison with the Probability of Agreement

灵活的方法比较：基于一致概率

Nathaniel T. Stevens

AI总结提出基于一致概率（PoA）的灵活推断框架，放宽先前假设，通过放宽假设提高方法比较的适用性，并用tPSA测量示例和模拟验证。

详情

AI中文摘要

测量方法的比较是临床实践中的常见问题；随着新方法的发展，建立它们与现有方法的一致性至关重要。一致概率（PoA）先前已被提出作为一种直观且信息丰富的手段来评估两种测量方法之间的一致性。它直接量化了不同方法对同一受试者的两次测量在临床上无法区分的可能性。在本文中，我们通过开发一个推断框架来彻底改革和扩展PoA方法，该框架放宽了先前实现中做出的几个限制性假设，最终提高了其在更广泛应用中的实用性。我们通过一个比较总前列腺特异性抗原（tPSA）测量方法的示例来说明这种更灵活的方法。并通过模拟彻底研究了其性能。这项工作极大地提高了PoA方法在方法比较中的灵活性、可用性，从而提高了其影响力。

英文摘要

The comparison of methods of measurement is a common problem in clinical practice; as novel methods are developed, establishing their agreement with existing methods is crucial. The probability of agreement (PoA) has previously been proposed as an intuitive and informative means of assessing agreement between two methods of measurement. It straightforwardly quantifies the likelihood that two measurements by different methods on the same subject are clinically indistinguishable. In this paper, we overhaul and extend the PoA methodology by developing an inference framework that relaxes several restrictive assumptions made in previous implementations, ultimately increasing its utility in a wider range of applications. We illustrate this more flexible methodology in an example that compares methods of measuring total Prostatic Specific Antigen (tPSA). And we thoroughly investigate its performance via simulation. This work dramatically increases the flexibility, availability, and hence impact of the PoA approach for method comparison.

URL PDF HTML ☆

赞 0 踩 0

2606.14837 2026-06-16 stat.ME 新提交

Bartlett adjustment for Gaussian random effects meta-analysis

高斯随机效应元分析的Bartlett调整

Haben Michael

AI总结针对元分析中研究数量少导致渐近方法失效的问题，推导了高斯随机效应模型的Bartlett校正，修正了文献中的公式。

2605.13092 2026-06-16 stat.ML cs.LG stat.ME 版本更新

Adaptive Kernel Density Estimation with Pre-training

具有预训练的自适应核密度估计

Ruitong Zhang, Ke Deng

发表机构 * Department of Statistics and Data Science, Tsinghua University（统计与数据科学系，清华大学）

AI总结本文提出利用预训练技术提升高维下自适应核密度估计效率，通过神经网络推荐合适核函数，实验证明在目标分布接近预训练分布时效果显著。

2604.26819 2026-06-16 math.PR cs.IT math.IT math.ST stat.ML stat.TH 版本更新

Sharp One-Dimensional Sub-Gaussian Comparison in Convex Order

凸序下的尖锐一维次高斯比较

Yihan Zhang

AI总结证明若随机变量X的矩生成函数被标准正态分布G点态上界，则X在凸序下被G/𝔼[|G|]控制，且该结果由均匀分布和绝对值函数证明是最优的。

2412.17470 2026-06-16 math.ST econ.EM stat.ME stat.TH 版本更新

A Necessary and Sufficient Condition for Size Controllability of Heteroskedasticity Robust Test Statistics

异方差稳健检验统计量尺寸可控性的一个充要条件

Benedikt M. Pötscher, David Preinerstorfer

AI总结针对回归模型中单个约束检验，给出了异方差稳健检验统计量尺寸可控性的充要条件，改进了现有仅充分条件的结果。

Comments Two footnotes added

2602.17587 2026-06-16 math.ST cs.LG stat.ML stat.TH 版本更新

Asymptotically Optimal Sequential Testing with Markovian Data

马尔可夫数据的渐近最优序贯检验

Alhad Sethi, Kavali Sofia Sagar, Shubhada Agrawal, Debabrota Basu, P. N. Karthik

发表机构 * Indian Institute of Science, Bangalore（班加罗尔印度科学学院）； Indian Institute of Technology, Hyderabad（海得拉巴印度理工学院）； Univ. Lille, Inria, CNRS, Centrale Lille, UMR 9189 – CRIStAL（里尔大学、法国国家科学研究中心、中央里尔学院、UMR 9189 – CRIStAL）

AI总结针对遍历有限状态马尔可夫链生成的数据，提出一种渐近最优的序贯假设检验方法，其期望停止时间与实例相关的下界渐近匹配，并应用于马尔可夫链蒙特卡洛模型误设检测和马尔可夫决策过程结构性质检验。

Comments ICML 2026

详情

AI中文摘要

我们研究了由遍历有限状态马尔可夫链生成的数据的单侧和α-正确序贯假设检验。原假设是未知转移矩阵属于随机矩阵的指定集合P，备择假设对应于不相交的集合Q。我们建立了备择假设下任何有效序贯检验的期望停止时间的非渐近实例相关下界，该下界是渐近紧的。我们的新分析改进了现有下界，这些下界在此设置中要么是渐近的，要么被证明是次优的。我们的下界同时包含了由未知马尔可夫链诱导的平稳分布和转移结构。我们进一步提出了一种最优检验，其期望停止时间在α→0时渐近匹配该下界。我们通过应用该框架到马尔可夫链蒙特卡洛中模型误设的序贯检测以及马尔可夫决策过程中转移动力学的线性等结构性质的检验，说明了我们框架的实用性。我们的发现给出了马尔可夫依赖下最优序贯检验程序的尖锐且一般的刻画。

英文摘要

We study one-sided and $α$-correct sequential hypothesis testing for data generated by an ergodic, finite-state Markov chain. The null hypothesis is that the unknown transition matrix belongs to a prescribed set $P$ of stochastic matrices, and the alternative corresponds to a disjoint set $Q$. We establish a non-asymptotic instance-dependent lower bound on the expected stopping time of any valid sequential test under the alternative, which is asymptotically tight. Our novel analysis improves the existing lower bounds, which are either asymptotic or provably sub-optimal in this setting. Our lower bound incorporates both the stationary distribution and the transition structure induced by the unknown Markov chain. We further propose an optimal test whose expected stopping time matches this lower bound asymptotically as $α\to 0$. We illustrate the usefulness of our framework through applications to sequential detection of model misspecification in Markov Chain Monte Carlo and to testing structural properties, such as the linearity of transition dynamics, in Markov decision processes. Our findings yield a sharp and general characterization of optimal sequential testing procedures under Markovian dependence.

URL PDF HTML ☆

赞 0 踩 0

2602.05807 2026-06-16 stat.ME 版本更新

SpARCD: A Spectral Graph Framework for Revealing Differential Functional Connectivity in fMRI Data

SpARCD：一种揭示fMRI数据中差异功能连接的谱图框架

Shira Yoffe, Ziv Ben-Zion, Guy Gurevitch, Talma Hendler, Malka Gorfine, Ariel Jaffe

AI总结提出SpARCD框架，利用距离相关和谱滤波检测两种实验条件下脑连接差异，通过置换检验得到区域级显著性图，在复杂依赖结构中优于传统方法。

详情

AI中文摘要

识别在不同认知或情绪状态下表现出功能连接改变的脑区域是神经科学中的一个关键问题。现有方法，如边检验、基于种子的心理生理交互（PPI）分析或相关网络比较，通常存在统计功效低、阈值任意以及捕获分布式或非线性依赖模式能力有限的问题。我们提出SpARCD（揭示连接差异的谱分析），一种用于检测两种实验条件下脑连接差异的新统计框架。SpARCD利用距离相关（一种对线性和非线性关联都敏感的依赖度量）为每种条件构建加权图。然后通过谱滤波构建微分算子，并计算其前导特征向量来揭示连接变化。通过基于置换的检验方案实现推断，该方案生成可解释的区域级显著性图。广泛的模拟研究表明，SpARCD相对于传统的边检验或单变量方法具有更高的功效，特别是在存在复杂依赖结构时。对113名早期PTSD患者在执行情绪面孔匹配任务时的fMRI数据应用，揭示了与情绪反应和调节过程相关的不同网络。总体而言，SpARCD为比较高维连接结构提供了一个统计严谨且计算高效的框架，广泛适用于神经影像学和其他基于网络的科学领域。

英文摘要

Identifying brain regions that exhibit altered functional connectivity between cognitive or emotional states is a fundamental problem in neuroscience. We propose SpARCD (Spectral Analysis for Revealing Connectivity Differences), a statistical framework for detecting detecting condition-specific patterns of functional connectivity. SpARCD uses distance correlation, a dependence measure sensitive to both linear and nonlinear associations, to construct weighted region-wise connectivity graphs for each condition. A differential operator obtained through spectral filtering is then used to identify connectivity changes via its leading eigenvectors. To assess statistical significance, we develop a permutation-based testing procedure that yields interpretable region-level significance maps. We establish finite-sample validity of the permutation test and derive asymptotic guarantees for the stability of the resulting region rankings. Simulation studies demonstrate improved power relative to conventional edge-wise and univariate approaches, particularly in settings with nonlinear dependence structures. We applied SpARCD to fMRI data from 113 individuals with early-stage PTSD and 42 controls during emotional and neutral task conditions. The method identified distinct connectivity networks associated with visual processing in both PTSD and control participants. Resting-state comparisons between PTSD and control participants highlighted similar visual networks. SpARCD provides a statistically rigorous and computationally efficient framework for comparing high-dimensional connectivity patterns.

URL PDF HTML ☆

赞 0 踩 0

2507.05689 2026-06-16 math.ST stat.ML stat.TH 版本更新

Optimal structure learning and conditional independence testing

最优结构学习与条件独立性检验

Ming Gao, Yuhao Wang, Bryon Aragam

AI总结本文建立结构学习与条件独立性检验之间的最优性联系，证明结构学习的最小最大最优率由条件独立性检验决定，并基于此提出改进的PC算法。

2508.08564 2026-06-16 stat.ME math.ST stat.ML stat.TH 版本更新

Kernel Two-Sample Testing via Directional Components Analysis

基于方向成分分析的核双样本检验

Rui Cui, Yuhao Li, Xiaojun Song

AI总结针对标准核双样本检验中尾部方向成分噪声导致功效下降的问题，提出通过截断MMD谱分解保留前导方向成分的核检验方法，结合高效参数自举过程，在高维和不平衡场景下实现更优功效和稳健性。

Comments Major Revision

详情

AI中文摘要

标准的核双样本检验，例如基于最大均值差异（MMD）的检验，在再生核希尔伯特空间（RKHS）中聚合所有方向上的平方差异。然而，在有限样本中，尾部方向成分存在噪声，这会降低检验功效。我们提出了一种新的基于核的检验方法，通过截断MMD的谱分解，仅保留估计良好的前导特征方向来解决这一问题。通过聚合这些稳健的成分，我们的方法实现了优越的功效和稳健性，特别是在高维和不平衡设置中。此外，我们引入了一种计算高效的自举参数过程来近似临界值，该过程在理论上合理且比基于置换的方法快得多。广泛的模拟和实证研究表明，我们的方法在保持严格的第一类错误控制的同时，比现有的基于MMD的检验具有更高的功效。

英文摘要

Standard kernel two-sample tests, such as those based on the Maximum Mean Discrepancy (MMD), aggregate squared differences across all directions in a Reproducing Kernel Hilbert Space (RKHS). However, in finite samples, trailing directional components are noisy, which degrades test power. We propose a novel kernel-based test that resolves this by truncating the spectral decomposition of the MMD, retaining only the well-estimated leading eigen-directions. By aggregating these robust components, our method achieves superior power and robustness, particularly in high-dimensional and unbalanced settings. Furthermore, we introduce a computationally efficient parametric bootstrap procedure for approximating critical values, which is theoretically justified and significantly faster than permutation-based alternatives. Extensive simulations and empirical studies demonstrate that our method maintains strict Type I error control while delivering higher power than existing MMD-based tests.

URL PDF HTML ☆

赞 0 踩 0

2312.01265 2026-06-16 math.PR stat.ME 版本更新

The optimal sub-Gaussian normalisation for randomised monotone functions

随机单调函数的最优亚高斯归一化

Thomas Anton, Rabee Tourky

AI总结本文研究了随机单调函数的最优亚高斯归一化问题，通过分析有限样本大小下的概率不等式，推导出归一化尺度与对数函数之间的紧密关系。

Comments 41 pages, 1 figure. Copy editing. Signed measure processes

2207.05190 2026-06-16 stat.ME math.ST stat.TH 版本更新

Estimation of High-Dimensional Normal Means through Inferential Models

通过推断模型估计高维正态均值

Samuel J. Eschker, Chuanhai Liu

AI总结针对高维正态均值估计问题，提出基于推断模型的无先验点估计类，利用广义概率积分变换和重加权Anderson-Darling统计量的有序均匀预测随机集，实现有效推断并解释Stein悖论。

Comments 29 pages

详情

AI中文摘要

多元正态均值的估计是一个基本问题，其突出表现为在二次损失下当$n\geq 3$时MLE的不可容许性。虽然收缩和经验贝叶斯方法通过几何推理或层次建模利用联合结构，本文提出了一类源自推断模型的无先验框架的点估计。我们为独立非独立同分布观测开发了广义概率积分变换，创建了从样本到有序均匀参考分布的双射映射。通过将此双射与基于重加权Anderson-Darling统计量的有序均匀预测随机集相结合，我们确保了有效且高效的推断，捕捉了有序观测揭示的全局形状结构。我们进一步引入了结合多个似然轮廓的最大最小（瓶颈）准则。为确保可计算性，我们开发了一个有放回抽样替代方法，将精确公式与过参数化（g）建模联系起来。我们的方法提供了Stein悖论的结构性解释，表明MLE对应于联合辅助分布的零密度点，从辅助角度揭示了其不可信性。数值研究表明，我们的估计器与最先进的自动建模方法具有竞争力，并优于经典的收缩和经验贝叶斯方法。

英文摘要

The estimation of the multivariate normal mean is a fundamental problem, highlighted by the inadmissibility of the MLE for $n\geq 3$ under quadratic loss. While shrinkage and empirical Bayes methods leverage joint structure through geometric reasoning or hierarchical modeling, this paper proposes a class of point estimators derived from the prior-free framework of inferential models. We develop a generalized probability integral transform for independent, non-i.i.d observations, creating a bijective mapping from the sample to an ordered-uniform reference distribution. By combining this bijection with an ordered-uniform predictive random set based on a reweighted Anderson-Darling statistic, we ensure valid and efficient inference that captures the global shape structure revealed by the ordered observations. We further introduce a maximin (bottleneck) criterion for combining multiple plausibility contours. To ensure computability, we develop a sampling-with-replacement surrogate that connects the exact formulation to over-parameterized (g)-modeling. Our approach provides a structural explanation of Stein's paradox, showing that the MLE corresponds to a zero-density point of the joint auxiliary distribution, revealing its implausibility from an auxiliary perspective. Numerical studies show that our estimators are competitive with state-of-the-art auto-modeling methods and outperform classical shrinkage and empirical Bayes methods.

URL PDF HTML ☆

赞 0 踩 0

2606.17005 2026-06-16 cs.AI stat.ME 新提交

Bayesian Inference and Decision Audits for Public Archives of Frontier AI Evaluations

前沿AI评估公共档案的贝叶斯推断与决策审计

Yanan Long

AI总结本文通过贝叶斯推断和审计方法，分析公共AI评估档案中的选择性报告和缺失数据，发现单一终端记录与多种历史路径兼容，并验证了审计门限对虚假声明的过滤作用。

详情

AI中文摘要

公共AI评估常被视为终端排行榜，但底层证据是由报告规则、基准修订和缺失数据塑造的选择性时间序列。LiveBench和Open LLM Leaderboard v2的重复公共档案作为主要纵向记录；LMArena提供偏好压力测试；GAIA和tau-bench贡献有限的智能体试点。这些档案共同实例化了一个贝叶斯推断问题：在固定报告约定下，一个仅包含$1{,}000$个系统的构造终端示例与两个终端前历史兼容，在相同终端尾模型下，达到天花板$0.05$以内的时间分别为$23.03$或$75.13$。在合成后验比较中，面向行动的诊断在不同观测制度下存在差异。候选选择感知的前沿模型未能通过合成恢复、目标档案预测、偏好转移和不确定性校准；相应地，固定审计门限拒绝了其更强的声明。一种档案与裁决协议重建了公共评估历史，隔离了验证的时间边界，并证伪了无依据的前沿声明。

英文摘要

Public AI evaluations are often read as terminal leaderboards, yet the underlying evidence is a selective time series shaped by reporting rules, benchmark revisions, and missingness. Repeated public archives for LiveBench and Open LLM Leaderboard v2 serve as the primary longitudinal record; LMArena provides a preference stress test; and GAIA and tau-bench contribute limited agentic pilots. Together, these archives instantiate a Bayesian inference problem: under a fixed reporting convention, one constructed terminal-only example over $1{,}000$ systems is compatible with two pre-terminal histories, yielding times of $23.03$ or $75.13$ to reach within $0.05$ of the ceiling under the same terminal-tail model. In synthetic posterior comparisons, action-facing diagnostics differ across observation regimes. The candidate selection-aware frontier model fails synthetic recovery, objective-archive prediction, preference transfer, and uncertainty calibration; correspondingly, fixed audit gates reject its stronger claims. An archive-and-adjudication protocol reconstructs public evaluation histories, isolates a verified timing boundary, and falsifies unsupported frontier claims.

URL PDF HTML ☆

赞 0 踩 0

2606.16923 2026-06-16 cs.AI stat.ML 新提交

MA-SBI: Misspecification-Aware Simulation-Based Inference via Side-Channel Guidance

MA-SBI: 通过侧信道引导的误设定感知仿真推断

Arunkumar V, Manoranjan Gandhudi, Gangadharan G. R., Arun Prakash, S. Senthilkumar

发表机构 * University College of Engineering, Anna University Tiruchirappalli（安娜大学蒂鲁吉拉伯利工程学院）； Central University of Karnataka（卡纳塔克中央大学）； National Institute of Technology Tiruchirappalli（蒂鲁吉拉伯利国立理工学院）； School of Computer & Systems Sciences, Jawaharlal Nehru University（贾瓦哈拉尔·尼赫鲁大学计算机与系统科学学院）

AI总结针对仿真模型误设定问题，提出无需校准的MA-SBI框架，利用侧信道文本信息进行后验校正，理论保证偏差减少界限，实验表明仅用文本即可匹配oracle后验。

Comments 23 pages, 9 figures, 12 tables

详情

AI中文摘要

潜在参数的仿真推断（SBI）常受仿真器误设定困扰，即由于固有的建模简化导致的仿真观测与真实观测之间的不匹配。最新的鲁棒SBI方法RoPE通过真实与仿真观测学习表示之间的最优传输来解决此问题，但需要真实参数校准对，而这在需要SBI的设置中通常不可用。实践者拥有的是非结构化侧信息，如制度标签、指令文本和政策公告。我们提出误设定感知仿真推断（MA-SBI），一个无需校准的框架，将侧信道转化为后验校正。学习到的校正器将侧信道文本映射到观测空间偏移，应用于任何预训练的摊销后验之前，无需重新训练也无需参数真实值。我们的主要定理通过误设定与侧信道之间的互信息界定了可实现的偏差减少，通过Donsker-Varadhan扩展到所有次高斯噪声的非平凡常数。在隐藏校准基准上，仅使用文本的MA-SBI在10个种子和两个骨干网络上匹配oracle后验（TOST等价），而使用更多数据的RoPE则不能。两种方法互补：当误设定是结构性的且可从参数对中恢复时，RoPE占优，正如理论所预测。随机变体在真实COVID和OxCGRT流行病学数据上提高了后验预测对数似然，并在一个良好设定的认知科学语料库上正确保持后验不变。

英文摘要

Simulation-based inference (SBI) of latent parameters is often hindered by simulator misspecification, the mismatch between simulated and real-world observations caused by inherent modeling simplifications. RoPE, the recent state-of-the-art for robust SBI, addresses this through optimal transport between learned representations of real and simulated observations, but requires ground-truth parameter calibration pairs that are typically unavailable in the very settings where SBI is needed. What practitioners do have is unstructured side-information such as regime labels, instruction text, and policy bulletins. We propose Misspecification-Aware Simulation-Based Inference (MA-SBI), a calibration-free framework that turns this side-channel into a posterior correction. A learned corrector maps side-channel text to an observation-space shift applied before any pre-trained amortized posterior, requiring no retraining and no parameter ground-truth. Our main theorem bounds achievable bias reduction by the mutual information between misspecification and side-channel, with a non-vacuous constant that extends to all sub-Gaussian noise via Donsker-Varadhan. On hide-the-calibration benchmarks, MA-SBI with text alone matches the oracle posterior across 10 seeds and two backbones (TOST equivalence), while RoPE given more data does not. The two approaches are complementary: where misspecification is structural and recoverable from parameter pairs, RoPE dominates, as the theory predicts. A stochastic variant improves posterior-predictive log-likelihood on real COVID and OxCGRT epidemiological data, and correctly leaves the posterior unchanged on a well-specified cognitive-science corpus.

URL PDF HTML ☆

赞 0 踩 0

2606.16683 2026-06-16 stat.ME stat.OT 新提交

Two fully specified Bayes factors for hypothesis testing and sensitivity analysis in process tracing

过程追踪中用于假设检验和敏感性分析的两个完全指定的贝叶斯因子

Matias López, Jake Bowers, Daniel Gajardo Cooper

AI总结提出两个完全指定的生成模型推导证据概率，解决过程追踪中贝叶斯因子手动指定概率的偏差问题，并通过敏感性分析驱动结论。

2606.16524 2026-06-16 cs.LG astro-ph.CO stat.ML 新提交

Neural Bayesian Anomaly Mitigation: A Robust Loss that Doubles as an Unsupervised Contamination Classifier

神经贝叶斯异常缓解：一种兼具无监督污染分类器功能的鲁棒损失函数

S. A. K. Leeney, W. J. Handley, H. T. J. Bevins, E. de Lera Acedo

发表机构 * Astrophysics Group, Cavendish Laboratory, University of Cambridge（剑桥大学卡文迪许实验室天体物理组）； Institute of Astronomy, University of Cambridge（剑桥大学天文研究所）

AI总结提出神经贝叶斯异常缓解（NBAM）损失，基于贝叶斯潜变量混合模型，既提供鲁棒监督损失又输出无监督污染后验，在CIFAR-10上优于Huber等基线。

Comments 13 pages, 4 figures

详情

AI中文摘要

工程化的鲁棒损失函数（如Huber、Student-$t$和广义交叉熵）使监督模型能够容忍污染，但无法回答哪些观测被破坏。我们引入神经贝叶斯异常缓解（NBAM），一种通用的即插即用损失函数，源自贝叶斯潜在开关混合模型：边际似然定义了一个鲁棒的监督损失，相关的后验定义了一个无监督的污染分类器。与Huber或Student-$t$类似，NBAM可以替换任何监督流程中的标准训练损失；与它们不同，NBAM还学习了一个结构化的污染模型，并返回每个样本的校准污染后验。学习到的输入相关先验$π_ϕ(x)$捕获污染的空间局部性，使得靠近已知损坏的样本更可能被标记，同时自动出现奥卡姆惩罚并正则化以防止过度标记。在具有非对称标签污染的CIFAR-10上，NBAM无需监督即可恢复污染过程的结构：污染后验将干净样本与污染样本分开，学习到的异常头识别每个标签翻转对的方向。除了这些能力之外，在0.2-0.6的污染率下，NBAM的性能优于本文考虑的四种鲁棒损失基线。

英文摘要

Engineered robust losses such as Huber, Student-$t$, and generalised cross-entropy make supervised models tolerant of contamination but cannot answer which observations are corrupted. We introduce Neural Bayesian Anomaly Mitigation (NBAM), a general-purpose drop-in loss derived from a Bayesian latent-switch mixture model: the marginal likelihood defines a robust supervised loss, and the associated posterior defines an unsupervised contamination classifier. Like Huber or Student-$t$, NBAM can replace the standard training loss in any supervised pipeline; unlike them, it additionally learns a structured contamination model and returns a calibrated per-sample contamination posterior. A learned input-dependent prior $π_ϕ(x)$ captures the spatial locality of contamination, so that samples near known corruptions are more likely to be flagged, while an Occam penalty emerges automatically and regularises against over-flagging. On CIFAR-10 with asymmetric label contamination, NBAM recovers the structure of the corruption process without supervision: the contamination posterior separates clean from corrupted samples, and the learned anomaly head identifies the direction of every label-flip pair. Alongside these capabilities, NBAM outperforms the four robust-loss baselines considered here at contamination rates 0.2-0.6.

URL PDF HTML ☆

赞 0 踩 0

2606.16224 2026-06-16 stat.AP 新提交

A Bayesian hierarchical model for meta-analysis

用于元分析的贝叶斯分层模型

Jing Dai, Sijie Xu, Shufei Ge

AI总结提出贝叶斯分层元分析框架，通过解析积分实现小样本下的稳健参数估计，并应用于奥卡西平与卡马西平的癫痫治疗安全性比较，发现前者副作用风险更低。

Comments 8 pages, 1 figure, 8 tables

2606.16080 2026-06-16 stat.ME 新提交

Bayesian joint modelling using semiparametric accelerated failure time approaches

使用半参数加速失效时间方法的贝叶斯联合建模

Ding Ma, Patrick Maher, Andrew Martin

AI总结提出一类半参数加速失效时间联合模型，直接建模协变量对事件时间的影响并灵活捕捉纵向-事件关联，采用贝叶斯框架进行估计，相比比例风险模型更具灵活性和可解释性。

详情

AI中文摘要

纵向临床研究通常收集生物标志物或健康相关生活质量的重复测量以及时间至事件结局。这些过程本质上是相互关联的：纵向轨迹可能预测事件风险，而事件发生或其预期可能导致纵向过程的信息性删失。联合模型为处理这种依赖性提供了原则性框架，但大多数现有公式依赖于比例风险假设，这可能具有限制性，并在时间尺度上提供有限的可解释性。我们提出了一类半参数加速失效时间联合模型，直接建模协变量对事件时间的影响，同时灵活捕捉纵向-事件关联。生存部分通过加速失效时间模型指定，基线部分由灵活基展开表示，允许广泛的平滑基线规范。我们使用Bernstein多项式基线表示说明该框架，并引入重缩放策略以提高时间扭曲下的数值稳定性和参数可识别性。在贝叶斯框架内进行估计，实现对纵向、生存和关联参数的联合推断。模拟研究反映了现实的纵向轨迹、删失机制和依赖结构，用于评估有限样本性能。当事件风险依赖于潜在纵向过程时，所提出的模型与独立的线性混合模型相比，显示出对纵向治疗效应的更好恢复。总体而言，该框架通过提供一种灵活且可解释的比例风险方法替代方案，扩展了现有的联合建模方法。

英文摘要

Longitudinal clinical studies often collect repeated measurements of biomarkers or health-related quality of life together with a time-to-event outcome. These processes are intrinsically linked: longitudinal trajectories may predict event risk, while event occurrence, or its anticipation, can induce informative censoring of the longitudinal process. Joint models provide a principled framework for handling this dependence, but most existing formulations rely on proportional hazards assumptions that may be restrictive and offer limited interpretability on the time scale. We propose a class of semiparametric accelerated failure time joint models that directly model covariate effects on event timing while flexibly capturing longitudinal-event associations. The survival component is specified through an accelerated failure time model with the baseline component represented by a flexible basis expansion, allowing a broad class of smooth baseline specifications. We illustrate the framework using Bernstein polynomial baseline representations and introduce rescaling strategies to improve numerical stability and parameter identifiability under time-warping. Estimation is conducted within a Bayesian framework, enabling joint inference for longitudinal, survival, and association parameters. Simulation studies reflecting realistic longitudinal trajectories, censoring mechanisms, and dependence structures are used to evaluate finite-sample performance. The proposed models show improved recovery of longitudinal treatment effects compared with a standalone linear mixed model when event risk depends on the underlying longitudinal process. Overall, the framework extends existing joint modelling methodology by offering a flexible and interpretable alternative to proportional hazards-based approaches.

URL PDF HTML ☆

赞 0 踩 0

2606.15837 2026-06-16 cs.CV cs.LG stat.ME stat.ML 新提交

Learning a Sampling-Free Variational DNN Plugin from Tiny Training Sets to Refine OOD Segmentation With Uncertainty Estimation

学习一种无采样的变分DNN插件，从微小训练集精炼OOD分割并估计不确定性

Jimut B. Pal, Suyash P. Awate

发表机构 * Centre for Machine Intelligence and Data Science (C-MInDS), Indian Institute of Technology (IIT) Bombay（印度理工学院孟买分校机器智能与数据科学中心）； Computer Science and Engineering (CSE) Department, Indian Institute of Technology (IIT) Bombay（印度理工学院孟买分校计算机科学与工程系）

AI总结提出VarDeepPCA，一种轻量级变分DNN框架，利用小分布内数据集学习有效解剖几何分布，无需目标域数据或预训练，通过重新解释softmax映射实现无采样推理，并提供不确定性估计，在4种临床应用中显著提升OOD分割的解剖合理性和准确性。

Comments Accepted at the Journal of Machine Learning for Biomedical Imaging

详情

AI中文摘要

深度神经网络（DNN）由于扫描仪和采集协议的变化，经常无法泛化到分布外（OOD）的医学图像。由于获取和标注新医学数据集的成本高昂，重新训练DNN模型以应对这些分布偏移通常不切实际。为了解决这个问题，我们引入了VarDeepPCA，一种新颖的轻量级变分DNN框架，旨在通过利用内在几何先验来恢复/精炼退化的分割图。与需要目标域数据或大量预训练的现有方法不同，我们的VarDeepPCA仅使用小的分布内（ID）数据集显式学习有效解剖几何的分布。理论上，我们的新颖变分学习框架利用对softmax映射的重新解释来隐式执行精确分布建模，从而实现计算高效、无采样的学习和推理。这也使VarDeepPCA能够为其恢复的分割图提供不确定性估计。我们在4种不同的临床应用上，使用14个公开可用的数据集，涉及心肌、神经视网膜边缘、前列腺和胎儿头部分割，对我们的框架进行了实证验证。与15种现有方法的比较表明，VarDeepPCA一致地恢复了现有方法在OOD数据上产生的分割图，以（i）显著提高几何的解剖合理性和分割的临床实用性，以及（ii）显著减少误差，而不需要比现有方法更多的训练数据。

英文摘要

Deep neural networks (DNNs) frequently fail to generalize to out-of-distribution (OOD) medical images because of variations in scanners and acquisition protocols. Retraining DNN models to address these distribution shifts is often impractical due to the high cost of acquiring and annotating new medical datasets. To address this, we introduce VarDeepPCA, a novel lightweight variational DNN framework designed to restore/refine degraded segmentation maps by leveraging intrinsic geometric priors. Unlike existing approaches that require target-domain data or extensive pre-training, our VarDeepPCA explicitly learns a distribution of valid anatomical geometries using only small in-distribution (ID) datasets. Theoretically, our novel variational learning framework leverages a reinterpretation of the softmax mapping to implicitly perform exact distribution modeling, thereby enabling computationally efficient, sampling-free learning and inference. This also enables VarDeepPCA to provide uncertainty estimates associated with its restored segmentation maps. We empirically validate our framework across 4 distinct clinical applications, using 14 publicly available datasets, involving segmentation of the myocardium, neuroretinal rim, prostate, and fetal head. Comparisons against 15 existing methods demonstrate that VarDeepPCA consistently restores segmentation maps produced by the existing methods on OOD data to (i) significantly improve anatomical plausibility of geometries and clinical utility of the segmentations, and (ii) significantly reduce errors, without needing any more training data than that used by existing methods.

URL PDF HTML ☆

赞 0 踩 0

2606.15525 2026-06-16 stat.AP stat.ME 新提交

Modeling Nonlinear Ability Trajectories and Learner Heterogeneity in Online Learning: A Bayesian Nonparametric Dynamic IRT Framework

在线学习中非线性能力轨迹与学习者异质性建模：一种贝叶斯非参数动态IRT框架

Zhihua Ma, Alice Xu, Icy Zhang, Guanyu Hu

AI总结提出贝叶斯非参数动态IRT框架，用B样条基函数捕捉非线性效应，MFM先验自动确定聚类数，克服线性假设、预设聚类数和无法追踪纵向动态的局限，应用于198名大学生数据，识别出四种学习者轮廓。

详情

AI中文摘要

在线学习放大了理解学生参与模式如何影响学习成果的需求，特别是在技术中介环境的灵活性下。为此，我们提出了一种贝叶斯非参数动态项目反应理论（IRT）框架，用于追踪个体内部在教学单元中的能力轨迹。该模型整合了B样条基函数展开以捕捉参与行为对能力漂移的非线性效应，同时采用有限混合混合（MFM）先验自动确定潜在学习者聚类的数量。该框架克服了现有文献中的三个局限：（1）参与-能力关系中的刚性线性假设，（2）对预设聚类数的依赖，以及（3）无法追踪纵向能力动态。我们将该模型应用于198名本科生在CourseKata上完成9章入门统计学课程的纵向数据。模型自动识别出四种不同的学习者轮廓：挣扎下降型（11%）、低稳定型（23%）、主流稳定型（55%）和高进步型（12%）。结果表明，能力轨迹在各章节中保持显著稳定，且参与数量指标未能显著预测能力漂移。这些发现表明，在入门级在线统计教育中，学术能力主要反映一种稳定的预先存在的特征，而非动态可变的课程结果。最终，该框架为学习者画像提供了一种灵活工具，以指导适应性教学设计。

英文摘要

Online learning has amplified the need to understand how student engagement patterns influence learning outcomes, particularly given the flexibility of technology-mediated environments. To address this, we propose a Bayesian nonparametric dynamic item response theory (IRT) framework that tracks within-individual ability trajectories across instructional units. The proposed model integrates B-spline basis expansions to capture nonlinear effects of engagement behaviors on ability drift, alongside a Mixture-of-Finite-Mixtures (MFM) prior to automatically determine the number of latent learner clusters. This framework overcomes three limitations in the existing literature: (1) rigid linearity assumptions in engagement-ability relationships, (2) dependence on pre-specified cluster counts, and (3) the inability to track longitudinal ability dynamics. We apply the model to longitudinal data from 198 undergraduates completing a 9-chapter introductory statistics course on CourseKata. The model automatically identified four distinct learner profiles: struggling-declining (11\%), low-stable (23\%), mainstream-stable (55\%), and high-improving (12\%). Results indicate that ability trajectories remained remarkably stable across chapters, and engagement quantity metrics did not significantly predict ability drift. These findings suggest that in introductory online statistics education, academic ability primarily reflects a stable pre-existing characteristic rather than a dynamically malleable course outcome. Ultimately, this framework offers a flexible tool for learner profiling to inform adaptive instructional design.

URL PDF HTML ☆

赞 0 踩 0

2606.14800 2026-06-16 stat.ME cs.LG eess.IV stat.ML 新提交

Bridging data-driven priors via the score function for posterior sampling -- Comparative review and experimental study

通过得分函数桥接数据驱动先验进行后验采样——比较综述与实验研究

Elhadji Cisse Faye, Mame Diarra Fall, Sylvain Delchini, Nicolas Dobigeon

发表机构 * IDP, Univ Orléans（IDP，奥尔良大学）； LITIS, Univ Rouen Normandie（LITIS，鲁昂-诺曼底大学）； Bureau de Recherches Géologiques et Minières Orléans, France（奥尔良地质与矿业研究局，法国）； IRIT, Univ Toulouse（图卢兹大学IRIT）

AI总结本文综述了贝叶斯逆问题中多种数据驱动先验如何通过得分函数统一，并展示其在采样算法中的有效集成，通过图像修复和超分辨率实验验证了方法的效率与通用性。

详情

AI中文摘要

本文综述了贝叶斯逆问题中常用的多种数据驱动先验如何通过各自的得分函数统一起来。通过将这些先验置于这一共同视角下，我们表明它们可以受益于直接且有效地集成到最近提出的采样算法中。通过考虑几种数据驱动先验，即去噪正则化、基于归一化流的先验、基于得分的生成模型和凸脊正则化，说明了这一通用框架的适用性。对于这四种特定的先验，在图像修复和单图像超分辨率任务中评估了该方法的性能。这些结果以及在地质背景下恢复真实图像的结果证明了该方法的效率。这一统一框架证明足够通用，能够处理由广泛类别的基于得分函数的先验定义的任何后验分布，而不仅限于本文考虑的具体情况。

英文摘要

This paper reviews how a diverse set of popular data-driven priors commonly used in Bayesian inverse problems can be unified through their respective score functions. By framing these priors under this common perspective, we show that they can benefit from their straightfoward and effective integration into a recently proposed sampling algorithm. The applicability of this common framework is illustrated by considering several data-driven priors, namely regularization-by-denoising, normalizing flow-based priors, score-based generative models, and convex-ridge regularizers. For these four particular priors, the performance of the method is evaluated when conducting image inpainting and single image super-resolution. These results, as well as those obtained when restoring real images acquired in a geological context, demonstrate the efficiency of the method. This unified framework proves versatile enough to handle any posterior distribution defined by a broad class of score function-based priors, beyond the specific cases considered in this paper.

URL PDF HTML ☆

赞 0 踩 0

2509.21734 2026-06-16 stat.ME 版本更新

Optimal Stopping for Sequential Bayesian Experimental Design

序贯贝叶斯实验设计的最优停止

Chen Cheng, Xun Huan

AI总结针对序贯实验设计中何时停止的问题，提出基于马尔可夫决策过程的贝叶斯最优停止框架，并采用课程学习策略解决联合训练中的局部最优陷阱。

详情

AI中文摘要

序贯贝叶斯实验设计通常假设实验次数在数据收集开始前是固定的。然而，在实际操作中，实验可能需要提前终止，因为额外的测量相对于其成本可能提供递减的信息，从而引发核心决策问题：何时应该停止？常见的基于阈值的停止规则易于实现但目光短浅，因为它们将当前状态与固定标准进行比较，而未考虑未来实验的预期价值。本文通过将停止和设计表述为马尔可夫决策过程中的耦合决策，为序贯实验设计开发了一个贝叶斯最优停止框架。我们证明，对于任何设计策略，最优停止规则恰好当立即终止奖励超过预期继续价值时终止。然后，我们推导出一种用于学习基于价值的停止和设计策略的策略梯度方法。朴素的联合训练可能产生循环依赖，使学习陷入早期停止的局部最优。我们通过一种课程学习策略解决了这一困难，该策略在训练过程中逐渐从强制继续过渡到自适应停止。在线性高斯基准、一维非线性测试问题以及污染物源检测问题上的数值研究表明，所提出的方法学习了稳定的设计-停止策略，并提高了资源感知性能，在具有强序贯依赖的设置中增益最大。

英文摘要

Sequential Bayesian experimental design is often formulated as a fixed-horizon policy optimization problem, in which the number of experiments is specified before data collection begins. In practical campaigns, however, additional measurements may provide diminishing information relative to their cost, making termination an integral part of experimental design. Common threshold-based stopping rules are easy to implement but myopic, because they compare the current state with a fixed criterion rather than the expected value of future experiments. This work develops a Bayesian optimal stopping framework for sequential experimental design by treating design and stopping as coupled decisions in a finite-horizon sequential decision problem. We prove that, for any fixed design policy, the optimal stopping rule terminates when the immediate terminal reward is no smaller than the expected continuation value. We then derive a policy-gradient method for learning continuous design policies with value-based stopping. The resulting optimization is challenging because the design policy, continuation value, and stopping boundary are mutually dependent, and naïve training can become trapped in early-stopping local optima. To address this difficulty, we introduce a curriculum strategy that gradually transitions from forced continuation to adaptive stopping during training. Numerical studies on a linear-Gaussian benchmark, a nonlinear test case, and a contaminant source detection problem show that the proposed approach learns stable, resource-aware design-stopping policies, with the largest gains in settings with strong sequential dependence.

URL PDF HTML ☆

赞 0 踩 0

2511.03954 2026-06-16 stat.ME stat.CO 版本更新

关系结构因果模型

Adiba Ejaz, Elias Bareinboim

发表机构 * Causal Artificial Intelligence Lab, Columbia University（哥伦比亚大学因果人工智能实验室）

AI总结提出关系结构因果模型，将结构因果模型扩展到对象和关系可变的场景，通过关系因果图和符号识别准则实现未见组合的因果和观测查询识别，并设计关系神经因果模型在交通场景中优于非关系基线。

Comments Proceedings of the Forty-Third International Conference on Machine Learning

详情

AI中文摘要

人工智能必须拥有一个因果的环境模型，支持关于干预和反事实的推理，同时具有组合性，支持对未见过的对象组合进行泛化。在这项工作中，我们正式研究了何时以及如何学习这样的模型。我们开发了关系结构因果模型，将结构因果模型（Pearl 2009）扩展到对象及其关系变化的场景。首先，我们展示了在没有进一步假设的情况下，不仅因果查询，而且关于未见对象组合的观测查询的答案也无法被识别。为了实现这种识别——包括在存在未观测混杂的情况下——我们定义了关系因果图并推导了符号识别准则。最后，我们提出了关系神经因果模型，这是一种可证明正确的方法，在具有不同汽车、信号和行人的模拟交通场景中优于非关系基线。

英文摘要

An artificial intelligence must have a model of its environment that is causal, supporting reasoning about interventions and counterfactuals, and also combinatorial, supporting generalization to unseen combinations of objects. In this work, we formally study when and how such a model can be learned. We develop relational structural causal models, extending structural causal models (Pearl 2009) to settings where objects and their relations vary. First, we show how answers to not only causal but also observational queries about unseen combinations of objects can not be identified without further assumptions. To enable such identification--including in the presence of unobserved confounding--we define relational causal graphs and derive symbolic identification criteria. Finally, we propose relational neural causal models, a provably correct approach that outperforms non-relational baselines on simulated traffic scenes with varying cars, signals, and pedestrians.

URL PDF HTML ☆

赞 0 踩 0

2606.14840 2026-06-16 stat.ME 新提交

Causal Sufficient Dimension Reduction for Multiple Continuous Exposures with an Application to Environmental Mixtures

多连续暴露的因果充分降维及其在环境混合物中的应用

Thomas W. Hsiao, Howard H. Chang, Razieh Nabi

AI总结提出因果充分降维（CSDR）框架，通过低维暴露摘要表征因果暴露-反应曲面，并设计两阶段估计器，在环境混合物研究中验证其有效性。

详情

AI中文摘要

图对齐凸松弛中的相变

Laurent Massoulié, Sushil Mahavir Varma, Louis Vassaux, Irène Waldspurger

AI总结研究相关GOE矩阵的图对齐问题，分析凸松弛方法，证明当相关参数σ=o(n^{-1/2}/log^4 n)时解集中到真实排列，并刻画了相变阈值。

Comments Accepted for presentation at the Conference on Learning Theory (COLT) 2026

详情

AI中文摘要

我们研究了相关高斯正交系综（GOE）矩阵的图对齐问题，目标是在给定两个相关对称高斯矩阵$(A, B)$（相关性为$1/\sqrt{1+σ^2}$）的情况下恢复隐藏的顶点排列。虽然最大似然估计在信息论上是最优的，但其计算归结为二次分配问题，难以处理。受此启发，我们分析了基于在双随机矩阵集和单位超立方体上最小化$\\|AX - XB\\|_F$的凸松弛。我们证明，当相关参数满足$σ= o(n^{-1/2}/\log^4 n)$时，任一松弛的解$(X^\star)$集中在真实排列矩阵$(Π^\star)$附近，即$\\|X^\star-Π^\star\\|_F^2 = o(n)$，这意味着在简单的后处理后可以恢复除消失比例顶点外的所有顶点。结合现有下界，我们的结果精确刻画了$\\|X^\star-Π^\star\\|_F^2$从$σ= \tilde{o}(n^{-1/2})$时的$o(n)$到$σ= \tildeΩ(n^{-1/2})$时的$Ω(n)$的转变。在此过程中，我们的分析显著收紧了先前的结果，并将其扩展到双随机松弛之外。

英文摘要

We study the graph alignment problem for correlated Gaussian Orthogonal Ensemble (GOE) matrices, where the goal is to recover a hidden vertex permutation given two correlated symmetric Gaussian matrices $(A, B)$ with correlation $1/\sqrt{1+σ^2}$. While the maximum likelihood estimator is information-theoretically optimal, its computation, which reduces to a quadratic assignment problem, is intractable. Motivated by this, we analyze convex relaxations based on minimizing $\|AX - XB\|_F$ over the set of doubly stochastic matrices and the unit hypercube. We show that when the correlation parameter satisfies $σ= o(n^{-1/2}/\log^4 n)$, the solution of either relaxation $(X^\star)$ concentrates around the ground-truth permutation matrix $(Π^\star)$, i.e., $\|X^\star-Π^\star\|_F^2 = o(n)$, implying recovery of all but a vanishing fraction of vertices after simple post-processing. Combined with existing lower bounds, our results precisely characterize that $\|X^\star-Π^\star\|_F^2$ transitions from $o(n)$ for $σ= \tilde{o}(n^{-1/2})$ to $Ω(n)$ for $σ= \tildeΩ(n^{-1/2})$. In doing so, our analysis significantly tightens prior results and extends them beyond doubly stochastic relaxations.

URL PDF HTML ☆

赞 0 踩 0

2605.25855 2026-06-16 stat.ME math.ST stat.ML stat.TH 版本更新

High-Dimensional Robust Change-Point Detection via Angular Kernel Statistics

高维变点检测：基于角核统计量

Jyotishka Ray Choudhury, Yao Xie

AI总结针对高维低样本量（HDLSS）数据，提出一种维度平均的角核扫描框架，通过聚合坐标间有界一维角差异实现非参数、无超参数、不依赖矩的变点检测，并给出离线与在线过程的统计推断保证。

详情

AI中文摘要

我们研究在必须从少量观测批次中进行推断的高维数据变点检测问题。主要关注高维低样本量（HDLSS）情形，其中序列长度固定而环境维度发散。我们提出一种维度平均的角核扫描框架，用于检测边际分布变化。该统计量聚合跨坐标的有界一维角差异，得到一个完全非参数、无超参数且不依赖矩的估计量，该估计量在无需指定、估计或假设有限边际矩（例如在重尾或污染分布下）的情况下仍然定义良好。对于离线单变点问题，我们推导出精确的总体均值分解为通用确定性形状函数和标量信号因子，将零假设协方差结构表征至标量长期方差因子，并建立了跨坐标混合下的HDLSS多元中心极限定理。这些结果导致插件高斯校准、渐近第一类错误控制以及功效和定位保证，包括$d^{-1/2}$局部检测尺度。我们进一步将离线过程扩展为针对高维流数据的固定窗口序贯监测过程，并获得了ARL校准和最坏情况EDD界。模拟研究表明，所提方法能够在具有挑战性的HDLSS和流设置中准确检测和定位变化，而基于矩或超参数敏感的程序可能不可靠。

英文摘要

We study nonparametric change-point detection for high-dimensional data in regimes where inference must be performed from small batches of observations. Our primary focus is the high-dimensional, low sample size (HDLSS) regime, where the sequence length is fixed while the ambient dimension diverges. We propose a dimension-averaged angular kernel scan framework for detecting marginal distributional shifts. The statistic aggregates bounded one-dimensional angular discrepancies across coordinates, yielding a fully nonparametric, hyperparameter-free, and moment-agnostic estimator that remains well-defined without specifying, estimating, or assuming finite marginal moments; for example, under heavy-tailed or contaminated distributions. For the offline single-change problem, we derive an exact population mean factorization into a universal deterministic shape function and a scalar signal factor, and characterize the exact null covariance structure up to a scalar variance factor, both valid for any fixed sample size and dimension. We also establish an HDLSS multivariate central limit theorem under cross-coordinate strong mixing which leads to a variance-calibrated asymptotically distribution-free test, asymptotic type-I error control, and lower bounds on power and localization accuracy. We further extend the offline procedure to a fixed-window sequential monitoring procedure for high-dimensional streaming data, and obtain ARL calibration and worst-case Pollak EDD bounds. Simulation studies demonstrate that the proposed method can accurately detect and localize changes in many challenging HDLSS and streaming high-dimensional settings where moment-based or hyperparameter-sensitive procedures may be extremely unstable or inaccurate.

URL PDF HTML ☆

赞 0 踩 0

2512.12003 2026-06-16 stat.ME math.ST stat.TH 版本更新

Debiased Inference for High-Dimensional Regression Models Based on Profile M-Estimation

基于剖面M估计的高维回归模型去偏推断

Yuhao Deng, Yi Wang, Yu Gu, Yuanjia Wang, Donglin Zeng

AI总结提出去偏剖面M估计（DPME）框架，通过牛顿-拉夫森一步校正实现高维回归模型的正则化估计的渐近正态推断，无需显式投影，计算成本低。

详情

AI中文摘要

高维回归模型的去偏推断近年来受到广泛关注，以确保正则化估计量具有有效的推断。许多现有方法通过显式构造到 nuisance 参数空间的投影来实现 Neyman 正交性，但当投影的显式形式不可用时，这些方法不可行。我们引入了一个通用的去偏框架，即去偏剖面 $M$-估计（DPME），它适用于广泛的模型类别，并且不需要像现有方法那样进行模型特定的 Neyman 正交化或投影推导。我们的方法首先通过优化惩罚目标函数获得参数的初始估计量。为了纠正惩罚引入的偏差，我们使用牛顿-拉夫森更新构造一个一步估计量，该更新应用于剖面函数的梯度，其中剖面函数定义为在保持感兴趣参数固定时的最优目标函数。我们使用数值微分，无需显式计算梯度。得到的 DPME 估计量被证明是渐近线性和正态分布的。通过大量模拟，我们证明了所提出的方法在显著降低计算成本的同时，实现了比现有替代方法更好的覆盖率。最后，我们通过将方法应用于估计多发性骨髓瘤的治疗规则来说明其实用性。

英文摘要

Debiased inference for high-dimensional regression models has received substantial recent attention to ensure regularized estimators have valid inference. Many existing methods focus on achieving Neyman orthogonality through explicitly constructing projections onto the space of nuisance parameters, which is infeasible when an explicit form of the projection is unavailable. We introduce a general debiasing framework, Debiased Profile $M$-Estimation (DPME), which applies to a broad class of models and does not require model-specific Neyman orthogonalization or projection derivations as in existing methods. Our approach begins with obtaining an initial estimator of the parameters by optimizing a penalized objective function. To correct for the bias introduced by penalization, we construct a one-step estimator using the Newton--Raphson update, applied to the gradient of a profile function defined as the optimal objective function with the parameter of interest held fixed. We use numerical differentiation without requiring explicit calculation of the gradients. The resulting DPME estimator is shown to be asymptotically linear and normally distributed. Through extensive simulations, we demonstrate that the proposed method achieves better coverage rates than existing alternatives with largely reduced computational cost. Finally, we illustrate the utility of our method by applying it to estimate a treatment rule for multiple myeloma.

URL PDF HTML ☆

赞 0 踩 0

2510.13715 2026-06-16 stat.ME 版本更新

Exact Coordinate Descent for High-Dimensional Regularized Huber Regression

高维正则化Huber回归的精确坐标下降法

Younghoon Kim, Po-Ling Loh, Sumanta Basu

AI总结提出一种精确坐标下降算法求解弹性网惩罚的高维Huber回归，通过自适应变量筛选规则加速收敛，在重尾噪声和高相关预测变量下保持稳定高效。

详情

AI中文摘要

在这项研究中，针对弹性网惩罚的高维Huber回归，开发了一种精确坐标下降算法。与现有的梯度下降或坐标下降类方法不同，即使当协变量之间由于重尾分布而产生高相关性导致Hessian矩阵病态时，该算法仍然有效。对于每个坐标，边际增量仅来自内点观测值，而导数在基于部分残差构建的网格上保持单调递增。基于传统的坐标下降框架，提出了自适应变量筛选规则，以选择性地确定每次迭代中更新哪些变量，从而加速收敛。对所提出算法的收敛性进行了正式分析，并提出了实用的计算策略以加速其执行。这些增强确保了算法即使在具有挑战性的场景下也能快速稳定地运行。涉及重尾噪声和高相关预测变量的大量模拟研究以及实际数据应用，展示了该方法的实际效率以及计算增强的益处。

英文摘要

In this study, an exact coordinate descent algorithm is developed for high-dimensional Huber regression regularized with an elastic net penalty. Unlike existing gradient descent or coordinate descent-type methods, this algorithm remains effective even when the Hessian becomes ill-conditioned due to high correlations between covariates drawn from heavy-tailed distributions. For each coordinate, marginal increments arise solely from inlier observations, while the derivatives remain monotonically increasing over a grid constructed from the partial residuals. Building on conventional coordinate descent frameworks, adaptive variable screening rules are proposed to selectively determine which variables to update at each iteration, thereby accelerating convergence. The convergence of the proposed algorithm is formally analyzed, and practical computational strategies are presented to speed up its execution. These enhancements ensure that the algorithm operates rapidly and stably even under challenging scenarios. Extensive simulation studies involving heavy-tailed noise and highly correlated predictors, along with a real-world data application, demonstrate both the practical efficiency of this method and the benefits of the computational enhancements.

URL PDF HTML ☆

赞 0 踩 0

2306.02244 2026-06-16 math.ST stat.ME stat.TH 版本更新

KL-BSS: Rethinking optimality for neighbourhood selection in structural equation models

KL-BSS：重新思考结构方程模型中邻域选择的最优性

Ming Gao, Wai Ming Tai, Bryon Aragam

AI总结提出KL-BSS方法，利用结构方程模型中的潜在结构，在更弱的特征值条件下以更少样本恢复线性模型支持，并通过实验验证其优于BSS和Lasso。

详情

DOI: 10.1093/jrsssb/qkag078

AI中文摘要

我们提出了一种在线性结构方程模型中进行邻域选择的新方法，该方法改进了经典方法如最佳子集选择（BSS）和Lasso。我们的方法称为KL-BSS，利用了SEM中存在的潜在结构——即使这种结构未知——并且可以轻松使用现有求解器实现。与BSS和Lasso相比，在更弱的特征值条件下，KL-BSS能够以更少的样本可证明地恢复线性模型的支持集。我们建立了KL-BSS获得的逐点和极小极大样本复杂度。在真实和模拟数据上的大量实验证实了KL-BSS带来的改进。虽然众所周知Lasso在结构化依赖下会遇到困难，但较少人知道即使是BSS也会遇到麻烦，并且可以显著改进。这些结果对图模型中的结构学习具有启示意义，因为图模型通常依赖邻域选择作为子程序。

英文摘要

We introduce a new method for neighbourhood selection in linear structural equation models that improves over classical methods such as best subset selection (BSS) and the Lasso. Our method, called KL-BSS, takes advantage of the existence of underlying structure in SEM -- even when this structure is unknown -- and is easily implemented using existing solvers. Under weaker eigenvalue conditions compared to BSS and the Lasso, KL-BSS can provably recover the support of linear models with fewer samples. We establish both the pointwise and minimax sample complexity for support recovery, which KL-BSS obtains. Extensive experiments on both real and simulated data confirm the improvements offered by KL-BSS. While it is well-known that the Lasso encounters difficulties under structured dependencies, it is less well-known that even BSS runs into trouble as well, and can be substantially improved. These results have implications for structure learning in graphical models, which often relies on neighbourhood selection as a subroutine.

URL PDF HTML ☆

赞 0 踩 0

2508.20278 2026-06-16 stat.ME 版本更新

Interpretable Scalar-on-Image Linear Regression Models via the Generalized Dantzig Selector

通过广义Dantzig选择器的可解释标量对图像线性回归模型

Sijia Liao, Xiaoxiao Sun, Ning Hao, Hao Helen Zhang

AI总结提出广义Dantzig选择器，联合施加稀疏性和平滑性约束，提高标量对图像回归中系数函数的可解释性，并通过理论和实验验证其优势。

详情

AI中文摘要

标量对图像回归模型通过估计二元系数函数来研究标量响应与二元函数（例如图像）之间的关联。现有方法通常施加平滑性约束以控制偏差-方差权衡，从而防止过拟合。然而，这种假设可能阻碍可解释性，尤其是当只有图像的某些区域影响响应变化时。在这种情况下，通过对系数函数施加稀疏性假设可以更好地捕捉可解释性。为了解决这一挑战，我们提出了广义Dantzig选择器，一种联合在系数函数上施加稀疏性和平滑性的新方法。所提出的方法通过准确识别对响应变化无贡献的区域来增强可解释性，同时保持估计的稳定性。广泛的模拟研究和实际数据应用表明，新方法具有高度可解释性，并且比现有方法有显著改进。此外，我们严格建立了估计误差的非渐近界，为所提出的框架提供了强有力的理论保证。

英文摘要

The scalar-on-image regression model examines the association between a scalar response and a bivariate function (e.g., images) through the estimation of a bivariate coefficient function. Existing approaches often impose smoothness constraints to control the bias-variance trade-off, and thus prevent overfitting. However, such assumptions can hinder interpretability, especially when only certain regions of an image influence changes in the response. In such a scenario, interpretability can be better captured by imposing sparsity assumptions on the coefficient function. To address this challenge, we propose the Generalized Dantzig Selector, a novel method that jointly enforces sparsity and smoothness on the coefficient function. The proposed approach enhances interpretability by accurately identifying regions with no contribution to the changes of response, while preserving stability in estimation. Extensive simulation studies and real data applications demonstrate that the new method is highly interpretable and achieves notable improvements over existing approaches. Moreover, we rigorously establish non-asymptotic bounds for the estimation error, providing strong theoretical guarantees for the proposed framework.

URL PDF HTML ☆

赞 0 踩 0

2407.09964 2026-06-16 math.ST stat.ML stat.TH 版本更新

TrIM: Transformed Iterative Mondrian Forests for Gradient-based Dimension Reduction and High-Dimensional Regression

TrIM: 基于梯度的降维和高维回归的变换迭代Mondrian森林

Ricardo Baptista, Eliza O'Reilly, Yangxinyu Xie

AI总结提出一种计算高效的梯度线性降维和高维回归算法，通过Mondrian森林估计期望梯度外积矩阵，并迭代更新特征和权重以提升回归性能，理论保证一致性。

Comments 49 pages, 19 figures

2606.17014 2026-06-16 cs.LG math.ST stat.ML stat.TH 新提交

Filtered Conformal Ellipsoids for Graph-Native Time Series

图原生时间序列的过滤共形椭球

Yannick Limmer

发表机构 * DRW London（DRW伦敦）

AI总结提出过滤共形椭球方法，结合状态空间滤波与共形校准，为多元时间序列生成联合预测集，控制单事件并适应跨坐标依赖，通过可观测预测律商分析保证覆盖界。

详情

AI中文摘要

多元时间序列的联合预测集应控制单个事件，同时适应跨坐标依赖性。我们研究过滤共形椭球：一个冻结的状态空间滤波器输出一步预测均值和协方差，并对得到的马氏距离分数应用分割共形校准。滤波器用于选择椭球形状；共形校准选择标量半径，因此该构造受益于学习到的预测协方差，而不依赖高斯尾部概率来保证覆盖。主要困难在于过滤分数是依赖的，且学习到的循环滤波器不需要在其原始隐藏状态上收缩；因此，我们分析可观测预测律商中的收缩，该商识别产生相同未来发射高斯律序列的隐藏状态。在稳定的贝叶斯高斯投影滤波器、协方差界和有限时域可观测性费舍尔条件下，小超额高斯负对数似然意味着学习到的发射律的收缩。结合阈值自协方差包络，这给出了依赖下过滤分割共形预测的切比雪夫型近似覆盖界；更尖锐的伯恩斯坦型界需要额外的几何混合集中假设。在高斯预言可实现性下，我们还在条件有效的高斯椭球规则类中获得了接近预言的log体积比较。我们使用具有对角加低秩协方差的GCN-GRU滤波器实例化该框架。在中等规模的图原生交通基准（METRLA-$20$和PEMSBAY-$50$）上，学习到的滤波器比静态协方差和非滤波基线给出更尖锐的目标椭球；在全图规模和非图原生数据集上，因子和copula基线可能更强。

英文摘要

Joint prediction sets for multivariate time series should control a single event while adapting to cross-coordinate dependence. We study filtered conformal ellipsoids: a frozen state-space filter emits a one-step predictive mean and covariance, and split-conformal calibration is applied to the resulting Mahalanobis scores. The filter is used to choose the ellipsoid shape; conformal calibration chooses the scalar radius, so the construction benefits from a learned predictive covariance without relying on Gaussian tail probabilities for coverage. The main difficulty is that filtered scores are dependent and learned recurrent filters need not contract in their raw hidden state; we therefore analyse contraction in an observable predictive-law quotient that identifies hidden states producing the same future sequence of emitted Gaussian laws. Under a stable Bayes Gaussian-projection filter, covariance bounds, and a finite-horizon observability Fisher condition, small excess Gaussian negative log-likelihood implies contraction of the learned emitted laws. Combined with a threshold-autocovariance envelope this yields a Chebyshev-type approximate coverage bound for filtered split-conformal prediction under dependence; a sharper Bernstein-type bound requires an additional geometric-mixing concentration assumption. Under Gaussian oracle realisability we also obtain a near-oracle log-volume comparison within the class of conditionally valid Gaussian ellipsoid rules. We instantiate the framework with a GCN-GRU filter with diagonal-plus-low-rank covariance. On moderate-size graph-native traffic benchmarks (METRLA-$20$ and PEMSBAY-$50$), the learned filter gives sharper at-target ellipsoids than static-covariance and non-filter baselines; at full-graph scale and on non-graph-native datasets, factor and copula baselines can be stronger.

URL PDF HTML ☆

赞 0 踩 0

2606.16773 2026-06-16 econ.EM stat.ME stat.ML 新提交

Generative Predictive Distributions for Time Series

时间序列的生成式预测分布

Jordi Llorens-Terrazas, Mika Meitz

AI总结提出基于生成式表示的灵活框架，用于建模非线性多变量时间序列的预测分布，通过条件生成对抗网络估计，并建立弱时间依赖下的统计一致性。

详情

AI中文摘要

我们提出了一个灵活的框架，用于建模非线性、可能多变量时间序列的预测分布。我们的方法基于测度论概率中的一个民间结果，在适当的生成式表示中表达一般的预测分布。这种表示为预测分布提供了直接的基于模拟的近似，从而能够直接计算条件均值和方差的预测、扇形图、风险价值、预期亏损、联合尾部风险以及其他感兴趣的量。我们使用条件生成对抗网络的一个版本来估计这种生成式表示，并提供了弱时间依赖下估计的形式化统计分析。具体来说，估计被表述为一个特定的极小极大问题，并且我们建立了其近似解在豪斯多夫距离下的一致性。通过应用于股票收益、已实现方差和已实现协方差的例子，说明了该方法的实证相关性。所提出的方法在计算上也是可管理的，在我们的应用中，在标准笔记本电脑上估计大约需要一分钟。

英文摘要

We propose a flexible framework for modeling the predictive distributions of nonlinear, possibly multivariate time series. Our approach expresses a general predictive distribution in an appropriate generative representation that is based on a folklore result from measure theoretic probability. This representation provides a direct simulation-based approximation to the predictive distribution, enabling straightforward computation of forecasts for the conditional mean and variance, fan charts, value at risk, expected shortfall, joint tail risks, and other quantities of interest. We estimate this generative representation using a version of conditional generative adversarial networks and provide a formal statistical analysis of estimation under weak temporal dependence. Specifically, estimation is expressed as a particular minimax problem and we establish consistency of its approximate solutions in Hausdorff distance. The empirical relevance of the approach is illustrated using applications to equity returns, realized variance, and realized covariances. The proposed method is also computationally manageable, with estimation in our applications taking approximately one minute on a standard laptop.

URL PDF HTML ☆

赞 0 踩 0

2606.16677 2026-06-16 stat.AP 新提交

Distributional Forecasting of EU Asylum Applications with Dynamic Multivariate Count Models

欧盟庇护申请分布预测的动态多元计数模型

Gregor Zens, Jakub Bijak

AI总结提出贝叶斯框架联合预测EU-27月度庇护申请分布，分解潜在强度为国家随机游走和共同因子，结合厚尾或随机波动冲击，发现联合模型优于单国模型，尤其在上尾风险中表现显著。

详情

AI中文摘要

我们提出了一个贝叶斯框架，用于联合预测EU-27各国月度庇护申请的分布。该模型将潜在申请强度分解为国家特定的随机游走和共同因子，并允许特质性和共同冲击表现出厚尾或随机波动性。使用2008年至2026年的欧盟统计局数据，我们在滚动样本外预测中评估预测分布，对整体分布准确性和上尾风险进行评分。三个发现浮现：第一，最优规格因国家、评分规则和预测期而异，强调了模型需与政策特定损失函数对齐。第二，联合EU-27模型优于单国基准模型，在上尾（准备成本最相关）中增益最大。第三，随机游走对数强度为国家庇护申请动态提供了有用的短期描述，尤其当与灵活的创新动态结合时。最后，我们讨论了这些发现对涉及庇护预测和准备规划的国家及欧盟层面机构的启示。

英文摘要

We propose a Bayesian framework for joint distributional forecasting of monthly asylum applications across the EU-27. The model decomposes latent application intensities into country-specific random walks and common factors, with idiosyncratic and shared shocks allowed to exhibit heavy tails or stochastic volatility. Using Eurostat data from 2008 to 2026, we evaluate predictive distributions in a rolling out-of-sample exercise, scoring overall distributional accuracy and upper-tail risk. Three findings emerge. First, the preferred specification varies across countries, scoring rules, and horizons, underscoring the need to align models with policy-specific loss functions. Second, joint EU-27 models improve on country-by-country benchmarks, with the largest gains in the upper tail, where preparedness costs are most relevant. Third, random-walk log-intensities provide a useful short-run description of national asylum-application dynamics, especially when combined with flexible innovation dynamics. We conclude by discussing implications for national and EU-level agencies involved in asylum forecasting and preparedness planning.

URL PDF HTML ☆

赞 0 踩 0

2606.15953 2026-06-16 stat.ME 新提交

耦合振荡器概率建模的Kuramoto-von Mises时间序列模型

Yun Hwang, Todd P. Coleman

AI总结提出一种不假设热力学平衡的耦合振荡器概率分布估计方法，基于Langevin动力学构建，在高采样率下具有闭式解，在非平衡模拟数据和真实脑/胃电生理数据中优于现有方法。

Comments 15 pages, 4 figures

详情

AI中文摘要

耦合振荡器系统为建模广泛的物理和生物现象提供了基本框架。在神经科学中，中枢神经系统与相邻脑区表现出同步振荡活动，例如在睡眠期间产生行波动力学。类似地，在胃肠系统中，神经肌肉细胞协调其振荡以产生慢波活动的传播波。为了估计多变量相位关系的概率分布，现有方法通常依赖于平衡热力学，通过成对指数族分布以玻尔兹曼形式表达系统。然而，这些假设在现实系统中常常被违反，现实系统本质上是动态的，并经常在平衡和非平衡状态之间转换。为了解决这个问题，我们提出了一种估计耦合振荡器概率分布的有效方法，该方法不假设热力学平衡。通过基于Langevin动力学的构建，该方法即使在非平衡状态下也能实现精确建模。最大似然估计方法在高采样率条件下具有闭式代数解，这一条件通常被现代数据采集系统满足，使其易于实际应用。我们在模拟数据上展示了其鲁棒性，在非平衡设置中优于现有方法，并进一步说明了其在表征脑刺激响应中的动态脑行波以及在人胃电生理记录背景下的假设检验中的实用性。

英文摘要

A system of coupled oscillators provides a fundamental framework for modeling a wide range of physical and biological phenomena. In neuroscience, the central nervous system exhibits synchronized oscillatory activity with adjacent brain regions, giving rise to traveling wave dynamics for instance during sleep. Similarly, in the gastrointestinal system, neuromuscular cells coordinate their oscillations to generate propagating waves of slow wave activity. To estimate probability distributions of multivariate phase relationships, existing approaches typically rely on equilibrium thermodynamics, expressing the system in a Boltzmann form through a pairwise exponential family distribution. However, these assumptions are often violated in real-world systems, which are inherently dynamic and frequently transition between equilibrium and non-equilibrium regimes. To address this, we propose an efficient method for estimating the probability distribution of coupled oscillators that does not assume thermodynamic equilibrium. Using a Langevin dynamics-based construction, the approach enables accurate modeling even in non-equilibrium regimes. The maximum likelihood estimation method is shown to have a closed form algebraic solution in the high sampling rate regime, a condition commonly satisfied by modern data acquisition systems, which makes it readily applicable in practice. We demonstrate its robustness on simulated data, where it outperforms existing approaches in non-equilibrium settings, and further illustrate its utility for characterizing dynamic brain traveling waves in response to brain stimulation and in hypothesis testing within the context of electrophysiologic recordings of the human stomach.

URL PDF HTML ☆

赞 0 踩 0

2603.00874 2026-06-16 stat.ME 版本更新

Detecting Distributional Differences in Spatially Correlated Multivariate Data via Kernel-Smoothed Rank-Based Empirical Copula Tests

通过核平滑秩经验Copula检验检测空间相关多变量数据的分布差异

Marco Mandap

AI总结针对非正态和空间自相关的多变量产量分布比较，提出基于核平滑经验Copula过程的非参数空间Cramer-von Mises型检验，通过秩变换和空间核权重控制类型I错误，并建立弱收敛理论。

Comments An error was identified in the underlying distribution proof used for the empirical copula test. The authors are withdrawing this version while finalizing a formally verified proof of the distribution in Lean 4

详情

AI中文摘要

比较跨空间参考农业田块的多变量产量质量分布因两个普遍特征而复杂化：非正态性和空间自相关。经典程序如ANOVA、MANOVA和标准秩检验假设独立性，因此在存在空间依赖性时表现出严重的类型I错误膨胀。我们提出了一种基于从池化分量秩构建的核平滑经验Copula过程的非参数空间Cramer-von Mises型检验。空间核权重明确考虑了局部依赖性，而秩变换消除了对边际分布形式的敏感性。在固定域填充渐近性和多项式α混合条件下，我们建立了平滑经验Copula过程向均值为零的高斯极限的弱收敛，并证明了所得二次检验统计量收敛到限制在K-1维对比子空间上的卡方随机变量的加权和。通过在高斯Copula模型下使用精确离散空间协方差算子校准的Satterthwaite近似获得实际推断。双变量对数正态空间数据的蒙特卡洛实验表明，与变得严重反保守的经典参数和非空间秩方法相比，所提出的检验在不同强度的空间依赖性下保持了名义大小。该程序为精准农业及相关应用领域中比较多变量空间产量分布提供了一个理论上合理且计算可行的框架。

英文摘要

Comparing multivariate yield quality distributions across spatially referenced agricultural fields is complicated by two pervasive features: non-normality and spatial autocorrelation. Classical procedures such as ANOVA, MANOVA, and standard rank tests assume independence and therefore exhibit severe Type I error inflation when spatial dependence is present. We propose a nonparametric spatial Cramer-von Mises-type test based on kernel-smoothed empirical copula processes constructed from pooled componentwise ranks. Spatial kernel weights account explicitly for local dependence, while the rank transformation removes sensitivity to marginal distributional form. Under fixed-domain infill asymptotics and polynomial alpha-mixing conditions, we establish weak convergence of the smoothed empirical copula process to a mean-zero Gaussian limit and show that the resulting quadratic test statistic converges to a weighted sum of chi-squared random variables restricted to the K-1-dimensional contrast subspace. Practical inference is obtained through a Satterthwaite approximation calibrated using the exact discrete spatial covariance operator under a Gaussian copula model. Monte Carlo experiments with bivariate log-normal spatial data demonstrate that the proposed test maintains nominal size across varying strengths of spatial dependence, in contrast to classical parametric and non-spatial rank-based methods, which become severely anti-conservative. The procedure provides a theoretically justified and computationally tractable framework for comparing multivariate spatial yield distributions in precision agriculture and related applied settings.

URL PDF HTML ☆

赞 0 踩 0

2412.20316 2026-06-16 stat.ME 版本更新

A Rank-Based Test for Comparing Multiple Fields' Yield Quality Distributions Under Spatial Dependence

空间依赖下多个田地产量质量分布比较的基于秩的检验

Marco Mandap

AI总结针对空间依赖和非正态性，提出基于秩的检验框架，利用空间核平滑构建稳健经验分布函数，并证明统计量渐近服从加权卡方分布，通过Satterthwaite近似校正空间方差膨胀。

Comments An error was identified in the underlying distribution proof used for the empirical copula test. The authors are withdrawing this version while finalizing a formally verified proof of the distribution in Lean 4

详情

AI中文摘要

比较多个农业田地的产量质量分布是评估管理实践的基础，但两个普遍存在的数据特征——非正态性和空间自相关——使其复杂化。传统的参数检验（如ANOVA）在空间依赖性违反独立性假设时，常遭受严重的I类错误膨胀。本文引入一种新颖的基于秩的检验框架，利用空间核平滑构建稳健的经验分布函数（EDF）。我们建立了在$\alpha$-混合条件下检验统计量的渐近性质，证明其收敛到加权卡方随机变量之和。为便于实际推断，我们采用Satterthwaite近似推导有效自由度，以考虑空间方差'膨胀'。详细发展了理论框架，为所提方法提供了严格基础。模拟研究和实际产量质量数据的应用留待未来工作。

英文摘要

Comparing yield quality distributions across multiple agricultural fields is fundamental for evaluating management practices, yet it is complicated by two pervasive data characteristics: non-normality and spatial autocorrelation. Traditional parametric tests, such as ANOVA, frequently suffer from severe Type I error inflation when the independence assumption is violated by spatial dependence. This paper introduces a novel rank-based test framework that utilizes spatial kernel smoothing to construct robust empirical distribution functions (EDFs). We establish the asymptotic properties of the test statistic under $α$-mixing conditions, proving its convergence to a weighted sum of chi-squared random variables. To facilitate practical inference, we employ a Satterthwaite approximation to derive effective degrees of freedom that account for the spatial 'inflation' of variance. The theoretical framework is developed in detail, providing a rigorous foundation for the proposed method. Simulation studies and applications to real yield quality data are left to future work.

URL PDF HTML ☆

赞 0 踩 0

2511.18553 2026-06-16 math.ST stat.ML stat.TH 版本更新

Matching correlated VAR time series

匹配相关VAR时间序列

Ernesto Araya, Hemant Tyagi

AI总结研究匹配两个相关VAR时间序列数据库的问题，提出概率框架，通过线性分配估计器实现完美或部分恢复，并利用凸松弛高效求解。

详情

AI中文摘要

我们研究了匹配相关VAR时间序列数据库的问题，其中多元时间序列与其扰动和置换版本同时被观测，目标是恢复它们之间的未知匹配。为此，我们引入了一个概率框架，其中两个时间序列$(x_t)_{t\in[T]},(x^\#_t)_{t\in[T]}$联合生成，使得$x^\#_t=x_{\pi^*(t)}+\sigma \tilde{x}_{\pi^*(t)}$，其中$(x_t)_{t\in[T]},(\tilde{x}_t)_{t\in[T]}$是独立同分布的一阶向量自回归（VAR）时间序列，具有高斯增量，$\pi^*$是隐藏的。目标是从观测$(x_t)_{t\in[T]},(x^\#_t)_{t\in[T]}$中恢复$\pi^*$。这推广了经典的匹配独立点云问题到时间序列设置。我们推导了最大似然估计（MLE），导致在排列上的二次优化，并从理论上分析了基于线性分配的估计器。对于后一种方法，我们建立了恢复保证，识别出允许完美或部分恢复的$\sigma$阈值。此外，我们提出通过考虑排列矩阵的凸松弛（例如，在Birkhoff多面体上）来求解MLE。这允许通过交替最小化高效估计$\pi^*$和VAR参数。实验上，我们发现线性分配通常匹配或优于基于MLE松弛的方法。

英文摘要

We study the problem of matching correlated VAR time series databases, where a multivariate time series is observed along with a perturbed and permuted version, and the goal is to recover the unknown matching between them. To model this, we introduce a probabilistic framework in which two time series $(x_t)_{t\in[T]},(x^\#_t)_{t\in[T]}$ are jointly generated, such that $x^\#_t=x_{π^*(t)}+σ\tilde{x}_{π^*(t)}$, where $(x_t)_{t\in[T]},(\tilde{x}_t)_{t\in[T]}$ are independent and identically distributed vector autoregressive (VAR) time series of order $1$ with Gaussian increments, for a hidden $π^*$. The objective is to recover $π^*$, from the observation of $(x_t)_{t\in[T]},(x^\#_t)_{t\in[T]}$. This generalizes the classical problem of matching independent point clouds to the time series setting. We derive the maximum likelihood estimator (MLE), leading to a quadratic optimization over permutations, and theoretically analyze an estimator based on linear assignment. For the latter approach, we establish recovery guarantees, identifying thresholds for $σ$ that allow for perfect or partial recovery. Additionally, we propose solving the MLE by considering convex relaxations of the set of permutation matrices (e.g., over the Birkhoff polytope). This allows for efficient estimation of $π^*$ and the VAR parameters via alternating minimization. Empirically, we find that linear assignment often matches or outperforms MLE relaxation based approaches.

URL PDF HTML ☆

赞 0 踩 0

2302.14505 2026-06-16 stat.AP stat.ME 版本更新

Nonlinear regression models to forecast PM$_{2.5}$ concentration

基于非线性回归模型的PM$_{2.5}$浓度预测

Jinghong Zeng

AI总结提出基于非线性回归的PM$_{2.5}$浓度预测模型，包括单值和区间预测，结合NCEP CFS2提高精度，在武汉数据上验证有效。

Comments In Chinese, supervised by Prof. Yurong Chen

2606.16985 2026-06-16 stat.ML cs.LG eess.SP nlin.CD stat.ME 新提交

Dynestyx: A Probabilistic Programming Library for Dynamical Systems

Dynestyx: 一个面向动态系统的概率编程库

Daniel Waxman, Dmitry Batenkov, John Feser, Andy Zane, Eli Bingham, Youssef Marzouk, Matthew E. Levine

AI总结提出dynestyx库，通过统一接口支持状态空间模型的先验指定、混合效应推断及状态与参数估计，实现贝叶斯动态系统分析。

Comments 7 pages

2606.16138 2026-06-16 stat.ML cs.LG 新提交

Closing the Approximation Gap in Simulation-free Latent SDEs

弥合无模拟潜在随机微分方程中的近似差距

Henry D. Smith, Brian L. Trippe, Scott W. Linderman

发表机构 * Stanford University（斯坦福大学）

AI总结针对现有无模拟变分推断算法因参数化限制导致后验推断和参数学习性能下降的问题，提出Helmholtz-SDE算法，通过优化与指定边际分布兼容的路径律来弥合近似差距，在保持高效的同时恢复更准确的动力学。

详情

AI中文摘要

从含噪声观测中恢复动力系统是包括神经科学和物理学在内的科学领域中的反复挑战。潜在随机微分方程通过将系统建模为根据可学习SDE演化并生成观测的未观测状态来解决这一问题。变分推断为拟合潜在SDE提供了可处理的目标。传统的VI算法通过在时间离散化上进行数值模拟来评估该目标，在保真度和计算成本之间进行权衡。最近一类算法，即无模拟VI，通过其瞬时边际而不是漂移来参数化后验，从而避开了这种权衡。在这项工作中，我们表明现有无模拟VI算法的效率是有代价的：它们的参数化将近似后验限制为基于模拟的方法可用的SDE的子集，降低了后验推断和参数学习。我们提出了Helmholtz-SDE，一种无模拟VI算法，通过优化与指定边际分布集合兼容的路径律来弥合这一差距。Helmholtz-SDE比先前的无模拟方法更忠实地恢复动力学，在高后验不确定性下增益最大。它进一步以一小部分运行时间匹配基于模拟的VI的性能。

英文摘要

Recovering dynamical systems from noisy observations is a recurring challenge across scientific domains, including neuroscience and physics. Latent stochastic differential equations (SDEs) address this by modeling the system as an unobserved state that evolves according to a learnable SDE and generates the observations. Variational inference (VI) provides a tractable objective for fitting latent SDEs. Traditional VI algorithms evaluate this objective by numerical simulation over a time discretization, trading fidelity for computational cost. A recent class of algorithms, simulation-free VI, sidesteps this tradeoff by parameterizing the posterior through its instantaneous marginals rather than its drift. In this work, we show that the efficiency of existing simulation-free VI algorithms comes at a price: their parameterizations restrict the approximate posterior to a subset of the SDEs available to simulation-based methods, degrading posterior inference and parameter learning. We propose Helmholtz-SDE, a simulation-free VI algorithm that closes this gap by optimizing over path laws compatible with a prescribed collection of marginals. Helmholtz-SDE recovers dynamics more faithfully than prior simulation-free methods, with the largest gains under high posterior uncertainty. It further matches the performance of simulation-based VI at a fraction of the runtime.

URL PDF HTML ☆

赞 0 踩 0

2606.16073 2026-06-16 cs.LG stat.ML 新提交

Stop the Sampler! Classifier-Based Adaptive Stopping for Sampling Kernels

停止采样器！基于分类器的采样核自适应停止

Kirill Korolev, Nikita Morozov, Stepan Pavlenko, Esmeralda S. Whitammer, Sergey Samsonov

发表机构 * Stanford University（斯坦福大学）

AI总结提出将MCMC轨迹终止作为可学习组件，利用非循环生成流网络训练状态依赖分类器，在保证详细平衡条件下自适应停止采样，显著缩短轨迹长度并改善模式覆盖与混合。

Comments ICML 2026 SPIGM Workshop

详情

AI中文摘要

从复杂、未归一化的概率密度中采样是贝叶斯推断和概率建模中的基本挑战。虽然马尔可夫链蒙特卡罗（MCMC）方法提供了渐近保证，但由于固定或手动调整的轨迹长度，它们常常遭受慢混合和高计算成本。在这项工作中，我们提出了一种新颖的框架，将轨迹终止视为采样动力学的可学习组件。通过将MCMC置于非循环生成流网络（GFlowNets）的理论中，我们训练状态依赖的神经分类器来决定轨迹何时到达高密度区域并应终止。我们通过详细平衡条件从理论上建立了最优分类器与目标密度之间的联系，并引入了一种多级训练方案以促进复杂几何中的探索。在各种基准密度上的实验结果表明，与标准MCMC基线相比，我们的方法显著减少了平均轨迹长度，同时改善了模式覆盖和混合。

英文摘要

Sampling from complex, unnormalized probability densities is a fundamental challenge in Bayesian inference and probabilistic modeling. While Markov chain Monte Carlo (MCMC) methods provide asymptotic guarantees, they often suffer from slow mixing and high computational costs due to fixed or manually tuned trajectory lengths. In this work, we propose a novel framework that treats trajectory termination as a learnable component of the sampling dynamics. By framing MCMC within the theory of non-acyclic generative flow networks (GFlowNets), we train state-dependent neural classifiers to decide when a trajectory has reached a high-density region and should terminate. We theoretically establish the connection between optimal classifiers and the target density via detailed balance conditions and introduce a multilevel training scheme to facilitate exploration in complex geometries. Experimental results across various benchmark densities demonstrate that our approach significantly reduces average trajectory lengths while improving mode coverage and mixing compared to standard MCMC baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.15962 2026-06-16 stat.ME cs.LG 新提交

p-PSO: A Penalized Particle Swarm Optimization Technique for Finding D-Optimal Designs with Mixed Factors in Generalized Linear Models

p-PSO: 一种用于广义线性模型中混合因子D-最优设计的惩罚粒子群优化技术

Shrabanti Chowdhury, Abhyuday Mandal

发表机构 * Icahn School of Medicine at Mount Sinai（伊坎医学院）； University of Georgia（佐治亚大学）

AI总结提出一种新的惩罚粒子群优化方法p-PSO，通过通用惩罚公式解决广义线性模型中混合因子D-最优设计问题，高效且可直接使用现成PSO算法。

详情

AI中文摘要

寻找广义线性模型（GLMs）的D-最优设计具有挑战性，因为Fisher信息矩阵依赖于未知参数且缺乏闭式解，尤其当输入因子包含离散和连续变量时。尽管经典算法和最近的元启发式方法提供了部分解决方案，但仍需要稳健且计算高效的方法。本文提出了一种惩罚粒子群优化（PSO）方法，称为$p$-PSO。我们引入了一种新的、通用的约束优化惩罚公式，并展示了其在最优设计问题中的有效性。该公式与算法无关，适用于一大类黑箱优化方法。结果表明，该方法非常高效，其主要贡献在于提出了一种惩罚公式，使得可以直接使用现成的PSO算法，并自然地扩展到更一般的约束优化任务。

英文摘要

Finding D-optimal designs for generalized linear models (GLMs) is challenging due to the dependence of the Fisher information matrix on unknown parameters and the lack of closed-form solutions, particularly when input factors include both discrete and continuous variables. Although classical algorithms and recent metaheuristic approaches have offered partial solutions, there remains a need for robust and computationally efficient methods. In this paper, we propose a penalized Particle Swarm Optimization (PSO) approach, named $p$-PSO. Here we introduce a new, general-purpose penalty formulation for constrained optimization and demonstrate its effectiveness in optimal design problems. The formulation is algorithm-agnostic and applicable to a broad class of black-box optimization methods. Results show that the method is highly efficient, with its primary contribution being a penalty formulation that enables the direct use of an off-the-shelf PSO algorithm and extends naturally to more general constrained optimization tasks.

URL PDF HTML ☆

赞 0 踩 0

2606.15871 2026-06-16 stat.CO cs.LG stat.ML 新提交

Amortized mean-shift interacting particles

摊销均值漂移交互粒子

Ali Siahkoohi

发表机构 * Department of Computer Science University of Central Florida（计算机科学系佛罗里达中央大学）

AI总结提出摊销均值漂移交互粒子方法，通过学习映射从观测和少量后验样本直接输出加权节点，无需评估密度或得分，实现比同等数量蒙特卡洛样本更精确的积分估计。

详情

AI中文摘要

逆问题的贝叶斯推断用于评估积分——后验期望、尾部概率和风险——跨观测流。标准估计通过对后验样本的积分求平均，其误差仅随样本量的平方根衰减，因此精度需要大量样本——当每个样本调用偏微分方程正演模型时，这是禁止的。均值漂移交互粒子需要的样本少得多：它们返回一小组带符号权重的节点——一种确定性求积，其加权平均值估计这些积分。然而，寻找节点是一个每次观测的优化，在其最精确的形式中，每一步都读取后验得分——返回它本意要节省的成本。我们引入了摊销均值漂移交互粒子，一种学习映射，在单次前向传递中从观测和几个后验样本输出加权节点。训练仅需要联合参数-观测样本和一个可供抽样的后验——条件归一化流、经验条件或用户能抽样的任何参考——映射仅从样本学习积分该后验，既不评估其密度也不评估其得分。一旦训练完成，它泛化到未见过的观测和任意节点预算的积分，并以两种方式改进独立样本：通过重新加权，证明不劣于蒙特卡洛的等权重；通过移动它们，经验上进一步降低误差。在闭式、抽样、学习和基于物理的后验中——直到一千个系数的地下水场——它在每个预算下比相同数量的样本更准确地积分，并且后验白化、维度感知核消除了高维障碍。结果是蒙特卡洛积分的帕累托改进，而非与抽取更多样本竞争。

英文摘要

Bayesian inference for inverse problems is run to evaluate integrals -- posterior expectations, tail probabilities, and risks -- across a stream of observations. The standard estimate averages the integrand over posterior samples, a Monte-Carlo average whose error decays only as the square root of the sample size, so accuracy demands many samples -- prohibitive when each one calls a partial-differential-equation forward model. Mean-shift interacting particles need far fewer: they return a small set of signed-weight nodes -- a deterministic quadrature whose weighted averages estimate those integrals. Finding the nodes, however, is a per-observation optimization that, in its most accurate form, reads the posterior score at every step -- returning the cost it meant to save. We introduce amortized mean-shift interacting particles, a learned map that emits the weighted nodes from an observation and a few posterior samples in a single forward pass. Training asks only for joint parameter-observation samples and a posterior to draw from -- a conditional normalizing flow, an empirical conditional, or any reference the user can sample -- and the map learns to integrate that posterior from samples alone, evaluating neither its density nor its score. Once trained, it generalizes to unseen observations and integrands at any node budget and improves on independent samples in two ways: by reweighting them, provably no worse than the equal weights of Monte-Carlo; and by moving them, which empirically lowers it further. Across closed-form, sampled, learned, and physics-based posteriors -- up to a thousand-coefficient groundwater field -- it integrates more accurately than the same number of samples at every budget, and a posterior-whitened, dimension-aware kernel removes the high-dimensional wall. The result is a Pareto improvement on Monte-Carlo integration, not a competitor to drawing more samples.

URL PDF HTML ☆

赞 0 踩 0

2606.15793 2026-06-16 cs.LG cs.AI stat.ML 新提交

Proximal Policy Optimization for Amortized Discrete Sampling

用于摊销离散采样的近端策略优化

Anna Zykova-Myzina, Timofei Gritsaev, Daniil Tiapkin, Nikita Morozov

发表机构 * HSE University（高等经济学院）； Constructor University（康斯特大学）； CMAP, CNRS, École polytechnique, IPP（CMAP，CNRS，巴黎综合理工学院，IPP）

AI总结本文在生成流网络框架下，推导了策略梯度算法并首次应用近端策略优化，提升了离散概率分布采样的收敛速度和数据效率。

2606.15725 2026-06-16 stat.CO stat.ML 新提交

Score-Based Martingale Posteriors for Deep Neural Networks

基于得分的鞅后验分布用于深度神经网络

Abylay Zhumekenov, Ajay Jasra, Mohamed Maama, Raul Tempone

AI总结研究将基于得分的鞅后验分布（SMP）应用于大规模机器学习，通过随机梯度上升递归构建参数鞅序列，实现快速不确定性量化，并与蒙特卡洛方法对比。

Comments 20 pages, 7 figures, 6 tables, appendix

2606.15679 2026-06-16 stat.ML cs.LG cs.NA math.NA 新提交

Stochastic trace estimation with tensor train random vectors

基于张量列随机向量的随机迹估计

Zvonimir Bujanović, Daniel Kressner, Hrvoje Olić

发表机构 * University of Zagreb, Faculty of Science, Department of Mathematics（Zagreb大学科学学院数学系）； Institute of Mathematics, EPFL（EPFL数学研究所）

AI总结研究使用高斯随机张量列向量进行随机迹估计，证明适当秩下可恢复维度无关保证，并应用于Nyström++框架。

详情

AI中文摘要

随机迹估计是一种标准工具，用于近似仅通过矩阵-向量乘积可获得的大规模矩阵的迹。然而，在张量结构设置中，非结构化的高斯或Rademacher测试向量在存储和计算上可能过于昂贵，而更便宜的秩一张量积向量可能需要随张量阶数指数增长的样本复杂度。本文研究高斯随机张量列向量作为随机迹估计的结构化替代方案。我们证明，通过适当选择张量列秩，随机张量列向量可以恢复Girard-Hutchinson估计器的维度无关保证。特别地，基于张量列秩$r \geq d-1$的中位数均值变体在精度$\varepsilon$和失败概率$\delta$上实现了与基于非结构化高斯向量的经典估计器相同的依赖性。我们进一步证明了由独立高斯随机张量列向量形成的草图的一个无意识子空间注入结果：张量列秩$r\geq d-1$和$\mathcal{O}(\varepsilon^{-2}(k+\log(1/δ)))$个样本足以用于$k$维目标子空间。最后，我们研究了此类草图在Nyström++框架中的应用。我们证明，在额外的谱尾条件下，所得估计器可以实现所需的$\mathcal{O}(\varepsilon^{-1})$样本复杂度。这些结果阐明了随机张量列向量在随机迹估计中的潜力和局限性。

英文摘要

Stochastic trace estimation is a standard tool for approximating the trace of a large-scale matrix available only through matrix-vector products. However, in tensor-structured settings, unstructured Gaussian or Rademacher test vectors may be prohibitively expensive to store and compute with, while cheaper rank-one tensor-product vectors can require sample complexities that grow exponentially with the tensor order. This work studies Gaussian random tensor train vectors as a structured alternative for stochastic trace estimation. We show that, with a suitable choice of the tensor train rank, random tensor train vectors recover dimension-independent guarantees for the Girard--Hutchinson estimator. In particular, a median-of-means variant with tensor train rank $r \geq d-1$ achieves the same dependence on the accuracy $\varepsilon$ and failure probability $δ$ as the classical estimator based on unstructured Gaussian vectors. We further prove an oblivious subspace injection result for sketches formed from independent Gaussian random tensor train vectors: tensor train rank $r\geq d-1$ and $\mathcal{O}(\varepsilon^{-2}(k+\log(1/δ)))$ samples suffice for a $k$-dimensional target subspace. Finally, we investigate the use of such sketches within the Nyström++ framework. We show that the resulting estimator can achieve the desired $\mathcal{O}(\varepsilon^{-1})$ sample complexity under an additional spectral-tail condition. These results provide clarififcation on both the potential and the limitations of random tensor train vectors in stochastic trace estimation.

URL PDF HTML ☆

赞 0 踩 0

2606.15458 2026-06-16 stat.ML cs.LG 新提交

变分维度提升用于非线性随机动力学的鲁棒跟踪

Yonatan L. Ashenafi

AI总结提出一种变分维度提升框架，将非线性状态空间模型转化为高维线性随机表示，从而利用高效线性滤波技术跟踪非线性随机动力学，并通过三个模型验证其准确性和鲁棒性。

详情

AI中文摘要

非线性随机运动对贝叶斯粒子跟踪提出了重大挑战。为了解决这一挑战，我们提出了一种提升框架，该框架构建了非线性状态空间模型的高维线性随机表示。所得的替代模型能够使用计算高效的线性滤波技术，同时保持与底层非线性动力学的直接联系。本文利用伊藤引理和变分法推导了此类变换的必要条件，并在双稳态三次运动模型、径向布朗运动模型和具有乘性噪声的逻辑模型上展示了该方法。模拟结果证实，变换后的线性系统在投影回原空间时，能够准确重建非线性动力学，并且在刚性和奇异性的不同区域中，跟踪精度与传统滤波器相当，同时避免了它们的结构不稳定性。

英文摘要

Nonlinear stochastic motion presents significant challenges for Bayesian particle tracking. To address this challenge, we propose a lifting framework that constructs a higher-dimensional linear stochastic representation of nonlinear state-space models. The resulting surrogates enable the use of computationally efficient linear filtering techniques while retaining a direct connection to the underlying nonlinear dynamics. The paper derives the necessary conditions for such transformations using Ito's lemma and variational calculus, and illustrates the method on a bistable cubic motion model, radial Brownian process model, and a logistic model with multiplicative noise. Simulations confirm that the transformed linear systems, when projected back, accurately reconstruct the nonlinear dynamics and, in distinct regimes of stiffness and singularity, yield tracking accuracy competitive with conventional filters, while avoiding their structural instabilities.

URL PDF HTML ☆

赞 0 踩 0

2603.21075 2026-06-16 stat.CO 版本更新

Neural Inference Functions for Margins for Time Series Copula Models

时间序列Copula模型的边际神经推断函数

Daniel Fynn, David Gunawan, Andrew Zammit-Mangion

AI总结提出基于神经网络的N-IFM方法，用于高效估计多元时间序列Copula模型参数，在保持推断精度的同时大幅降低计算成本。

Comments 86 pages, 29 figures

2502.07396 2026-06-16 stat.CO cs.CE stat.ML 版本更新

Optimality in importance sampling: a gentle survey

重要性采样中的最优性：一个温和的综述

Fernando Llorente, Luca Martino

AI总结综述重要性采样中提议密度的最优性概念，涵盖边际似然近似、多提议密度、退火后验序列及噪声场景（如ABC和强化学习）等框架，并提供理论与实证比较。

2512.20566 2026-06-16 math.OC stat.ML 版本更新

Random Gradient-Free Optimization in Infinite Dimensional Spaces

无限维空间中的随机无梯度优化

Caio Peixoto, Daniel Csillag, Bernardo F. P. da Costa, Yuri F. Saporito

AI总结提出一种仅需方向导数的无限维希尔伯特空间无梯度优化方法，通过预基和随机方向导数实现可证明收敛，并应用于物理信息神经网络求解偏微分方程。

Comments 23 pages, 4 figures

详情

AI中文摘要

我们提出了一种新的无梯度方法，用于希尔伯特空间中的无限维优化，该方法仅需计算方向导数。尽管函数优化通常通过有限维梯度下降（例如神经网络）在参数化上求解，但我们转而利用优化问题的函数性质来获得可证明的保证。然而，无限维梯度在实践中往往难以计算，使得朴素的函数梯度下降难以处理。为克服这一限制，我们的框架仅利用方向导数和希尔伯特空间的预基（即一个线性无关集，其张成空间稠密）。这解决了可处理性问题，因为预基比完全正交基或再生核（甚至可能不存在）更容易获得，且单个方向导数可通过自动微分计算。我们展示了该方法在物理信息神经网络（PINNs）求解偏微分方程中的应用，有效实现了可证明的收敛。

英文摘要

We propose a new gradient-free method for infinite-dimensional optimization in Hilbert spaces that requires only the computation of directional derivatives. Though functional optimization is often solved through finite-dimensional gradient descent over a parametrization, such as neural networks, we instead propose to leverage the functional nature of the optimization problem to enable provable guarantees. However, infinite-dimensional gradients are often hard to compute in practice, rendering naïve functional gradient descent intractable. To overcome this limitation, our framework leverages only directional derivatives and a pre-basis for the Hilbert space, i.e., a linearly independent set whose span is dense. This resolves the tractability issue, as pre-bases are much more accessible than full orthonormal bases or reproducing kernels -- which may not even exist -- and individual directional derivatives can be computed using automatic differentiation. We showcase the use of our method to solve partial differential equations à la physics-informed neural networks (PINNs), where it effectively enables provable convergence.

URL PDF HTML ☆

赞 0 踩 0

2506.08328 2026-06-16 stat.ME 版本更新

Diffusion Non-Additive Model for Multi-Fidelity Simulations with Tunable Precision

扩散非加性模型：用于可调精度多保真度模拟

Junoh Heo, Romain Boutelet, Wenjia Wang, Chih-Li Sung

AI总结提出扩散非加性（DNA）模型，利用高斯过程捕获非线性依赖并外推精确解，实现多保真度模拟的精度提升与不确定性量化。

Comments 35 pages including references and 27 pages supplementary

详情

AI中文摘要

计算机模拟对于分析复杂系统不可或缺，然而高保真度模型通常带来高昂的计算成本。多保真度框架通过结合低成本低保真度模拟与昂贵的高保真度模拟来解决这一挑战，以提高准确性和效率。然而，某些科学问题要求比现有最高保真度模拟更精确的结果，特别是当存在控制模拟精度的调优参数，但对应零值的精确解仍无法获得时。本文受生成扩散模型启发，引入扩散非加性（DNA）模型，该模型利用高斯过程先验捕获不同保真度水平之间的非线性依赖，并外推至精确解。DNA模型：(i) 适应不同保真度水平之间复杂的非加性关系；(ii) 采用不可分离协方差核来建模调优参数与输入变量之间的交互，提升预测性能；(iii) 提供后验预测均值和方差的闭式表达式，实现高效推理和不确定性量化；(iv) 建立预测误差的严格理论界限，从而得到最优实验设计策略。该方法在一系列数值研究和实际案例研究中得到验证。提供了实现所提方法的R包以支持实际应用。

英文摘要

Computer simulations are indispensable for analyzing complex systems, yet high-fidelity models often incur prohibitive computational costs. Multi-fidelity frameworks address this challenge by combining inexpensive low-fidelity simulations with costly high-fidelity simulations to improve both accuracy and efficiency. However, certain scientific problems demand even more accurate results than the highest-fidelity simulations available, particularly when a tuning parameter controlling simulation accuracy is available, but the exact solution corresponding to a zero-valued parameter remains out of reach. In this paper, we introduce the Diffusion Non-Additive (DNA) model, inspired by generative diffusion models, which captures nonlinear dependencies across fidelity levels using Gaussian process priors and extrapolates to the exact solution. The DNA model: (i) accommodates complex, non-additive relationships across fidelity levels; (ii) employs a nonseparable covariance kernel to model interactions between the tuning parameter and input variables, improving predictive performance; (iii) provides closed-form expressions for the posterior predictive mean and variance, allowing efficient inference and uncertainty quantification; and (iv) establishes rigorous theoretical bounds on the prediction error, leading to an optimal experimental design strategy. The methodology is validated on a suite of numerical studies and real-world case studies. An R package implementing the proposed methodology is available to support practical applications.

URL PDF HTML ☆

赞 0 踩 0

2509.03945 2026-06-16 stat.CO cs.DC cs.NA math.NA stat.ML 版本更新

Prob-GParareal: A Probabilistic Numerical Parallel-in-Time Solver for Differential Equations

Prob-GParareal：一种用于微分方程的概率数值并行时间求解器

Guglielmo Gattiglio, Lyudmila Grigoryeva, Massimiliano Tamborrino

AI总结提出Prob-GParareal，通过高斯过程建模Parareal校正函数，为微分方程的并行时间求解提供不确定性量化，并在五个基准ODE系统上验证了精度和鲁棒性。

详情

AI中文摘要

我们介绍了Prob-GParareal，这是GParareal算法的概率扩展，旨在为（常微分和偏微分）方程（ODE、PDE）的并行时间（PinT）求解提供不确定性量化。该方法采用高斯过程（GP）对Parareal校正函数进行建模，与GParareal一致，进一步实现了数值不确定性在时间上的传播，并产生系统演化的概率预测。此外，Prob-GParareal支持概率初始条件，并保持与经典数值求解器的兼容性，确保其易于集成到现有的Parareal框架中。在此，我们首先对Prob-GParareal的计算复杂度进行理论分析，并推导误差界。然后，我们在五个基准ODE系统（包括混沌、刚性和分岔问题）上数值展示了所提算法的准确性和鲁棒性。为了展示所提算法的灵活性和潜在可扩展性，我们还考虑了Prob-nnGParareal，这是通过将Parareal中的GP替换为最近邻GP得到的变体，并在一个额外的PDE示例上展示了其性能提升。这项工作弥合了现有PinT方法概率对应物发展中的一个关键空白。

英文摘要

We introduce Prob-GParareal, a probabilistic extension of the GParareal algorithm designed to provide uncertainty quantification for the Parallel-in-Time (PinT) solution of (ordinary and partial) differential equations (ODEs, PDEs). The method employs Gaussian processes (GPs) to model the Parareal correction function, in line with GParareal, further enabling the propagation of numerical uncertainty across time and yielding probabilistic forecasts of the system's evolution. Furthermore, Prob-GParareal accommodates probabilistic initial conditions and maintains compatibility with classical numerical solvers, ensuring its straightforward integration into existing Parareal frameworks. Here, we first conduct a theoretical analysis of the computational complexity and derive error bounds of Prob-GParareal. Then, we numerically demonstrate the accuracy and robustness of the proposed algorithm on five benchmark ODE systems, including chaotic, stiff, and bifurcation problems. To showcase the flexibility and potential scalability of the proposed algorithm, we also consider Prob-nnGParareal, a variant obtained by replacing the GPs in Parareal with the nearest-neighbors GPs, illustrating its increased performance on an additional PDE example. This work bridges a critical gap in the development of probabilistic counterparts to established PinT methods.

URL PDF HTML ☆

赞 0 踩 0

2606.17048 2026-06-16 cs.LG cs.CV stat.ML 新提交

Exact Posterior Score Estimation for Solving Linear Inverse Problems

精确后验分数估计用于求解线性逆问题

Abbas Mammadov, Ozgur Kara, Kaan Oktay, Iskander Azangulov, Adil Kaan Akan, Hyungjin Chung, James Matthew Rehg, Yee Whye Teh

发表机构 * University of Oxford（牛津大学）； UIUC（伊利诺伊大学厄巴纳-香槟分校）； EverEx

AI总结提出精确后验分数（EPS）方法，通过闭式后验分数将线性逆问题转化为去噪问题，无需梯度或投影，在FFHQ和ImageNet上优于现有方法。

详情

AI中文摘要

扩散和基于流的模型通过训练去噪器来逆转高斯损坏，从而学习强大的数据先验。为了利用这一先验解决线性逆问题，需要从后验中采样，但先验提供的分数是无条件分数，而非后验分数。现有方法要么使用近似测量匹配校正来引导固定的预训练去噪器，要么训练一个放弃先验去噪结构的条件恢复模型。我们在一般高斯插值下推导了线性高斯逆问题的精确后验分数闭式，并表明后验采样可归结为在算子依赖的偏移枢轴和各向异性噪声协方差下的去噪问题。我们将这一恒等式转化为精确后验分数（EPS），这是一种去噪训练目标，保留了标准预训练的输入/输出结构，因此可以从头训练或从预训练去噪器微调。在推理时，EPS使用与底层骨干相同的采样器，无需似然梯度或投影。我们在FFHQ和ImageNet上的五个线性逆问题上评估了EPS，在保真度、感知和分布指标上优于无训练和基于训练的基线，同时使用的去噪器评估次数比基于梯度的后验采样器少大约一个数量级。

英文摘要

Diffusion and flow-based models learn powerful data priors by training a denoiser to reverse Gaussian corruption. To use this prior to solve a linear inverse problem, one needs to sample from the posterior, but the score that the prior provides is the unconditional score, not the posterior score. Existing methods either steer a fixed pretrained denoiser with approximate measurement-matching corrections, or train a conditional restoration model that abandons the denoising structure of the prior. We derive the exact posterior score in closed form for linear Gaussian inverse problems under general Gaussian interpolants, and show that posterior sampling reduces to a denoising problem at an operator-dependent shifted pivot under an anisotropic noise covariance. We turn this identity into Exact Posterior Score (EPS), a denoising training objective that preserves the input/output structure of standard pretraining and can therefore be trained from scratch or fine-tuned from a pretrained denoiser. At inference, EPS uses the same sampler as the underlying backbone, with no likelihood gradients or projections. We evaluate EPS on five linear inverse problems across FFHQ and ImageNet, where it outperforms training-free and training-based baselines on fidelity, perceptual, and distributional metrics, while using roughly an order of magnitude fewer denoiser evaluations than gradient-based posterior samplers.

URL PDF HTML ☆

赞 0 踩 0

2606.16975 2026-06-16 stat.ML cs.LG 新提交

Sobolev Approximation by Fixed-Size Neural Networks with Arbitrary Accuracy

固定大小神经网络实现任意精度的Sobolev逼近

Baicheng Li, Haizhao Yang, Shijun Zhang

AI总结提出新型激活函数（EUAF、DUAF∞等），使固定大小神经网络能以任意精度逼近Sobolev空间中的函数，并给出显式的宽度和深度界。

详情

AI中文摘要

本文研究用于固定大小神经网络实现任意精度Sobolev逼近的新型激活函数。我们首先证明，任何$W^{2,\infty}((a,b)^d)$中的函数都可以通过使用基本通用激活函数（$\mathrm{EUAF}$）的固定大小神经网络，以$W^{1,\infty}$范数度量达到任意精度。为了将此结果推广到$s\in\mathbb{N}$时的$W^{s,\infty}((a,b)^d)$，我们引入了来自可微通用激活函数族（$\mathrm{DUAF}_n$）的光滑激活函数$\mathrm{DUAF}_{\infty}$。我们证明，任何$W^{s,\infty}((a,b)^d)$中的函数都可以通过固定大小的$\mathrm{DUAF}_{\infty}$激活网络，以$W^{s-1,\infty}$范数度量达到任意精度。我们进一步构造了Sigmoid变体$\widetilde{\mathrm{DUAF}}_n$，并证明对于每个$1\leq s\leq n$，固定大小的$\widetilde{\mathrm{DUAF}}_n$激活网络仍能以$W^{s-1,\infty}$范数度量任意逼近任何$f\in W^{s,\infty}((a,b)^d)$。在所有结果中，宽度和深度界均被显式计算，且所提出的激活函数是初等的。

英文摘要

In this work, we investigate new activation functions for achieving arbitrary-accuracy Sobolev approximation by fixed-size neural networks. We first show that any function in $W^{2,\infty}((a,b)^d)$ can be approximated with arbitrary accuracy, measured in the $W^{1,\infty}$-norm, by a fixed-size neural network using the Elementary Universal Activation Function ($\mathrm{EUAF}$). To extend this result to $W^{s,\infty}((a,b)^d)$ for $s\in\mathbb{N}$, we introduce a smooth activation $\mathrm{DUAF}_{\infty}$ from the family of Differentiable Universal Activation Functions ($\mathrm{DUAF}_n$). We prove that any function in $W^{s,\infty}((a,b)^d)$ can be approximated with arbitrary accuracy in the $W^{s-1,\infty}$-norm by a fixed-size $\mathrm{DUAF}_{\infty}$-activated network. We further construct sigmoidal variants $\widetilde{\mathrm{DUAF}}_n$ and show that, for every $1\leq s\leq n$, fixed-size $\widetilde{\mathrm{DUAF}}_n$-activated networks still approximate any $f\in W^{s,\infty}((a,b)^d)$ with arbitrary accuracy in the $W^{s-1,\infty}$-norm. In all these results, the width and depth bounds are computed explicitly, and the proposed activations are elementary.

URL PDF HTML ☆

赞 0 踩 0

2606.16926 2026-06-16 math.OC cs.LG stat.ML 新提交

Functional Gradient Descent with Adaptive Representations

自适应表示的函数梯度下降

Daniel Csillag, Rodrigo Schuller, Pedro Dall'Antonia, Leonidas Guibas, Luiz Velho, Tiago Novello

AI总结提出一种自适应表示的函数梯度下降算法，通过将近似误差纳入分析，在平滑损失下收敛到驻点，在PL条件下收敛到全局最小值，在回归、PDE求解和计算机视觉中优于固定近似FGD和神经网络基线。

详情

AI中文摘要

函数优化问题通常通过优化固定表示（如神经网络）的参数来解决，这导致高度非凸的损失，使训练和理论分析复杂化。一个有趣的替代方案是函数梯度下降（FGD），即直接在函数空间中进行梯度下降，它受益于强收敛结果并具有简洁的理论。然而，FGD在实践中难以实现，因为函数梯度是无限维的，因此无法完全计算或存储在内存中。现有的实现因此依赖于固定近似，这引入了近似误差。我们提出了一种新的、有理论基础的FGD算法，该算法在优化过程中自适应地调整函数梯度的表示。通过将这种近似明确地纳入分析，我们证明了无论近似如何，算法都能收敛到驻点（对于平滑损失）和全局最小值（在平滑性和Polyak-Lojasiewicz型条件下）。据我们所知，这是第一个在一般设置下具有此类保证的可实现FGD方法。我们在回归、偏微分方程的数值求解和现代计算机视觉中展示了我们方法的有效性。在各种设置中，我们的方法在效率和准确性上始终优于固定近似的FGD和神经网络基线。

英文摘要

Functional optimization problems are typically solved by optimizing the parameters of a fixed representation, such as a neural network, resulting in highly nonconvex losses that complicate both training and theoretical analysis. An interesting alternative is functional gradient descent (FGD), that is, gradient descent directly in function space, which benefits from strong convergence results and admits a clean theory. However, FGD is difficult to implement in practice because functional gradients are infinite-dimensional, and thus cannot be fully computed nor stored in memory. Existing implementations therefore rely on fixed approximations, which introduce approximation error. We propose a new, theoretically-grounded FGD algorithm that adapts the representation of the functional gradients over the course of optimization. By explicitly incorporating this approximation into the analysis, we establish convergence to a stationary point (for smooth losses) and to a global minimizer (under smoothness + a Polyak-Lojasiewicz-type condition) regardless of our approximations. To the best of our knowledge, this is the first implementable FGD method with such guarantees in a general setting. We demonstrate the effectiveness of our method on regression, numerical solution of PDEs, and modern computer vision. Across settings, our method consistently outperforms both FGD with fixed approximations and neural network baselines in efficiency and accuracy.

URL PDF HTML ☆

赞 0 踩 0

2606.16730 2026-06-16 stat.ML cs.AI cs.LG 新提交

Attention is Just Another Name for Coupling?: A Fast-Slow ODE Perspective on Hierarchical Pretraining

注意力只是耦合的另一个名字？：关于层级预训练的快速-慢速ODE视角

Zhengyuan Gao

AI总结本文提出一种快慢ODE视角，将因果自注意力视为耦合机制，并引入一个通过零初始化门控反馈到快路径的慢子系统，在理论证明和实验验证中揭示了其与主方程平稳分布的联系。

详情

AI中文摘要

因果自注意力是一种耦合机制：每个token的隐藏状态通过同一时间尺度上前置token的学习混合来更新。本文提出一个疑问：是否存在第二个时间上更慢的耦合——一个在序列的时间下采样视图上运行并通过零初始化门控反馈到快路径的慢子系统——来补充它？该问题以奇异摄动常微分方程（ODE）的语言提出，其中快变量$x$以token速率演化，慢变量$y$每$P$个token更新一次，时间尺度比$\varepsilon = 1/P$通过因果块均值池化在结构上强制执行。\n本文将快慢ODE形式具体化为一个神经网络：一个在$T$个token上的标准因果注意力快路径，一个在$T/P$个池化token上的全注意力慢路径（每层便宜$P^2$倍），以及一个零初始化的加法门控。此外，在快动力学的线性生成器假设下，我们证明了平衡流形$x = \phi(y)$恰好是主方程（ME）的平稳分布$p_{\mathrm{st}}(y)$；在该机制下，学习的MLP $\phi_\theta(y)$是其变分近似（训练块不是生成器，因此该恒等式是结构极限，而非对训练网络的断言）。实验上，在50万token时，耦合是中性的——门控保持关闭，耦合和冻结消融在运行间噪声范围内——其墙钟成本与密集基线相当。贡献在于精确的、带有间隙标记的映射本身，而非性能提升。

英文摘要

Causal self-attention is a coupling mechanism: each token's hidden state is updated by a learned mixture of preceding tokens at the same timescale. This paper asks whether a second, temporally slower coupling-a slow sub-system operating on a temporally-downsampled view of the sequence and fed back into the fast path through a zero-initialised gate-complements it. The question is framed in the language of singularly perturbed ordinary differential equations (ODEs), where the fast variable $x$ evolves at the token rate, the slow variable $y$ evolves at one update per $P$ tokens, and the timescale ratio $\varepsilon = 1/P$ is enforced structurally by causal block-mean pooling. The paper instantiates the fast-slow ODE formalism as a concrete neural network: a fast path of standard causal attention over $T$ tokens, a slow path of full attention over $T/P$ pooled tokens ($P^2 \times$ cheaper per layer), and a zero-initialised additive gate. In addition, under a linear-generator assumption on the fast dynamics, we prove that the equilibrium manifold $x = ϕ(y)$ is exactly the master-equation (ME) stationary distribution $p_{\mathrm{st}}(y)$; in that regime a learned MLP $ϕ_θ(y)$ is a variational approximation of it (the trained block is not a generator, so this identity is the structured limit, not a claim about the network as trained). Empirically, at $500$k tokens the coupling is neutral -- the gate stays closed and the coupled and frozen ablations are within run-to-run noise -- at a wall-clock cost comparable to a dense baseline. The contribution is the precise, gap-marked mapping itself, not a performance gain.

URL PDF HTML ☆

赞 0 踩 0

2606.16610 2026-06-16 stat.ML cs.LG 新提交

Diffusion Flow Matching: Dimension-Improved KL Bounds and Wasserstein Guarantees

扩散流匹配：维度改进的KL界和Wasserstein保证

Marta Gentiloni Silveri, Giovanni Conforti, Alain Durmus

AI总结本文针对基于布朗运动的扩散流匹配，在KL散度和2-Wasserstein距离下推导了改进的离散化误差收敛界，实现了维度依赖的最优缩放。

2606.16301 2026-06-16 cs.LG stat.ML 新提交

One-Step Generalization Ratio Guided Optimization for Domain Generalization

一步泛化比率引导的域泛化优化

Sumin Cho, Dongwon Kim, Kwangsu Kim

发表机构 * Korea Advanced Institute of Science and Technology (KAIST)（韩国高级科学技术研究所）

AI总结提出GENIE优化器，通过一步泛化比率（OSGR）动态均衡参数更新，抑制虚假相关，促进域不变特征学习，在域泛化任务中超越现有优化器。

Comments 29 pages, accepted at the 42nd International Conference on Machine Learning (ICML 2025)

详情

AI中文摘要

域泛化（DG）旨在训练模型泛化到未见过的目标域，但常常过拟合到域特定特征，即所谓的非期望相关性。基于梯度的DG方法通常引导梯度朝向主导方向，但往往无意中强化了虚假相关性。最近的工作采用dropout来正则化过度自信的参数，但未明确调整梯度对齐或确保平衡的参数更新。我们提出GENIE（泛化增强迭代均衡器），一种新颖的优化器，利用一步泛化比率（OSGR）量化每个参数对损失减少的贡献并评估梯度对齐。通过预条件因子动态均衡OSGR，GENIE防止少量参数主导优化，从而促进域不变特征学习。理论上，GENIE平衡参数间的收敛贡献和梯度对齐，在保持SGD收敛速度的同时实现更高的OSGR。实验上，它优于现有优化器，并在与各种DG和单DG方法集成时提升性能。

英文摘要

Domain Generalization (DG) aims to train models that generalize to unseen target domains but often overfit to domain-specific features, known as undesired correlations. Gradient-based DG methods typically guide gradients in a dominant direction but often inadvertently reinforce spurious correlations. Recent work has employed dropout to regularize overconfident parameters, but has not explicitly adjusted gradient alignment or ensured balanced parameter updates. We propose GENIE (Generalization-ENhancing Iterative Equalizer), a novel optimizer that leverages the One-Step Generalization Ratio (OSGR) to quantify each parameter's contribution to loss reduction and assess gradient alignment. By dynamically equalizing OSGR via a preconditioning factor, GENIE prevents a small subset of parameters from dominating optimization, thereby promoting domain-invariant feature learning. Theoretically, GENIE balances convergence contribution and gradient alignment among parameters, achieving higher OSGR while retaining SGD's convergence rate. Empirically, it outperforms existing optimizers and enhances performance when integrated with various DG and single-DG methods.

URL PDF HTML ☆

赞 0 踩 0

2606.16273 2026-06-16 stat.ML cs.LG stat.ME 新提交

Generative Modeling on Metric Graphs via Neural Optimal Transport

基于神经最优传输的度量图生成建模

Alessandro Micheli, Yueqi Cao, Anthea Monod, Samir Bhatt

发表机构 * Imperial College London（帝国理工学院伦敦分校）； KTH Royal Institute of Technology（皇家理工学院）； Statens Serum Institut（丹麦国家血清研究所）； University of Copenhagen（哥本哈根大学）

AI总结提出首个深度生成建模框架，用于度量图上连续分布，通过图嵌入、神经半对偶求解熵Kantorovich问题并投影回原图，理论证明收敛性，实验优于离散图OT基线。

详情

AI中文摘要

我们提出了，据我们所知，首个用于紧度量图上连续支撑概率分布的深度生成建模框架。给定度量图上的源测度和目标测度，我们的方法将图嵌入到光滑环境空间，通过神经半对偶参数化求解熵Kantorovich问题，并将生成的样本投影回原始图。我们研究了两种嵌入几何：外在欧几里得实现和内在热带Abel--Jacobi嵌入到Jacobian环面。在这两种情况下，生成的生成器通过构造支持在图上。我们证明，在增加神经表达能力的联合极限下，学习到的生成器弱收敛到原始图测度之间的有效传输耦合。实验上，在一系列几何不同的图上，我们的方法匹配或改进了基于离散图OT的启发式传输基线，同时具有更好的可扩展性。最后，我们通过在纽约市曼哈顿的一百万Uber上车点数据上训练模型，展示了在真实世界城市移动数据上的可扩展性。

英文摘要

We introduce, to our knowledge, the first deep generative modeling framework for probability distributions continuously supported on compact metric graphs. Given source and target measures on a metric graph, our method embeds the graph into a smooth ambient space, solves an entropic Kantorovich problem via a neural semidual parameterization, and projects generated samples back onto the original graph. We study two embedded geometries: an extrinsic Euclidean realization and the intrinsic tropical Abel--Jacobi embedding into the Jacobian torus. In both cases, the resulting generator is graph-supported by construction. We prove that, in the joint limit of increasing neural expressivity, the learned generator converges weakly to a valid transport coupling between the original graph measures. Empirically, across a range of geometrically distinct graphs, our method matches or improves upon heuristic transport baselines based on discrete graph OT, while scaling more favorably. Finally, we demonstrate scalability on real-world urban mobility data by training our model on one million Uber pickup locations in Manhattan, New York City.

URL PDF HTML ☆

赞 0 踩 0

2606.15897 2026-06-16 cs.LG cs.AI stat.ML 新提交

Topological Flow Matching

拓扑流匹配

Kacper Wyrwal, İsmail İlkan Ceylan, Alexander Tong

AI总结提出拓扑流匹配，通过拉普拉斯漂移增强参考过程，在保留流匹配稳定性和无模拟目标的同时，捕捉底层域拓扑结构，适用于脑fMRI、洋流等结构化数据。

Comments Accepted at ICLR 2026. 26 pages, 24 figures. Code: https://github.com/KacperWyrwal/topological-flow-matching

详情

AI中文摘要

流匹配是一个强大的生成建模框架，因其简单性和强大的经验性能而受到重视。然而，其标准公式将结构化空间上的信号（例如脑图上的fMRI数据）视为欧几里得空间中的点，忽略了其域的丰富拓扑特征。为了解决这个问题，我们引入了拓扑流匹配，这是流匹配的一种拓扑感知泛化。我们将流匹配解释为解决退化薛定谔桥问题的框架，并通过用拉普拉斯导出的漂移增强参考过程来注入拓扑信息。这种原则性修改捕获了底层域的结构，同时保留了流匹配的理想特性：稳定的、无模拟的目标和确定性样本路径。因此，我们的框架可以作为标准流匹配的直接替代品。我们在多样化的结构化数据集上展示了其有效性，包括脑fMRI、洋流、地震事件和交通流。

英文摘要

Flow matching is a powerful generative modeling framework, valued for its simplicity and strong empirical performance. However, its standard formulation treats signals on structured spaces, such as fMRI data on brain graphs, as points in Euclidean space, overlooking the rich topological features of their domains. To address this, we introduce topological flow matching, a topology-aware generalization of flow matching. We interpret flow matching as a framework for solving a degenerate Schrödinger bridge problem and inject topological information by augmenting the reference process with a Laplacian-derived drift. This principled modification captures the structure of the underlying domain while preserving the desirable properties of flow matching: a stable, simulation-free objective and deterministic sample paths. As a result, our framework serves as a drop-in replacement for standard flow matching. We demonstrate its effectiveness on diverse structured datasets, including brain fMRIs, ocean currents, seismic events, and traffic flows.

URL PDF HTML ☆

赞 0 踩 0

2606.15665 2026-06-16 stat.ML cs.LG math.ST stat.TH 新提交

Information Gap and Feasibility-Aware Inference in Binomial Logistic Mixtures

二项逻辑混合模型中的信息差距与可行性感知推断

Yuta Hayashida, Shonosuke Sugasawa

AI总结研究二项逻辑混合模型中混合检测与标签恢复之间的信息差距，提出基于后验熵惩罚的可行性感知推断方法，避免误导性成分选择并改善后验标签概率校准。

Comments 33 pages (main) + 30 pages (supplement)

详情

AI中文摘要

本文研究二项逻辑混合模型中混合检测与标签恢复之间的信息差距。基于似然的标准准则（如贝叶斯信息准则，BIC）可以检测到两个成分的存在，但这并不能保证相应的标签是可恢复的。我们表明，这种差距对于具有固定试验次数的二项逻辑混合模型是内在的：观察到的混合结构证据和用于标签恢复的每个观测信息在成分分离度上具有不同的局部阶数，并且只有前者随样本量累积。因此，存在一个可检测但不可恢复的区域，其中BIC选择两个成分，而后验标签基本上没有信息。为了解决这个问题，我们提出了两种可行性感知推断程序：具有后验熵惩罚的可恢复性感知BIC，以及一种熵正则化估计器，它减轻了最大似然估计器产生过度分离成分和过度集中的后验责任的倾向。数值实验证实了预测的差距，并表明所提出的方法避免了误导性的成分选择，并改善了后验标签概率的校准。

英文摘要

This paper studies the information gap between mixture detection and label recovery in binomial logistic mixtures. Standard likelihood-based criteria such as the Bayesian information criterion (BIC) can detect the presence of two components, but this does not guarantee that the corresponding labels are recoverable. We show that this gap is intrinsic to binomial logistic mixtures with a fixed number of trials: observed-data evidence for mixture structure and per-observation information for label recovery have different local orders in the component separation, and only the former accumulates with the sample size. As a result, there exists a detectable-but-unrecoverable regime in which BIC selects two components while the posterior labels remain essentially uninformative. To address this issue, we propose two feasibility-aware inference procedures: a recoverability-aware BIC with a posterior-entropy penalty and an entropy-regularized estimator that mitigates the tendency of the maximum likelihood estimator to produce overly separated components and overly concentrated posterior responsibilities. Numerical experiments confirm the predicted gap and demonstrate that the proposed methods avoid misleading component selections and improve the calibration of posterior label probabilities.

URL PDF HTML ☆

赞 0 踩 0

2606.15569 2026-06-16 cs.LG math.ST stat.ML stat.TH 新提交

A Decision-Theoretic View of Test-Time Training: When, How Far, and Which Directions to Adapt

测试时训练的决策论视角：何时、多远以及哪些方向进行自适应

Tomoya Wakayama

发表机构 * N/A

AI总结通过决策论将测试时训练视为核机制下的隐式贝叶斯推断，揭示了更新步长和子空间选择对性能的影响，并提出了自适应策略、PAC-Bayes保证和最优子空间选择规则。

详情

AI中文摘要

测试时训练（TTT）通过参数更新使预训练模型适应每个提示，提高了在预训练到测试分布偏移下的准确性。然而，其性能常常受到不稳定性和对超参数（如更新步长和子空间）敏感性的影响。我们通过决策论的视角解释这一行为，将TTT视为核机制下的隐式贝叶斯推断。在高斯过程基准下，我们表明当更新与提示的信噪比谱匹配并与查询相关的特征方向对齐时，TTT能降低预测误差。这一视角支撑了以下结果：（1）我们展示了固定更新步长和子空间在分布偏移下失败的情况，从而激励自适应策略；（2）我们证明通过提示证据选择更新步长具有对抗过拟合的PAC-Bayes保证；（3）我们在线性-高斯校正模型下刻画了贝叶斯最优更新子空间，从而为选择Transformer块和头提供了评分规则。我们的理论有助于解释TTT的经验不稳定性，为何时、多远以及哪些方向进行自适应提供了原则性指导。

英文摘要

Test-time training (TTT) adapts a pretrained model to each prompt via parameter updates, improving accuracy under pretraining-to-test distribution shifts. Yet, its performance often suffers from instability and sensitivity to hyperparameters such as update steps and subspace. We explain this behavior through a decision-theoretic lens, treating TTT as implicit Bayesian inference in the kernel regime. Under a Gaussian process benchmark, we show that TTT reduces prediction error when updates are spectrally matched to the prompt's signal-to-noise ratio and aligned with query-relevant eigen-directions. This perspective underpins the following results: (1) we show when fixed update steps and subspaces fail under distribution shifts, motivating adaptive strategies; (2) we prove that selecting update steps via prompt evidence admits a PAC-Bayes guarantee against overfitting; and (3) we characterize the Bayes-optimal update subspace under a linear-Gaussian correction model, yielding a scoring rule for selecting Transformer blocks and heads. Our theory helps explain the empirical instability of TTT, taking a step toward principled guidance for when, how far, and which directions to adapt.

URL PDF HTML ☆

赞 0 踩 0

2606.15555 2026-06-16 math.OC cs.AI cs.LG stat.ML 新提交

Service-Induced Congestion in Memory-Constrained LLM Serving

内存受限的大语言模型服务中的服务引发拥塞

Ruicheng Ao, Jing Dong, Gan Luo, David Simchi-Levi

发表机构 * Institute for Data, Systems, and Society, Massachusetts Institute of Technology（数据、系统与社会研究所，麻省理工学院）； Columbia Business School, Columbia University（哥伦比亚大学商学院）； School of Mathematical Sciences, Peking University（北京大学数学科学学院）

AI总结本文通过离散时间动力学模型研究内存受限的大语言模型服务中，因键值缓存增长导致的服务引发拥塞，发现同质负载下无驱逐均衡不稳定且收敛到最坏情况极限环，异质负载下稳定条件与解码长度互质相关，并提出调度设计原则。

Comments 101 pages

详情

AI中文摘要

在大语言模型（LLM）服务中，每个请求在服务期间会积累持久的图形处理单元（GPU）内存，因为其键值缓存随着每个生成的令牌而增长。在高并发下，总内存使用量因此随时间内生增长：服务过程本身会创造未来的容量压力。当内存容量超出时，系统会驱逐活动请求，丢弃缓存状态并在稍后重新启动它们，这浪费了计算并降低了吞吐量。我们开发了一个内存受限的LLM推理的离散时间动力学模型，该模型捕获了连续批处理下的准入、内存增长和驱逐。在饱和输入机制下，系统同时存在无驱逐的固定点和带驱逐的极限环。对于同质负载，我们证明无驱逐平衡是不稳定的，并且除了一个勒贝格测度为零的精确捕获集外，系统收敛到一个唯一的最坏情况极限环，该极限环在该例外集外是渐近稳定的，吞吐量损失高达50%。对于异质负载，我们在两类共同输入设置下证明了一个稳定性准则，并解释了生存多项式机制如何推广到多类和异质输入长度。在输入主导的缩放机制下，互质的解码长度稳定了无驱逐平衡，而非互质的长度创造了同步模式，导致不稳定。这些结果描述了负载异质性何时使完成去同步化并有助于稳定内存受限的服务。更广泛地说，我们将服务引发的拥塞识别为一种结构性不稳定机制，并推导出维持高吞吐量的调度设计原则。

英文摘要

In large language model (LLM) serving, each request accumulates persistent graphics processing unit (GPU) memory during service as its key-value cache grows with every generated token. Under high concurrency, aggregate memory usage therefore increases endogenously over time: the service process itself creates future capacity pressure. When memory capacity is exceeded, systems evict active requests, discarding cached state and restarting them later, which wastes computation and reduces throughput. We develop a discrete-time dynamical model of memory-constrained LLM inference that captures admission, memory growth, and eviction under continuous batching. In the saturated-input regime, the system admits both eviction-free fixed points and limit cycles with evictions. For homogeneous workloads, we show that the eviction-free equilibrium is unstable and that, except for a Lebesgue-measure-zero exact-capture set, the system converges to a unique worst-case limit cycle that is asymptotically stable outside this exceptional set, with throughput losses as large as 50%. For heterogeneous workloads, we prove a stability criterion in the two-class common-input setting and explain how the survival-polynomial mechanism generalizes to multiple classes and heterogeneous-input lengths. Under an input-dominated scaling regime, coprime decoding lengths stabilize the eviction-free equilibrium, while non-coprime lengths create synchronized modes that drive instability. These results characterize when workload heterogeneity desynchronizes completions and helps stabilize memory-constrained serving. More broadly, we identify service-induced congestion as a structural instability mechanism and derive scheduling design principles for sustaining high throughput.

URL PDF HTML ☆

赞 0 踩 0

2606.15482 2026-06-16 stat.ML cs.LG 新提交

Ricci-Filtration: Boosting Retrieval-Augmented Generation Reranker to Query-Answer Tasks by Discrete Ricci Flow

Ricci-Filtration：通过离散Ricci流提升检索增强生成重排序器在查询-答案任务中的性能

Tian Qin, Wei-Min Huang

发表机构 * Tian Qin（田琴）； Wei-Min Huang（黄伟民）

AI总结提出基于离散曲率和Ricci流的几何重排序增强方法Ricci-Filtration，通过建模查询与检索块为网络并利用曲率过滤噪声块，显著提升RAG生成性能。

详情

AI中文摘要

Ricci流是一种曲率引导的扩散过程，通过收缩高正曲率区域和扩张负曲率区域来变形空间。类似地，加权图上的离散Ricci流通过收缩正Ricci曲率的边和拉伸负Ricci曲率的边来修改边权重，有效增加簇之间的分离度。受这两项开创性工作的启发，我们提出了一种基于几何的RAG重排序增强方法，称为Ricci-Filtration。通过将输入查询和初始检索块建模为一个网络，其中输入查询和块作为节点，基于嵌入的成对关系定义初始图，Ricci-Filtration利用离散曲率和Ricci流评估每个块相对于用户查询的结构重要性。该系统首先根据块相对于查询的几何曲率过滤初始块；然后，重排序器处理剩余块以增强生成性能。我们从理论上证明，归一化离散Ricci流可以通过识别边权重的不同渐近行为来检测社区结构。这支持移除相对于查询节点具有大权重和负Ricci曲率的“噪声”文档块。大量实验证实，Ricci-Filtration在准确率、精确率、召回率和F1分数上优于几种基线重排序方法。此外，消融研究表明，Ricci-Filtration在各种设置下通常优于基线，突显了该框架在不同架构下的鲁棒性。

英文摘要

Ricci flow is a curvature-guided diffusion process that deforms space by shrinking regions of high positive curvature and expanding those with negative curvature. Similarly, discrete Ricci flow on weighted graphs modifies edge weights by shrinking edges with positive Ricci curvature and stretching those with negative Ricci curvature, effectively increasing the separation between clusters. Inspired by these two cornerstone works, we propose a geometry-based RAG reranker enhancement procedure called Ricci-Filtration. By modeling the input query and initial retrieved chunks as a network, where the input query and chunks serve as nodes and embedding-based pairwise relations define an initial graph, Ricci-Filtration leverages discrete curvature and Ricci flow to evaluate the structural importance of each chunk with respect to the user query. The system first filters the initial chunks based on their geometric curvature relative to the query; then, a reranker processes the remaining chunks to enhance generative performance. We theoretically prove that normalized discrete Ricci flow can detect community structures by identifying distinct asymptotic behaviors in edge weights. This supports the removal of ``noisy'' document chunks characterized by large weights and negative Ricci curvature relative to the query node. Extensive experiments confirm that Ricci-Filtration outperforms several baseline reranking methods in accuracy, precision, recall, and F1 scores. Furthermore, ablation studies demonstrate that the Ricci-Filtration generally outperforms the baseline under various settings, highlighting the framework's robustness across different architectures.

URL PDF HTML ☆

赞 0 踩 0

2606.15219 2026-06-16 cs.LG cs.DS math.ST stat.ML stat.TH 新提交

Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

神经网络能否实现最优计算-统计权衡？基于单指标模型的分析

Siyu Chen, Beining Wu, Miao Lu, Zhuoran Yang, Tianhao Wang

AI总结提出统一梯度算法训练两层神经网络，在多项式时间内学习高斯单指标模型，样本复杂度匹配SQ下界，并扩展到稀疏情形。

Comments 96 pages, 4 figures

详情

AI中文摘要

在这项工作中，我们解决以下问题：基于梯度的神经网络训练能否在学习高斯单指标模型时实现最优计算-统计权衡？先前研究表明，统计查询框架下的任何多项式时间算法需要$Ω(d^{s^\star/2}\lor d)$个样本，其中$s^\star$是生成指数，代表学习潜在模型的内在难度。然而，神经网络能否达到这一样本复杂度尚不清楚。受先前学习单指标模型的技术（如标签变换和景观平滑）启发，我们提出了一种统一的梯度算法，用于在多项式时间内训练两层神经网络。我们的方法适用于多种损失函数和激活函数，涵盖了广泛现有方法。我们证明，该算法学习到的特征表示与未知信号$θ^\star$高度对齐，样本复杂度为$\widetilde{O}(d^{s^\star/2} \lor d)$，对于所有生成指数$s^\star\geq 1$，与SQ下界仅差多对数因子。此外，我们通过引入一种利用稀疏结构的新型权重扰动技术，将方法扩展到$θ^\star$为$k$-稀疏（$k = o(\sqrt{d})$）的情形。我们推导出相应的SQ下界为$\widetildeΩ(k^{s^\star})$，我们的方法与之匹配至多对数因子。我们的框架，特别是权重扰动技术，具有独立意义，并暗示了其他问题（如稀疏张量PCA）的潜在梯度解法。

英文摘要

In this work, we tackle the following question: Can neural networks trained with gradient-based methods achieve the optimal computational-statistical tradeoff in learning Gaussian single-index models? Prior research has shown that any polynomial-time algorithm under the statistical query (SQ) framework requires $Ω(d^{s^\star/2}\lor d)$ samples, where $s^\star$ is the generative exponent representing the intrinsic difficulty of learning the underlying model. However, it remains unknown whether neural networks can achieve this sample complexity. Inspired by prior techniques such as label transformation and landscape smoothing for learning single-index models, we propose a unified gradient-based algorithm for training a two-layer neural network in polynomial time. Our method is adaptable to a variety of loss and activation functions, covering a broad class of existing approaches. We show that our algorithm learns a feature representation that strongly aligns with the unknown signal $θ^\star$, with sample complexity $\widetilde{O} (d^{s^\star/2} \lor d)$, matching the SQ lower bound up to a polylogarithmic factor for all generative exponents $s^\star\geq 1$. Furthermore, we extend our approach to the setting where $θ^\star$ is $k$-sparse for $k = o(\sqrt{d})$ by introducing a novel weight perturbation technique that leverages the sparsity structure. We derive a corresponding SQ lower bound of order $\widetildeΩ(k^{s^\star})$, matched by our method up to a polylogarithmic factor. Our framework, especially the weight perturbation technique, is of independent interest, and suggests potential gradient-based solutions to other problems such as sparse tensor PCA.

URL PDF HTML ☆

赞 0 踩 0

2606.15217 2026-06-16 stat.ML cs.LG 新提交

Conformal Candidate Certification for Offline Model-Based Optimization

离线模型优化的共形候选认证

Seungjin Choi

发表机构 * Seungjin Choi（Choi）

AI总结提出共形候选认证（CCC）方法，通过加权共形预测为离线模型优化中的候选设计提供校准的单侧下界，确保超过目标阈值的候选被认证，解决了分布偏移下的统计可靠性问题。

Comments ICML 2026 Workshop on Decision-Making from Offline Datasets to Online Adaptation: Black-Box Optimization to Reinforcement Learning

详情

AI中文摘要

离线模型优化（MBO）通过优化在固定历史数据集上训练的代理模型来提出候选方案。由于候选方案故意处于分布外，代理模型的排名在最优化器最激进的地方最不可靠，然而现有方法没有为每个候选提供统计证书，证明其设计满足目标阈值。我们提出\emph{共形候选认证}（CCC），一种事后包装器，为每个候选附加一个校准的单侧下界，并仅推进那些下界超过目标阈值的候选。我们证明，熵正则化的代理最大化诱导出吉布斯倾斜提议，因此同一代理模型为加权共形预测提供重要性权重，无需单独的密度比估计步骤。在受控的合成研究中，CCC在名义水平0.90下认证了激进提议池中的16.7%的候选，经验覆盖率为0.990，而忽略协变量偏移的标准共形预测覆盖率降至0.416。

英文摘要

Offline model-based optimization (MBO) proposes candidates by optimizing a surrogate trained on a fixed historical dataset. Because candidates are deliberately out-of-distribution, surrogate rankings are least reliable exactly where the optimizer is most aggressive, yet existing methods provide no per-candidate statistical certificate that a design meets a target threshold. We propose \emph{Conformal Candidate Certification} (CCC), a post-hoc wrapper that attaches a calibrated one-sided lower bound to each candidate and advances only those whose bound exceeds the target. We show that entropy-regularized surrogate maximization induces a Gibbs-tilted proposal, so the same surrogate supplies importance weights for weighted conformal prediction without a separate density-ratio estimation step. In a controlled synthetic study, CCC certifies $16.7\%$ of an aggressive proposal pool with empirical coverage 0.990 at nominal 0.90, while standard conformal prediction ignoring the covariate shift collapses to 0.416 coverage.

URL PDF HTML ☆

赞 0 踩 0

2606.14929 2026-06-16 cs.LG cs.AI stat.ML 新提交

用于分层分类的同时潜在预算树

Simultaneous Latent Budget Trees for Stratified Classification Cristian Buoncompagni, Stefano Pellegrino, Giulia Vannucci, Raffaele Dubbioso, Roberta Siciliano

AI总结提出同时潜在预算树框架，通过模型驱动的分裂规则处理分层因素，实现可解释分类，并应用于肌萎缩侧索硬化症性别差异分析。

详情

AI中文摘要

在可解释人工智能时代，单棵树因其易于解释而重新受到关注。本文介绍了同时潜在预算树，这是一个概率机器学习框架，用于在存在分层因素（如时间、空间或人口统计变量）作为控制变量或潜在混杂因素时的分类树。标准的树生长过程并非设计用于优化条件分裂规则。提出了一种基于模型的分裂规则，其中子节点被解释为同时混合模型（如同时潜在预算模型及其约束版本）的潜在成分，该模型拟合于父节点。混合参数驱动观测值（不同组别不同）到达子节点，而潜在预算参数更新控制变量每个水平的响应类别轮廓。参数通过最小二乘法估计，考虑模型的神经网络视角。信息丰富的树结构可以通过节点和路径上的解释辅助工具进行交互式可视化，包括视觉剪枝和决策树选择过程。提出了适当的措施来处理不平衡的响应类别分布。所提出的方法应用于调查肌萎缩侧索硬化症疾病进展中的性别相关差异。SLBT库及其各种基于树的算法可在链接的GitHub仓库中获取。

英文摘要

In the era of Explainable Artificial Intelligence, there is a renewed focus on single trees for their ease of interpretation. This paper introduces Simultaneous Latent Budget Trees, a probabilistic machine learning framework for classification trees in the presence of a stratification factor such as a temporal, spatial, or demographic variable, acting as a control variable or potential confounder. Standard tree growth procedures are not designed to optimize a conditional split rule. A model-based split rule is proposed in which child nodes are interpreted as latent components of a simultaneous mixture model, such as the Simultaneous Latent Budget Model and its constrained versions, fitted to the parent node. Mixing parameters drive the observations, differently for each group, to the child nodes whereas latent budgets parameters update the response classes profile of each level of the control variable. Parameters are estimated by least squares considering a neural network perspective of the model. An informative tree structure can be interactively visualized with interpretation aids on the node and the paths, including visual pruning and decision tree selection procedure. Suitable measures are proposed to handle an unbalanced response class distribution. The proposed methodology is applied to investigate gender-related differences in disease progression of Amyotrophic Lateral Sclerosis. The SLBT library with the various tree-based algorithms is available in the linked GitHub repository.

URL PDF HTML ☆

赞 0 踩 0

2605.03289 2026-06-16 stat.ML cs.LG math.ST stat.TH 版本更新

Imbalanced Classification under Capacity Constraints

容量约束下的不平衡分类

Daniel Fraiman, Ricardo Fraiman

发表机构 * Departamento de Matemática y Ciencias Universidad de San Andrés（数学与科学系，圣安德烈斯大学）； CONICET Argentina（阿根廷国家科研委员会）； PEDECIBA Matemática Uruguay（乌拉圭PEDECIBA数学）

AI总结针对少数类检测中容量约束问题，提出形式化分类框架，通过重加权先验概率等价于贝叶斯分类器，并引入容量调整性能指标，实验表明优于传统方法和SMOTE。

详情

AI中文摘要

在欺诈检测、医学筛查和工业质量控制等应用中，从严重类别不平衡中检测少数类观测是一个核心挑战。在这些场景中，每个阳性预测都会触发昂贵的后续行动（如MRI扫描、交易审计），其执行受到实际运营约束。本文提出了一个容量约束下的形式化分类框架：给定用户定义的界限$b$（可标记为少数类的观测比例上限），目标是找到在该类上最大化灵敏度的分类器。我们刻画了该约束下的最优分类器，并建立了其与重加权先验概率下的经典贝叶斯分类器的等价性。我们还引入了一个容量调整的性能指标$M$，用于衡量容量约束生效时的有效检测率。该框架在标准学习方法（k-NN、SVM、随机森林和神经网络）上实现，并为每种方法建立了统计一致性。我们进一步证明，当没有超参数面向容量约束目标时，这些方法退化为事后阈值调整，并引入了一种容量感知支持向量机，在训练过程中利用约束，实现了最强的经验性能。在台湾信用卡违约数据集上的实验证实，在高不平衡情况下，容量约束分类器显著优于经典方法和SMOTE。该框架自然地扩展到多类别设置和在线环境。

英文摘要

Detecting observations from a minority class under severe class imbalance is a central challenge in applications such as fraud detection, medical screening, and industrial quality control. In these settings, each positive prediction triggers a costly follow-up action, an MRI scan, a transaction audit, whose execution is subject to real operational constraints. This paper proposes a formal classification framework under capacity constraints: given a user-defined bound limit $b$ on the proportion of observations that can be labeled as belonging to the minority class, the goal is to find the classifier that maximizes sensitivity on that class. We characterize the optimal classifier under this constraint and establish its equivalence with the classical Bayes classifier under a reweighting of the prior probabilities. We also introduce a capacity-adjusted performance metric $M$ that accounts for the effective detection rate when the capacity constraint is binding. The framework is implemented on top of standard learning methods, k-NN, SVM, random forests, and neural networks, and statistical consistency is established for each. We further show that these methods reduce to post-hoc thresholding when no hyperparameters are oriented toward the capacity-constrained objective, and introduce a capacity-aware support vector machine that exploits the constraint during training and achieves the strongest empirical performance. Experiments on the Taiwanese credit card default dataset confirm that capacity-constrained classifiers substantially outperform both classical approaches and SMOTE under high imbalance regimes. The framework extends naturally to multiclass settings and online environments.

URL PDF HTML ☆

赞 0 踩 0

2505.24275 2026-06-16 cs.LG math.OC stat.ML 版本更新

GradPower: Powering Gradients for Faster Language Model Pre-Training

GradPower: 通过梯度加速更快的语言模型预训练

Jinbo Wang, Mingze Wang, Jiaqi Zhang, Wei Wang, Peng Pei, Xunliang Cai, Weinan E, Lei Wu

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结本文提出GradPower，一种轻量级的梯度变换技术，用于加速语言模型预训练。通过元素级符号幂变换，将梯度输入基础优化器，无需修改优化器内部逻辑或超参数，从而在多种架构、参数规模、数据集和学习率调度方案中均取得更低的终端损失。

Comments 24 pages, accepted by ICML 2026

详情

AI中文摘要

我们提出GradPower，一种轻量级的梯度变换技术，用于加速语言模型预训练。给定一个梯度向量$g=(g_i)_i$，GradPower首先应用元素级符号幂变换：$φ_p(g)=({ m sign}(g_i)|g_i|^p)_{i}$，其中$p>0$为固定值，然后将变换后的梯度输入基础优化器。值得注意的是，GradPower只需单行代码更改，无需修改基础优化器的内部逻辑，包括超参数。当应用于Adam（称为AdamPower）时，GradPower在多种架构（LLaMA、Qwen2MoE）、参数规模（66M到2B）、数据集（C4、OpenWebText）和学习率调度方案（余弦、warmup-stable-decay）中均一致取得更低的终端损失。最显著的收益出现在训练现代混合专家模型时使用warmup-stable-decay调度方案。GradPower还无缝集成到其他最先进的优化器中，如Muon，从而进一步提升性能。最后，我们提供了理论分析，揭示了GradPower的内在机制，并突显了梯度噪声的影响。

英文摘要

We propose GradPower, a lightweight gradient-transformation technique for accelerating language model pre-training. Given a gradient vector $g=(g_i)_i$, GradPower first applies the elementwise sign-power transformation: $φ_p(g)=({\rm sign}(g_i)|g_i|^p)_{i}$ for a fixed $p>0$, and then feeds the transformed gradient into a base optimizer. Notably, GradPower requires only a single-line code change and no modifications to the base optimizer's internal logic, including the hyperparameters. When applied to Adam (termed AdamPower), GradPower consistently achieves lower terminal loss across diverse architectures (LLaMA, Qwen2MoE), parameter scales (66M to 2B), datasets (C4, OpenWebText), and learning-rate schedules (cosine, warmup-stable-decay). The most pronounced gains are observed when training modern mixture-of-experts models with warmup-stable-decay schedules. GradPower also integrates seamlessly with other state-of-the-art optimizers, such as Muon, yielding further improvements. Finally, we provide theoretical analyses that reveal the underlying mechanism of GradPower and highlight the influence of gradient noise.

URL PDF HTML ☆

赞 0 踩 0

2605.18324 2026-06-16 cs.CV cs.AI cs.GR cs.LG stat.ML 版本更新

局部核投影离群度：一种用于多模态离群检测的两阶段方法

Akira Tamamori

发表机构 * Department of Computer Science, Aichi Institute of Technology（爱知技术大学计算机科学系）

AI总结提出两阶段LKPLO框架，结合自适应损失函数、全局核PCA和局部聚类，解决多模态离群检测问题，在10个基准数据集上取得最优性能。

Comments 12 pages, 5 figures; accepted by The IEICE Transactions on Information and Systems

详情

AI中文摘要

本文提出两阶段LKPLO，一种新颖的多阶段离群检测框架，克服了传统基于投影的方法同时存在的局限性：它们依赖于固定的统计度量并假设单一数据结构。我们的框架独特地综合了三个关键概念：(1) 一种基于广义损失的离群度度量（PLO），用灵活的自适应损失函数（如我们提出的SVM类损失）替代固定度量；(2) 一个全局核PCA阶段，用于线性化非线性数据结构；(3) 一个后续的局部聚类阶段，用于处理多模态分布。在10个基准数据集上进行的全面5折交叉验证实验，结合自动超参数优化，表明两阶段LKPLO达到了最先进的性能。在现有方法失败且具有挑战性结构的数据集上，尤其是在多簇数据（Optdigits）和复杂高维数据（Arrhythmia）上，它显著优于强基线。此外，消融研究实证证实，核化和局部化阶段的协同组合对其优越性能不可或缺。这项工作为重要类别的离群检测问题贡献了一个强大的新工具，并强调了混合多阶段架构的重要性。

英文摘要

This paper presents Two-Stage LKPLO, a novel multi-stage outlier detection framework that overcomes the coexisting limitations of conventional projection-based methods: their reliance on a fixed statistical metric and their assumption of a single data structure. Our framework uniquely synthesizes three key concepts: (1) a generalized loss-based outlyingness measure (PLO) that replaces the fixed metric with flexible, adaptive loss functions like our proposed SVM-like loss; (2) a global kernel PCA stage to linearize non-linear data structures; and (3) a subsequent local clustering stage to handle multi-modal distributions. Comprehensive 5-fold cross-validation experiments on 10 benchmark datasets, with automated hyperparameter optimization, demonstrate that Two-Stage LKPLO achieves state-of-the-art performance. It significantly outperforms strong baselines on datasets with challenging structures where existing methods fail, most notably on multi-cluster data (Optdigits) and complex, high-dimensional data (Arrhythmia). Furthermore, an ablation study empirically confirms that the synergistic combination of both the kernelization and localization stages is indispensable for its superior performance. This work contributes a powerful new tool for a significant class of outlier detection problems and underscores the importance of hybrid, multi-stage architectures.

URL PDF HTML ☆

赞 0 踩 0

2510.06647 2026-06-16 stat.ML cs.LG 版本更新

Q-Learning with Fine-Grained Gap-Dependent Regret

具有细粒度间隙依赖遗憾的Q学习

Haochen Zhang, Zhong Zheng, Lingzhou Xue

发表机构 * Department of Statistics, The Pennsylvania State University（统计学系，宾夕法尼亚州立大学）

AI总结针对表格型马尔可夫决策过程，提出细粒度间隙依赖遗憾界，分别改进UCB和非UCB算法，并修正了AMB算法的设计缺陷。

详情

AI中文摘要

我们研究了在情节式表格马尔可夫决策过程中无模型强化学习的细粒度间隙依赖遗憾界。现有的无模型算法实现了极小化极大最坏情况遗憾，但其间隙依赖界仍然粗糙，未能完全捕捉次优间隙的结构。我们通过为基于UCB和非UCB的算法建立细粒度间隙依赖遗憾界来解决这一限制。在基于UCB的设置中，我们开发了一个新颖的分析框架，明确分离了最优和次优状态-动作对的分析，从而为UCB-Hoeffding (Jin et al., 2018) 提供了第一个细粒度遗憾上界。为了突出该框架的通用性，我们引入了ULCB-Hoeffding，这是一种受AMB (Xu et al., 2021) 启发但结构简化的新UCB算法，它享有细粒度遗憾保证并在经验上优于AMB。在非UCB设置中，我们重新审视了唯一已知的算法AMB，并识别出其算法设计和分析中的两个关键问题：Q更新中的不当截断以及其集中论证中鞅差条件的违反。我们提出了AMB的改进版本，解决了这些问题，为非UCB方法建立了第一个严格的细粒度间隙依赖遗憾，实验表明其性能优于AMB。

英文摘要

We study fine-grained gap-dependent regret bounds for model-free reinforcement learning in episodic tabular Markov Decision Processes. Existing model-free algorithms achieve minimax worst-case regret, but their gap-dependent bounds remain coarse and fail to fully capture the structure of suboptimality gaps. We address this limitation by establishing fine-grained gap-dependent regret bounds for both UCB-based and non-UCB-based algorithms. In the UCB-based setting, we develop a novel analytical framework that explicitly separates the analysis of optimal and suboptimal state-action pairs, yielding the first fine-grained regret upper bound for UCB-Hoeffding (Jin et al., 2018). To highlight the generality of this framework, we introduce ULCB-Hoeffding, a new UCB-based algorithm inspired by AMB (Xu et al.,2021) but with a simplified structure, which enjoys fine-grained regret guarantees and empirically outperforms AMB. In the non-UCB-based setting, we revisit the only known algorithm AMB, and identify two key issues in its algorithm design and analysis: improper truncation in the $Q$-updates and violation of the martingale difference condition in its concentration argument. We propose a refined version of AMB that addresses these issues, establishing the first rigorous fine-grained gap-dependent regret for a non-UCB-based method, with experiments demonstrating improved performance over AMB.

URL PDF HTML ☆

赞 0 踩 0

2510.01175 2026-06-16 cs.LG eess.SP math.OC stat.ML 版本更新

On the Benefits of Weight Normalization for Overparameterized Matrix Sensing

关于过参数化矩阵感知中权重归一化的优势

Yudong Wei, Liang Zhang, Bingcong Li, Niao He

发表机构 * ETH Zurich（苏黎世联邦理工学院）

AI总结本文证明在过参数化矩阵感知中，权重归一化结合黎曼优化可实现线性收敛，相比未使用归一化的方法获得指数级加速，且过参数化程度越高，迭代和样本复杂度多项式级降低。

2501.19401 2026-06-16 cs.LG stat.ML 版本更新

DAL: A Practical Prior-Free Black-Box Framework for Piecewise Stationary Bandits

DAL：一种面向分段平稳赌博机的实用无先验黑盒框架

Argyrios Gerogiannis, Yu-Han Huang, Subhonmesh Bose, Venugopal V. Veeravalli

发表机构 * Georgia Institute of Technology（佐治亚理工学院）； University of California, Berkeley（加州大学伯克利分校）

AI总结提出检测增强学习（DAL）框架，无需非平稳性先验知识，将任意最优静态赌博机算法与变化检测器结合，在多种非平稳场景下超越现有方法。

Comments 28 pages, 12 figures

2508.03867 2026-06-16 math.AG cs.LG stat.ML 版本更新

Constraining the outputs of ReLU neural networks

约束ReLU神经网络的输出

Yulia Alexandr, Guido Montúfar

发表机构 * University of California, Los Angeles（加州大学洛杉矶分校）； Max Planck Institute for Mathematics in the Sciences（马克斯·普朗克数学研究所）

AI总结通过引入与ReLU网络相关的代数簇，利用激活区域内的秩约束推导多项式方程，刻画网络可表示的函数，并研究簇达到预期维度的条件。

Comments 33 pages, 4 figures

2505.06589 2026-06-16 stat.ML cs.AI math.OC 版本更新

Optimal Transport for Machine Learners

机器学习者的最优传输

Gabriel Peyré

AI总结本书从机器学习角度介绍最优传输（OT）技术，涵盖从Monge映射、Kantorovich对偶到Sinkhorn算法等核心方法，并展示其在损失函数、生成模型、领域适应、梯度流等ML任务中的应用。

详情

AI中文摘要

现代机器学习反复操作概率测度：经验数据集、生成样本、潜在分布、类别条件律、粒子系统、宽网络权重和注意力模式。最优传输在此场景中很有用，因为它通过询问质量应如何移动来比较这些对象。因此，它结合了具有统计意义的差异概念与插值几何、对偶证书和变分动力学。这使得OT成为损失函数、生成建模、领域适应、鲁棒学习、重心、梯度流和学习算法的平均场描述的通用语言。本书以这些机器学习用途为出发点，介绍主要的OT技术。它从有限分配和Monge映射视角开始，过渡到Kantorovich耦合和对偶势，然后解释使传输可用的算法思想：线性规划、半离散单元、Sinkhorn缩放和低维投影。随后，相同的对象被重新用作测度几何，给出Wasserstein距离、重心、梯度流、动态公式和高斯/Bures公式。最后几章强调与现代ML最相关的变体：散度和对抗损失、熵松弛和非平衡松弛、鲁棒或谱地面几何、Gromov和量子扩展，以及基于传输的生成模型、平均场网络和注意力动态视图。目标是保持数学的明确性，同时揭示将OT转化为机器学习者可用工具箱所需的计算和几何直觉。

英文摘要

Modern machine learning repeatedly manipulates probability measures: empirical datasets, generated samples, latent distributions, class-conditional laws, particle systems, weights of wide networks and attention patterns. Optimal transport is useful in this setting because it compares such objects by asking how mass should move. It therefore combines a statistically meaningful notion of discrepancy with a geometry of interpolation, dual certificates and variational dynamics. This makes OT a common language for losses, generative modeling, domain adaptation, robust learning, barycenters, gradient flows and mean-field descriptions of learning algorithms. This book presents the main OT techniques with these machine-learning uses in mind. It starts from finite assignment and the Monge map viewpoint, passes to Kantorovich couplings and dual potentials, and then explains the algorithmic ideas that make transport usable: linear programming, semi-discrete cells, Sinkhorn scaling and low-dimensional projections. The same objects are then reused as a geometry of measures, giving Wasserstein distances, barycenters, gradient flows, dynamic formulations and Gaussian/Bures formulas. The final chapters emphasize the variants most relevant to modern ML: divergences and adversarial losses, entropic and unbalanced relaxations, robust or spectral ground geometries, Gromov and quantum extensions, and transport-based views of generative models, mean-field networks and attention dynamics. The goal is to keep the mathematics explicit while exposing the computational and geometric intuitions needed to turn OT into a working toolbox for machine learners.

URL PDF HTML ☆

赞 0 踩 0

2409.18909 2026-06-16 cs.LG cs.IT math.IT stat.ML 版本更新

Best Arm Identification with Minimal Regret

最小化遗憾的最佳臂识别

Junwen Yang, Vincent Y. F. Tan, Tianyuan Jin

发表机构 * Institute of Operations Research and Analytics National University of Singapore（运营研究与分析研究所，新加坡国立大学）； Department of Mathematics Department of Electrical and Computer Engineering Institute of Operations Research and Analytics National University of Singapore（数学系电子与计算机工程系运营研究与分析研究所，新加坡国立大学）； Department of Mathematics National University of Singapore（数学系新加坡国立大学）

AI总结提出在最小化累积遗憾的同时以置信度δ识别最佳臂的问题，利用信息论推导下界，并设计渐近最优的Double KL-UCB算法。

详情

AI中文摘要

受需要负责任实验的现实应用启发，我们提出了最小化遗憾的最佳臂识别（BAI）问题。这一多臂老虎机问题的变体优雅地融合了其两个最普遍的目标：遗憾最小化和BAI。更准确地说，智能体的目标是以规定的置信水平δ识别最佳臂，同时最小化直到停止时间的累积遗憾。聚焦于单参数指数族分布，我们利用信息论技术建立了期望累积遗憾的实例相关下界。此外，我们提出了一个不可能结果，强调了固定置信度BAI中累积遗憾与样本复杂度之间的张力。作为补充，我们设计并分析了Double KL-UCB算法，该算法在置信水平趋近于零时达到渐近最优性。值得注意的是，该算法采用两种不同的置信界限以随机方式指导臂选择。我们的发现阐明了遗憾最小化与BAI之间内在联系的新视角。

英文摘要

Motivated by real-world applications that necessitate responsible experimentation, we introduce the problem of best arm identification (BAI) with minimal regret. This variant of the multi-armed bandit problem elegantly amalgamates two of its most ubiquitous objectives: regret minimization and BAI. More precisely, the agent's goal is to identify the best arm with a prescribed confidence level $δ$, while minimizing the cumulative regret up to the stopping time. Focusing on single-parameter exponential families of distributions, we leverage information-theoretic techniques to establish an instance-dependent lower bound on the expected cumulative regret. Moreover, we present an impossibility result that underscores the tension between cumulative regret and sample complexity in fixed-confidence BAI. Complementarily, we design and analyze the Double KL-UCB algorithm, which achieves asymptotic optimality as the confidence level tends to zero. Notably, this algorithm employs two distinct confidence bounds to guide arm selection in a randomized manner. Our findings elucidate a fresh perspective on the inherent connections between regret minimization and BAI.

URL PDF HTML ☆

赞 0 踩 0

2405.15768 2026-06-16 stat.ML cs.AI cs.LG 版本更新

Canonical Variates in Wasserstein Metric Space

Wasserstein度量空间中的典型变量

Jia Li, Lin Lin

发表机构 * Department of Statistics, The Pennsylvania State University（宾夕法尼亚州立大学统计学系）； Department of Biostatistics and Bioinformatics, Duke University（杜克大学生物统计学与生物信息学系）

AI总结针对分布数据分类问题，提出基于Wasserstein距离的Fisher比最大化降维方法，通过迭代优化算法实现，实验证明能显著提升分类性能。

Comments single space 39 pages, 10 figures

详情

AI中文摘要

在本文中，我们处理由向量空间上的分布（而非单个点）表示的实例的分类问题。我们考虑基于成对距离的分类算法，特别是分布之间的Wasserstein度量。我们研究的核心是在Wasserstein度量空间中进行降维以提高分类准确性。我们引入了一种基于最大化Fisher比（定义为类间变异与类内变异之比）原理的新方法。该比值最大化的方向被称为判别坐标或典型变量轴。在实践中，类间变异和类内变异被定义为分布对之间的平均平方Wasserstein距离，这些分布对要么属于同一类，要么属于不同类。该比值优化通过一种迭代算法实现，该算法在向量空间中的最优传输和最大化步骤之间交替进行。进行了实证研究以评估算法的收敛性；实验结果表明，降维技术显著提高了分类性能。此外，新方法优于基于从分布数据派生的向量表示运行的成熟算法。它对实例如何由分布总结的变化（例如高斯混合模型表示中的分量数量）也表现出鲁棒性。

英文摘要

In this paper, we address the classification of instances represented by distributions on a vector space rather than single points. We consider classification algorithms based on pairwise distances, specifically, the Wasserstein metric between distributions. Central to our investigation is dimension reduction within the Wasserstein metric space to enhance classification accuracy. We introduce a novel approach grounded in the principle of maximizing Fisher's ratio, defined as the quotient of between-class variation to within-class variation. The directions in which this ratio is maximized are termed discriminant coordinates or canonical variates axes. In practice, both between-class and within-class variations are defined as the average squared Wasserstein distances between pairs of distributions, with the pairs either belonging to the same class or to different classes. This ratio optimization is achieved through an iterative algorithm, which alternates between optimal transport and maximization steps within the vector space. Empirical studies are conducted to assess the algorithm's convergence; and experimental results demonstrate that the dimension reduction technique substantially enhances classification performance. Moreover, the new method outperforms well-established algorithms that operate on vector representations derived from distributional data. It also exhibits robustness to variations in how instances are summarized by distributions, such as the number of components in a Gaussian mixture model (GMM) representation.

URL PDF HTML ☆

赞 0 踩 0

2606.16726 2026-06-16 q-bio.QM stat.AP 新提交

Too Few or Too Many? Sample Size Estimation for Differential Abundance Studies

太少还是太多？差异丰度研究的样本量估计

Michael Agronah, Benjamin M. Bolker

AI总结提出一种基于效应大小、平均丰度和统计功效的样本量计算方法，通过R包power.nb实现，并利用30个真实微生物组数据集验证，发现现有研究样本量不足。

详情

AI中文摘要

确定适当的研究样本量是规划科学研究的关键步骤。适当的样本量规划可避免样本量不足和过度膨胀。样本量过大会浪费资源、受试者的时间和精力以及实验动物的生命。样本量不足（一个更常见的问题）会因无法检测到生物学上有意义的差异而浪费更多资源，并助长可疑的研究实践，如$p$-hacking。微生物组研究尤其受到小样本量的挑战，特别是在人类受试者或昂贵动物模型的研究中。在实践中，差异丰度研究中分类群的统计功效受效应大小（通常量化为倍数变化）、单个分类群的平均丰度和样本数量的影响。我们提出了一种新的样本量计算方法，用于差异丰度研究，作为效应大小、平均丰度和分类群统计功效的函数。我们的方法已在power.nb R包中实现，可从https://michaelagronah.com/power.nb/articles/stub.html获取。我们利用从30个真实世界微生物组数据集中获得的分类群平均丰度和倍数变化估计值，应用我们的模型进行样本量计算。结果表明，差异丰度微生物组研究需要比当前文献中普遍存在的样本量更大的样本量，才能达到足够的统计功效。我们的框架将帮助研究人员就适当的样本量做出明智的决策。

英文摘要

Determining an appropriate sample size for a study is a crucial step in planning scientific research. Appropriate sample size planning avoids both inadequate and inflated sample sizes. Inflated sample sizes wastes resources, time and effort of human subjects, and lives of experimental animals. Inadequate sample sizes, a much more common problem, wastes even more resources through the inability to detect biologically meaningful differences and encourages questionable research practices like $p$-hacking. Microbiome studies are particularly challenged by small sample sizes, particularly in studies of human subjects or expensive animal models. In practice, the statistical power of taxa within a differential abundance study is influenced by the effect size (typically quantified as fold change), mean abundance of individual taxa, and the number of samples. We present a novel approach for sample size calculation for differential abundance studies as a function of effect size, mean abundance and statistical power of taxa. Our method is implemented in the power.nb R package, available at https://michaelagronah.com/power.nb/articles/stub.html. We applied our model for sample size calculation using estimates of mean abundance and fold change of taxa obtained from thirty real-world microbiome datasets. Our results showed that differential abundance microbiome studies require larger sample sizes than are currently prevalent in the literature to achieve adequate statistical power. Our framework will help researchers make informed decisions about appropriate sample sizes.

URL PDF HTML ☆

赞 0 踩 0

2606.16460 2026-06-16 stat.AP stat.ME 新提交

Module-Structured Mixture Factor Models to Identify Outcome-Specific Signatures in Gene Expression Data

模块化结构混合因子模型识别基因表达数据中的结果特异性特征

Jinran Wu, Geoffrey J. McLachlan, Saumyadipta Pyne

AI总结提出模块化结构混合因子模型，结合有限混合建模与基因模块级低秩因子表示，分解表达变异性，实现可解释的无监督疾病亚型识别。

Comments 24 pages, 2 figures

详情

AI中文摘要

高通量基因表达数据表现出高维度、复杂的基因间依赖性和样本间显著的生物学异质性，给无监督聚类和疾病亚型发现带来了重大挑战。我们引入了一种模块化结构混合因子模型，该模型将有限混合建模与在基因模块级别定义的低秩潜在因子表示相结合。通过在均值和协方差结构中显式建模基因模块，所提出的框架将表达变异性分解为全局基因特异性效应、簇特异性模块级偏移、模块内的潜在依赖性以及基因特异性残差噪声。开发了一种期望-条件最大化算法用于参数估计，允许在高维转录组学环境中进行稳定且可扩展的推断。该框架利用大型临床转录组数据集，能够对两种自身免疫性疾病中与疾病相关的分子亚型和表型异质性进行可解释的无监督识别。

英文摘要

High-throughput gene expression data exhibit high dimensionality, complex intergene dependence, and pronounced biological heterogeneity across samples, presenting major challenges for unsupervised clustering and disease subtype discovery. We introduce a module-structured mixture factor model that combines finite mixture modelling with low-rank latent factor representations defined at the gene-module level. By explicitly modelling gene modules in both the mean and covariance structure, the proposed framework decomposes expression variability into global gene-specific effects, cluster-specific module-level shifts, latent dependence within modules, and gene-specific residual noise. An Expectation--Conditional Maximisation algorithm is developed for parameter estimation, allowing stable and scalable inference in high-dimensional transcriptomic settings. This framework enables interpretable unsupervised identification of disease-associated molecular subtypes and phenotypic heterogeneity across two autoimmune diseases using a large clinical transcriptomic dataset.

URL PDF HTML ☆

赞 0 踩 0

2606.15478 2026-06-16 stat.ME 新提交

A Bayesian Functional Accelerated Failure-Time Model with Varying Effects Correcting for Measurement Error

贝叶斯函数加速失效时间模型：考虑测量误差的变效应

Joseph Yang, Roger Zoh, Carmen Tekwe, Lan Xue

AI总结提出贝叶斯函数加速失效时间模型，通过高斯过程单指标结构建模功能系数随时间和标量协变量的变化，并利用工具变量处理功能协变量的测量误差，以分析可穿戴设备数据中步数活动与缺血性卒中死亡时间的关系。

详情

AI中文摘要

在许多生物医学环境中，作为连续观测轨迹收集的功能数据自然出现，一个关键的推断目标是理解这些功能协变量如何与时间-事件结局相关，同时允许这种关系在由标量特征定义的子组之间变化。现有的函数加速失效时间（AFT）模型的频率学派方法难以灵活捕捉标量协变量和时间对功能效应的联合非线性影响，并且没有充分解决经常污染功能观测暴露的测量误差。我们提出一个贝叶斯函数AFT模型，其中功能系数是时间和一组标量协变量的变效应函数，通过高斯过程单指标结构建模，为子组修正提供灵活的非线性框架。功能协变量中的测量误差通过配对代理观测与工具变量来处理，该工具变量适应与潜在功能暴露的非线性关联。这种功能数据的一个主要来源是可穿戴设备，它可以连续监测身体活动（PA）行为模式随时间的变化，但其输出众所周知容易受到测量误差的影响，并且在不同人口统计子组中表现出与健康结果的异质性关联。通过模拟，我们表明我们的方法恢复了真实的变功能效应，并相对于忽略测量误差的朴素模型减少了偏差。我们将我们的方法应用于中风地理和种族差异原因（REGARDS）研究，以调查步数身体活动如何与不同种族和地区组中的缺血性卒中死亡时间相关。

英文摘要

Functional data collected as continuously observed trajectories arise naturally in many biomedical settings, and a key inferential goal is understanding how such functional covariates relate to time-to-event outcomes while allowing that relationship to vary across subgroups defined by scalar characteristics. Existing frequentist approaches to functional accelerated failure-time (AFT) models struggle to flexibly capture the joint, nonlinear influence of scalar covariates and time on the functional effect, and none adequately address the measurement error that frequently contaminates functionally observed exposures. We propose a Bayesian functional AFT model in which the functional coefficient is a varying effect function of both time and a set of scalar covariates, modeled through a Gaussian process single-index structure that provides a flexible, nonlinear framework for subgroup modification. Measurement error in the functional covariate is handled by pairing a proxy observation with an instrumental variable that accommodates non-linear associations with the latent functional exposure. A prominent source of such functional data is wearable devices, which can continuously monitor physical activity (PA) behavioral patterns over time, yet whose outputs are well known to be prone to measurement error and to exhibit heterogeneous associations with health outcomes across demographic subgroups. Through simulations, we show that our approach recovers the true varying functional effects and reduces bias relative tonaïve models that ignore measurement error. We apply our methods to the Reasons for Geographical and Racial Differences in Stroke (REGARDS) study to investigate how step-count physical activity relates to time-to-death from ischemic stroke across racial and regional groups.

URL PDF HTML ☆

赞 0 踩 0

2606.15445 2026-06-16 stat.ME 新提交

Interim Monitoring as an Information-Time Alignment Problem: The WCR Framework for Time-to-Event Trials

作为信息时间对齐问题的期中监测：用于时间至事件试验的WCR框架

Haitao Pan, Zhongheng Cai

AI总结提出WCR框架，通过锁定队列和校准随访要求参数化随访成熟度，解决事件驱动与入组驱动设计的时序矛盾，控制I类错误和功效，平衡日历时间、成熟度和决策延迟。

Comments Main manuscript with supplementary material. R package WCRBayesDesign is available on CRAN. Submitted to Biometrics

详情

AI中文摘要

时间至事件试验中的期中监测必须在推断成熟度与操作上有意义的时机之间取得平衡。事件驱动设计将分析事件累积对齐，但可能产生大量且不可预测的日历延迟，而入组驱动设计提供可预测的时机，但可能依赖于不成熟的随访。我们提出窗口队列与校准随访要求（WCR）框架，该框架通过锁定队列规模和锁定后随访要求直接参数化随访成熟度。期中分析在预设队列累积了校准的最小随访时间后进行，此时入组可能继续，后续患者保留用于最终分析。该框架区分了用于里程碑生存估计的受限随访和用于比例风险估计的非受限随访，从而将有效信息范围与估计量联系起来。设计参数和决策阈值通过约束优化联合校准，以控制I类错误和功效，同时平衡日历时间、期中成熟度和决策延迟负担。由一项罕见儿科肿瘤学试验驱动的模拟研究表明，WCR在校准模型下达到目标操作特征，并提供比传统事件驱动和入组驱动方法更稳定且可解释的期中时机。该方法已在开源R包WCRBayesDesign中实现，可从CRAN获取。WCR将期中监测重新定义为信息时间对齐问题，并为事件稀疏、入组缓慢和长终点周期的单臂试验提供了实用设计策略。

英文摘要

Interim monitoring in time-to-event trials must balance inferential maturity with operationally meaningful timing. Event-driven designs align analyses with event accumulation but can produce substantial and unpredictable calendar delays, whereas enrollment-driven designs provide predictable timing but may rely on immature follow-up. We propose the Window-Cohort with Calibrated Follow-Up Requirement (WCR) framework, which directly parameterizes follow-up maturity through a locked cohort size and a post-lock follow-up requirement. The interim analysis is conducted after the prespecified cohort has accrued the calibrated minimum follow-up, while enrollment may continue and later patients are reserved for the final analysis. The framework distinguishes restricted follow-up for landmark survival estimands from unrestricted follow-up for proportional hazards estimands, thereby linking the effective information horizon to the estimand. Design parameters and decision thresholds are jointly calibrated through constrained optimization to control type I error and power while balancing calendar time, interim maturity, and decision-lag burden. Simulation studies motivated by a rare pediatric oncology trial show that WCR attains target operating characteristics under the calibration model and offers more stable and interpretable interim timing than conventional event-driven and enrollment-driven approaches. The methodology is implemented in the open-source R package WCRBayesDesign, available on CRAN. WCR reframes interim monitoring as an information-time alignment problem and provides a practical design strategy for single-arm trials with sparse events, slow accrual, and long-horizon endpoints.

URL PDF HTML ☆

赞 0 踩 0

2606.15397 2026-06-16 q-bio.PE stat.AP 新提交

On the Equivalence of Instantaneous and Mechanistic Reproduction Numbers

瞬时繁殖数与机制繁殖数的等价性

Jeremy Goldwasser, Ryan J. Tibshirani, Alyssa Bilinski

AI总结本文证明在均匀混合假设下，通过更新方程定义的瞬时繁殖数与SEIR等房室模型中的机制繁殖数等价，并推导了SEIR动力学隐含的世代间隔分布。

2606.15145 2026-06-16 stat.ME 新提交

On the estimation of the median odds ratio for measuring contextual effects in multilevel binary data from complex survey designs

复杂调查设计中多水平二值数据中测量情境效应的中位数比值比估计

Shafayet Khan Shafee, M. Shafiqur Rahman

AI总结针对多水平二值数据，提出基于Delta方法的中位数比值比（MOR）区间估计，适用于两水平和三水平模型，模拟显示中到大样本下偏差小且覆盖概率满意。

Comments 16 pages, 1 figure, 3 tables and 3 supplementary tables; supplementary material included as an appendix within the same file

详情

AI中文摘要

在具有聚类或分层数据结构的研究中，量化组间异质性（称为情境效应）对于有效的组级推断至关重要。中位数比值比（MOR）源自聚类二值数据的随机效应（RE）逻辑回归模型，提供了对情境效应的直观评估。现有研究大多关注两水平模型MOR的点估计，对其在复杂多水平结构下的统计性质探索有限。然而，相应的区间估计量的开发对于统计推断至关重要。此外，许多现实世界数据集，特别是来自多阶段调查的数据集，涉及超过两水平的分层结构，其中每个水平的情境效应都值得关注。本文讨论了二值和三值二值数据MOR的估计，特别强调区间估计。由于MOR是基于RE logit模型方差分量的后估计量，其置信区间使用Delta方法推导，将对数变换后的MOR视为渐近正态。该方法在两水平和三水平设置的不同模型规范中进行了演示。一项广泛的模拟研究评估了MOR估计量在分层数据设置的不同场景下的性能。结果表明，对于中到大样本，估计量表现出可忽略的偏差和满意的95%置信区间覆盖概率，小样本偏差主要归因于方差分量估计。将所提方法应用于估计剖宫产的情境效应表明，该框架增强了可解释性，并支持更明智的统计和政策导向分析。

英文摘要

In studies with clustered or hierarchical data structures, quantifying between-cluster heterogeneity, referred to as contextual effects, is crucial for valid cluster-level inference. The median odds ratio (MOR), derived from random effects (RE) logistic regression models for clustered binary data, provides an intuitive assessment of contextual effects. Most existing research focuses on point estimation of the MOR for two-level models, with limited exploration of its statistical properties under complex multilevel structures. However, the development of corresponding interval estimators is essential for statistical inference. Moreover, many real-world datasets, particularly those from multistage surveys, involve hierarchical structures beyond two levels, where contextual effects at each level are of interest. This paper discusses the estimation of MOR for both the two-and three-level binary data, with particular emphasis on interval estimation. Since the MOR is a post-estimation measure based on variance components of the RE logit model, its confidence interval is derived using the Delta method, treating the log-transformed MOR as asymptotically normal. The approach is demonstrated across different model specifications in two-and three-level settings. An extensive simulation study evaluated the performance of the MOR estimators across diverse scenarios in hierarchical data settings. The results showed that the estimators exhibited negligible bias and satisfactory coverage probability of a 95% confidence interval for moderate to large samples, with small-sample bias mainly due to variance component estimation. An application of the methods for estimating the contextual effect on C-section delivery demonstrated that the proposed framework enhances interpretability and supports more informed statistical and policy-oriented analyses.

URL PDF HTML ☆

赞 0 踩 0

2606.14902 2026-06-16 stat.ME stat.AP 新提交

单变量函数数据的双原型分析及其在宏观经济金融时间序列中的应用

Aleix Alcacer, Rafael Benitez, Vicente J. Bolos, Irene Epifanio

发表机构 * Jaume I University（Jaime I 大学）； University of València（瓦伦西亚大学）

AI总结提出双原型分析方法，同时识别案例和时间维度的原型结构，应用于欧洲国家10年期国债收益率数据，揭示三个时间区间和三个国家原型。

Comments 6 pages, 2 figures. To be published in the proceedings of SIS-FENStatS 2026, Sapienza University of Rome, Italy, June 22-25, 2026

详情

AI中文摘要

我们首次在单变量函数数据背景下引入双原型分析。这种无监督方法通过同时识别案例（在我们的应用中为国家）和时间参数上的原型结构，扩展了原型分析。案例和时间点都被表示为双原型的混合，从而得到复杂函数观测的简洁且高度可解释的表示。尽管双原型分析并非旨在作为一种聚类技术，但与双聚类方法相比，它提供了更优的可解释性，因为它基于极端的、有代表性的模式而非平均质心，从而增强了人类的理解。我们将所提出的方法应用于2001-2025年期间欧洲国家的10年期政府债券收益率。结果识别出三个不同的时间区间（危机前时期、欧元区主权债务危机时期和危机后时期），并揭示了德国、希腊和匈牙利作为国家原型。

英文摘要

We introduce biarchetype analysis for the first time in the context of univariate functional data. This unsupervised methodology extends archetype analysis by simultaneously identifying archetypal structures across both the cases (countries, in our application) and the temporal argument. Both cases and time points are expressed as mixtures of biarchetypes, yielding a concise and highly interpretable representation of complex functional observations. Although biarchetype analysis is not intended as a clustering technique, it offers superior interpretability compared with biclustering approaches, as it is based on extreme, representative patterns rather than average centroids, thereby enhancing human comprehension. We apply the proposed method to 10-year government bond yields of European countries over the period 2001-2025. The results identify three distinct time regimes (the pre-crisis period, the euro-area sovereign debt crisis, and the post-crisis period), and reveal Germany, Greece, and Hungary as country archetypes.

URL PDF HTML ☆

赞 0 踩 0

2606.15876 2026-06-16 stat.ME stat.AP 新提交

Archetypal analysis of European 10-year government bond yields with multidimensional scaling of two-mode three-way asymmetric dissimilarities

基于二维三向非对称相异度多维缩放的欧洲10年期政府债券收益率原型分析

Aleix Alcacer, Rafael Benitez, Vicente J. Bolos, Irene Epifanio

AI总结提出从三维非对称邻近数据提取原型轮廓的方法，应用于23个欧洲国家10年期国债收益率的定向小波平方相干性非对称相异度矩阵，通过h-plot可视化和原型分析识别原型国家并量化不对称性。

Comments 6 pages, 1 figure. To be published in the proceedings of SIS-FENStatS 2026, Sapienza University of Rome, Italy, June 22-25, 2026

2606.15755 2026-06-16 q-fin.RM q-fin.ST stat.ME 新提交

Schur阻尼的两面：高维伪似然与投资组合配置

Peter Cotton

AI总结本文揭示空间统计中的Schur补（用于高维高斯伪似然估计）与投资组合中的残余风险（用于层次风险平价与最小方差组合）是同一数学对象，通过可靠性收缩统一，并证明最优阻尼具有闭式解。

详情

AI中文摘要

两个很少相互引用的领域——拟合高维天气场的空间统计学家和构建投资组合的量化投资者——独立地得到了相同的数学对象：一个由单个可解释参数阻尼的Schur补。在空间建模中，Schur补是条件协方差，使得高斯（Vecchia）伪似然在规模上可估计，最近的工作通过向基模型收缩来正则化它。在资产配置中，它是净对冲后的残余风险，相同的参数在层次风险平价和最小方差投资组合之间插值。我们证明这些是同一操作——条件高斯分布的可靠性收缩——因此天气模型在站点数超过观测数时需要保持可估计的阻尼，与投资组合在资产数超过回报数时需要保持稳定的阻尼逐项相同。最优量是闭式可靠性，一种同时是Ledoit-Wolf强度的James-Stein收缩。收缩机制是经典的，但这一恒等式似乎是新的：据我们所知，两个文献都没有注意到空间模型拟合的条件收缩与投资组合选择的分散化-方差倾斜是同一个量。我们精确地建立了对应关系，指出两个文献各自提供了对方所缺乏的内容，并报告了一个关于唯一真正开放的选择——如何设置阻尼——的小实验，表明空间社区的拟合强度（如果有的话）是更好的配方。

英文摘要

Two communities that rarely cite each other -- spatial statisticians fitting high-dimensional weather fields, and quantitative investors building portfolios -- have independently arrived at the same mathematical object: a Schur complement, damped by one interpretable parameter. In spatial modeling the Schur complement is the conditional covariance that makes a Gaussian (Vecchia) pseudo-likelihood estimable at scale, and recent work regularizes it by shrinking toward a base model. In allocation it is the residual risk of a bet net of its hedge, and the same parameter interpolates hierarchical risk parity and the minimum-variance portfolio. We show these are one operation -- reliability shrinkage of a conditional Gaussian -- so that the damping a weather model needs to remain estimable when stations outnumber observations is, term for term, the damping a portfolio needs to remain stable when assets outnumber returns. The optimal amount is a closed-form reliability, a James-Stein shrinkage that is simultaneously a Ledoit-Wolf intensity. The shrinkage machinery is classical, but the identity appears to be new: to our knowledge neither literature has noted that the conditional shrinkage a spatial model fits and the diversification-variance tilt a portfolio chooses are one and the same quantity. We make the correspondence precise, note that the two literatures have each supplied what the other lacks, and report a small experiment on the one genuinely open choice -- how to set the damping -- suggesting the spatial community's fitted intensity is, if anything, the better recipe.

URL PDF HTML ☆

赞 0 踩 0

2601.20875 2026-06-16 stat.AP cs.LG econ.EM stat.ME stat.ML 版本更新

Drivers, Receivers, and Dynamic Linkages: The Directed Structure of SDG Interdependence, 2000--2024

驱动者、接收者与动态联系：可持续发展目标相互依赖的有向结构，2000-2024

Md Muhtasim Munif Fahim, Md Jahid Hasan Imran, Md. Naim Molla, Luknath Debnath, Tonmoy Shil, Ehsanul Bashar Pranto, Md Mostafizur Rahman Likhon, Md Shafin Sanyan Saad, Md. Rezaul Karim

发表机构 * Data Science Research Lab, Department of Statistics, University of Rajshahi（数据科学研究实验室，统计学系，拉贾沙希大学）

AI总结使用面板格兰杰因果检验和局部投影法，分析114个国家2000-2024年17个可持续发展目标的有向相互依赖网络，发现84个显著联系（40个协同、44个权衡），驱动者-接收者排名脆弱，和平与强大机构是净接收者，减贫是效应加权驱动者。

Comments 27 pages, 5 figures. Panel Granger non-causality and local projections on 114 countries (2000-2024). Submitted to Sustainability Science

详情

AI中文摘要

财政和行政能力有限的政府需要知道哪些可持续发展目标（SDGs）通过目标系统传播进展以及传播速度有多快。我们利用2000年至2024年每年观测的114个国家的平衡面板数据，绘制了所有17个目标的有向相互依赖结构。目标序列具有持续性、趋势性和横截面依赖性，因此我们应用了两种适用于该机制的估计量：对一阶差分序列运行的Dumitrescu-Hurlin面板格兰杰非因果性检验，以恢复有向交互网络；以及具有Driscoll-Kraay标准误的面板局部投影，以测量31个理论推导的指标联系的动态幅度。在272个有向目标对中，84个联系通过了错误发现控制（40个协同，44个权衡；网络密度0.31）。协同和权衡以相当的强度出现，因此没有单一目标表现为通用加速器，目标层级本身也很脆弱。驱动者-接收者排名在滞后阶数和中心性指标上弱相关，并且在国家自助法下只有两个角色与零可区分：和平与强大机构作为最清晰的净接收者，以及减贫作为最可能的效应量加权驱动者。支持的联系是动态的，在四到五年内累积：卫生设施和贫困改善是降低儿童死亡率的最强预测因子，教育-儿童健康关联在183个国家的独立世界发展指标数据中得到证实。这些结果警示基于排名的加速器政策，并支持基于通过组成指标监测的、有支持的时间滞后联系构建的自适应投资组合。

英文摘要

Governments with limited fiscal and administrative capacity need to know which Sustainable Development Goals (SDGs) propagate progress through the goal system and how quickly. We map the directed interdependence structure of all seventeen goals using a balanced panel of 114 countries observed annually from 2000 to 2024. The goal series are persistent, trending, and cross-sectionally dependent, so we apply two estimators matched to this regime: a Dumitrescu-Hurlin panel Granger non-causality test, run on first-differenced series, to recover the directed interaction network, and panel local projections with Driscoll-Kraay standard errors to measure the dynamic magnitude of 31 theory-derived indicator linkages. Of 272 directed goal pairs, 84 linkages survive false-discovery control (40 synergies, 44 trade-offs; network density 0.31). Synergies and trade-offs occur at comparable strength, so no single goal behaves as a universal accelerator, and the goal-level hierarchy itself is fragile. Driver-receiver rankings correlate weakly across lag orders and centrality metrics, and under a country bootstrap only two roles are distinguishable from zero: peace and strong institutions as the clearest net receiver, and poverty reduction as the most probable effect-size-weighted driver. The supported linkages are dynamic, accruing over four to five years: sanitation and poverty improvements are the strongest predictors of lower child mortality, and the education-child-health association is corroborated in independent World Development Indicators data across 183 countries. These results caution against rankings-based accelerator policy and support adaptive portfolios built on supported, time-lagged linkages monitored through constituent indicators.

URL PDF HTML ☆

赞 0 踩 0

2601.04608 2026-06-16 q-fin.MF q-fin.CP stat.ML 版本更新

Forecasting the U.S. Treasury Yield Curve: A Distributionally Robust Machine Learning Approach for Interest Rate Risk Management

预测美国国债收益率曲线：一种用于利率风险管理的分布鲁棒机器学习方法

Jinjun Liu, Ming-Yen Cheng

AI总结针对收益率曲线预测中的分布不确定性，提出结合参数因子模型与机器学习的分布鲁棒集成框架，通过惩罚尾部风险改进样本外预测性能，支持基于DV01的利率风险管理。

Comments 44 pages( including e-companion), 6 figures, under journal review

2512.08219 2026-06-16 cs.DL stat.AP 版本更新

Any Old Tom, Dick or Harry: The Citation Impact of First Name Genderedness

任何老汤姆、迪克或哈里：名字性别化的引用影响

Maxime Holmberg Sainte-Marie, Vincent Larivière

AI总结本研究通过融合维基数据名字性别化表与Web of Science索引的2010-2019年美国作者论文数据，分析名字性别化与引用分布的关系，发现女性化和中性名字在所有学科和作者角色中普遍存在引用劣势。

详情

AI中文摘要

本文研究了作者名字的性别化程度与学术产出中引用分布之间的关系。通过将来自Wikidata的名字性别化表与2010年至2019年间发表、由美国附属作者撰写、被Web of Science索引的文章的文献计量数据合并，我们开发了一个相对分布框架，沿连续性别化谱比较名字、文章和引用计数。结果表明，语料库的词汇结构（由唯一名字（类型）与其出现次数（标记）之间的关系捕捉）在作者角色间高度稳定。生产力分析显示，学科群体在女性化与男性化名字之间的不平衡方向上存在显著差异，物理科学呈现一致的男性化偏向，而社会科学倾向于女性化一端。然而，引人注目的是，分布差异的幅度在学科群体间相对稳定，而方向差异显著。引用分析揭示了所有学科群体和作者角色中女性化和中性名字普遍存在的引用赤字。这种不对称在生命科学中跨作者角色最为一致，而在物理科学中除了中间作者外不存在，中间作者的不成比例份额可能反映了大型合作结构的引用动态。总体而言，尽管数据和研究设计支持关联性而非因果性主张，但所揭示的趋势与假设一致，即名字性别化通过低审慎评估环境中的隐性偏见影响引用认可。

英文摘要

This paper examines the relationship between the genderedness of authors' first names and citation distributions in scholarly production. Merging a first name genderedness table derived from Wikidata with bibliometric data from articles by US-affiliated authors published between 2010 and 2019 and indexed in the Web of Science, we develop a relative distributional framework that compares name, article, and citation counts along a continuous genderedness spectrum. Results show that the lexical structure of the corpus, as captured by the relationship between unique first names (types) and the number of their occurrences (tokens), proves highly stable across author roles. Productivity analyses reveal that disciplinary groups diverge substantially in the direction of the imbalance between femininely- and masculinely-gendered names, with physical sciences showing a consistent masculine skew and social sciences a tendency toward the feminine end of the spectrum. Strikingly, however, the amplitude of distributional divergence remains relatively stable across disciplinary groups, in contrast to its substantial variations in direction. Citation analyses reveal a pervasive citation deficit for femininely- and neutrally-gendered names across all disciplinary groups and author roles. This asymmetry is most consistent across author roles in the life sciences, and absent in the physical sciences except among middle authors, whose unparalleled share plausibly reflects the citation dynamics of large collaborative structures. Overall, although the data and research design support associative rather than causal claims, the trends they reveal are nonetheless consistent with the hypothesis that first name genderedness influences citation recognition through implicit bias operating in low-deliberation evaluative contexts.

URL PDF HTML ☆

赞 0 踩 0

2512.08144 2026-06-16 stat.ME 版本更新

Propensity score adjustment when errors in achievement measures inform treatment assignment

当成绩测量误差影响处理分配时的倾向得分调整

Joshua Wasserman, Ben B. Hansen, Michael R. Elliott

AI总结针对学业成绩差距评估中测量误差导致的小组平均分噪声问题，提出一种平衡小组真实平均分的倾向得分估计方法，改善重叠性并减少匹配估计偏差，通过模拟和德州暑期学习损失项目验证。

Comments 30 pages, 3 figures

详情

AI中文摘要

美国州教育机构将人口统计子群体之间存在学业成绩差距的学校标记为需要改进。一些学校在这些子群体中可能只有少数学生，因此期末考试成绩的平均值只能有噪声地衡量“真实”平均分——即如果学生多次参加考试所期望的分数。除了公开评估数据中掩盖小群体平均值的问题之外，这给旨在缩小成绩差距的干预措施评估带来了挑战。我们引入了旨在平衡子群体真实平均分的倾向得分估计。即使当噪声测量不可用时，这些估计也可用，并且与忽略测量误差的估计相比改善了重叠性，从而更大程度地减少了匹配估计的偏差。我们通过模拟和在德克萨斯州一项旨在遏制暑期学习损失的州级倡议中的应用来展示我们的方法。

英文摘要

U.S. state education agencies mark schools displaying achievement gaps between demographic subgroups as needing improvement. Some schools may have few students in these subgroups, such that average end-of-year test scores only noisily measure the average "true" score-the score one would expect if students took the test many times. This, in addition to the masking of small subgroup averages in publicly available assessment data, poses challenges for evaluating interventions aimed at closing achievement gaps. We introduce propensity score estimates designed to achieve balance on subgroup average true scores. These estimates are available even when noisy measurements are not and improve overlap compared to those that ignore measurement error, leading to greater bias reduction of matching estimators. We demonstrate our methods through simulation and an application to a statewide initiative in Texas for curbing summer learning loss.

URL PDF HTML ☆

赞 0 踩 0

2502.06530 2026-06-16 econ.TH math.ST stat.TH

Ranking Statistical Experiments via the Linear Convex Order and the Lorenz Zonoid: Economic Applications

通过线性凸序和洛伦兹洛必达序对统计实验进行排序：经济应用

Kailin Chen

AI总结本文提出线性-Blackwell序，用于比较二元行动决策问题和准凹支付决策问题中的实验，以及道德风险和事后信号筛选问题中的实验。

Comments The main text ends on page 45, and the supplementary material follows thereafter. This paper was previously circulated under the title "Experiments in the Linear Convex Order''

2606.16952 2026-06-16 cs.LG cs.AI stat.AP stat.ME stat.ML 新提交

Phantoms and Disclosures: a Causal Framework for Auditing Synthetic Data

幻象与披露：合成数据审计的因果框架

Kareem Amin, Rudrajit Das, Alessandro Epasto, Adel Javanmard, Dennis Kraft, Mónica Ribero, Sergei Vassilvitskii

发表机构 * Google（谷歌）； University of Southern California（南加州大学）

AI总结提出一个可定制的实证审计框架，通过区分真实披露与幻象披露，利用统计假设检验检测合成数据中的隐私泄露，无需模型访问或参考模型，提供比先前方法更紧的隐私泄露下界。

Comments 35 pages, 10 tables, 5 figures

详情

AI中文摘要

生成式AI和大语言模型（LLMs）的快速普及激发了人们对合成数据的兴趣，将其作为敏感真实数据集的隐私保护替代方案。然而，生成高实用性合成数据往往存在记忆和复述训练语料中隐私信息的风险。在这项工作中，我们提出了一个可定制的实证审计框架，旨在检测和解释此类数据披露。我们的框架引入了一种机制来区分“真实披露”——系统直接复现用户信息的情况，以及“幻象披露”——系统偶然生成用户数据的情况。通过将输入数据划分为训练集和保留集，并应用严格的统计假设检验，我们确定观察到的披露是否与严格的隐私基线（如零学习或特定的差分隐私（DP）边界）一致。关键的是，这种方法不需要模型访问、不需要插入金丝雀数据，也不需要参考模型训练——仅需要合成输出和保留的控制集。我们证明，该框架有效地充当了成员推断攻击，提供了比先前基于数据的审计方法更紧的隐私泄露经验下界。我们的方法是模型无关的，适用于任何合成数据生成机制，并且所需的计算资源比影子模型或基于金丝雀的替代方法少几个数量级。

英文摘要

The rapid adoption of generative AI and Large Language Models (LLMs) has spurred interest in synthetic data as a privacy-preserving alternative to sensitive real-world datasets. However, generating high-utility synthetic data often carries the risk of memorizing and regurgitating private information from the training corpus. In this work, we present a customizable empirical auditing framework designed to detect and explain such data disclosures. Our framework introduces a mechanism to distinguish between "true disclosures"-where the system directly reproduces a user's information-and "phantom disclosures''-where the system incidentally generates a user's data. By partitioning input data into training and holdout sets and applying rigorous statistical hypothesis testing, we determine if observed disclosures are consistent with strict privacy baselines, such as zero-learning or specific Differential Privacy (DP) bounds. Crucially, this approach requires no model access, no canary insertion, and no reference model training -only the synthetic output and a held-out control set. We demonstrate that this framework effectively functions as a membership inference attack, providing empirical lower bounds on privacy leakage that are tighter than prior data-based auditing methods. Our approach is model-agnostic, applies to any synthetic data generation mechanism, and requires orders of magnitude fewer computational resources than shadow-model or canary-based alternatives.

URL PDF HTML ☆

赞 0 踩 0

2606.16872 2026-06-16 stat.ME 新提交

Towards Fair Predictions: Group Conditional Concordance Index to Quantify Fairness in Time-to-Event Prognostication

迈向公平预测：用于量化时间至事件预测中公平性的组条件一致性指数

Haoyuan Wang, Riddhiman Bhattacharya, Richardo Henao, Daniel Wojdyla, Chuan Hong, Matthew Engelhard

AI总结提出组条件一致性指数（xCI），通过扩展Harrell一致性指数来量化生存分析中的组间公平性，并在右删失数据下进行估计，通过案例研究证明其能检测现有指标忽略的偏差。

Comments 28 pages

详情

AI中文摘要

公平性度量对于严格定义、量化和减轻预测模型中的偏差至关重要。虽然大多数现有度量侧重于二元分类任务，但时间至事件分析中的公平性受到的关注有限。为了解决这一差距，我们提出了一种新的组公平性度量——组条件一致性指数（xCI），它通过以组成员身份为条件扩展了Harrell一致性指数（CI）。xCI在存在右删失数据的情况下测量组内和跨组的排序准确性。我们正式定义了xCI，证明了CI是所有可能组对之间xCI的加权平均值，并利用逆删失概率加权（IPCW）开发了一致估计量。通过分析推导和模拟研究，我们进一步研究了xCI与预测风险评分之间的关系。为了展示其实用性，我们提出了两个案例研究：（i）评估基于Framingham后代、MESA和ARIC研究协调数据训练的生存模型的公平性，以及（ii）使用大规模电子健康记录（EHR）数据库Truveta评估现有心血管疾病（CVD）风险预测模型的公平性。我们的结果表明，xCI有效地检测了现有指标忽略的跨人口统计组的偏差。总体而言，xCI为生存分析中的公平性评估提供了有价值的工具，特别是在资源分配受限的环境中，并补充了现有的公平性评估方法。

英文摘要

Fairness metrics are essential for rigorously defining, quantifying, and mitigating biases in predictive models. While most existing metrics focus on binary classification tasks, fairness in time-to-event analyses has received limited attention. To address this gap, we propose a novel group fairness metric, the group-conditional Concordance Index (xCI), which extends Harrell's Concordance Index (CI) by conditioning on group membership. The xCI measures both within-group and cross-group ranking accuracy in the presence of right-censored data. We formally define the xCI, prove that CI is a weighted average of xCIs across all possible group pairs, and develop a consistent estimator using inverse probability of censoring weights (IPCW). We further investigate the relationship between xCI and predicted risk scores through analytical derivations and simulation studies. To demonstrate its practical utility, we present two case studies: (i) assessing the fairness of survival models trained on harmonized data from the Framingham Offspring, MESA, and ARIC studies, and (ii) evaluating fairness in existing cardiovascular disease (CVD) risk prediction models using Truveta, a large-scale electronic health record (EHR) database. Our results show that xCI effectively detects biases across demographic groups that are overlooked by existing metrics. Overall, xCI provides a valuable tool for fairness assessment in survival analysis, particularly in constrained resource allocation settings, and complements existing fairness evaluation approaches.

URL PDF HTML ☆

赞 0 踩 0

2606.16488 2026-06-16 stat.ME 新提交

An Energy-Driven Framework for Privacy-Aware Synthetic Data Generation

一种能量驱动的隐私感知合成数据生成框架

Pierpaolo Massoli, Fabio Spagnuolo

AI总结提出一种能量驱动框架，通过约束随机探索结合可解释性惩罚，在混合类型数据中平衡统计保真度与披露风险，实现隐私感知的合成数据生成。

Comments First release of the paper

详情

AI中文摘要

官方统计和数据密集型应用中对微观数据访问的需求日益增长，这引发了关于披露风险、推断有效性和统计效用保留的重要挑战。本文提出了一种可解释的能量驱动框架，用于混合类型数据中的隐私感知合成数据生成。该方法结合了判别建模、贝叶斯网络提议机制、Metropolis-Hastings采样和生成后优化，在一个约束概率框架内进行。与基于扰动的方法不同，隐私感知行为通过由明确的合理性、隐私性、多样性和结构一致性惩罚引导的约束随机探索来实现。该框架专门针对具有稀疏配置、异构变量类型和复杂多元依赖结构的混合类型表格数据设计。生成过程被表述为一个多目标采样问题，在保留预测效用的同时平衡统计保真度和披露风险。使用一个包含人口统计、行为和健康相关变量的混合类型个体级数据集进行了广泛的实证评估。验证策略结合了统计保真度诊断、预测分析、多样性度量、最近邻风险分析、成员推断攻击和分裂共形预测。实证结果表明，所提出的框架能够保留原始数据的大部分预测和多元结构，同时限制精确记忆现象并保持有利的隐私感知行为。该方法为在竞争效用和隐私约束下生成合成数据提供了一个可解释的框架。

英文摘要

The increasing demand for access to microdata in official statistics and data-intensive applications raises important challenges concerning disclosure risk, inferential validity and preservation of statistical utility. This paper proposes an interpretable energy-driven framework for privacy-aware synthetic data generation in mixed-type data. The proposed methodology combines discriminative modelling, Bayesian-Network proposal mechanisms, Metropolis--Hastings sampling and post-generation optimization within a constrained probabilistic framework. Unlike perturbation-based approaches, privacy-aware behaviour is achieved through constrained stochastic exploration guided by explicit plausibility, privacy, diversity and structural-coherence penalties. The framework is specifically designed for mixed-type tabular data characterized by sparse configurations, heterogeneous variable types and complex multivariate dependency structures. The generation process is formulated as a multi-objective sampling problem balancing statistical fidelity and disclosure-risk while preserving predictive utility. An extensive empirical evaluation is conducted using a mixed-type individual-level dataset containing demographic, behavioural and health-related variables. The validation strategy combines statistical fidelity diagnostics, predictive analyses, diversity measures, nearest-neighbour risk analysis, membership inference attacks and Split Conformal Prediction. The empirical results suggest that the proposed framework is capable of preserving a substantial portion of the predictive and multivariate structure of the original data while limiting exact memorization phenomena and maintaining favourable privacy-aware behaviour. The proposed methodology provides an interpretable framework for synthetic data generation under competing utility and privacy constraints.

URL PDF HTML ☆

赞 0 踩 0

2606.15964 2026-06-16 stat.ML cs.LG 新提交

PromptShift-CRC: Drift-Aware Conformal Risk Control for Foundation Models Under Prompt and Domain Shift

PromptShift-CRC: 面向提示和领域漂移的基础模型的漂移感知保形风险控制

Jeffery Opoku, David Banahene

发表机构 * The University of Texas Rio Grande Valley（德克萨斯理工大学里奥格兰德谷分校）； Florida International University（佛罗里达国际大学）

AI总结提出PromptShift-CRC方法，通过嵌入提示和响应、测量漂移、加权校准样本并在线更新风险水平，在提示和领域漂移下控制基础模型输出的风险。

详情

AI中文摘要

基础模型现在被用于其接收的提示可能快速变化的场景。用户变化、主题变化、策略变化，模型可能突然面临在校准数据中罕见的请求类型。这使得固定校准变得有风险。保形预测和保形风险控制提供了与模型无关的控制错误的方法，但当校准数据与未来数据相似时效果最佳。本文开发了PromptShift CRC，一种面向提示和领域漂移的基础模型输出的漂移感知保形风险控制方法。该方法嵌入提示和响应，测量当前提示流与校准池的偏离程度，对相关或最近的校准示例赋予更大权重，并在观察到违规后在线更新风险水平。它报告三个实用诊断指标：实现风险误差、提示漂移和有效校准大小。我们给出了该方法在分布不匹配和加权分位数不确定性项下控制风险的条件。在一个合成提示漂移基准中，静态保形风险控制在漂移后急剧失效，而PromptShift-CRC在所考虑的适应性基线中提供了最佳覆盖。然后，我们在公开基准的派生流上评估相同的校准层，包括问答、毒性、摘要事实性和长上下文幻觉风险。

英文摘要

Foundation models are now used in settings where the prompts they receive can change quickly. Users change, topics change, policies change, and the model may suddenly face a kind of request that was rare in the calibration data. This makes fixed calibration risky. Conformal prediction and conformal risk control give model-agnostic ways to control error, but they work best when the calibration data still look like the future data. This paper develops PromptShift CRC, a drift-aware conformal risk control method for foundation-model outputs under prompt and domain shift. The method embeds prompts and responses, measures how far the current prompt stream has moved from the calibration pool, gives more weight to relevant or recent calibration examples, and updates the risk level online after observed violations. It reports three practical diagnostics: realized risk error, prompt drift, and effective calibration size. We give conditions under which the method controls risk up to terms for distribution mismatch and weighted quantile uncertainty. In a synthetic prompt-shift benchmark, static conformal risk control fails sharply after drift, while PromptShift-CRC gives the best coverage among the adaptive baselines considered. We then evaluate the same calibration layer on public benchmark derived streams for question answering, toxicity, summarization factuality, and long-context hallucination risk

URL PDF HTML ☆

赞 0 踩 0

2606.15474 2026-06-16 cs.AI stat.AP 新提交

Who Drifted: the System or the Judge? Anytime-Valid Attribution in LLM Evaluation Pipelines

谁漂移了：系统还是裁判？LLM评估流水线中的随时有效归因

Yitao Li

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结提出一种基于固定锚点集和赌检验的方法，区分LLM评估中产品性能下降与裁判模型变化导致的分数漂移，并证明其随时有效性和归因准确性。

详情

AI中文摘要

对LLM产品的持续评估依赖于一个被视为地面真相的强大LLM裁判：一个廉价的监控器对每次交互进行评分，当分数下降时团队会收到警报。但裁判本身是一个API背后的模型，静默的版本升级或评分提示更新会改变其评分方式——因此每次漂移警报在更差的产品和变化的裁判之间是模糊的。我们通过一个固定的人工标注锚点集（当前裁判以稳定间隔重新评分）、一个关于裁判与人类差距的二次赌e过程，以及一个返回{无, 系统, 裁判}判决的守卫窗口规则来解决这种模糊性。我们证明了随时有效性、单向识别（只有裁判可以移动锚点）、一个归因竞赛（其设计法则是锚点必须跑赢它们守卫的主过程）以及过程正交性。在两个真实的裁判变化中，静默版本升级在60/60次运行中被检测为裁判漂移，且零次误归因为系统；而一个污染性的严格提示变化在守卫宽度为300时，120次运行中有110次被正确归因——而行业默认的滚动z检验在75%的无漂移流上产生误报。每个实验在第二个领域（TL;DR摘要）上重复，无需重新调整参数，并且当领域不同时，差异正是竞赛所预测的：严格提示变化在那里更强烈地改变分数，因此锚点触发更快，归因变得完美（240/240）。该监控器的运行成本约为对每个项目使用强裁判的0.64倍，或在更便宜但更聋的模式下为0.21倍。

英文摘要

Continuous evaluation of LLM products relies on a strong LLM judge treated as ground truth: a cheap monitor scores every interaction and a team is paged when the score drifts down. But the judge is itself a model behind an API, and a silent version bump or scoring-prompt update changes how it scores -- so every drift alarm is ambiguous between a worse product and a changed judge. We resolve the ambiguity with a fixed, human-labeled anchor set that the current judge re-scores at a steady interleave, a second betting e-process on the judge-versus-human gap, and a guard-window rule returning a verdict in {none, system, judge}. We prove anytime-validity, one-way identification (only the judge can move the anchors), an attribution race whose design law is that the anchors must out-run the main process they guard, and process orthogonality. On two real judge changes, a silent version bump is detected as judge drift in 60/60 runs with zero judge-to-system misattribution, and a contaminating strict-prompt change is correctly attributed on 110 of 120 runs at guard width 300 -- while the industry-default rolling z-test false-alarms on 75% of drift-free streams. Every experiment replicates on a second domain (TL;DR summarization) with nothing re-tuned, and where the domains differ the differences are the ones the race predicts: the strict-prompt change shifts scores harder there, so the anchors fire faster and attribution becomes perfect (240/240). The monitor runs at approximately 0.64 of the cost of strong-judging every item, or 0.21 in a cheaper-but-deafer regime.

URL PDF HTML ☆

赞 0 踩 0

2606.14909 2026-06-16 stat.ML cs.LG 新提交

Audited Conformal Prediction for Classification under Unknown Distribution Shift

未知分布漂移下分类问题的审计共形预测

Yanfei Zhou, Rizal Fathony, Nam H. Nguyen, Matteo Sesia

发表机构 * Department of Data Sciences and Operations, University of Southern California（数据科学与运营系，南加州大学）； AI Foundations, Capital One（Capital One人工智能基础）； Department of Data Sciences and Operations, Thomas Lord Department of Computer Science, University of Southern California（数据科学与运营系，托马斯·劳德计算机科学系，南加州大学）

AI总结提出审计共形预测方法，利用目标群体小标注数据训练审计模型识别旧模型可能失败的输入，结合共形预测框架在保证边际覆盖的同时提高条件覆盖，并提供理论保证。

详情

AI中文摘要

我们考虑在未知分布漂移下部署的预训练分类模型的不确定性量化问题。我们提出了审计共形预测（ACP），该方法利用来自目标群体的小标注数据集训练一个辅助审计模型，以识别旧模型可能失败的输入。通过将审计模型的输出整合到共形预测框架中，ACP 产生的预测集在保证边际覆盖的同时，在实践中比现有方法实现了更高的条件覆盖。我们开发并分析了两种互补的整合策略——一种针对边际覆盖并改善条件性能，另一种提供明确的组条件覆盖保证——并为两者建立了理论保证。在合成和真实世界数据集上的实验验证了该方法，并说明了预测集大小与条件覆盖之间的权衡。

英文摘要

We consider the problem of uncertainty quantification for a pretrained classification model deployed under unknown distribution shift. We propose Audited Conformal Prediction (ACP), a method that leverages a small labeled dataset from the target population to train an auxiliary audit model identifying inputs where the legacy model is likely to fail. By integrating the audit model's outputs into the conformal prediction framework, ACP produces prediction sets that guarantee marginal coverage while achieving substantially higher conditional coverage in practice than existing approaches. We develop and analyze two complementary integration strategies -- one targeting marginal coverage with improved conditional performance, the other providing explicit group-conditional coverage guarantees -- and establish theoretical guarantees for both. Experiments on synthetic and real-world datasets validate the method and illustrate trade-offs between prediction set size and conditional coverage.

URL PDF HTML ☆

赞 0 踩 0

2504.11775 2026-06-16 stat.ML cs.CY cs.LG q-fin.RM 版本更新

hyreg2: 一个用于估计连续和二分类数据混合的潜在类别的R包

Svenja Elkenkamp, John Grosser, Kim Rand

AI总结提出hyreg2 R包，基于联合似然方法估计混合结果类型的潜在类别模型，支持连续和二分类数据，使用EM算法实现，并提供用户友好接口。

Comments Package hyreg2 available on CRAN

详情

AI中文摘要

R包hyreg2引入了一个频率学派框架，用于使用联合似然方法估计混合结果类型的潜在类别模型。该方法在假设两种结果类型来自共同底层数据生成过程的情况下，结合了连续和二分类数据。在实现的模型中，连续响应假设服从正态分布，而二分类响应使用二项分布建模。这类模型在多个科学学科中用于估计不同类型数据（例如临床试验、计量经济学和健康经济学）的共同参数集。潜在类别估计使用广泛使用的R包flexmix中实现的期望最大化算法。hyreg2包提供了该联合似然框架的用户友好实现，允许用户无需显式编程似然函数即可估计模型。异方差性和删失数据可以被考虑。除了模型估计，该包还提供了专门的汇总和可视化函数以促进结果解释。本文介绍了该包所依据的方法论框架，并通过基于EQ-5D-5L价值集估计的示例说明了其功能。

英文摘要

The R package hyreg2 introduces a frequentist framework for estimating latent class models for mixed outcome types using a joint likelihood approach. The method combines continuous and dichotomous data under the assumption that both outcome types arise from a common underlying data-generating process. In the implemented model, continuous responses are assumed to follow a normal distribution, while dichotomous responses are modeled using a binomial distribution. Such models are used in various scientific disciplines to estimate a common set of parameters across different types of data (e.g. clinical trials, econometrics and health economics). Latent class estimation is performed using the expectation-maximization algorithm as implemented in the widely used R package flexmix. The hyreg2 package offers a user-friendly implementation of this joint likelihood framework, allowing users to estimate models without explicitly programming the likelihood function. Heteroskedasticity as well as censored data can be taken into account. In addition to model estimation, the package provides dedicated summary and visualization functions to facilitate the interpretation of results. The article presents the methodological framework underlying the package and illustrates its functionality through an example based on the estimation of an EQ-5D-5L value set.

URL PDF HTML ☆

赞 0 踩 0

2606.15933 2026-06-16 stat.ME stat.CO 新提交

A Comparison of $\texttt{R}$ Packages for Estimating Generalized Linear Mixed Models

用于估计广义线性混合模型的 $\ exttt{R}$ 包比较

Xiang Li, Mirko Signorelli

AI总结本文通过蒙特卡洛模拟，系统比较了七个代表性R包在估计广义线性混合模型时的收敛性、计算速度、估计精度和假设检验性能，为实际应用提供了选择建议。

Comments 22 pages, 13 figures

详情

AI中文摘要

广义线性混合模型（GLMM）广泛用于分析相关数据，如纵向和多层次数据。由于CRAN上有超过15个R包可用于拟合GLMM，实践者面临艰难选择：哪个包能提供准确估计、可靠收敛和合理计算速度？现有比较要么局限于单个包内的方法，要么只关注速度等狭窄标准。为填补这一空白，我们系统比较了七个代表性R包——lme4、GLMMadaptive、glmmTMB、MASS、hglm、brms和rstanarm——它们实现了不同的估计框架。通过24种情景的蒙特卡洛模拟，我们评估了每个包的收敛率、计算时间、估计精度和假设检验性能。结果表明，lme4_AGQ和GLMMadaptive具有最高的精度和收敛率，尽管GLMMadaptive在复杂随机效应结构下变慢。lme4_LA和glmmTMB计算速度快，但收敛率较低且偏差较大，尤其是方差分量。MASS和hglm也很快，但MASS产生宽松的单变量检验，hglm缺乏对相关随机效应和多元检验的支持。在两个贝叶斯包中，rstanarm可靠收敛并产生有效的单变量检验，而brms极慢，限制了其实用性。基于这些发现，我们为应用研究中选择GLMM工具提供了实用建议。

英文摘要

Generalized linear mixed models (GLMMs) are widely used for analyzing correlated data, such as longitudinal and multilevel data. With over 15 $\texttt{R}$ packages available on $\texttt{CRAN}$ for fitting GLMMs, practitioners face a difficult choice regarding which package yields accurate estimates, converges reliably, and offers reasonable computational speed. Existing comparisons are either limited to methods within a single package or focus on narrow criteria such as speed alone. To address this gap, we systematically compared seven representative $\texttt{R}$ packages -- $\texttt{lme4}$, $\texttt{GLMMadaptive}$, $\texttt{glmmTMB}$, $\texttt{MASS}$, $\texttt{hglm}$, $\texttt{brms}$, and $\texttt{rstanarm}$ -- that implement different estimation frameworks. By using Monte Carlo simulations across 24 scenarios, we evaluated each package in terms of convergence ratios, computational time, estimation accuracy, and hypothesis testing performance. Our results showed that $\texttt{lme4_AGQ}$ and $\texttt{GLMMadaptive}$ yield the highest accuracy and convergence ratios, although $\texttt{GLMMadaptive}$ becomes slower under complex random-effect structures. $\texttt{lme4_LA}$ and $\texttt{glmmTMB}$ are computationally fast but exhibit lower convergence ratios and larger bias, especially for variance components. $\texttt{MASS}$ and $\texttt{hglm}$ are also fast, but $\texttt{MASS}$ yields liberal univariate tests and $\texttt{hglm}$ lacks support for correlated random effects and multivariate testing. Between two Bayesian packages, $\texttt{rstanarm}$ converges reliably and produces valid univariate tests, whereas $\texttt{brms}$ is extremely slow, limiting its practical utility. Based on these findings, we provide practical recommendations for choosing GLMM tool in applied research.

URL PDF HTML ☆

赞 0 踩 0

2606.15760 2026-06-16 cs.LG stat.ML 新提交

The Data Manifold under the Microscope

显微镜下的数据流形

Marios Koulakis, Constantin Seibold

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结针对深度学习理论与实践的差距，提出一个基准框架，通过扩展dSprites和COIL-20数据集并配合有限差分估计器，实现曲率、可达性和体积的近真实值估计，用于校准几何估计器和验证理论假设。

Comments Accepted at ICML 2026. Camera-ready version

详情

AI中文摘要

深度学习理论与实践之间存在显著差距。泛化和近似误差界通常针对简化模型推导，或者过于宽松而缺乏信息。许多工作依赖于流形假设以及内在维度、曲率和可达性等几何正则性。进展需要深入了解数据流形几何和合适的基准，但现有选项两极分化：具有已知几何但适用性有限的分析流形，或几何只能粗略估计的真实世界数据集。我们引入了一个用于研究数据几何的基准框架。我们重新利用并扩展了dSprites和COIL-20，增加了额外的变换维度和密集的轴对齐采样，并将它们与有限差分估计器配对，在通用估计器不可靠或难以部署的情况下，以接近真实值的精度恢复曲率、可达性和体积。该框架旨在作为一个受控测试平台，可用作几何估计器的校准环境和探索理论假设的沙盒。为了说明其用途，我们展示了两个应用研究，即评估Genovese等人和Fefferman等人的界的缩放行为，以及跟踪$β$-VAE的逐层几何，突出了当前界的行为以及受控基准对指导和验证未来理论的价值。参考实现可在https://github.com/koulakis/manifold-microscope获取。

英文摘要

A significant gap exists between theory and practice in deep learning. Generalization and approximation error bounds are often derived for simplified models or are too loose to be informative. Many rely on the manifold hypothesis and on geometric regularity such as intrinsic dimension, curvature, and reach. Progress requires insight into data-manifold geometry and suitable benchmarks, yet existing options are polarized: analytic manifolds with known geometry but limited applicability, or real-world datasets where geometry is only coarsely estimable. We introduce a benchmarking framework for studying data geometry. We repurpose and extend dSprites and COIL-20 with additional transformation dimensions and dense, axis-aligned sampling, and pair them with finite-difference estimators that recover curvature, reach, and volume at near-ground-truth accuracy in a regime where general-purpose estimators are unreliable or difficult to deploy. The framework is intended as a controlled testbed, useful as a calibration environment for geometric estimators and a sandbox for probing theoretical assumptions. To illustrate its use, we present two application studies, namely assessing the scaling behavior of the bounds of Genovese et al. and Fefferman et al., and tracking the layer-wise geometry of a $β$-VAE, highlighting the behavior of current bounds and the value of controlled benchmarks for guiding and validating future theory. A reference implementation is available at https://github.com/koulakis/manifold-microscope.

URL PDF HTML ☆

赞 0 踩 0

2606.15314 2026-06-16 cs.LG cs.AI stat.ML 新提交

LLMs on Tabular Data with Limited Semantics: Evidence from Industrial Car Retrofit Prediction

有限语义表格数据上的LLM：来自工业汽车改造预测的证据

Aina Vila Pons, Ioannis Tzachristas, Constantinos Antoniou

发表机构 * Technical University of Munich（慕尼黑工业大学）； BMW Group（宝马集团）

AI总结研究在工业表格数据中，LLM（嵌入、直接分类、混合堆叠）与经典树集成方法的对比，发现LLM在语义受限时效果有限，但嵌入和混合方法仍有价值。

详情

AI中文摘要

工业改造规划依赖于结构化操作数据而非自由文本：规划者必须估计新注册的原型是否需要改造、需要哪种改造包以及工作将花费多长时间。我们研究了一个工业数据集，该数据集将原型注册系统（284,271辆车）与改造管理系统（48,716次清洗后的访问）相连接，并在行序列化输入上比较了强大的表格机器学习基线与三种基于LLM的策略：嵌入特征（Amazon Titan）、直接提示分类（Claude Sonnet 4）和ML+LLM堆叠方法。在二分类发生预测、15类改造类型分类、每次访问持续时间回归以及聚合的月度基准测试中，经典树集成仍然是最强的独立模型。然而，LLM结果揭示了一致的模式：嵌入在表格上仍然有用（二分类AUC = 0.982），直接提示在通过哈希去除语义信号后崩溃（二分类AUC = 0.500；多类加权F1 = 0.018），而混合堆叠产生了最佳的手动构建多类模型（加权F1 = 0.626）。在月度基准测试中，基于滞后的机器学习优于时间序列基础模型，尽管Chronos-small在零样本预测中仍具有竞争力。结果表明，在隐私受限的工业表格上，LLM作为补充组件比替代强大的表格基线更有效。

英文摘要

Industrial retrofit planning depends on structured operational data rather than free text: planners must estimate whether a newly registered prototype will require a retrofit, which retrofit package it will need, and how long the work will take. We study an industrial dataset linking a prototype-registration system (284,271 vehicles) with a retrofit-management system (48,716 cleaned visits), and compare strong tabular machine learning baselines with three LLM-based strategies on row-serialized inputs: embedding features (Amazon Titan), direct prompted classification (Claude Sonnet 4), and an ML+LLM stacking approach. Across binary occurrence prediction, 15-way retrofit-type classification, per-visit duration regression, and an aggregated monthly benchmark, classical tree ensembles remain the strongest standalone models. However, the LLM results reveal a consistent pattern: embeddings remain useful on tables (binary AUC = 0.982), direct prompting collapses once semantic signal is stripped by hashing (binary AUC = 0.500; multiclass weighted F1 = 0.018), and hybrid stacking yields the best manually built multiclass model (weighted F1 = 0.626). On the monthly benchmark, lag-based machine learning outperforms time-series foundation models, though Chronos-small remains competitive in zero-shot forecasting. The results suggest that on privacy-constrained industrial tables, LLMs are more effective as complementary components than as replacements for strong tabular baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.15201 2026-06-16 stat.AP math.DS stat.ML 新提交

A Koopman-PINN Framework for Epidemic Models: Parameter Inference and Forecasting

Koopman-PINN框架用于流行病模型：参数推断与预测

Achraf Zinihi, Matthias Ehrhardt, Moulay Rchid Sidi Ammi

AI总结提出Koopman增强物理信息神经网络（K-PINN）框架，结合Koopman算子理论与物理信息学习，实现非线性流行病模型的参数推断与长期预测，在合成猴痘数据和真实SARS-CoV-2数据集上优于经典PINN和Koopman-EDMD方法。

详情

AI中文摘要

我们提出了一个Koopman增强的物理信息神经网络（K--PINN）框架，用于非线性流行病模型中的参数推断和预测。该方法结合了Koopman算子理论和物理信息学习。它将流行病状态映射到一个潜在的可观测空间，在该空间中动力学近似线性演化，同时通过自动微分满足控制流行病方程。这种集成提高了可解释性、参数可辨识性和长期预测稳定性。我们将所提出的框架应用于归一化的SEIRSD流行病模型，并使用合成的猴痘（Mpox）数据和来自德国、摩洛哥和瑞典的SARS-CoV-2病毒真实数据集进行评估。合成轨迹使用保持结构的非标准有限差分方案生成，以确保可靠的训练数据。数值结果表明，K--PINN在参数估计、轨迹重建和长期预测方面比经典PINN和Koopman-EDMD方法更准确。这些结果表明，K--PINN是一种有效的流行病建模机器学习框架，可以扩展到更复杂的系统。

英文摘要

We propose a Koopman-enhanced physics-informed neural network (K--PINN) framework for parameter inference and forecasting in nonlinear epidemic models. This method combines Koopman operator theory and physics-informed learning. It maps epidemic states into a latent observable space where the dynamics evolve approximately linearly while satisfying the governing epidemic equations through automatic differentiation. This integration improves interpretability, parameter identifiability, and long-term predictive stability. We apply the proposed framework to a normalized SEIRSD epidemic model and evaluate it using synthetic monkeypox (Mpox) data and real-world datasets from Germany, Morocco, and Sweden for the SARS-CoV-2 virus. Synthetic trajectories are generated using a structure-preserving, nonstandard finite difference scheme to ensure reliable training data. Numerical results demonstrate that K--PINN achieves more accurate parameter estimation, trajectory reconstruction, and long-term forecasting than classical PINNs and Koopman-EDMD approaches. These results suggest that K--PINN is an effective machine learning framework for epidemic modeling that can be extended to more complex systems.

URL PDF HTML ☆

赞 0 踩 0

2606.07622 2026-06-16 cs.LG stat.AP 新提交

Airport Terminal Passenger Queue Forecasting for Departure Gates and Security Checkpoints

机场航站楼登机口与安检点旅客排队预测

Juhwan Lee, Seokbin Yoon, Keumjin Lee, Hojong Baik, Seyeon Jung

发表机构 * Korea Aerospace University（韩国航空大学）； Korea Airports Corporation（韩国机场公社）

AI总结提出基于Transformer的框架，利用历史队列长度、等待时间和旅客吞吐量数据，预测登机口和安检点未来两小时的队列长度与等待时间，支持主动排队管理。

Comments 10 pages, 6 figures, accepted at DASC 2026

详情

AI中文摘要

准确的机场航站楼旅客排队预测对于高效的离港运营至关重要，因为它能够实现主动的拥堵管理。然而，时变的旅客需求以及多个离港设施中异构的设施使用情况使得预测具有挑战性。在这项工作中，我们提出了一种旅客排队预测框架，该框架从运营数据中学习历史旅客流量模式。所提出的模型采用基于Transformer的架构，利用过去登机口和安检点的队列长度和等待时间，以及值机岛的旅客吞吐量，来捕捉时间依赖性和设施间相关性。学习到的表示被映射到两个设施特定的MLP头部，以预测登机口和安检点的队列长度和等待时间。实验结果表明，该模型能够准确预测未来两小时内的排队情况。所提出的方法为机场航站楼运营中的主动排队管理和人员重新分配提供了实用的实时决策支持。

英文摘要

Accurate passenger queue forecasting in airport terminals is essential for efficient departure operations, as it enables proactive congestion management. However, time-varying passenger demand and heterogeneous facility usage across multiple departure facilities make forecasting challenging. In this work, we propose a passenger queue forecasting framework that learns historical passenger flow patterns from operational data. The proposed model employs a Transformer-based architecture to capture temporal dependencies and inter-facility correlations using past queue length and waiting time at departure gates and security checkpoints, together with passenger throughput at check-in islands. The learned representations are mapped to two facility-specific prediction heads to predict queue length and waiting time at departure gates and security checkpoints. Experimental results demonstrate accurate forecasts up to two hours ahead. The proposed approach offers practical real-time decision support for proactive queue management and staff reallocation in airport terminal operations.

URL PDF HTML ☆

赞 0 踩 0

2603.12881 2026-06-16 stat.AP stat.ME 版本更新

Multivariate lattice deformation: A spatially explicit framework for assessing crop rotation impacts on soil nutrient dynamics

多元晶格变形：评估轮作对土壤养分动态影响的空间显式框架

Marco Mandap

AI总结提出多元晶格模型，将土壤视为4D张量，用力向量表示轮作，通过核平滑模拟养分横向移动，以N-P-K空间欧氏距离量化累积影响，识别风险区域并指导针对性管理。

Comments An error was identified in the underlying distribution proof used for the empirical copula test. The authors are withdrawing this version while finalizing a formally verified proof of the distribution in Lean 4

详情

AI中文摘要

轮作对土壤养分的影响通常使用田间平均或单一养分分析来评估，忽略了空间异质性和多元相互作用。我们提出了一个多元晶格模型，将土壤视为4D张量（空间、时间以及N、P、K通道）。轮作表示为力向量，土壤缓冲能力（“刚度”）随质地空间变化。通过核平滑引入养分横向移动。累积影响通过N-P-K空间中的欧氏距离量化，并使用Cramer-von Mises置换检验评估显著性。在20×20异质网格上模拟三年玉米-大豆-小麦轮作显示，一个周期后平均应力为0.63，沙质区域最大值为0.91。磷耗竭（17.9%）超过氮（10.8%），在19%的单元格中主导应力——这被单一养分分析所掩盖。连续玉米使平均应力增加41%。Cramer-von Mises检验检测到显著偏差（p ≤ 0.002），Moran's I（0.29-0.30）确认了空间自相关。我们的框架识别风险区域并指导针对性管理，连接了地质统计学与机械作物模型。

英文摘要

Crop rotation impacts on soil nutrients are typically assessed using field-averaged or single-nutrient analyses that ignore spatial heterogeneity and multivariate interactions. We propose a multivariate lattice model treating soil as a 4D tensor (space, time, and N, P, K channels). Crop rotations are represented as force vectors, with soil buffering capacity ("stiffness") varying spatially with texture. Lateral nutrient movement is introduced via kernel smoothing. Cumulative impact is quantified by Euclidean distance in N-P-K space, with significance assessed via Cramer-von Mises permutation tests. Simulating a three-year corn-soybean-wheat rotation on a 20 x 20 heterogeneous grid shows mean stress of 0.63 after one cycle, with maximum 0.91 in sandy areas. Phosphorus depletion (17.9%) exceeds nitrogen (10.8%), dominating stress in 19% of cells - obscured by single-nutrient analyses. Continuous corn increases mean stress by 41%. Cramer-von Mises tests detect significant deviation (p <= 0.002), and Moran's I (0.29-0.30) confirms spatial autocorrelation. Our framework identifies risk zones and guides site-specific management, bridging geostatistics with mechanistic crop models.

URL PDF HTML ☆

赞 0 踩 0

2510.14092 2026-06-16 stat.ML cs.LG 版本更新

deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss

deFOREST: 融合光学与雷达卫星数据增强树木损失感知

Julio Enrique Castrillon-Candas, Hanfeng Gu, Caleb Meredith, Yulin Li, Xiaojing Tang, Pontus Olofsson, Mark Kon

AI总结提出融合光学与SAR数据的森林砍伐检测流程，利用离散KL展开残差空间构建异常图，结合HMM分类，在亚马逊区域验证混合方法优于现有技术且对稀疏光学数据更鲁棒。

详情

DOI: 10.1109/TGRS.2026.3689741
Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 64, 2026, Art no. 4409213

AI中文摘要

本文开发了一个结合光学和合成孔径雷达（SAR）数据的森林砍伐检测流程。该流程的一个关键组成部分是利用离散Karhunen-Loéve（KL）展开的残差空间构建光学数据的异常图。异常通过森林标称状态下残差分量分布的浓度界限来量化。该界限不需要关于数据分布的先验知识。这与假设知道数据分布的统计参数方法形成对比，这种假设不切实际，尤其对于高维数据（如我们的数据）不可行。一旦计算出光学异常图，它们与SAR数据结合，并通过隐马尔可夫模型（HMM）对森林状态进行分类。我们在亚马逊森林中一个$92\,km \times 92\,km$的区域使用Sentinel-1（SAR）和Sentinel-2（光学）数据测试了我们的方法。结果表明，混合光学-雷达方法和仅光学方法都实现了高精度，优于最新的混合方法。此外，在高度多云地区常见的光学数据稀疏情况下，混合方法显著更鲁棒。

英文摘要

In this paper we develop a deforestation detection pipeline that incorporates optical and Synthetic Aperture Radar (SAR) data. A crucial component of the pipeline is the construction of anomaly maps of the optical data, which is done using the residual space of a discrete Karhunen-Loéve (KL) expansion. Anomalies are quantified using a concentration bound on the distribution of the residual components for the nominal state of the forest. This bound does not require prior knowledge on the distribution of the data. This is in contrast to statistical parametric methods that assume knowledge of the data distribution, an impractical assumption that is especially infeasible for high dimensional data such as ours. Once the optical anomaly maps are computed they are combined with SAR data, and the state of the forest is classified by using a Hidden Markov Model (HMM). We test our approach with Sentinel-1 (SAR) and Sentinel-2 (Optical) data on a $92\,km \times 92\,km$ region in the Amazon forest. The results show that both the hybrid optical-radar and optical only methods achieve high accuracy that is superior to the recent state-of-the-art hybrid method. Moreover, the hybrid method is significantly more robust in the case of sparse optical data that are common in highly cloudy regions.

URL PDF HTML ☆

赞 0 踩 0

2502.10182 2026-06-16 stat.ME stat.AP 版本更新

Scalable Generalised Accuracy Estimation for Multisource Register-based Official Statistics

基于多源登记册的官方统计的可扩展广义精度估计

Nina Deliu, Piero Demetrio Falorsi, Stefano Falorsi, Diego Chianella, Giorgio Alleva

AI总结针对多源登记册统计中的多重误差，提出一种基于多项逻辑模型的全局误差解析近似方法，实现可解释、灵活且计算可扩展的精度量化。

Comments 49 pages (main manuscript and supplementary material); 7 tables, 5 figures

详情

AI中文摘要

官方统计正在经历重大转型，国家统计机构从传统的单源数据生产系统转向整合行政、普查和调查数据的统计登记册集成系统。由此产生的多源登记册估计值容易受到多种交互误差源的影响，然而用于量化其精度的严格可扩展框架仍不成熟。本文讨论并验证了一种用于此类多源登记册统计的全局误差评估度量。聚焦于两个核心不确定性来源——抽样和建模，我们推导出一种解析解，能够精确近似多项逻辑模型下大规模插补过程的全局误差。所提出的度量具有可解释性、灵活性和计算可扩展性，能够为用户定义的、非计划的特定领域人口总量统计提供即时精度量化。其有效性在理论上得到确立，并通过模拟研究得到证实。最后，展示了在意大利国家统计局教育数据上的应用。

英文摘要

Official statistics are undergoing a significant transformation, as national statistical institutes transition from traditional single-source data production systems to integrated systems of statistical registers combining administrative, census, and survey data. The resulting multisource register-based estimates are prone to multiple interacting sources of error, yet rigorous scalable frameworks for quantifying their accuracy remain underdeveloped. This work discusses and validates a global measure of error assessment for such multisource register-based statistics. Focusing on two central sources of uncertainty, sampling and modelling, we derive an analytical solution that accurately approximates the global error of mass-imputation procedures under a multinomial logistic model. The proposed measure is interpretable, flexible, and computationally scalable, enabling on-the-fly accuracy quantification for user-defined, unplanned domain-specific statistics on population totals. Its validity is established theoretically and confirmed through simulation studies. An application to education data from the Italian National Institute of Statistics is presented.

URL PDF HTML ☆

赞 0 踩 0

2606.17022 2026-06-16 math.ST cs.LG stat.ML stat.TH 新提交

Learning the Geometry of Data: A Mathematical Review of Shape Space Analysis

学习数据的几何：形状空间分析的数学综述

Gary P. T. Choi, Khanh Dao Duc, Shira Faigenbaum-Golovin, Karen Habermann, Emmanuel Hartman, Christoph von Tycowicz, Chi Zhang, Wenjun Zhao, Felix Zhou

AI总结本文综述形状空间分析，利用微分几何、统计学和机器学习构建从形状表示到几何感知学习的分析流程，用于表征几何数据中的非线性结构。

Comments 79 pages, 10 figures, 8 tables

详情

AI中文摘要

机器学习的一个核心目标是识别数据中的结构和模式。数据采集的进步日益产生具有丰富几何形态的观测数据集，从而产生了编码对象几何变异的形状空间。这类数据集出现在广泛的学科中，包括生物学、医学、人类学和计算机视觉，其中微妙的几何差异通常携带重要的科学信息。然而，传统的机器学习方法常常不足以解释这些数据背后的非线性几何结构。本综述综合了快速增长的形状空间分析工作，该工作为几何数据的研究提供了数学和计算框架。借鉴微分几何、统计学和机器学习的理念，我们围绕一个共同的分析流程组织文献：形状表示和参数化、稳健测地距离的严格构造、形状空间上的统计分析以及几何感知的学习方法。我们讨论了这些工具如何能够表征形状变异、比较几何对象以及分析跨群体和时间的结构轨迹。为了说明该领域的广度，我们重点介绍了跨越多个生物组织尺度的应用，包括亚细胞形态学和灵长类牙齿进化的研究。在这些以及许多其他领域中，研究人员面临着由复杂、非线性且常常未对齐的几何变异引起的共同挑战。本综述最后指出了关键的理论和计算挑战，以及由日益庞大和多样化的几何数据集驱动的新兴机遇。

英文摘要

A central objective of machine learning is to identify structure and patterns in data. Advances in data acquisition have increasingly produced datasets whose observations possess rich geometric form, giving rise to shape spaces that encode variability in object geometry. Such datasets arise across a wide range of disciplines, including biology, medicine, anthropology, and computer vision, where subtle geometric differences often carry important scientific information. Traditional machine learning methods, however, are frequently ill-equipped to account for the nonlinear geometric structure underlying these data. This survey synthesizes a rapidly growing body of work on shape space analysis, which provides a mathematical and computational framework for the study of geometric data. Drawing on ideas from differential geometry, statistics, and machine learning, we organize the literature around a common analytical pipeline: shape representation and parameterization, the rigorous construction of robust geodesic metrics, statistical analysis on shape spaces, and geometry-aware learning methods. We discuss how these tools enable the characterization of shape variability, the comparison of geometric objects, and the analysis of structural trajectories across populations and time. To illustrate the breadth of the field, we highlight applications spanning multiple scales of biological organization, including studies of subcellular morphology and primate tooth evolution. Across these and many other domains, researchers face common challenges arising from complex, nonlinear, and often unaligned geometric variation. The review concludes by identifying key theoretical and computational challenges, as well as emerging opportunities driven by increasingly large and diverse geometric datasets.

URL PDF HTML ☆

赞 0 踩 0

2606.16715 2026-06-16 cs.IT math.IT math.PR math.ST stat.TH 新提交

Testing for a Hidden Geometry in Random Graphs

随机图中隐藏几何结构的检测

Amit Silber, Mor Oren-Loberman, Wasim Huleihel

AI总结研究在随机图中检测隐藏几何信号的问题，推导了检测不可能的信息论下界，并揭示了易-难-不可能相变。

Comments Accepted to COLT 2026; 54 apges

详情

AI中文摘要

我们研究在随机图中检测微弱几何信号的问题。形式上，考虑一个假设检验问题：在原假设下，观测图是 Erdős--Rényi 随机图 $\mathcal{G}(n,q)$；而在备择假设下，一个随机几何图 $\mathcal{G}(k,q,d)$ 被植入在 $k\le n$ 个顶点上。该植入子图由单位球面 $\mathbb{S}^{d-1}$ 上的独立随机点生成，边由潜在几何邻近性决定并校准为边密度 $q$。我们的目标是刻画检测这种隐藏几何结构的统计和计算极限。我们推导了尖锐的信息论下界，识别出检测不可能的区域，并提供在检测可行时达到这些极限的算法。我们进一步研究了该问题的计算复杂度，并确定何时存在有效的多项式时间检验。该模型展现出“易-难-不可能”相变：某些区域允许高效检测，另一些区域仅能通过计算上不可行的过程检测，而其余区域即使拥有无限计算能力也无法检测。作为计算障碍的证据，我们证明所有低次多项式算法在推测的困难区域均失败，展示了统计可行性与计算可行性之间的尖锐差距。

英文摘要

We study the problem of detecting a faint geometric signal hidden in an otherwise random graph. Formally, we consider a hypothesis testing problem in which, under the null, the observed graph is an Erdős--Rényi random graph $\mathcal{G}(n,q)$, while under the alternative a random geometric graph $\mathcal{G}(k,q,d)$ is planted on $k\le n$ vertices. The planted subgraph is generated from independent random points on the unit sphere $\mathbb{S}^{d-1}$, with edges determined by latent geometric proximity and calibrated to have edge density $q$. Our goal is to characterize the statistical and computational limits of detecting this hidden geometry. We derive sharp information-theoretic lower bounds that identify regimes where detection is impossible and provide algorithms that achieve these limits whenever detection is feasible. We further investigate the computational complexity of the problem and determine when efficient polynomial-time tests exist. The model exhibits an \emph{easy--hard--impossible} phase transition: some regimes allow efficient detection, others permit detection only with computationally intractable procedures, and still others render detection impossible even with unlimited computational power. As evidence for the computational barrier, we prove that all low-degree polynomial algorithms fail throughout the conjecturally hard regime, demonstrating a sharp gap between statistical and computational feasibility.

URL PDF HTML ☆

赞 0 踩 0

2606.16482 2026-06-16 math.AG math.CO math.ST stat.TH 新提交

Euler Stratifications of Second Hypersimplices via Delta-matroids

第二超单形的欧拉分层通过Delta-拟阵

Janike Oldekop

AI总结通过Delta-拟阵与主A-行列式的非零因子对应，研究第二超单形缩放环面的欧拉示性数，证明Clarke等人(2024)关于最小ML度数的猜想。

Comments 20 pages, 1 figure, 6 tables

2606.16393 2026-06-16 stat.AP math.ST physics.app-ph physics.data-an stat.TH 新提交

Calibrating the Brody exponent as a quantitative measure of short-range exclusion in 2D spatial point processes

将Brody指数校准为二维空间点过程中短程排斥的定量度量

Dawid Kucharski

AI总结本文将Brody分布校准为二维空间点过程中短程排斥的定量度量，通过重新校准完全空间随机基线（β=0.96±0.15）和建立β-排斥半径经验校准（Spearman ρ=0.988），并应用于制造表面、相位提取干涉测量和素数嵌入等案例。

Comments 22 pages, 6 figures, 3 tables, 33 references; submitted to a peer-reviewed journal

详情

AI中文摘要

Brody分布最初是量子混沌中泊松和维格纳能级间距统计之间的现象学插值，本文将其校准为二维空间点过程中短程排斥的定量度量。核心结果有两个。首先，二维完全空间随机基线被重新校准为β=0.96±0.15，纠正了不恰当的一维泊松参考。其次，经验β-排斥半径校准与有效硬核半径的Spearman ρ=0.988得到验证。该框架在58个制造表面（10种材料，10种工艺）、认证圆度标准的相位提取干涉轮廓测量以及素数的二维二元嵌入上进行了演示。一个稀疏整数对照证明素数β=2.15信号是真正的算术信号（相对于随机整数对照Δβ=+0.68），而康托尔嵌入零结果（β=1.40，TOST p<0.01）表明二维排斥是由嵌入产生的而非内在的。密度稀疏实验表明β捕捉的是排斥强度而非点密度，但绝对值依赖于密度。识别了低填充分数下二元场的独特CSR基线，并提供了决策表。β-排斥半径校准、CSR基线校正和对照协议共同构成了一个用于二维空间点过程中短程排斥可重复表征的校准测量框架。

英文摘要

The Brody distribution, originally a phenomenological interpolation between Poisson and Wigner level-spacing statistics in quantum chaos, is calibrated here as a quantitative measure of short-range exclusion in 2D spatial point processes. Two results form the core. First, the 2D complete-spatial-randomness baseline is recalibrated to $β=0.96\pm0.15$, correcting the inappropriate 1D Poisson reference. Second, an empirical $β$--$r_{\text{excl}}$ calibration is validated against the effective hard-core radius with Spearman $ρ=0.988$. The framework is demonstrated on 58 manufactured surfaces (10 materials, 10 processes), phase-extracted interferometric profilometry of a certified roundness standard, and 2D binary embeddings of prime numbers. A sparse-integer control proves the prime $β=2.15$ signal is genuinely arithmetic ($Δβ=+0.68$ over random-integer control), while a Cantor-embedding null result ($β=1.40$, TOST $p<0.01$) demonstrates that 2D exclusion is embedding-created rather than intrinsic. Density-thinning experiments establish that $β$ captures exclusion strength rather than point density, while absolute values are density-dependent. A distinct CSR baseline for binary fields at low fill fraction is identified, with a decision table provided. The $β$--$r_{\text{excl}}$ calibration, the CSR baseline correction, and the control protocols together constitute a calibrated measurement framework for reproducible characterisation of short-range exclusion in 2D spatial point processes.

URL PDF HTML ☆

赞 0 踩 0

2606.16373 2026-06-16 math.ST math.PR math.SP stat.TH 新提交

Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning

高阶谱扰动展开 II：核矩阵与流形学习

Bernhard Stankewitz, Martin Wahl

AI总结本文在弱假设下建立核矩阵作为核积分算子近似的谱集中界，处理大重数、大有效维度和重尾分布，应用于无限维主成分分析、流形学习和贝叶斯非参数统计。

2606.05672 2026-06-16 math.ST stat.TH 版本更新

Trace-Class Results for MCMC Algorithms for Student-t Regression Models

Student-t 回归模型的 MCMC 算法的迹类结果

Yasuyuki Hamura

AI总结本文研究 Student-t 回归模型的 MCMC 算法，通过分析迹类性质来评估马尔可夫链的效率，发现标准数据增广算法在无信息先验下不是迹类，而折叠 Gibbs 算法是迹类；在正态-逆伽马先验下标准算法是迹类。

Comments 22 pages

2606.05072 2026-06-16 math.ST stat.TH 版本更新

Adaptive Sequential Change Detection using Mixtures of Predictive Distributions

使用预测分布混合的自适应序列变化检测

Topi Halme, H. Vincent Poor, Visa Koivunen

AI总结针对后变化分布未知的独立观测序列变化检测问题，提出一种基于滑动窗口预测分布混合的PM-CuSum算法，实现一阶渐近最优性且渐近延迟余项更小。

2605.28429 2026-06-16 math.ST stat.TH 版本更新

On Extending Type-I Error to Data-Dependent Levels

第一类错误到数据依赖水平的“正确”扩展

Nick W. Koning

AI总结本文通过三个公理证明第一类错误到数据依赖水平的扩展是唯一的，并以此支持E-value的常用定义。

2504.11320 2026-06-16 cs.LG cs.AI cs.DC math.OC stat.ML 版本更新

Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints

优化大语言模型推理：带有内存约束的流引导在线调度

Ruicheng Ao, Gan Luo, David Simchi-Levi, Xinshang Wang

发表机构 * Institute for Data, Systems, and Society, Massachusetts Institute of Technology（数据、系统与社会研究所，麻省理工学院）； School of Mathematical Sciences, Peking University（北京大学数学科学学院）； Alibaba Group（阿里巴巴集团）

AI总结本文提出流引导在线调度方法，通过等待阈值算法和嵌套等待算法，在内存约束下优化大语言模型推理的延迟和容量，减少过载时的延迟。

Comments 79 pages, 20 figures

详情

AI中文摘要

大型语言模型现在每天服务于数百万用户，提供商每天的支出超过70万美元。每个请求需要逐token推理，使GPU调度成为延迟、容量和成本的关键因素。难点在于内生内存增长：生成的token会扩展键值（KV）缓存，溢出可能导致正在进行的请求被驱逐并浪费先前计算。我们将推理视为一个具有内生内存增长、线性迭代次数和驻留GPU的KV缓存约束的多阶段在线调度问题。我们引入了流模型，该模型表征了平衡批处理组成、内存需求和稳定性区域。受流模型指导，我们设计了WAIT（等待累积推理阈值）算法，该算法为已知输出长度设计了基于阈值的准入规则，并通过调节请求在解码阶段段中的推进方式扩展到未知输出长度的嵌套WAIT。两种算法在所陈述的内存条件下近似流基准。嵌套WAIT使用额外的中等规模安全缓冲区，以应对未知输出长度引起的内存溢出导致的驱逐。在配置为Llama-2-7B的A100 GPU上的Vidur模拟中，补充的实GPU验证在附录中报告，这些策略相对于广泛使用的基线算法扩大了经验上观察到的稳定运行范围，并在接近过载和过载区域显著降低了延迟。

英文摘要

Large language models now serve millions of users daily, with providers incurring costs exceeding $700,000 per day. Each request requires token-by-token inference, making GPU scheduling central to latency, capacity, and cost. The difficulty is endogenous memory growth: generated tokens expand the Key-Value (KV) cache, and overflow can evict in-progress requests and waste prior computation. We formulate inference as a multi-stage online scheduling problem with endogenous memory growth, linear iteration times, and GPU-resident KV-cache constraints. We introduce a fluid model that characterizes equilibrium batch composition, memory requirement, and stability region. Guided by the fluid model, we design WAIT (Waiting for Accumulated Inference Threshold), a threshold-based admission rule for known output lengths, and Nested WAIT, which extends the rule to unknown output lengths by regulating how requests advance across decode-stage segments. Both algorithms approximate the fluid benchmark asymptotically under the stated memory conditions. Nested WAIT uses an additional safety buffer of moderate scale to hedge against memory-overflow-induced evictions under unknown output lengths. In Vidur simulations configured for Llama-2-7B on an A100 GPU, with supplemental real-GPU validation reported in the appendix, the policies enlarge the empirically observed stable operating range relative to widely used baseline algorithms and reduce latency especially in near-overloaded and overloaded regimes.

URL PDF HTML ☆

赞 0 踩 0

2603.29463 2026-06-16 math.ST stat.TH 版本更新

Robustified Gaussian quasi-BIC for volatility

波动率的稳健化高斯拟BIC

Shoichi Eguchi, Hiroki Masuda

AI总结针对受有限活动跳跃污染的非遍历连续波动率回归模型，提出两种基于密度功率加权和Hölder不等式归一化的Schwarz型统计量，并证明其模型选择一致性。

矩阵正态分布的Stein方法

Robert E. Gaunt, Frédéric Ouimet, Donald Richards

AI总结本文首次系统发展矩阵分布的Stein方法，通过矩阵Ornstein-Uhlenbeck扩散建立Stein恒等式，给出Stein方程解的半群表示及正则性估计，并应用于矩阵中心极限定理、矩阵T分布近似及协方差估计。

Comments 25 pages, 0 figures

详情

AI中文摘要

本文首次系统发展矩阵分布的Stein方法。我们建立了矩阵正态近似的Stein方法的基本要素：从具有双边尺度的矩阵Ornstein-Uhlenbeck扩散推导出基于扩展生成元的Stein恒等式，为Stein方程的解提供了明确的半群表示，并获得了解的正则性估计。新方法通过三个例子展示：(i) 量化矩阵中心极限定理的光滑Wasserstein距离界（教学示例），(ii) 中心化矩阵$T$分布的矩阵正态近似的Wasserstein距离界，以及(iii) 估计矩阵正态的行和列协方差因子的Stein矩方法，产生一类灵活的加权翻转-翻转Stein估计量，推广了Dutilleul的经典翻转-翻转算法，并自然适应行/列重要性权重、系统缺失和投影到结构化协方差族。后两个例子本质上是矩阵值的，不能通过简单的向量化处理。

英文摘要

This work presents the first systematic development of Stein's method for matrix distributions. We establish the basic essential ingredients of Stein's method for matrix normal approximation: we derive an extended-generator-based Stein identity from a matrix Ornstein-Uhlenbeck diffusion with two-sided scales, provide an explicit semigroup representation for the solution of the Stein equation, and obtain regularity estimates for the solution. The new methodology is demonstrated in three examples: (i) smooth Wasserstein distance bounds to quantify the matrix central limit theorem (a didactic example), (ii) a Wasserstein distance bound for the matrix normal approximation of the centered matrix $T$ distribution, and (iii) a Stein's method-of-moments approach to estimating the row and column covariance factors of the matrix normal, yielding a flexible class of weighted flip-flop Stein estimators that generalize Dutilleul's classical flip-flop algorithm and naturally accommodate row/column importance weights, systematic missingness, and projection onto structured covariance families. The latter two examples are intrinsically matrix-valued and cannot be treated using naive vectorization.

URL PDF HTML ☆

赞 0 踩 0

2510.02666 2026-06-16 math.ST stat.TH 版本更新

Robustified Gaussian quasi-likelihood inference for volatility

鲁棒化的波动率高斯拟似然推断

Shoichi Eguchi, Hiroki Masuda

AI总结针对高频数据受有限活动跳跃和尖峰噪声污染的情况，提出基于密度幂加权和Hölder不等式归一化的鲁棒化高斯拟似然估计量，证明其渐近混合正态性，并对协变量和响应过程同时具有鲁棒性。

2410.05517 2026-06-16 math.ST stat.TH 版本更新

Functional Extreme-PLS

函数型极端偏最小二乘法

Stéphane Girard, Cambyse Pakzad

AI总结针对离散化函数型框架提出极端降维方法，结合PLS和SIR技术，通过投影协方差最大化捕捉尾部信息，在非线性逆单指标模型下估计指标，并证明渐近一致性。

Comments 44 pages, 9 figures

详情

AI中文摘要

我们提出了一种极端降维方法，将极端偏最小二乘法（Extreme-PLS）扩展到离散化函数型框架，其中协变量位于无限维希尔伯特空间$L^2([0,1])$中，但在密集时间网格上部分观测。该思想部分借鉴了偏最小二乘法（PLS）和切片逆回归（SIR）技术。该方法依赖于将协变量投影到子空间，并最大化其投影与响应之间的协方差，条件于捕捉尾部信息的极端事件。协变量和重尾响应通过非线性逆单指标模型关联，我们的目标是在该回归框架中推断指标。我们提出了一族新的估计量，并证明了其在模型下的渐近一致性和收敛速度。在噪声的温和假设下，大多数假设以正则变化的形式给出，这与标准SIR和单指标回归文献不同。此外，我们扩展了理论分析，在一般可分离希尔伯特空间中得到了经验尾部矩的几乎必然一致性结果（无模型假设）。最后，我们在合成函数数据和高频金融数据的有限样本研究中展示了结果，突出了降维在捕捉尾部依赖和极端风险管理中的有效性。

基于核回归和密度探索的贝叶斯优化

Tansheng Zhu, Hongyu Zhou, Ke Jin, Xusheng Xu, Qiufan Yuan, Lijie Ji

发表机构 * Zhiyuan College, Shanghai Jiao Tong University, Shanghai 200240, P. R. China（上海交通大学紫阳学院）； School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai 200240, P. R. China（上海交通大学数学科学学院）； Shanghai Institute of Aerospace Systems Engineering, Shanghai 201109, P. R. China（上海航天系统工程研究院）； Department of Mathematics, Shanghai University, Shanghai 200444, P. R. China（上海大学数学系）； Newtouch Center for Mathematics of Shanghai University, Shanghai University, Shanghai 200444, P. R. China（上海大学数学中心）

AI总结该研究提出了一种新的贝叶斯优化算法BOKE，通过核回归和密度探索结合，减少计算成本至二次复杂度，并在理论和实验上证明了其收敛性和有效性。

详情

AI中文摘要

贝叶斯优化在优化昂贵评估的黑盒函数时非常有效，但因高斯过程的每次迭代三次计算复杂度而面临显著的计算挑战，导致总时间复杂度与迭代次数的四次方成正比。为了解决这一限制，我们提出了一种新的算法，即基于核回归和密度探索的贝叶斯优化（BOKE）。BOKE利用核回归进行高效的函数近似，核密度用于探索，并将它们整合到置信界标准中以指导优化过程，从而将计算成本降低到二次。我们的理论分析严格建立了在噪声评估下的BOKE全局收敛性。通过广泛的数值实验，在合成和现实优化任务中，我们证明了BOKE不仅在与高斯过程方法和其他基线方法相比具有竞争力，而且表现出优越的计算效率。这些结果突显了BOKE在资源受限环境中的有效性，为工程应用中的优化问题提供了一种实用的方法。

英文摘要

Bayesian optimization is highly effective for optimizing expensive-to-evaluate black-box functions, but it faces significant computational challenges due to the cubic per-iteration cost of Gaussian processes, which results in a total time complexity that is quartic with respect to the number of iterations. To address this limitation, we propose a novel algorithm, Bayesian optimization by kernel regression and density-based exploration (BOKE). BOKE uses kernel regression for efficient function approximation, kernel density for exploration, and integrates them into the confidence bound criteria to guide the optimization process, thus reducing computational costs to quadratic. Our theoretical analysis rigorously establishes the global convergence of BOKE under noisy evaluations. Through extensive numerical experiments on both synthetic and real-world optimization tasks, we demonstrate that BOKE not only performs competitively compared to Gaussian process-based methods and several other baseline methods but also exhibits superior computational efficiency. These results highlight BOKE's effectiveness in resource-constrained environments, providing a practical approach for optimization problems in engineering applications.

URL PDF HTML ☆

赞 0 踩 0

2503.12147 2026-06-16 math.ST stat.TH

Two statistical problems for multivariate mixture distributions

Ricardo Fraiman, Leonardo Moreno, Thomas Ransford

Comments 41 pages, 12 figures

2511.10911 2026-06-16 stat.ME

Improving Variance and Confidence Interval Estimation in Small-Sample Propensity Score Analyses: Bootstrap vs. Asymptotic Methods

Baoshan Zhang, Sean M. O'Brien, Yuan Wu, Laine E. Thomas

详情

DOI: 10.1002/sim.70643
Journal ref: Statistics in Medicine (2026)

英文摘要

Propensity score (PS) methods are widely used to estimate treatment effects in non-randomized studies. Variance is typically estimated using sandwich or bootstrap methods, which can either treat the PS as estimated or fixed. The latter is thought to be conservative. Comparisons between the sandwich and bootstrap estimators have been compared in moderate to large sample sizes, favoring the bootstrap estimator. With the growing interest in treatments for rare disease and externally controlled clinical trials, very small sample sizes are not uncommon and the asymptotic properties of sandwich estimators may not hold. Bootstrap methods that allow for PS re-estimation can also generate problems with quasi-separation in small samples. It is unclear whether it is safe to prefer sandwich estimators or to assume that treating the PS as fixed is conservative. We conducted a Monte Carlo simulation to compare the performance of bootstrap versus sandwich variance and CI estimators for average treatment effects estimated with PS methods. We systematically evaluated the impact of treating the PS as fixed versus re-estimating it. These methodological comparisons were performed using Inverse Probability of Treatment Weighting (IPTW) and Augmented Inverse Probability of Treatment Weighting (AIPW) estimators. Simulations assessed performance under various conditions, including small sample sizes and different outcome and treatment prevalences. We illustrate the differences in our motivating example, the LIMIT-JIA trial. We show that the sandwich estimators can perform quite poorly in small samples, and fixed PS methods are not necessarily conservative. A stratified bootstrap avoids quasi-separation and performs well. Differences were large enough to alter statistical conclusions in our motivating example, LIMIT-JIA.

URL PDF HTML ☆

赞 0 踩 0

2508.06580 2026-06-16 stat.AP q-bio.PE

Actuarial Analysis of an Infectious Disease Insurance based on an SEIARD Epidemiological Model

Achraf Zinihi, Matthias Ehrhardt, Moulay Rchid Sidi Ammi

1705.08544 2026-06-16 stat.OT

Data Visualization on Day One: Bringing Big Ideas into Intro Stats Early and Often

Xiaofei Wang, Cynthia Rush, Nicholas Jon Horton

Comments Accepted in Technology Innovations in Statistics Education

1. 统计理论与方法 28 篇

A nonparametric two-sample test using a parametric integral probability metric

Optimal Multiscale Learning of Linear Operators

Statistical methods for assessing non-replicable, outlying, and influential studies

Moment-Free Kunchenko Stochastic Polynomials via Empirical Characteristic Function

On the Geometry of Separation in Finite Gaussian Mixtures

Wild bootstrap for mean response inference in functional linear regression models

Jeffreys-Type Penalized GEE for Correlated Binary Data with an Odds-Ratio Parameterization

Bias-Reduced GEE via Adjusted Estimating Equations, with Odds-Ratio Extensions

The limits of interpretability in multiple linear regression

Minimax Synthesis of Network Mechanisms

Latent Variable Models for Distributional Features

Kernel Density Estimation by Spectral Decomposition: Data-Driven Tapering and Superposition

Limit theorems of Azadkia-Chatterjee's conditional graph correlation

Finite Resources False Discovery Rate Control in Structured Hypothesis Spaces

Generalized likelihood ratio test for magnetic anomaly detection: a geometrical approach

Optimized Sequential Testing for Binary Ensemble Classifiers

Separate versus pooled winsorization for group mean contrasts: a finite-sample theory

Flexible Method Comparison with the Probability of Agreement

Bartlett adjustment for Gaussian random effects meta-analysis

Adaptive Kernel Density Estimation with Pre-training

Sharp One-Dimensional Sub-Gaussian Comparison in Convex Order

A Necessary and Sufficient Condition for Size Controllability of Heteroskedasticity Robust Test Statistics

Asymptotically Optimal Sequential Testing with Markovian Data

SpARCD: A Spectral Graph Framework for Revealing Differential Functional Connectivity in fMRI Data

Optimal structure learning and conditional independence testing

Kernel Two-Sample Testing via Directional Components Analysis

The optimal sub-Gaussian normalisation for randomised monotone functions

Estimation of High-Dimensional Normal Means through Inferential Models

2. 贝叶斯统计与概率建模 12 篇

Bayesian Inference and Decision Audits for Public Archives of Frontier AI Evaluations

MA-SBI: Misspecification-Aware Simulation-Based Inference via Side-Channel Guidance

Two fully specified Bayes factors for hypothesis testing and sensitivity analysis in process tracing

Neural Bayesian Anomaly Mitigation: A Robust Loss that Doubles as an Unsupervised Contamination Classifier

A Bayesian hierarchical model for meta-analysis

Bayesian joint modelling using semiparametric accelerated failure time approaches

Learning a Sampling-Free Variational DNN Plugin from Tiny Training Sets to Refine OOD Segmentation With Uncertainty Estimation

Modeling Nonlinear Ability Trajectories and Learner Heterogeneity in Online Learning: A Bayesian Nonparametric Dynamic IRT Framework

Bridging data-driven priors via the score function for posterior sampling -- Comparative review and experimental study

Optimal Stopping for Sequential Bayesian Experimental Design

Nonparametric Modeling of Continuous-Time Markov Chains

Separate Exchangeability as Modeling Principle in Bayesian Nonparametrics

3. 因果推断与实验设计 5 篇

Bounding Causal Effects for Ordinal Outcomes Under Positive Dependence

Relational Structural Causal Models

Causal Sufficient Dimension Reduction for Multiple Continuous Exposures with an Application to Environmental Mixtures

Experimentation for Different Scheduling Policies on Queues: Mixed Differences-in-Q Estimators Based on Little's Law

Detecting Where Effects Occur by Testing Hypotheses in Order

4. 高维统计与正则化 10 篇

Spectral Sparsification of Laplacian-Constrained Gaussian and Hüsler-Reiss Graphical Models

Paired Sample Tests for High-dimensional Uncorrelatedness via Random Integration

Bias-Aware External-Model-Assisted Inference in High-Dimensional Regression

Phase Transition in Convex Relaxations for Graph Alignment

High-Dimensional Robust Change-Point Detection via Angular Kernel Statistics

Debiased Inference for High-Dimensional Regression Models Based on Profile M-Estimation

Exact Coordinate Descent for High-Dimensional Regularized Huber Regression

KL-BSS: Rethinking optimality for neighbourhood selection in structural equation models

Interpretable Scalar-on-Image Linear Regression Models via the Generalized Dantzig Selector

TrIM: Transformed Iterative Mondrian Forests for Gradient-based Dimension Reduction and High-Dimensional Regression

5. 时间序列与空间统计 11 篇

Filtered Conformal Ellipsoids for Graph-Native Time Series

Generative Predictive Distributions for Time Series

Distributional Forecasting of EU Asylum Applications with Dynamic Multivariate Count Models

Drift-Aware Spectral Conformal Prediction for Non-Exchangeable Streaming Data

Spectral Adaptive Conformal Prediction for Structured Non-Exchangeable Data

PHINN: Persistent Homology Inspired Neural Network for Rare-Event Time Series Generation

A Kuramoto-von Mises Time Series Model for Probabilistic Modeling of Coupled Oscillators

Detecting Distributional Differences in Spatially Correlated Multivariate Data via Kernel-Smoothed Rank-Based Empirical Copula Tests

A Rank-Based Test for Comparing Multiple Fields' Yield Quality Distributions Under Spatial Dependence

Matching correlated VAR time series

Nonlinear regression models to forecast PM$_{2.5}$ concentration

6. 计算统计与MCMC 23 篇

Dynestyx: A Probabilistic Programming Library for Dynamical Systems

Closing the Approximation Gap in Simulation-free Latent SDEs

Stop the Sampler! Classifier-Based Adaptive Stopping for Sampling Kernels

p-PSO: A Penalized Particle Swarm Optimization Technique for Finding D-Optimal Designs with Mixed Factors in Generalized Linear Models

Amortized mean-shift interacting particles

Proximal Policy Optimization for Amortized Discrete Sampling

Score-Based Martingale Posteriors for Deep Neural Networks

Stochastic trace estimation with tensor train random vectors