arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.09513 2026-04-13 math.ST stat.ME stat.TH

Harmonic Map Regression: Rate-Optimal Nonparametric Estimation on Manifolds with Topological Recovery

Xiaoyu Chen

详情

英文摘要

We study harmonic map regression, a nonparametric estimator for manifold-valued responses, that penalizes the empirical Fréchet risk by the Dirichlet energy. By connecting penalized regression to the theory of harmonic maps, the estimator acquires a structural theory that parallels the classical Euclidean smoothing spline. The Euler-Lagrange equation characterizes the solution as a piecewise-geodesic spline, an equivalent kernel controls pointwise risk at the rate $n^{-2/3}$, and the infinite-dimensional variational problem reduces exactly to a finite-dimensional optimization. Such newly established connection reveals a topological phenomenon that has no analogue in Euclidean nonparametric regression and, to our knowledge, has not been studied in the manifold regression literature. On manifolds whose regression curves can wrap around in topologically distinct ways, maps in distinct homotopy classes are separated by energy barriers intrinsic to the geometry of the target, and the Dirichlet penalty makes the estimator sensitive to this structure, recovering the correct topological class with probability tending to one, a phase transition we call topological recovery. A curvature-dependent oracle inequality yields the minimax rate $n^{-2s/(2s+1)}$ for Sobolev order $s$, matching the Euclidean constant on non-positively curved targets, while five geometric obstructions show that the full structural theory is unique to the Dirichlet energy ($s=1$). Simulations on $S^2$, $\mathbb{H}^2$, $SO(3)$, $\mathrm{Sym}^+(2)$, and $T^2$ corroborate the theory, and an application to wind-direction data on $S^1$ illustrates practical advantages.

URL PDF HTML ☆

赞 0 踩 0

2604.09467 2026-04-13 stat.ME stat.AP

A Multi-Stage Drop-the-Loser Design with Superiority Boundaries

Peter Greenstreet, Manel Khan, Salmaan Kanji, Pouya Motazedian, Andrew Seely, Stephanie Sibley, Tim Ramsay

Comments 27 pages, 1 figures

2604.09376 2026-04-13 stat.ME

Maximum-of-Differences Test for Comparing Multivariate K-Sample Distributions

Wei Lan, Long Feng, Runze Li, Chih-Ling Tsai

2604.09375 2026-04-13 math.OC math.ST stat.ML stat.TH

Data-Efficient Non-Gaussian Semi-Nonparametric Density Estimation for Nonlinear Dynamical Systems

Aaron R. Liao, Kenshiro Oguri, Michele D. Carpenter

2604.09319 2026-04-13 stat.AP

ZINBGT: Exploratory Data Analysis of Single-Cell Transcriptomic Expression Using Mixture Models

Toby Kettlewell, Yiyi Cheng, Thomas D. Otto, Vincent Macaulay, Mayetri Gupta

Comments 11 pages, 28 pages with appendix, 6 figures, 14 figures with appendix

2604.09309 2026-04-13 stat.ML cs.LG stat.CO

Iterative Identification Closure: Amplifying Causal Identifiability in Linear SEMs

Ziyi Ding, Xiao-Ping Zhang

2604.09286 2026-04-13 stat.CO stat.ML

High-dimensional Adaptive MCMC with Reduced Computational Complexity

Max Hird, Samuel Livingstone

2603.08939 2026-04-13 math.ST stat.TH

Shape-constrained density estimation with Wasserstein projection

Takeru Matsuda, Ting-Kam Leonard Wong

Comments 31 pages, 4 figures. Revised

2602.14286 2026-04-13 stat.ME stat.ML

Online LLM watermark detection via e-processes

Weijie Su, Ruodu Wang, Zinan Zhao

2602.05862 2026-04-13 stat.ML cs.LG math.ST stat.TH

Distribution-free two-sample testing with blurred total variation distance

Rohan Hore, Rina Foygel Barber

Comments 47 pages, 4 figures

2601.03105 2026-04-13 stat.AP cs.MA cs.SI physics.soc-ph

Computationally Efficient Estimation of Localized Treatment Effects for Multi-Level, Multi-Component Interventions to Address the Opioid Crisis

Abdulrahman A. Ahmed, M. Amin Rahimian, Qiushi Chen, Praveen Kumar

Comments repository link: https://github.com/abdulrahmanfci/gpr-metamodel/

详情

英文摘要

The opioid epidemic remains a major public health challenge in the United States, requiring a multi-pronged intervention approach to mitigate harms to communities. Given the heterogeneity of the epidemic across the country, it is crucial for policymakers to understand localized treatment effects of different intervention components and utilize limited resources efficiently. While locally calibrated simulation models offer a useful computational tool to project the epidemic outcomes for any given intervention policy, collecting simulation results for all intervention combinations to estimate localized treatment effects for each community is impractical because the number of possible intervention combinations grows exponentially with the number of interventions and levels at which they are applied. To tackle this, we develop a bi-level metamodel framework with a two-stage sequential design for efficient sampling. The metamodel consists of a response function linking health outcomes to each intervention component's treatment effect, and a Gaussian process regression to learn spatial and socio-economic structures of the treatment effects based on locally-contextualized covariates. With two-stage sequential sampling, we leverage spatial correlations and posterior uncertainty to sequentially sample the most informative counties and treatment conditions. We apply this framework to estimate treatment effects of buprenorphine dispensing and naloxone distribution on overdose mortality rates using a calibrated agent-based opioid epidemic model in PA counties. Our approach achieves approximately 5% average relative error using one-tenth the number of runs required for an exhaustive simulation. Our bi-level framework provides a computationally efficient approach to support policymakers, in evaluating resource-allocation strategies to mitigate the opioid epidemic in local communities.

URL PDF HTML ☆

赞 0 踩 0

2512.17038 2026-04-13 stat.AP

Do Generalized-Gamma Scale Mixtures of Normals Fit Large Image Datasets?

Brandon Marks, Yash Dave, Zixun Wang, Hannah Chung, Riya Patwa, Simon Cha, Michael Murphy, Alexander Strang

Comments 22 pages main text, 21 figures, 7 tables, 10 pages appendix

2509.26258 2026-04-13 physics.ao-ph physics.data-an stat.AP stat.ML

EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules

Maybritt Schillinger, Maxim Samarin, Xinwei Shen, Reto Knutti, Nicolai Meinshausen

Comments Updates according to suggestions by anonymous reviewers: improved methodology for temporal consistency; add preliminary results for extrapolation to unseen GCMs; add further evaluation via histograms, ACFs and for climate change signal; improved explanations and wordings in several places

详情

英文摘要

The practical use of future climate projections from global circulation models (GCMs) is often limited by their coarse spatial resolution, requiring downscaling to generate high-resolution data. Regional climate models (RCMs) provide this refinement, but are computationally expensive. To address this issue, machine learning (ML) models can learn the downscaling function, mapping coarse GCM outputs to high-resolution fields. Among these, generative approaches aim to capture the full conditional distribution of RCM data given coarse-scale GCM data, which is characterized by large variability and thus challenging to model accurately. We introduce EnScale, a generative ML framework emulating the full GCM-to-RCM map by training on multiple pairs of GCM and corresponding RCM data. It first adjusts large-scale mismatches between GCM and coarsened RCM data, followed by a super-resolution step to generate high-resolution fields. To efficiently model the high-dimensional output, the super-resolution step employs a novel class of sparse local stochastic layers. Both steps employ generative models optimized with the energy score, a proper scoring rule. Compared to state-of-the-art ML downscaling approaches, our setup reduces computational cost by about one order of magnitude. EnScale jointly emulates multiple variables -- temperature, precipitation, solar radiation, and wind -- spatially consistent over Central Europe. In addition, we propose a variant EnScale-t that enables temporally consistent downscaling. We establish a comprehensive evaluation framework across various categories including calibration, spatial and temporal structure, extremes, and multivariate dependencies. Comparison with diverse benchmarks demonstrates EnScale(-t)'s competitive performance and computational efficiency, offering a promising approach for accurate and temporally consistent RCM emulation.

URL PDF HTML ☆

赞 0 踩 0

2501.19038 2026-04-13 stat.ML cs.LG

Conformal Prediction in Hierarchical Classification with Constrained Representation Complexity

Thomas Mortier, Alireza Javanmardi, Yusuf Sale, Eyke Hüllermeier, Willem Waegeman

2411.15627 2026-04-13 math.ST stat.TH

Community detection for binary graphical models in high dimension

Julien Chevallier, Guilherme Ost

2407.03619 2026-04-13 stat.ME

Multivariate Representations of Univariate Marked Hawkes Processes

Louis Davis, Conor Kresin, Boris Baeumer, Ting Wang

Comments 26 pages, 3 figures, submitted to the Annals of Statistics

2404.01478 2026-04-13 stat.AP

A Multidimensional Fractional Hawkes Process for Multiple Earthquake Mainshock Aftershock Sequences

Louis Davis, Boris Baeumer, Ting Wang

Comments 37 pages, 10 tables, 3 figures

2403.00916 2026-04-13 gr-qc math-ph math.MP math.ST quant-ph stat.TH

Characterizing Signalling: Connections between Causal Inference and Space-time Geometry

Maarten Grothus, V. Vilasini

Comments 31 + 25 pages, 12 figures. This work includes significantly improved versions of initial results presented in MG's master's thesis arXiv:2211.03593. v3 is close to the version accepted for publication at Classical and Quantum Gravity, and contains numerous clarifications and some minor corrections

详情

英文摘要

Causality is pivotal to our understanding of the world, presenting itself in different forms: information-theoretic and relativistic, the former linked to the flow of information, the latter to the structure of space-time. Leveraging a framework introduced in PRA, 106, 032204 (2022), which formally connects these two notions in general physical theories, we study their interplay. Here, information-theoretic causality is defined through a causal modelling approach. First, we improve the characterization of information-theoretic signalling as defined through so-called affects relations. Specifically, we provide conditions for identifying redundancies in different parts of such a relation, introducing techniques for causal inference in unfaithful causal models (where the observable data does not "faithfully" reflect the causal dependences). In particular, this demonstrates the possibility of causal inference using the absence of signalling between certain nodes. Second, we define an order-theoretic property called conicality, showing that it is satisfied for light cones in Minkowski space-times with $d>1$ spatial dimensions but violated for $d=1$. Finally, we study the embedding of information-theoretic causal models in space-time without violating relativistic principles such as no superluminal signalling (NSS). In general, we observe that constraints imposed by NSS in a space-time and those imposed by purely information-theoretic causal inference behave differently. We then prove a correspondence between conical space-times and faithful causal models: in both cases, there emerges a parallel between these two types of constraints. This indicates a connection between informational and geometric notions of causality, and offers new insights for studying the relations between the principles of NSS and no causal loops in different space-time geometries and theories of information processing.

URL PDF HTML ☆

赞 0 踩 0

2403.00142 2026-04-13 stat.AP

A Fractional Model for Earthquakes

Louis Davis, Boris Baeumer, Ting Wang

Comments 16 pages, 7 figure, submitted to the Journal of the Royal Statistical Society Series C

2310.16260 2026-04-13 stat.ME

Differentially Private Estimation and Inference in High-Dimensional Regression with FDR Control

Zhanrui Cai, Sai Li, Xintao Xia, Linjun Zhang

2604.09259 2026-04-13 stat.ME stat.AP

Exact Bayesian Planning for Simple Step-Stress Accelerated Life Testing with Competing Risks

Kiran Prajapat

Comments 37 pages, 8 figures

2604.09256 2026-04-13 stat.ME

Nobody Puts Bonferroni in a Corner

Mårten Schultzberg

2604.09208 2026-04-13 stat.ML cs.LG

A Predictive View on Streaming Hidden Markov Models

Gerardo Duran-Martin

2604.09175 2026-04-13 cs.LG cs.AI math.ST stat.ML stat.TH

Generalization and Scaling Laws for Mixture-of-Experts Transformers

Mansour Zoubeirou a Mayaki

2604.09143 2026-04-13 cs.LG stat.ME

Score-Driven Rating System for Sports

Vladimír Holý, Michal Černý

2604.09135 2026-04-13 stat.ML cs.LG math.ST stat.ME stat.TH

Identifying Causal Effects Using a Single Proxy Variable

Silvan Vollmer, Niklas Pfister, Sebastian Weichwald

Comments Equal contribution between Pfister and Weichwald

2604.09108 2026-04-13 stat.ME

A Practical Guide to Interpret a Randomized Controlled Trial

Ibrahim Halil Tanboga

2604.09078 2026-04-13 math.ST stat.TH

Node-Private Community Detection in Stochastic Block Models

Olga Klopp, Ilias Zadik

2604.09055 2026-04-13 stat.ME

Constructing confidence intervals for constrained parameters via valid prior-free inferential models

Hezhi Lu, Qijun Wu

2604.09012 2026-04-13 stat.ME

Spatially varying distributed lag non-linear models using Laplacian P-splines

Sara Rutten, Thomas Neyens, Elisa Duarte, Antonio Gasparrini, Christel Faes

2604.08978 2026-04-13 stat.ME

Model-Robust Direct Effect Under Confounder-Mediator Ambiguity

AmirEmad Ghassami

2604.08969 2026-04-13 stat.ML cs.LG math.ST stat.TH

Online Quantile Regression for Nonparametric Additive Models

Haoran Zhan

2604.08935 2026-04-13 stat.ML cs.LG

A novel hybrid approach for positive-valued DAG learning

Yao Zhao

Comments 13 pages, 2 tables. Accepted at CLeaR 2026

2604.08853 2026-04-13 stat.ME

The Illusion of Learning from Observational Data: An Empirical Bayes Perspective

Bohan Wu, Sebastian Salazar, Donald P. Green, David M. Blei

2604.08829 2026-04-13 cs.LG cs.NE stat.ML

Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis

Giansalvo Cirrincione

Comments 20 pages, 3 figures, 8 tables submitted to Neurocomputing

2604.08821 2026-04-13 cs.GT econ.TH stat.ME

Buying Data of Unknown Quality: Fisher Information Procurement Auctions

Yuchen Hu, Martin J. Wainwright, Stephen Bates

2604.08804 2026-04-13 stat.ML cs.LG stat.ME

Policy-Aware Design of Large-Scale Factorial Experiments

Xin Wen, Xi Chen, Will Wei Sun, Yichen Zhang

2604.08798 2026-04-13 stat.ME econ.EM stat.CO

Identification of Latent Group Effects under Conditional Calibration

Marcell T. Kurbucz

Comments 31 pages, 5 figures, 5 tables

2604.08755 2026-04-13 cs.CE cs.LG stat.ML

Accurate and Reliable Uncertainty Estimates for Deterministic Predictions Extensions to Under and Overpredictions

Rileigh Bandy, Enrico Camporeale, Andong Hu, Thomas Berger, Rebecca Morrison

2604.08676 2026-04-13 stat.ME

StationarityToolkit: Comprehensive Time Series Stationarity Analysis in Python

Bhanu Suraj Malla, Yuqing Hu

Comments Submitted to Journal of Open Source Software

2604.08625 2026-04-13 stat.ML cs.LG math.ST stat.TH

Spectral-Transport Stability and Benign Overfitting in Interpolating Learning

Gustav Olaf Yunus Laitinen-Lundström Fredriksson-Imanov

Comments 50 pages, 7 figures, 4 tables. Research article. Includes full proofs, model-specific corollaries, and synthetic supporting experiments. Submitted to Machine Learning

2604.06468 2026-04-13 cs.LG stat.ML

Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise

Yuanjie Shi, Peihong Li, Zijian Zhang, Janardhan Rao Doppa, Yan Yan

Comments Accepted for Publication at the 29th International Conference on Artificial Intelligence and Statistics (AISTATS), 2026

2604.03936 2026-04-13 stat.ML cs.LG stat.ME

Biconvex Biclustering

Sam Rosen, Eric C. Chi, Jason Xu

Comments 34 pages, 5 figures

2604.03488 2026-04-13 stat.ME

Inference for Clustering: Conformal Sets for Cluster Labels

YoonHaeng Hur, Anirban Nath, Genevera Allen

2604.00504 2026-04-13 stat.ME econ.EM

Conformal Inference for Experimental Attrition in Social Science Research

Xiangyu Song

2603.25348 2026-04-13 math.ST stat.TH

Quantitative analysis of non-exchangeability in bivariate copulas: Sharp bounds, statistical tests and mixing constructions

Manuel Úbeda-Flores

2603.21917 2026-04-13 stat.ME econ.EM

The Cascade Identity: 2SLS as a Policy Parameter in Capacity-Constrained Settings

Niklas Bengtsson, Per Engström

Comments 67 pages, 3 figures, 10 tables

2603.02622 2026-04-13 cs.LG stat.ML

Implicit Bias in Deep Linear Discriminant Analysis

Jiawen Li

2602.18358 2026-04-13 stat.AP q-fin.ST

Forecasting the Evolving Composition of Inbound Tourism Demand: A Bayesian Compositional Time Series Approach Using Platform Booking Data

Harrison Katz

2602.11129 2026-04-13 math.PR cs.IT math.IT math.ST stat.TH

Information-Theoretic Thresholds for Bipartite Latent-Space Graphs under Noisy Observations

Andreas Göbel, Marcus Pappik, Leon Schiller

Comments Corrected the steps leading to equation (5.1) and the proof of lemma 6.2

2601.21763 2026-04-13 math.PR stat.CO

Spectral Gap of Metropolis Algorithms for Non-smooth Distributions under Isoperimetry

Shuigen Liu, Xin T. Tong

2601.13930 2026-04-13 math.ST stat.TH

On spectral clustering under non-isotropic Gaussian mixture models

Kohei Kawamoto, Yuichi Goto, Koji Tsukuda

Comments 8 pages

2601.08588 2026-04-13 quant-ph cs.IT cs.LG math.IT math.ST stat.TH

Sample Complexity of Composite Quantum Hypothesis Testing

Jacob Paul Simpson, Efstratios Palias, Sharu Theresa Jose

Comments Accepted to ISIT 2026

2512.03760 2026-04-13 stat.AP stat.ME

A decay-adjusted spatio-temporal model to account for the impact of mass drug administration on neglected tropical disease prevalence

Emanuele Giorgi, Claudio Fronterre, Peter J. Diggle

Comments Under review

2512.01708 2026-04-13 stat.ML cs.LG

Differentially Private and Federated Structure Learning in Bayesian Networks

Ghita Fassy El Fehri, Aurélien Bellet, Philippe Bastien

2512.00405 2026-04-13 stat.ME

Evaluating Surrogates in Individualized Treatment Rules

Zeyu Xu, Xiaojie Mao, Hao Mei, Yue Liu

Comments 42 pages, no figures

2511.04903 2026-04-13 stat.OT

Efficacy Analysis in Clinical Trials: A Comprehensive Review of Statistical and Machine Learning Approaches

Dhrubajyoti Ghosh, Samhita Pal

2509.26451 2026-04-13 stat.ME

Non-Parametric Simulation of Multivariate Extreme Events via Spectral Bootstrap

Nisrine Madhar, Juliette Legrand, Maud Thomas

Comments arXiv admin note: text overlap with arXiv:2406.08019

2507.21807 2026-04-13 stat.ML cs.LG

MIBoost: A gradient boosting algorithm for variable selection after multiple imputation

Robert Kuchen

Comments 18 pages, 2 algorithms, includes a simulation study

2507.16376 2026-04-13 stat.ME

A Bayesian Geoadditive Model for Spatial Disaggregation

Sara Rutten, Thomas Neyens, Elisa Duarte, Christel Faes

2506.03074 2026-04-13 stat.ML cs.LG

GL-LowPopArt: A Nearly Instance-Wise Minimax-Optimal Estimator for Generalized Low-Rank Trace Regression

Junghyun Lee, Kyoungseok Jang, Kwang-Sung Jun, Milan Vojnović, Se-Young Yun

Comments AISTATS 2026 (58 pages, 2 tables, 1 figure) (ver5: fixed some stuff from camera-ready version, significant revisions)

2505.24078 2026-04-13 stat.AP econ.GN q-fin.EC

Evaluating Gender Wage Inequality in Academia using Causal Inference Methods for Observational Data

Zihan Zhang, Jan Hannig

2505.23542 2026-04-13 econ.EM stat.ML

Large SVARs

Jonas E. Arias, Juan F. Rubio-Ramírez, Daniel Rudolf, Minchul Shin

Comments 58 pages, 14 figures

2504.04143 2026-04-13 stat.AP q-bio.PE

The Rhythm of Aging: Stability and Drift in the Individual Rate of Senescence

Silvio Cabral Patricio

2501.04581 2026-04-13 stat.AP

Mediation analysis in longitudinal intervention studies with an ordinal treatment-dependent confounder

Mikko Valtanen, Tommi Härkänen, Matti Uusitupa, Jaakko Tuomilehto, Jaana Lindström, Kari Auranen

2410.15001 2026-04-13 cs.LG stat.ML

FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening

Shubhajit Roy, Hrriday Ruparel, Kishan Ved, Anirban Dasgupta

Comments Published in Transactions on Machine Learning Research (TMLR), 2026. Available at https://openreview.net/forum?id=g7r7y2I7Sz

2410.09355 2026-04-13 cs.LG cs.AI stat.ML

On Divergence Measures for Training GFlowNets

Tiago da Silva, Eliezer de Souza da Silva, Diego Mesquita

Comments Accepted at NeurIPS 2024, https://openreview.net/forum?id=N5H4z0Pzvn

2305.10524 2026-04-13 stat.ME

Dynamic Matrix Recovery

Ziyuan Chen, Ying Yang, Fang Yao

Comments Journal of the American Statistical Association (2023)