arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1915
专题追踪
2511.17537 2026-04-15 cs.NI cs.AI

HiFiNet: Hierarchical Fault Identification in Wireless Sensor Networks via Edge-Based Classification and Graph Aggregation

Nguyen Tri Nghia, Nguyen Van Son, Nguyen Thi Hanh

Comments Accepted to CITA 2026

详情
英文摘要

Wireless Sensor Networks (WSN) are the backbone of essential monitoring applications, but their deployment in unfavourable conditions increases the risk to data integrity and system reliability. Traditional fault detection methods often struggle to effectively balance accuracy and energy consumption, and they may not fully leverage the complex spatio-temporal correlations inherent in WSN data. In this paper, we introduce HiFiNet, a novel hierarchical fault identification framework that addresses these challenges through a two-stage process. Firstly, edge classifiers with a Long Short-Term Memory (LSTM) stacked autoencoder perform temporal feature extraction and output initial fault class prediction for individual sensor nodes. Using these results, a Graph Attention Network (GAT) then aggregates information from neighboring nodes to refine the classification by integrating the topology context. Our method is able to produce more accurate predictions by capturing both local temporal patterns and network-wide spatial dependencies. To validate this approach, we constructed synthetic WSN datasets by introducing specific, predefined faults into the Intel Lab Dataset and NASA's MERRA-2 reanalysis data. Experimental results demonstrate that HiFiNet significantly outperforms existing methods in accuracy, F1-score, and precision, showcasing its robustness and effectiveness in identifying diverse fault types. Furthermore, the framework's design allows for a tunable trade-off between diagnostic performance and energy efficiency, making it adaptable to different operational requirements.

2511.00179 2026-04-15 physics.chem-ph cs.AI cs.LG

Generative Modeling Enables Molecular Structure Retrieval from Coulomb Explosion Imaging

Xiang Li, Till Jahnke, Rebecca Boll, Jiaqi Han, Minkai Xu, Michael Meyer, Maria Novella Piancastelli, Daniel Rolles, Artem Rudenko, Florian Trinter, Thomas J. A. Wolf, Jana B. Thayer, James P. Cryan, Stefano Ermon, Phay J. Ho

Journal ref Nat Commun 17, 3430 (2026)

详情
英文摘要

Capturing the structural changes that molecules undergo during chemical reactions in real space and time is a long-standing dream and an essential prerequisite for understanding and ultimately controlling femtochemistry. A key approach to tackle this challenging task is Coulomb explosion imaging, which benefited decisively from recently emerging high-repetition-rate X-ray free-electron laser sources. With this technique, information on the molecular structure is inferred from the momentum distributions of the ions produced by the rapid Coulomb explosion of molecules. Retrieving molecular structures from these distributions poses a highly non-linear inverse problem that remains unsolved for molecules consisting of more than a few atoms. Here, we address this challenge using a diffusion-based Transformer neural network. We show that the network reconstructs unknown molecular geometries from ion-momentum distributions with a mean absolute error below one Bohr radius, which is half the length of a typical chemical bond.

2510.10073 2026-04-15 cs.CR cs.CV

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

Zonghao Ying, Yangguang Shao, Jianle Gan, Gan Xu, Wenxin Zhang, Quanchen Zou, Junzheng Shi, Zhenfei Yin, Mingchuan Zhang, Aishan Liu, Xianglong Liu

Comments ACL

详情
英文摘要

Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in real-world environments, they face serious security risks, motivating the design of security evaluation benchmarks. Existing benchmarks provide only partial coverage, typically restricted to narrow scenarios such as user-level prompt manipulation, and thus fail to capture the broad range of agent vulnerabilities. To address this gap, we present \tool{}, the first holistic benchmark for evaluating the security of LVLM-based web agents. \tool{} first introduces a unified evaluation suite comprising six simulated but realistic web environments (\eg, e-commerce platforms, community forums) and includes 2,970 high-quality trajectories spanning diverse tasks and attack settings. The suite defines a structured taxonomy of six attack vectors spanning both user-level and environment-level manipulations. In addition, we introduce a multi-layered evaluation protocol that analyzes agent failures across three critical dimensions: internal reasoning, behavioral trajectory, and task outcome, facilitating a fine-grained risk analysis that goes far beyond simple success metrics. Using this benchmark, we conduct large-scale experiments on 9 representative LVLMs, which fall into three categories: general-purpose, agent-specialized, and GUI-grounded. Our results show that all tested agents are consistently vulnerable to subtle adversarial manipulations and reveal critical trade-offs between model specialization and security. By providing (1) a comprehensive benchmark suite with diverse environments and a multi-layered evaluation pipeline, and (2) empirical insights into the security challenges of modern LVLM-based web agents, \tool{} establishes a foundation for advancing trustworthy web agent deployment.

2509.10547 2026-04-15 q-bio.NC cs.AI cs.LG

Pursuit of biomarkers of brain diseases: Beyond cohort comparisons

Pascal Helson, Arvind Kumar

详情
英文摘要

Despite the diversity and volume of brain data acquired and advanced AI-based algorithms to analyze them, brain features are rarely used in clinics for diagnosis and prognosis. Here we argue that the field continues to rely on cohort comparisons to seek biomarkers, despite the well-established degeneracy of brain features. Using a thought experiment (Brain Swap), we show that more data and more powerful algorithms will not be sufficient to identify biomarkers of brain diseases. We argue that instead of comparing patient versus healthy controls using single data type, we should use multimodal (e.g. brain activity, neurotransmitters, neuromodulators, brain imaging) and longitudinal brain data to guide the grouping before defining multidimensional biomarkers for brain diseases.

2509.02648 2026-04-15 q-bio.GN cs.LG q-bio.QM stat.AP

Optimizing Prognostic Biomarker Discovery in Pancreatic Cancer Through Hybrid Ensemble Feature Selection and Multi-Omics Data

John Zobolas, Anne-Marie George, Alberto López, Sebastian Fischer, Marc Becker, Tero Aittokallio

Comments 52 pages, 5 figures, 9 Supplementary Figures, 1 Supplementary Table

Journal ref BioData Mining (2026)

详情
英文摘要

Prediction of patient survival using high-dimensional multi-omics data requires systematic feature selection methods that ensure predictive performance, sparsity, and reliability for prognostic biomarker discovery. We developed a hybrid ensemble feature selection (hEFS) approach that combines data subsampling with multiple prognostic models, integrating both embedded and wrapper-based strategies for survival prediction. Omics features are ranked using a voting-theory-inspired aggregation mechanism across models and subsamples, while the optimal number of features is selected via a Pareto front, balancing predictive accuracy and model sparsity without any user-defined thresholds. When applied to multi-omics datasets from three pancreatic cancer cohorts, hEFS identifies significantly fewer and more stable biomarkers compared to the conventional, late-fusion CoxLasso models, while maintaining comparable discrimination performance. Implemented within the open-source mlr3fselect R package, hEFS offers a robust, interpretable, and clinically valuable tool for prognostic modelling and biomarker discovery in high-dimensional survival settings.

2507.22485 2026-04-15 physics.geo-ph cs.AI

Physics-constrained generative machine learning-based high-resolution downscaling of Greenland's surface mass balance and surface temperature

Nils Bochow, Philipp Hess, Alexander Robinson

详情
英文摘要

Accurate, high-resolution projections of the Greenland ice sheet's surface mass balance (SMB) and surface temperature are essential for understanding future sea-level rise, yet current approaches are either computationally demanding or limited to coarse spatial scales. Here, we introduce a novel physics-constrained generative modeling framework based on a consistency model (CM) to downscale low-resolution SMB and surface temperature fields by a factor of up to 32 (from 160 km to 5 km grid spacing) in a few sampling steps. The CM is trained on monthly outputs of the regional climate model MARv3.12 and conditioned on ice-sheet topography and insolation. By enforcing a hard conservation constraint during inference, we ensure approximate preservation of SMB and temperature sums on the coarse spatial scale as well as robust generalization to extreme climate states without retraining. On the test set, our constrained CM achieves a continued ranked probability score of 6.31 mmWE for the SMB and 0.1 K for the surface temperature, outperforming interpolation-based downscaling. Together with spatial power-spectral analysis, we demonstrate that the CM faithfully reproduces variability across spatial scales. We further apply bias-corrected outputs of the NorESM2 Earth System Model as inputs to our CM, to demonstrate the potential of our model to directly downscale ESM fields. Our approach delivers realistic, high-resolution climate forcing for ice-sheet simulations with fast inference and can be readily integrated into Earth-system and ice-sheet model workflows to improve projections of the future contribution to sea-level rise from Greenland and potentially other ice sheets and glaciers too.

2507.01770 2026-04-15 math.NA cs.AI cs.DC cs.MS cs.NA math.OC

Global optimization tailored for graphics processing units: Complete and rigorous search for large-scale nonlinear minimization

Guanglu Zhang, Qihang Shan, Jonathan Cagan

Comments 35 pages, 4 figures

Journal ref PNAS Nexus, 5(4), pp. pgag103 (2026)

详情
英文摘要

This paper introduces a numerical method to enclose the global minimum of a nonlinear function subject to simple bounds on the variables. Using interval analysis, coupled with the computational power and architecture of graphics processing units (GPUs), the method iteratively rules out the regions in the search domain where the global minimum cannot exist and leaves a finite set of regions where the global minimum must exist. For effectiveness, because of the rigor of interval analysis, the method is guaranteed to enclose the global minimum even in the presence of rounding errors. For efficiency, the method employs a novel GPU-based single program, single data parallel programming style to circumvent major GPU performance bottlenecks, and a variable cycling technique is also integrated into the method to reduce computational cost when minimizing large-scale nonlinear functions. The method is validated by minimizing 11 benchmark test functions with scalable dimensions, including the well-known Ackley function, Griewank function, Levy function, Rastrigin function, and Rosenbrock function. These benchmark test functions represent grand challenges of global optimization, and enclosing the guaranteed global minimum of these benchmark test functions with more than 80 dimensions has not been reported in the literature. Our method completely searches the feasible domain and successfully encloses the guaranteed global minimum of these 11 benchmark test functions with up to 10,000 dimensions using only one GPU in a reasonable computation time, far exceeding the reported results in the literature due to the unique method design and implementation based on GPU architecture.

2507.00866 2026-04-15 astro-ph.IM cs.LG

Template-Fitting Meets Deep Learning: Redshift Estimation Using Physics-Guided Neural Networks

Jonas Chris Ferrao, Dickson Dias, Pranav Naik, Glory D'Cruz, Anish Naik, Siya Khandeparkar, Manisha Gokuldas Fal Dessai

详情
英文摘要

Accurate photometric redshift estimation is critical for observational cosmology, especially in large-scale surveys where spectroscopic measurements are impractical. Traditional approaches include template fitting and machine learning, each with distinct strengths and limitations. We present a hybrid method that integrates template fitting with deep learning using physics-guided neural networks. By embedding spectral energy distribution templates into the network architecture, our model encodes physical priors into the training process. The system employs a multimodal design, incorporating cross-attention mechanisms to fuse photometric and image data, along with Bayesian layers for uncertainty estimation. We evaluate our model on the publicly available PREML dataset, which includes approximately 400,000 galaxies from the Hyper Suprime-Cam PDR3 release, with 5-band photometry, multi-band imaging, and spectroscopic redshifts. Our approach achieves an RMS error of 0.0507, a 3-sigma catastrophic outlier rate of 0.13%, and a bias of 0.0028. The model satisfies two of the three LSST photometric redshift requirements for redshifts below 3. These results highlight the potential of combining physically motivated templates with data-driven models for robust redshift estimation in upcoming cosmological surveys.

2506.12770 2026-04-15 quant-ph cs.AI physics.atom-ph

Solving tricky quantum optics problems with assistance from (artificial) intelligence

Manas Pandey, Bharath Hebbe Madhusudhana, Saikat Ghosh, Dmitry Budker

Comments 9 pages, 3 figures

详情
英文摘要

The capabilities of modern artificial intelligence (AI) as a ``scientific collaborator'' are explored by engaging it with three nuanced problems in quantum optics: state populations in optical pumping, resonant transitions between decaying states (the Burshtein effect), and degenerate mirrorless lasing. Through iterative dialogue, the authors observe that AI models--when prompted and corrected--can reason through complex scenarios, refine their answers, and provide expert-level guidance, closely resembling the interaction with an adept colleague. The findings highlight that AI democratizes access to sophisticated modeling and analysis, shifting the focus in scientific practice from technical mastery to the generation and testing of ideas, and reducing the time for completing research tasks from days to minutes.

2506.01979 2026-04-15 cs.DC cs.AI

SpecBranch: Speculative Decoding via Hybrid Drafting and Rollback-Aware Branch Parallelism

Yuhao Shen, Junyi Shen, Quan Kong, Tianyu Liu, Yao Lu, Cong Wang

Comments The paper has been accepted by ICLR2026

详情
英文摘要

Recently, speculative decoding (SD) has emerged as a promising technique to accelerate LLM inference by employing a small draft model to propose draft tokens in advance, and validating them in parallel with the large target model. However, the existing SD methods still remain fundamentally constrained by their serialized execution, which causes the mutual waiting bubbles between the draft and target models. To address this challenge, we draw inspiration from branch prediction in modern processors and propose a novel framework \textbf{SpecBranch} to unlock branch parallelism in SD. Specifically, we first take an in-depth analysis of the potential of branch parallelism in SD, and recognize that the key challenge lies in the trade-offs between parallelization and token rollback. Based on the analysis, we strategically introduce parallel speculative branches to preemptively hedge against likely rejections. Meanwhile, to enhance parallelism, we jointly orchestrate adaptive draft lengths with a hybrid combination of the implicit draft model confidence and explicit reusing of target model features. Extensive experiments across various models and benchmarks show that SpecBranch achieves over \textbf{1.8}$\times \sim$ \textbf{4.5}$\times$ speedups against the auto-regressive decoding and reduces rollback tokens by $\textbf{50}$\% for poorly aligned models, realizing its applicability for real-world deployments.

2506.01256 2026-04-15 eess.AS cs.CL cs.LG cs.SD

Gradient boundaries through confidence intervals for forced alignment estimates using model ensembles

Matthew C. Kelley

Comments accepted for publication; 12 pages, 4 figures

详情
英文摘要

Forced alignment is a common tool to align audio with orthographic and phonetic transcriptions. Most forced alignment tools provide only point-estimates of boundaries. The present project introduces a method of producing gradient boundaries by deriving confidence intervals using neural network ensembles. Ten different segment classifier neural networks were previously trained, and the alignment process is repeated with each classifier. The ensemble is then used to place the point-estimate of a boundary at the median of the boundaries in the ensemble, and the gradient range is placed using a 97.85% confidence interval around the median constructed using order statistics. Gradient boundaries are taken here as a more realistic representation of how segments transition into each other. Moreover, the range indicates the model uncertainty in the boundary placement, facilitating tasks like finding boundaries that should be reviewed. As a bonus, on the Buckeye and TIMIT corpora, the ensemble boundaries show a slight overall improvement over using just a single model. The gradient boundaries can be emitted during alignment as JSON files and a main table for programmatic and statistical analysis. For familiarity, they are also output as Praat TextGrids using a point tier to represent the edges of the boundary regions.

2505.23737 2026-04-15 stat.ML cs.IT cs.LG math.IT math.OC

On the Convergence Analysis of Muon

Wei Shen, Ruichuan Huang, Minhui Huang, Cong Shen, Jiawei Zhang

详情
英文摘要

The majority of parameters in neural networks are naturally represented as matrices. However, most commonly used optimizers treat these matrix parameters as flattened vectors during optimization, potentially overlooking their inherent structural properties. Recently, an optimizer called Muon has been proposed, specifically designed to optimize matrix-structured parameters. Extensive empirical evidence shows that Muon can significantly outperform traditional optimizers when training neural networks. Nonetheless, the theoretical understanding of Muon's convergence behavior and the reasons behind its superior performance remain limited. In this work, we present a comprehensive convergence rate analysis of Muon and its comparison with Gradient Descent (GD). We characterize the conditions under which Muon can outperform GD. Our theoretical results reveal that Muon can benefit from the low-rank structure of Hessian matrices, a phenomenon widely observed in practical neural network training. Our experimental results support and corroborate the theoretical findings.

2505.19134 2026-04-15 cs.GT cs.LG stat.ML

Incentivizing High-Quality Human Annotations with Golden Questions

Shang Liu, Zhongze Cai, Hanzhao Wang, Zhongyao Ma, Xiaocheng Li

Comments Corrected bugs in the proofs by specifying a further assumption. arXiv admin note: text overlap with arXiv:2502.06387

详情
英文摘要

Human-annotated data plays a vital role in training large language models (LLMs), such as supervised fine-tuning and human preference alignment. However, it is not guaranteed that paid human annotators produce high-quality data. In this paper, we study how to incentivize human annotators to do so. We start from a principal-agent model to model the dynamics between the company (the principal) and the annotator (the agent), where the principal can only monitor the annotation quality by examining $n$ samples. We investigate the maximum likelihood estimators (MLE) and the corresponding hypothesis testing to incentivize annotators: the agent is given a bonus if the MLE passes the test. By analyzing the variance of the outcome, we show that the strategic behavior of the agent makes the hypothesis testing very different from traditional ones: Unlike the exponential rate proved by the large deviation theory, the principal-agent model's hypothesis testing rate is of $Θ(1/\sqrt{n \log n})$. Our theory implies two criteria for the \emph{golden questions} to monitor the performance of the annotators: they should be of (1) high certainty and (2) similar format to normal ones. In that light, we select a set of golden questions in human preference data. By doing incentive-compatible experiments, we find out that the annotators' behavior is better revealed by those golden questions, compared to traditional survey techniques such as instructed manipulation checks.

2504.00890 2026-04-15 stat.ML cs.LG

Privacy-Preserving Transfer Learning for Community Detection using Locally Distributed Multiple Networks

Xiao Guo, Xuming He, Xiangyu Chang, Shujie Ma

详情
英文摘要

Modern applications increasingly involve highly sensitive network data, where raw edges cannot be shared due to privacy constraints. We propose \texttt{TransNet}, a new spectral clustering-based transfer learning framework that improves community detection on a \emph{target network} by leveraging heterogeneous, locally stored, and privacy-preserved auxiliary \emph{source networks}. Our focus is the \textit{local differential privacy} regime, in which each local data provider perturbs edges via \textit{randomized response} before release, requiring no trusted third party. \texttt{TransNet} aggregates source eigenspaces through a novel adaptive weighting scheme that accounts for both privacy and heterogeneity, and then regularizes the weighted source eigenspace with the target eigenspace to optimally balance the two. Theoretically, we establish an error-bound-oracle property: the estimation error for the aggregated eigenspace depends only on \textit{informative sources}, ensuring robustness when some sources are highly heterogeneous or heavily privatized. We further show that the error bound of \texttt{TransNet} is no greater than that of estimators using only the target network or only (weighted) sources. Empirically, \texttt{TransNet} delivers strong gains across a range of privacy levels and heterogeneity patterns. For completeness, we also present \texttt{TransNetX}, an extension based on Gaussian perturbation of projection matrices under the assumption that trusted local data curators are available.

2503.10471 2026-04-15 cond-mat.mtrl-sci cs.AI

Siamese Foundation Models for Crystal Structure Prediction

Liming Wu, Wenbing Huang, Rui Jiao, Jianxing Huang, Liwei Liu, Yipeng Zhou, Hao Sun, Yang Liu, Fuchun Sun, Yuxiang Ren, Jirong Wen

详情
英文摘要

Predicting crystal structures from chemical compositions is a fundamental challenge in materials discovery, complicated by complex 3D geometries that distinguish it from fields like protein folding. Here, we present Diffusion-based Crystal Omni (DAO), a pretrain-finetune framework for crystal structure prediction integrating two Siamese foundation models: a structure generator and an energy predictor. The generator is pretrained via a two-stage pipeline on a vast dataset of stable and unstable structures, leveraging the predictor to relax unstable configurations and guide the generative sampling. Across two well-known benchmarks, pretraining significantly enhances performance across multiple backbone architectures. Ablation studies confirm that the synergy between the generator and predictor mutually benefits both components. We further validate DAO on three real-world superconductors ($\text{Cr}_6\text{Os}_2$, $\text{Zr}_{16}\text{Rh}_8\text{O}_4$, and $\text{Zr}_{16}\text{Pd}_8\text{O}_4$) typically inaccessible to conventional computation. For $\text{Cr}_6\text{Os}_2$, DAO achieves a 100\% match rate with experimental references and an atomic-position error of 0.0012 under 20-shot generation, performing over 2000$\times$ faster per iteration than DFT-based structure predictors. These compelling results collectively highlight the potential of our approach for advancing materials science research.

2502.13571 2026-04-15 cs.SI cs.LG

Influence Strength Estimation in Hyperbolic Space for Social Influence Maximization

Hongliang Qiao, Shanshan Feng, Min Zhou, Xutao Li, Yunming Ye, Fan Li, Shuo Shang, Yew-Soon Ong

Comments 14 pages, 10 figures

详情
英文摘要

The Influence Maximization (IM) problem aims to find a small set of influential users to maximize their influence spread in a social network. Traditional methods rely on fixed diffusion models with known parameters, limiting their generalization to real-world scenarios. In contrast, graph representation learning-based methods have gained wide attention for overcoming this limitation by learning user representations to capture influence characteristics. However, existing studies are built on Euclidean space, which fails to effectively capture the latent hierarchical features of social influence distribution. As a result, users' influence spread cannot be effectively measured through the learned representations. To alleviate these limitations, we propose HIM, a novel diffusion model agnostic method that leverages hyperbolic representation learning to estimate users' potential influence spread from social propagation data. HIM consists of two key components. First, a hyperbolic influence representation module encodes influence spread patterns from network structure and historical influence activations into expressive hyperbolic user representations. Hence, the influence magnitude of users can be reflected through the geometric properties of hyperbolic space, where highly influential users tend to cluster near the space origin. Second, a novel adaptive seed selection module is developed to flexibly and effectively select seed users using the positional information of learned user representations. Extensive experiments on five network datasets demonstrate the superior effectiveness and efficiency of our method for the IM problem with unknown diffusion model parameters, highlighting its potential for large-scale real-world social networks.

2502.07415 2026-04-15 stat.ML cs.LG

The Illusion of Fit: Spatially Resolved Assessment of Constitutive Model Validity in Elastography and Physics-Based Inverse Problems

Vincent C. Scholz, P. S. Koutsourelakis

Comments 29 pages, 12 figures

详情
英文摘要

Inferring the mechanical properties of soft tissues from measured deformations is a fundamental challenge in elastography. A rarely examined assumption underlying existing approaches is that the assumed constitutive law correctly describes the imaged material. When it fails, inversion still yields plausible-looking estimates - an illusion of fit with no indication of local model invalidity, which can mislead clinical interpretation. We propose a probabilistic framework that transforms constitutive model validity from an implicit assumption into an explicit, spatially resolved inference target. The key is to treat the stress field as an independent latent variable rather than deriving it from the constitutive law. This enables a pointwise comparison between the stress required by mechanical equilibrium and the stress predicted by the assumed constitutive model. Both governing equations enter the probabilistic learning objective as virtual observables with separate precision hyperparameters: the conservation law precision is set a priori to a small value reflecting its undisputed validity, while the constitutive precision is inferred under a sparsity-promoting prior. The resulting constitutive precision field provides an uncertainty-aware map of where the assumed model is supported by the data and where it is not. Inference is carried out via stochastic variational inference and is forward-model-free. We validate the framework on synthetic harmonic elastography experiments on a brain-slice geometry with an anisotropic inclusion. The inferred precision field identifies the inclusion with a five-order-of-magnitude precision contrast against the valid domain, robustly across 25-35 dB noise and four-fold sparser observations. A phantom experiment with ultrasound measurements on a linear elastic material yields no false-positive violations and recovers the true stiffness contrast.

2502.06595 2026-04-15 math.NA cs.LG cs.NA

Surrogate models for diffusion on graphs via sparse polynomials

Giuseppe Alessio D'Inverno, Kylian Ajavon, Simone Brugiapaglia

详情
英文摘要

Diffusion kernels over graphs have been widely utilized as effective tools in various applications due to their ability to accurately model the flow of information through nodes and edges. However, there is a notable gap in the literature regarding the development of surrogate models for diffusion processes on graphs. In this work, we fill this gap by proposing sparse polynomial-based surrogate models for parametric diffusion equations on graphs with community structure. In tandem, we provide convergence guarantees for both least squares and compressed sensing-based approximations by showing the holomorphic regularity of parametric solutions to these diffusion equations. Our theoretical findings are accompanied by a series of numerical experiments conducted on both synthetic and real-world graphs that demonstrate the applicability of our methodology.

2206.00939 2026-04-15 stat.ML cs.LG

Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs

Etienne Boursier, Loucas Pillaud-Vivien, Nicolas Flammarion

Comments corrected Proposition 1

详情
英文摘要

The training of neural networks by gradient descent methods is a cornerstone of the deep learning revolution. Yet, despite some recent progress, a complete theory explaining its success is still missing. This article presents, for orthogonal input vectors, a precise description of the gradient flow dynamics of training one-hidden layer ReLU neural networks for the mean squared error at small initialisation. In this setting, despite non-convexity, we show that the gradient flow converges to zero loss and characterise its implicit bias towards minimum variation norm. Furthermore, some interesting phenomena are highlighted: a quantitative description of the initial alignment phenomenon and a proof that the process follows a specific saddle to saddle dynamics.

2108.00480 2026-04-15 q-fin.CP cs.CL cs.LG

Realised Volatility Forecasting: Machine Learning via Financial Word Embedding

Eghbal Rahimikia, Stefan Zohren, Ser-Huang Poon

详情
英文摘要

We examine whether news can improve realised volatility forecasting using a modern yet operationally simple NLP framework. News text is transformed into embedding-based representations, and forecasts are evaluated both as a standalone, news-only model and as a complement to standard realised volatility benchmarks. In out-of-sample tests on a cross-section of stocks, news contains useful predictive information, with stronger effects for stock-related content and during high volatility days. Combining the news-based signal with a leading benchmark yields consistent improvements in statistical performance and economically meaningful gains, while explainability analysis highlights the news themes most relevant for volatility.

2604.13034 2026-04-15 cs.DC cs.DB

DySkew: Dynamic Data Redistribution for Skew-Resilient Snowpark UDF Execution

Chenwei Xie, Urjeet Shrestha, Corbin McElhanney, Lukas Lorimer, Gopal V, Zihao Ye, Yi Pan, Nic Crouch, Elliott Brossard, Florian Funke, Yuxiong He

详情
英文摘要

Snowflake revolutionized data warehousing with an elastic architecture that decouples compute and storage, enabling scalable solutions for diverse data analytics needs. Building on this foundation, Snowflake has advanced its AI Data Cloud vision by introducing Snowpark, a managed turnkey solution that supports data engineering and AI/ML workloads using Python and other programming languages. While Snowpark's User-Defined Function (UDF) execution model offers high throughput, it is highly vulnerable to performance degradation from data skew, where uneven data partitioning causes straggler tasks and unpredictable latency. The non-uniform computational cost of arbitrary user code further exacerbates this classic challenge. This paper presents DySkew, a novel, data-skew-aware execution strategy for Snowpark UDFs. Built upon Snowflake's new generalized skew handling solution, an adaptive data distribution mechanism utilizing per-link state machines. DySkew addresses the unique challenges of user-defined logic with goals of fine-grained per-row mitigation, dynamic runtime adaptation, and low-overhead, cost-aware redistribution. Specifically, for Snowpark, we introduce crucial optimizations, including an eager redistribution strategy and a Row Size Model to dynamically manage overhead for extremely large rows. This dynamic approach replaces the limitations of the previous static round-robin method. We detail the architecture of this framework and showcase its effectiveness through performance evaluations and real-world case studies, demonstrating significant improvements in the execution time and resource utilization for large-scale Snowpark UDF workloads.

2604.13033 2026-04-15 quant-ph cs.IT math-ph math.IT math.MP

Partial majorization and Schur concave functions on the sets of quantum and classical states

M. E. Shirokov

Comments 20 pages, 3 figures, any comments are welcome

详情
英文摘要

We construct for a Schur concave function $f$ on the set of quantum states a tight upper bound on the difference $f(ρ)-f(σ)$ for a quantum state $ρ$ with finite $f(ρ)$ and any quantum state $σ$ $m$-partially majorized by the state $ρ$ in the sense described in [1]. We also obtain a tight upper bound on this difference under the additional condition $\frac{1}{2}\|ρ-σ\|_1\leq\varepsilon$ and find simple sufficient conditions for vanishing this bound with $\,\min\{\varepsilon,1/m\}\to0\,$. The obtained results are applied to the von Neumann entropy. The concept of $\varepsilon$-sufficient majorization rank of a quantum state with finite entropy is introduced and a tight upper bound on this quantity is derived and applied to the Gibbs states of a quantum oscillator. We also show how the obtained results can be reformulated for Schur concave functions on the set of probability distributions with a finite or countable set of outcomes.

2604.13032 2026-04-15 quant-ph

Zeno Blockade Enabling Photonic Quantum Optimization

Mohammad-Ali Miri, Uchenna Chukwu, Nicholas Chancellor

Comments 43 pages 18 figures

详情
英文摘要

In this work we explore the potential of implementing an optical quantum optimizer using non-linear optics, specifically using sum-frequency generation and/or two photon absorption. This proposal uses Zeno effects to enforce independence constraints and then a linear protocol to find a maximum independent set in a way where the elements of the set can be weighted. Our proposal can either be viewed as an implementation of the entropy computing paradigm presented in [Nguyen et.~al.~Communications Physics 1, 411, 8] which uses real rather than imaginary time evolution, or as quantum annealing within a Zeno constrained subspace. We discuss how such a device could be built, and considerations such as error mitigation, particularly for photon-loss errors. We numerically study aspects of the protocol, including the effect of coherent versus incoherent incarnations of the Zeno effect, finding superior performance from the former.

2604.13031 2026-04-15 astro-ph.GA astro-ph.HE

Obscured at the Core: Evidence for Nuclear Dust in Reddened Type-1 AGN

Miguel A. Montalvo Hernandez, Andy D. Goulding, Jenny E. Greene

Comments 16 pages, 8 figures, 3 tables. Resubmitted to ApJ after addressing comments from referee

详情
英文摘要

Reddened Type-1 quasars offer a unique window into the structure and evolution of active galactic nuclei (AGN), yet their physical origin and the source of their reddening remain uncertain. Optical surveys often miss these dust-obscured objects, resulting in an incomplete view of the quasar population. In this work, we construct a sample of 6,600 Type-1 quasars at redshifts $0.5 \leq z \leq 1.2$ by combining deep optical imaging from HSC with mid-infrared photometry from WISE, enabling a more complete selection that is not biased against reddened objects. We perform detailed SED modeling using the CIGALE code, enhanced by synthetic photometry derived from SDSS spectra to better constrain the optical continuum. We classify quasars into blue and reddened Type-1 populations based on their continuum slopes and compare their SEDs and emission line properties. As expected from this definition, reddened Type-1 AGN show higher dust extinction, with a median $A_V = 0.60^{+0.32}_{-0.19}$ mag, compared to $A_V = 0.06^{+0.10}_{-0.03}$ mag for blue objects. But they also exhibit smaller torus half-opening angles, with a median of $25.7^{+10.1}_{-8.7}$ deg, compared to $33.3^{+11.1}_{-5.9}$ deg for blue objects. While such extinction could arise on either galaxy or nuclear scales, the systematically stronger narrow-line equivalent widths and weaker Balmer broad lines in reddened Type-1s indicate that the obscuration acts on nuclear scales, likely from dust concentrated near the polar axis. We discuss the possibility that these structural differences may be linked to a sub-pc outflow, that carries dusty gas into the polar region and evacuates the torus region.

2604.13027 2026-04-15 quant-ph cond-mat.quant-gas

Floquet Many-Body Cages

Tom Ben-Ami, Roderich Moessner, Markus Heyl

Comments 9 pages, 6+2 figures

详情
英文摘要

Many-body cages have very recently emerged as a general route for nonergodic behaviour in quantum matter. Here, we show that new types of many-body cages can be engineered in Floquet circuits with the potential to realize novel nonequilibrium quantum states. For that purpose, we first identify an explicit, general construction of Floquet circuits capable of hosting many-body cages. We then present a generic strategy to engineer and structure Floquet many-body cages. We demonstrate the developed scheme for the quantum hard disk model as a generic constrained model system, realizable for instance in Rydberg atom arrays. We construct Floquet circuits yielding Floquet many-body cages with topological properties and $π$-quasienergy modes, implying `time crystalline' spatiotemporal order. Our results can be directly extended to general quantum circuits, thus providing a new tool to engineer nonequilibrium behaviour in driven systems.

2604.13026 2026-04-15 quant-ph cond-mat.stat-mech cs.CC

A complexity phase transition at the EPR Hamiltonian

Kunal Marwaha, James Sud

Comments 47 pages, 8 figures

详情
英文摘要

We study the computational complexity of 2-local Hamiltonian problems generated by a positive-weight symmetric interaction term, encompassing many canonical problems in statistical mechanics and optimization. We show these problems belong to one of three complexity phases: QMA-complete, StoqMA-complete, and reducible to a new problem we call EPR*. The phases are physically interpretable, corresponding to the energy level ordering of the local term. The EPR* problem is a simple generalization of the EPR problem of King. Inspired by empirically efficient algorithms for EPR, we conjecture that EPR* is in BPP. If true, this would complete the complexity classification of these problems, and imply EPR* is the transition point between easy and hard local Hamiltonians. Our proofs rely on perturbative gadgets. One simple gadget, when recursed, induces a renormalization-group-like flow on the space of local interaction terms. This gives the correct complexity picture, but does not run in polynomial time. To overcome this, we design a gadget based on a large spin chain, which we analyze via the Jordan-Wigner transformation.

2604.13025 2026-04-15 cs.DS cs.DM math.CO

Asymptotically faster algorithms for recognizing $(k,\ell)$-sparse graphs

Bence Deák, Péter Madarasi

详情
英文摘要

The family of $(k,\ell)$-sparse graphs, introduced by Lorea, plays a central role in combinatorial optimization and has a wide range of applications, particularly in rigidity theory. A key algorithmic problem is to decide whether a given graph is $(k,\ell)$-sparse and, if not, to produce a vertex set certifying the failure of sparsity. While pebble game algorithms have long yielded $O(n^2)$-time recognition throughout the classical range $0 \leq \ell < 2k$, and $O(n^3)$-time algorithms in the extended range $2k \leq \ell < 3k$, substantially faster bounds were previously known only in a few special cases. We present new recognition algorithms for the parameter ranges $0 \le \ell \le k$, $k < \ell < 2k$, and $2k \leq \ell < 3k$. Our approach combines bounded-indegree orientations, reductions to rooted arc-connectivity, augmenting-path techniques, and a divide-and-conquer method based on centroid decomposition. This yields the first subquadratic, and in fact near-linear-time, recognition algorithms throughout the classical range when instantiated with the fastest currently available subroutines. Under purely combinatorial implementations, the running times become $O(n\sqrt n)$ for $0 \leq \ell \leq k$ and $O(n\sqrt{n\log n})$ for $k< \ell <2k$. For $2k \leq \ell < 3k$, we obtain an $O(n^2)$-time algorithm when $\ell \leq 2k+1$ and an $O(n^2\log n)$-time algorithm otherwise. In each case, the algorithm can also return an explicit violating set certifying that the input graph is not $(k,\ell)$-sparse.

2604.13014 2026-04-15 math.NA cs.NA math.AP

Finite element approximation of an anisotropic porous medium equation with fractional pressure

Stefano Fronzoni

Comments arXiv admin note: text overlap with arXiv:2404.18901

详情
英文摘要

We study a nonlocal diffusion equation of porous medium type featuring a generalised fractional pressure with spatial anisotropy. We construct a finite element method for the numerical solution of the equation on a bounded open Lipschitz polytopal domain $Ω\subset \mathbb{R}^{d}$, where $d = 2$ or $3$. The pressure in the model is defined as the solution of fractional elliptic problem involving the fractional power of a second order differential operator, in terms of its spectral definition. Under suitable assumptions on the fractional order and the coefficients of the operator, we rigorously prove convergence of the numerical scheme. The analysis is carried out in two stages: first passing to the limit in the spatial discretization, and then in the time step, ultimately showing that a subsequence of the sequence of finite element approximations defined by the proposed numerical method converges to a bounded and nonnegative weak solution of the initial-boundary-value problem under consideration. Finally, we present numerical experiments in two dimensions illustrating the computational aspects of the method and highlighting the interplay between nonlocal effects and spatial anisotropy under different configurations. We also show numerically the failure of the comparison principle and exponential decay of the numerical solution to a steady state.

2604.13012 2026-04-15 astro-ph.CO gr-qc hep-th

Probing Scalar-Tensor-Induced Gravitational Waves in the nHz Band: $\texttt{NANOGrav}$ and SKA

William Iania, Angelo Ricciardone

Comments 30 pages, 11 figures, 2 tables

详情
英文摘要

Scalar-induced gravitational waves (SIGWs) have recently attracted considerable interest, both as a possible explanation for the nanohertz signal reported by the Pulsar Timing Array (PTA) collaboration and for their connection with primordial black hole (PBH) physics. In addition to SIGWs, scalar-tensor-induced gravitational waves (STGWs) have emerged as a promising cosmological source of the stochastic gravitational wave background (SGWB). In this paper, we compute the STGWs generated during a generic matter-dominated (MD) era, as well as during an early matter-dominated (eMD) epoch followed by a sudden transition to the standard radiation-dominated (RD) stage, working in the Poisson gauge. We find that, in a purely MD age, the corresponding energy density rapidly dilutes, whereas in the presence of an eMD phase it remains non-vanishing due to the short duration of the eMD period. We then investigate whether the STGW signal could provide a dominant contribution to the $\texttt{NANOGrav 15-year}$ dataset and we forecast the prospects for its detection with future observations by the Square Kilometre Array (SKA). In particular, we consider STGWs generated during both eMD and RD eras, including their linear-order contributions. Our results show that the GWs induced by scalar-tensor mixing constitute a viable target for future, more sensitive detections of the SGWB.

2604.13009 2026-04-15 physics.chem-ph

EOM-fpCCSD: An Accurate Alternative to EOM-CCSD for Doubly Excited and Charge-Transfer States

Katharina Boguslawski, Paweł Tecmer

Comments 5 figures

详情
英文摘要

We introduce a new equation-of-motion coupled-cluster method based on a pair coupled-cluster doubles (pCCD) reference, termed frozen-pair EOM-CCSD (EOM-fpCCSD). This approach combines the computational efficiency of the pCCD ansatz with a dynamical correlation correction, enabling a reliable description of electronically excited states within the EOM framework. The method has been implemented in the open-source PyBEST software package. Its performance is systematically benchmarked against standard EOM-CCSD and its pair-tailored variant (EOM-ptCCSD), using both canonical Hartree-Fock and pCCD natural orbitals. For charge-transfer (CT) excitations taken from the QUEST database, EOM-fpCCSD yields excitation energies very close to those of EOM-CCSD, outperforming EOM-ptCCSD, as well as to the theoretical best estimates (TBEs). Working within the localized pCCD natural orbital basis allows us to determine the directed CT character, which quantifies the directed charge flow from one molecular domain to another. Numerical results show that EOM-fpCCSD, EOM-CCSD, and EOM-ptCCSD provide nearly identical descriptions of the directed CT character, despite changes in excitation energies. The true advantage of EOM-fpCCSD becomes evident for the challenging QUEST subset of doubly excited states. While EOM-ptCCSD performs similarly to standard EOM-CCSD, EOM-fpCCSD significantly outperforms both methods for these problematic states compared to TBEs. In addition to improving the accuracy of excitation energies, EOM-fpCCSD also converges for several states that standard EOM-CCSD and EOM-ptCCSD fail to converge. These results demonstrate that EOM-fpCCSD offers a promising and computationally efficient route toward a more accurate description of complex electronic excitations.