arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.23874 2026-06-11 physics.flu-dyn cs.LG math.DS physics.comp-ph physics.geo-ph 版本更新

Deep Learning of Solver-Aware Turbulence Closures from Nudged LES Dynamics

从Nudged LES动力学中深度学习求解器感知的湍流闭合模型

Ashwin Suriyanarayanan, Dibyajyoti Chakraborty, Romit Maulik

发表机构 * School of Mechanical Engineering（机械工程学院）； Purdue University（普渡大学）； College of Information Sciences and Technology（信息科学与技术学院）； Pennsylvania State University（宾夕法尼亚州立大学）

AI总结提出基于连续数据同化框架的深度学习方法，利用稀疏观测的DNS数据先验训练湍流闭合模型，无需修改或微分LES求解器，同时保持部署稳定性，并显式条件化数值格式以适配不同离散化。

详情

AI中文摘要

可微物理范式可以通过将神经网络参数化直接嵌入求解器，并根据潜在稀疏的目标数据进行优化，作为一种后验方法来发现湍流闭合模型。这解决了先验学习的关键局限性，即使用直接数值模拟（DNS）数据来近似亚网格应力，并假设存在低通滤波器。以这种先验方式训练的闭合模型常常由于假设的滤波器与数值离散化和粗粒化效应之间的不匹配而导致部署不稳定。相比之下，后验学习虽然在部署期间通常稳定，但由于需要通过大涡模拟（LES）求解器进行反向传播，因此计算成本高昂。此外，后验方法难以广泛应用，因为它们需要对现有求解器进行重大修改。最后，当需要在具有隐式滤波特性的不同数值格式之间进行泛化时，这两种方法都受到限制。在这项工作中，我们提出了一种基于连续数据同化框架的深度学习湍流闭合建模方法。我们的方法允许使用稀疏观测的DNS数据先验训练闭合模型，而无需修改或微分LES求解器，同时在部署期间保持稳定性以恢复不变统计量。我们通过显式地将模型条件化于数值格式，专注于模型适应不同离散化的能力。我们使用二维和三维经典案例来测试我们的框架，并表明学习的修正系统地跟踪了粗求解器的离散化误差。

英文摘要

The differentiable physics paradigm may be leveraged as an a-posteriori approach for discovering turbulence closure models by embedding a neural network parameterization directly inside the solver and optimizing it given potentially sparse target data. This addresses a key limitation of a-priori learning where direct numerical simulation (DNS) data is used to approximate the subgrid stress with the assumption of a low-pass filter. Closures trained in this a-priori manner frequently lead to unstable deployments due to the mismatch between the assumed filter and the effect of numerical discretizations and coarse-graining. In comparison, while typically stable during deployment, a-posteriori learning incurs high computational costs due to the need to backpropagate through a large eddy simulation (LES) solver. Furthermore, a-posteriori methods are challenging to apply broadly since they require significant modification of existing solvers. Finally, both approaches are limited when generalization is desired across different numerical schemes with their implicit filtering characteristics. In this work, we present a deep-learning approach for turbulence closure modeling built on the continuous data assimilation framework. Our approach enables the a-priori training of closures using sparsely observed DNS data without modifying or differentiating through the LES solver, while preserving stability during deployment for the recovery of invariant statistics. We focus on the model's ability to adapt to different discretizations by explicitly conditioning it on the numerical scheme. We use two- and three-dimensional canonical cases to test our framework and show that the learned correction systematically tracks the discretization error of the coarse solver.

URL PDF HTML ☆

赞 0 踩 0

2604.25018 2026-06-11 cs.ET cs.AI cs.DC cs.NI 版本更新

Internet of Everything in the 6G Era: Paradigms, Enablers, Potentials and Future Directions

6G时代的万物互联：范式、使能技术、潜力与未来方向

Driss Choukri, Essaid Sabir, Elmahdi Driouch, Abdelkrim Haqiq

发表机构 * Computer Networks, Mobility and Modeling Laboratory (IR2M), FST, Hassan I University of Settat, Morocco, and the Department of Science and Technology, TÉLUQ, University of Quebec, Montreal, H2S 3L4, Canada（计算机网络、移动与建模实验室（IR2M），FST，哈桑一世大学塞塔特分校，摩洛哥，以及科技部，TÉLUQ，魁北克大学，蒙特利尔，H2S 3L4，加拿大）； Department of Science and Technology, TÉLUQ, University of Quebec, Montreal, H2S 3L4, Canada（科技部，TÉLUQ，魁北克大学，蒙特利尔，H2S 3L4，加拿大）； Department of Computer Science, University of Quebec at Montreal (UQAM), Montreal, H2L 2C4, Canada（计算机科学系，魁北克大学蒙特利尔分校（UQAM），蒙特利尔，H2L 2C4，加拿大）

AI总结本文综述了万物互联（IoE）的概念、核心组件、架构基础、使能技术及研究挑战，并探讨了面向6G智能IoE系统的开放研究方向，重点关注可扩展性、安全、隐私和能效。

Comments 48 pages, 15 figures, 6 tables, 272 references

2604.24662 2026-06-11 physics.data-an cs.AI cs.IT math.IT 版本更新

Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data

信息瓶颈：从高维实验数据学习动力学相空间

K. Michael Martini, Eslam Abdelaleem, Paarth Gulati, Ilya Nemenman

发表机构 * Department of Physics, Emory University（埃默里大学物理系）； Initiative in Theory and Modeling of Living Systems, Emory University（埃默里大学生命系统理论与建模倡议）； Schools of Physics and Psychology, Georgia Institute of Technology（佐治亚理工学院物理与心理学学院）； Department of Biology, Emory University（埃默里大学生物学系）

AI总结提出DySIB方法，通过最大化过去与未来观测窗口间的预测互信息并惩罚表示复杂度，从高维时间序列数据中无监督学习低维动力学表示，在物理摆实验中恢复出与真实相空间匹配的二维表示。

Comments 12 pages including references, 7 figures, 4 appendix pages with 4 appendix figures

详情

AI中文摘要

从高维观测中识别系统的动力学状态变量是物理科学中的一个核心问题。挑战在于状态变量不可直接观测，必须从原始高维数据中无监督地推断。本文引入DySIB（动态对称信息瓶颈）作为一种学习方法，通过最大化过去与未来观测窗口之间的预测互信息并惩罚表示复杂度，学习时间序列数据的低维表示。该目标完全在潜在空间中运作，避免了对观测的重建。我们将DySIB应用于一个物理摆的实验视频数据集，其底层状态空间已知。该方法的学习架构超参数由数据自洽设定，恢复出一个二维表示，该表示与摆相空间的维度、拓扑和几何相匹配，学习到的坐标与标准角度和角速度平滑对齐。这些结果在一个特征明确的实验系统上表明，潜在空间中的预测信息可用于直接从高维数据中恢复可解释的动力学坐标。

英文摘要

Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The challenge is that the state variables are not directly observable and must be inferred from raw high-dimensional data without supervision. Here we introduce DySIB (Dynamical Symmetric Information Bottleneck) as a method to learn low-dimensional representations of time-series data by maximizing predictive mutual information between past and future observation windows while penalizing representation complexity. This objective operates entirely in latent space and avoids reconstruction of the observations. We apply DySIB to an experimental video dataset of a physical pendulum, where the underlying state space is known. The method, with hyperparameters of the learning architecture set self-consistently by the data, recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. These results demonstrate, on a well-characterized experimental system, that predictive information in latent space can be used to recover interpretable dynamical coordinates directly from high-dimensional data.

URL PDF HTML ☆

赞 0 踩 0

2603.21639 2026-06-11 cs.CY cs.LG 版本更新

A Multi-Modal Sensor Fusion Instrument for Measuring Regional Human Mobility: The Distributed Human Data Engine (DHDE)

多模态传感器融合仪器用于测量区域人类流动性：分布式人类数据引擎（DHDE）

Amil Khanzada, Takuji Takemoto

发表机构 * Headquarters for Regional Revitalization, University of Fukui, Japan（复兴地区总部，福井大学，日本）

AI总结提出分布式人类数据引擎（DHDE），通过融合边缘AI相机、数字意图信号、行为记录和气象数据，解决外围区域人类流动性测量中传感器稀疏和行为异质性问题，验证了稀疏传感器补偿方法，并发现“低活力悖论”。

Comments 32 pages, 4 figures, 3 tables. Pre-print of a manuscript submitted for peer review (v2)

详情

AI中文摘要

准确估计外围区域经济中的人类流动性面临一个基本的测量挑战：物理地面实况传感器稀疏，行为意图信号异质，环境摩擦给需求推断引入系统性偏差。我们提出分布式人类数据引擎（DHDE），一种多模态传感器融合架构，通过整合物理仪器（边缘AI相机）、数字意图信号（路线搜索印象指标）、行为记录（90,350条消费记录，97,719份标准化调查回复）以及日本福井四个地理分布节点的气象数据来解决这一挑战。主要的测量科学贡献在于设计、部署和跨节点验证DHDE作为稀疏传感器补偿仪器：一种异质传感器融合架构，将非平稳数字意图信号锚定到同时的物理地面实况计数，纠正由气象规划摩擦引入的系统性偏差。该仪器实现为集成推理管道（随机森林和带有Newey-West稳健推断的普通最小二乘法），在397个日观测数据上校准，并通过四个地理上不同的节点类型的时间顺序保留复制进行验证。主要OLS规范实现了样本内解释力R²=0.810和时间顺序样本外预测性能R²=0.683。结果识别出一个“低活力悖论”，其中宏观区域访客满意度与人群密度正相关（Spearman秩相关系数rs=+0.150，p=0.002）。我们估计年度代理缺口为865,917次意图隐含访问，对应119.6亿日元（7260万美元）的损失收入。

英文摘要

Accurately estimating human mobility in peripheral regional economies presents a fundamental measurement challenge: physical ground-truth sensors are sparse, behavioral intent signals are heterogeneous, and environmental friction introduces systematic bias into demand inference. We introduce the Distributed Human Data Engine (DHDE), a multi-modal sensor fusion architecture that addresses this challenge by integrating physical instrumentation (Edge-AI cameras), digital intent signals (route search impression metrics), behavioral records (90,350 spending records, 97,719 standardized survey responses), and meteorological data across four geographically distributed nodes in Fukui, Japan. The primary measurement-science contribution is the design, deployment, and cross-node validation of the DHDE as a sparse-sensor compensation instrument: a heterogeneous sensor fusion architecture that anchors non-stationary digital intent signals to concurrent physical ground-truth counts, correcting for systematic bias introduced by meteorological planning friction. The instrument is implemented as an ensemble inference pipeline (Random Forest and Ordinary Least Squares with Newey-West robust inference), calibrated across 397 daily observations and validated by chronological holdout replication across four geographically distinct node types. The primary OLS specification achieved an in-sample explanatory power of R2 = 0.810 and a chronological out-of-sample predictive performance of R2 = 0.683. Results identify an Under-Vibrancy Paradox where macro-regional visitor satisfaction correlates positively with crowd density (Spearman rank correlation rs = +0.150, p = 0.002). We estimate an annual proxy gap of 865,917 intent-implied visits, corresponding to JPY 11.96 billion (USD 72.6 million) in foregone revenue.

URL PDF HTML ☆

赞 0 踩 0

2603.19225 2026-06-11 cs.CE cs.AI cs.CL cs.IR q-fin.CP 版本更新

FinTradeBench: A Financial Reasoning Benchmark for LLMs

FinTradeBench: 面向LLM的金融推理基准

Yogesh Agrawal, Aniruddha Dutta, Md Mahadi Hasan, Santu Karmaker, Aritra Dutta

发表机构 * University of Central Florida（佛罗里达中央大学）

AI总结提出FinTradeBench基准，通过结合公司基本面与交易信号，评估大语言模型在金融推理中的表现，发现检索增强对数值和时间序列推理帮助有限。

Comments 9 pages main text, 31 pages total (including references and appendix). 5 figures, 16 tables. Preprint under review. Code and data will be made available upon publication

详情

AI中文摘要

现实世界的金融决策是一个具有挑战性的问题，需要对异构信号进行推理，包括从监管文件中提取的公司基本面和从价格动态计算出的交易信号。最近，随着大语言模型（LLM）的进步，金融分析师开始将它们用于金融决策任务。然而，现有的用于测试这些模型的金融问答基准主要关注公司资产负债表数据，很少评估关于公司股票如何在市场中交易或它们与基本面相互作用的推理。为了利用这两种方法的优势，我们引入了FinTradeBench，这是一个评估金融推理的基准，它整合了公司基本面和交易信号。FinTradeBench包含1400个问题，这些问题基于纳斯达克-100公司十年历史窗口的数据。该基准分为三个推理类别：基本面聚焦、交易信号聚焦以及需要跨信号推理的混合问题。为了确保大规模可靠性，我们采用了一个校准然后扩展的框架，该框架结合了专家种子问题、多模型响应生成、模型内自过滤、数值审计以及人类-LLM判断对齐。我们在零样本提示和检索增强设置下评估了14个LLM，并观察到了明显的性能差距。检索显著改善了对文本基本面的推理，但对交易信号推理的益处有限。这些发现突显了当前LLM在数值和时间序列推理方面的根本性挑战，并激励了未来在金融智能方面的研究。

英文摘要

Real-world financial decision-making is a challenging problem that requires reasoning over heterogeneous signals, including company fundamentals derived from regulatory filings and trading signals computed from price dynamics. Recently, with advances in Large Language Models (LLMs), financial analysts have begun to use them for financial decision-making tasks. However, existing financial question-answering benchmarks for testing these models primarily focus on company balance sheet data and rarely evaluate reasoning about how company stocks trade in the market or their interactions with fundamentals. To leverage the strengths of both approaches, we introduce FinTradeBench, a benchmark for evaluating financial reasoning that integrates company fundamentals and trading signals. FinTradeBench contains 1,400 questions grounded in NASDAQ-100 companies over a ten-year historical window. The benchmark is organized into three reasoning categories: fundamentals-focused, trading-signal-focused, and hybrid questions requiring cross-signal reasoning. To ensure reliability at scale, we adopt a calibration-then-scaling framework that combines expert seed questions, multi-model response generation, intra-model self-filtering, numerical auditing, and human-LLM judge alignment. We evaluate 14 LLMs under zero-shot prompting and retrieval-augmented settings and witness a clear performance gap. Retrieval substantially improves reasoning over textual fundamentals, but provides limited benefit for trading-signal reasoning. These findings highlight fundamental challenges in the numerical and time-series reasoning for current LLMs and motivate future research in financial intelligence.

URL PDF HTML ☆

赞 0 踩 0

2603.14762 2026-06-11 math.OC cs.LG cs.SY eess.SY 版本更新

Online Learning for Supervisory Switching Control

在线学习用于监督切换控制

Haoyuan Sun, Ali Jadbabaie

发表机构 * Massachusetts Institute of Technology（麻省理工学院）

AI总结研究在线学习在部分观测线性动态系统中监督切换控制的问题，提出非渐近分析方法，结合多臂老虎机算法，实现稳定控制器识别与系统辨识。

详情

AI中文摘要

我们研究了部分观测线性动态系统中的监督切换控制。目标是通过周期性选择一组N个候选控制器中的一个，来识别并部署适合的控制器。经典估计器基于监督控制保证渐近稳定性，但缺乏有限时间性能界限。相反，当前在线学习和系统识别中的非渐近方法需要限制性假设，如系统稳定性，这在控制设置中不兼容，从而排除了测试可能不稳定控制器的可能性。为弥合这一差距，我们提出了一种新颖的非渐近监督控制分析，将多臂老虎机算法适应到控制理论设置中。所提出的数据驱动算法通过评分标准评估候选控制器，利用系统可观测性来隔离状态历史的影响，从而既能检测不稳定控制器，又能实现准确的系统辨识。我们提出了两种算法变体，具有无维度、有限时间保证，其中每个算法在O(N log²N)步内识别匹配控制器，同时在系统扰动下实现有限的L₂增益。

英文摘要

We study supervisory switching control for partially-observed linear dynamical systems. The objective is to identify and deploy a suitable controller for the unknown system by periodically selecting among a collection of $N$ candidate controllers, some of which may destabilize the underlying system. While classical estimator-based supervisory control guarantees asymptotic stability, it lacks quantitative finite-time performance bounds. Conversely, current non-asymptotic methods in both online learning and system identification require restrictive assumptions that are incompatible in a control setting, such as system stability, which preclude testing potentially unstable controllers. To bridge this gap, we propose a novel, non-asymptotic analysis of supervisory control that adapts multi-armed bandit algorithms to a control-theoretic setting. The proposed data-driven algorithm evaluates candidate controllers via scoring criteria that leverage system observability to isolate the effects of state history, enabling both detection of destabilizing controllers and accurate system identification. We present two algorithmic variants with dimension-free, finite-time guarantees, where each identifies the matching controller in $O(N \log^2 N)$ steps, while simultaneously achieving finite $L_2$-gain with respect to system disturbances.

URL PDF HTML ☆

赞 0 踩 0

2603.13854 2026-06-11 cs.LO cs.AI cs.SC 版本更新

Power Term Polynomial Algebra for Boolean Logic

布尔逻辑的幂项多项式代数

Emanuele Sansone, Armando Solar-Lezama

发表机构 * CSAIL, MIT（MIT计算机科学与人工智能实验室）； ESAT, KU Leuven（比利时鲁文大学ESAT研究所）； KU Leuven（鲁文大学）

AI总结提出幂项多项式代数，一种介于CNF和ANF之间的布尔公式表示语言，通过幂项和多项式直接编码CNF子句与单项式族，避免辅助变量和约束，支持代数运算与重写规则。

Comments Pragmatics of SAT

详情

AI中文摘要

我们引入了幂项多项式代数，这是一种布尔公式的表示语言，旨在桥联合取范式（CNF）和代数范式（ANF）。该语言的动机是这些表示之间的平铺不匹配：直接CNF<->ANF转换可能导致指数爆炸，除非公式被分解成更小的片段，通常通过辅助变量和侧面约束。相比之下，我们的框架在表示本身内部解决了这种不匹配，紧凑地编码了单项式的结构化族，同时直接表示CNF子句，从而在抽象层次上避免了辅助变量和约束。我们通过幂项和幂项多项式形式化了该语言，定义了它们的语义，并展示了它们允许对应于布尔多项式加法和乘法的代数运算。我们证明了该语言的几个关键性质：析取子句允许紧凑的规范表示；幂项支持局部缩短和扩展重写规则；原子项的乘积可以在语言内部系统地重写。这些结果共同产生了一个符号演算，使得无需将公式展开为普通ANF即可直接操作公式。由此产生的框架提供了一种新的中间表示和重写演算，桥接了基于子句和代数的推理，并为结构感知的CNF<->ANF转换和混合推理方法提出了新的方向。

英文摘要

We introduce power term polynomial algebra, a representation language for Boolean formulae designed to bridge conjunctive normal form (CNF) and algebraic normal form (ANF). The language is motivated by the tiling mismatch between these representations: direct CNF<->ANF conversion may cause exponential blowup unless formulas are decomposed into smaller fragments, typically through auxiliary variables and side constraints. In contrast, our framework addresses this mismatch within the representation itself, compactly encoding structured families of monomials while representing CNF clauses directly, thereby avoiding auxiliary variables and constraints at the abstraction level. We formalize the language through power terms and power term polynomials, define their semantics, and show that they admit algebraic operations corresponding to Boolean polynomial addition and multiplication. We prove several key properties of the language: disjunctive clauses admit compact canonical representations; power terms support local shortening and expansion rewrite rules; and products of atomic terms can be systematically rewritten within the language. Together, these results yield a symbolic calculus that enables direct manipulation of formulas without expanding them into ordinary ANF. The resulting framework provides a new intermediate representation and rewriting calculus that bridges clause-based and algebraic reasoning and suggests new directions for structure-aware CNF<->ANF conversion and hybrid reasoning methods.

URL PDF HTML ☆

赞 0 踩 0

2602.06547 2026-06-11 cs.CR cs.AI cs.CL cs.ET 版本更新

"Do Not Mention This to the User": Detecting and Understanding Malicious Agent Skills in the Wild

“不要向用户提及此事”：检测与理解恶意代理技能

Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Leo Yu Zhang

发表机构 * Griffith University（格里菲斯大学）； Nanyang Technological University（南洋理工大学）； University of New South Wales（新南威尔士大学）； Zhejiang Key Laboratory of Digital Fashion and Data Governance, Zhejiang Sci-Tech University（浙江数字时尚与数据治理重点实验室，浙江科技大学）

AI总结本文通过对两个主要注册中心的98,380个技能进行系统安全分析，结合静态模式匹配和动态行为验证，识别出157个恶意技能，揭示了13种攻击技术中的632个不同漏洞，并发现攻击复杂性与隐藏投入相关。

Comments Accepted to the 35th USENIX Security Symposium (USENIX Security 2026)

详情

AI中文摘要

基于LLM的编码代理越来越依赖称为技能的第三方扩展，这些技能捆绑了自然语言指令和辅助脚本，以完全用户权限执行。社区注册中心已出现以分发这些技能，但由于缺乏标记的威胁数据，安全影响仍未得到研究。本文对从两个主要注册中心收集的98,380个技能进行了系统安全分析。通过静态模式匹配和动态行为验证的结合，我们识别出157个表现出确认恶意行为的技能，涵盖13种攻击技术中的632个不同漏洞。我们的分析表明，这些威胁是故意的而非偶然：每个恶意技能平均包含4.03个漏洞，跨越多个攻击阶段。我们识别出两种具有统计显著负相关的主要攻击策略——通过远程代码执行窃取凭证，以及通过嵌入文档中的对抗性指令操纵代理。超过一半的确认案例来自一个采用模板化品牌冒充大规模攻击的单一威胁行为者。我们进一步观察到，攻击复杂性与隐藏投入相关，高级技能普遍使用未记录的功能，同时利用平台原生的信任机制。在负责任的披露之后，注册中心维护者删除了所有157个（100%）报告的技能。我们的数据集和检测管道公开可用，以促进未来关于保护LLM代理生态系统安全的研究。

英文摘要

LLM-based coding agents increasingly rely on third-party extensions called skills, which bundle natural language instructions and helper scripts that execute with full user privileges. Community registries have emerged to distribute these skills, but the security implications remain unstudied due to the absence of labeled threat data. This paper presents a systematic security analysis of 98,380 skills collected from two major registries. Through a combination of static pattern matching and dynamic behavioral verification, we identify 157 skills exhibiting confirmed malicious behavior, encompassing 632 distinct vulnerabilities across 13 attack techniques. Our analysis reveals that these threats are deliberate rather than accidental: each malicious skill contains an average of 4.03 vulnerabilities spanning multiple attack phases. We identify two dominant attack strategies with statistically significant negative correlation -- credential theft via remote code execution, and agent manipulation through adversarial instructions embedded in documentation. Over half of all confirmed cases originate from a single threat actor employing templated brand impersonation at scale. We further observe that attack sophistication correlates with concealment investment, with advanced skills universally employing undocumented capabilities while also exploiting platform-native trust mechanisms. Following responsible disclosure, registry maintainers removed all 157 (100%) of the reported skills. Our dataset and detection pipeline are publicly available to facilitate future research on securing LLM agent ecosystems.

URL PDF HTML ☆

赞 0 踩 0

2603.11678 2026-06-11 eess.AS cs.SD 版本更新

RAF: Relativistic Adversarial Feedback For Universal Speech Synthesis

RAF：用于通用语音合成的相对论对抗反馈

Yongjoon Lee, Jung-Woo Choi

发表机构 * Korea Advanced Institute of Science and Technology (KAIST)（韩国科学技术院）

AI总结提出相对论对抗反馈（RAF）训练目标，通过自监督语音模型和相对论配对改进GAN声码器的域内保真度和泛化能力，在参数减少88%的情况下超越LSGAN训练的BigVGAN。

Comments Accepted to Interspeech 2026 Long paper track. Code: https://github.com/infected4098/Relativistic-Adversarial-Feedback

详情

AI中文摘要

我们提出相对论对抗反馈（RAF），一种用于GAN声码器的新型训练目标，可提高域内保真度和对未见场景的泛化能力。尽管现代GAN声码器采用先进架构，但其训练目标往往无法促进可泛化的表示。RAF通过利用语音自监督学习模型辅助判别器评估样本质量，鼓励生成器学习更丰富的表示来解决这一问题。此外，我们利用真实和虚假波形的相对论配对来改善训练数据分布的建模。跨多个数据集的实验表明，基于GAN的声码器在客观和主观指标上均获得一致提升。重要的是，经过RAF训练的BigVGAN-base仅使用12%的参数就在感知质量上优于经过LSGAN训练的BigVGAN。对比研究进一步证实了RAF作为GAN声码器训练框架的有效性。

英文摘要

We propose Relativistic Adversarial Feedback (RAF), a novel training objective for GAN vocoders that improves in-domain fidelity and generalization to unseen scenarios. Although modern GAN vocoders employ advanced architectures, their training objectives often fail to promote generalizable representations. RAF addresses this problem by leveraging speech self-supervised learning models to assist discriminators in evaluating sample quality, encouraging the generator to learn richer representations. Furthermore, we utilize relativistic pairing for real and fake waveforms to improve the modeling of the training data distribution. Experiments across multiple datasets show consistent gains in both objective and subjective metrics on GAN-based vocoders. Importantly, the RAF-trained BigVGAN-base outperforms the LSGAN-trained BigVGAN in perceptual quality using only 12\% of the parameters. Comparative studies further confirm the effectiveness of RAF as a training framework for GAN vocoders.

URL PDF HTML ☆

赞 0 踩 0

2505.03649 2026-06-11 stat.ML cs.LG math.CO math.PR 版本更新

Weighted Random Dot Product Graphs

加权随机点积图

Bernardo Marenco, Paola Bermolen, Marcelo Fiori, Federico Larroca, Gonzalo Mateos

发表机构 * Facultad de Ingeniería Universidad de la República（工程学院乌拉圭共和国大学）； Dept. of Electrical and Computer Engineering University of Rochester（电气与计算机工程系罗切斯特大学）

AI总结提出加权随机点积图（WRDPG）模型，通过节点潜位置的内积刻画边权分布的高阶矩，并给出谱嵌入估计的统计保证与生成框架。

Comments 30 pages, 12 figures, code to generate Figures 3 to 12 available at https://github.com/bmarenco/wrdpg. Updated to match the published version

详情

DOI: 10.1214/26-EJS2538
Journal ref: Electronic Journal of Statistics, 20(1), 2456-2499, 2026

AI中文摘要

复杂关系模式的建模已成为当代统计研究和相关数据科学领域的基石。以图形式表示的网络为这种分析提供了自然框架。本文扩展了随机点积图（RDPG）模型以适应加权图，显著拓宽了该模型的适用范围，使其能够处理边权呈现异质分布的场景。我们提出了一种非参数加权（W）RDPG模型，为每个节点分配一系列潜位置。这些节点向量的内积通过矩生成函数指定其关联边权分布的矩。与现有技术不同，WRDPG能够区分具有相同均值但高阶矩不同的权重分布。我们推导了基于工作马邻接谱嵌入的节点潜位置估计量的统计保证，建立了其一致性和渐近正态性。我们还贡献了一个生成框架，能够采样符合（指定或数据拟合的）WRDPG的图，从而促进例如使用恰当的参考分布对观测图指标进行分析和检验。本文组织如下：形式化模型定义、估计（或节点嵌入）过程及其保证，以及生成加权图的方法，所有内容均辅以说明性和可重复的示例，展示WRDPG在各种网络分析应用中的有效性。

英文摘要

Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight distributions. We propose a nonparametric weighted (W)RDPG model that assigns a sequence of latent positions to each node. Inner products of these nodal vectors specify the moments of their incident edge weights' distribution via moment-generating functions. In this way, and unlike prior art, the WRDPG can discriminate between weight distributions that share the same mean but differ in other higher-order moments. We derive statistical guarantees for an estimator of the nodal's latent positions adapted from the workhorse adjacency spectral embedding, establishing its consistency and asymptotic normality. We also contribute a generative framework that enables sampling of graphs that adhere to a (prescribed or data-fitted) WRDPG, facilitating, e.g., the analysis and testing of observed graph metrics using judicious reference distributions. The paper is organized to formalize the model's definition, the estimation (or nodal embedding) process and its guarantees, as well as the methodologies for generating weighted graphs, all complemented by illustrative and reproducible examples showcasing the WRDPG's effectiveness in various network analytic applications.

URL PDF HTML ☆

赞 0 踩 0

2603.09276 2026-06-11 stat.ML cs.LG 版本更新

On Regret Bounds of Thompson Sampling for Bayesian Optimization

关于贝叶斯优化中汤普森采样遗憾界的分析

Shion Takeno, Shogo Iwazaki

发表机构 * Nagoya University（名古屋大学）； MI-6 Ltd.（MI-6公司）

AI总结本文针对高斯过程汤普森采样（GP-TS）方法，在目标函数为GP样本路径的假设下，推导了其遗憾下界、累积遗憾二阶矩上界、期望宽松遗憾上界以及改进的累积遗憾上界，填补了GP-TS在高概率遗憾界方面的空白。

Comments 43 pages, Accepted to ICML 2026

详情

AI中文摘要

我们研究了一种广泛使用的贝叶斯优化方法——高斯过程汤普森采样（GP-TS），假设目标函数是高斯过程的一个样本路径。与具有高概率和期望遗憾界的高斯过程上置信界（GP-UCB）相比，GP-TS的大多数分析仅限于期望遗憾。此外，最近关于GP-UCB的宽松遗憾和改进的累积遗憾上界的分析是否能应用于GP-TS仍不清楚。为了填补这些空白，本文展示了几个遗憾界：(i) GP-TS的遗憾下界，这意味着GP-TS以概率δ依赖于$1/\delta$的多项式；(ii) 累积遗憾二阶矩的上界，直接暗示了关于δ的改进遗憾上界；(iii) 期望宽松遗憾上界；(iv) 关于时间水平T的改进累积遗憾上界。在此过程中，我们提供了几个有用的引理，包括从最近分析中放松必要条件以获得关于T的改进累积遗憾上界。

英文摘要

We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/δ$ with probability $δ$, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on $δ$, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon $T$. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on $T$.

URL PDF HTML ☆

赞 0 踩 0

2602.23461 2026-06-11 physics.flu-dyn cs.LG 版本更新

Neural ensemble Kalman filter: Data assimilation for compressible flows with shocks

神经集成卡尔曼滤波器：含激波可压缩流的数据同化

Xu-Hui Zhou, Lorenzo Beronilla, Michael K. Sleeman, Hangchuan Hu, Matthias Morzfeld, Andrew M. Stuart, Tamer A. Zaki

发表机构 * University of California, San Diego（加州大学圣迭戈分校）； University of Cambridge（剑桥大学）

AI总结针对含激波可压缩流中集成卡尔曼滤波器（EnKF）因双峰预报分布失效的问题，提出神经EnKF，通过将预报集合映射到神经网络参数空间并在此空间进行同化，结合物理信息迁移学习避免伪振荡和非物理特征。

详情

AI中文摘要

含激波可压缩流的数据同化（DA）具有挑战性，因为许多经典DA方法在不确定激波附近会产生伪振荡和非物理特征。我们在此关注集成卡尔曼滤波器（EnKF）。我们表明，EnKF性能不佳可归因于在不确定激波位置附近可能出现双峰预报分布；这违反了EnKF的假设，即预报接近高斯分布。为解决此问题，我们引入了新的神经EnKF。基本思想是通过将激波流的预报集合映射到深度神经网络（NN）的参数空间（权重和偏置），并随后在该空间中进行DA，从而系统地将神经函数逼近嵌入到集成DA中。非线性映射将尖锐和光滑的流动特征编码在NN参数的集合中。因此，只有当NN参数在预报集合的神经表示中平滑变化时，神经EnKF更新才是良好的。我们表明，可以通过物理信息迁移学习强制网络参数的这种平滑变化，并证明这样做神经EnKF避免了困扰EnKF的伪振荡和非物理特征。通过无粘Burgers方程、Sod激波管和二维爆炸波的一系列系统数值实验，证明了神经EnKF的适用性。

英文摘要

Data assimilation (DA) for compressible flows with shocks is challenging because many classical DA methods generate spurious oscillations and nonphysical features near uncertain shocks. We focus here on the ensemble Kalman filter (EnKF). We show that the poor performance of the EnKF may be attributed to the bimodal forecast distribution that can arise in the vicinity of an uncertain shock location; this violates the assumptions underpinning the EnKF, which assume a forecast which is close to Gaussian. To address this issue we introduce the new neural EnKF. The basic idea is to systematically embed neural function approximations within ensemble DA by mapping the forecast ensemble of shocked flows to the parameter space (weights and biases) of a deep neural network (NN) and to subsequently perform DA in that space. The nonlinear mapping encodes sharp and smooth flow features in an ensemble of NN parameters. Neural EnKF updates are therefore well-behaved only if the NN parameters vary smoothly within the neural representation of the forecast ensemble. We show that such a smooth variation of network parameters can be enforced via physics-informed transfer learning, and demonstrate that in so-doing the neural EnKF avoids the spurious oscillations and nonphysical features that plague the EnKF. The applicability of the neural EnKF is demonstrated through a series of systematic numerical experiments with the inviscid Burgers' equation, the Sod shock tube, and a two-dimensional blast wave.

URL PDF HTML ☆

赞 0 踩 0

2602.19718 2026-06-11 cs.SE cs.AI 版本更新

Carbon-Aware Governance Gates: An Architecture for Sustainable GenAI Development

碳感知治理门：可持续生成式AI开发的架构

Mateen A. Abbasi, Tommi J. Mikkonen, Petri J. Ihantola, Muhammad Waseem, Pekka Abrahamsson, Niko K. Mäkitalo

发表机构 * University of Helsinki（赫尔辛基大学）； Aalto University（阿尔托大学）

AI总结针对生成式AI在软件开发中增加碳足迹的问题，提出碳感知治理门架构，通过嵌入碳预算、能源溯源和可持续验证编排来降低环境影响。

Comments 5 pages, 1 figure. Preprint version under review

详情

AI中文摘要

生成式AI在软件开发生命周期中的快速普及增加了计算需求，这可能提高开发活动的碳足迹。同时，组织越来越多地将治理机制嵌入到生成式AI辅助开发中，以支持信任、透明度和问责制。然而，这些治理机制引入了额外的计算负载，包括重复推理、再生循环和扩展的验证管道，增加了能源使用和生成式AI辅助开发的碳足迹。本文提出碳感知治理门（CAGG），一种架构扩展，将碳预算、能源溯源和可持续感知验证编排嵌入到人机治理层中。CAGG包含三个组件：（i）能源和碳溯源账本，（ii）碳预算管理器，以及（iii）绿色验证编排器，通过治理策略和可重用设计模式实现。

英文摘要

The rapid adoption of Generative AI (GenAI) in the software development life cycle (SDLC) increases computational demand, which can raise the carbon footprint of development activities. At the same time, organizations are increasingly embedding governance mechanisms into GenAI-assisted development to support trust, transparency, and accountability. However, these governance mechanisms introduce additional computational workloads, including repeated inference, regeneration cycles, and expanded validation pipelines, increasing energy use and the carbon footprint of GenAI-assisted development. This paper proposes Carbon-Aware Governance Gates (CAGG), an architectural extension that embeds carbon budgets, energy provenance, and sustainability-aware validation orchestration into human-AI governance layers. CAGG comprises three components: (i) an Energy and Carbon Provenance Ledger, (ii) a Carbon Budget Manager, and (iii) a Green Validation Orchestrator, operationalized through governance policies and reusable design patterns.

URL PDF HTML ☆

赞 0 踩 0

2602.07840 2026-06-11 cs.IR cs.AI 版本更新

SAGE: Scalable AI Governance & Evaluation

SAGE: 可扩展的人工智能治理与评估

Benjamin Le, Xueying Lu, Nick Stern, Wenqiong Liu, Igor Lapchuk, Xiang Li, Baofen Zheng, Kevin Rosenberg, Jiewen Huang, Zhe Zhang, Abraham Cabangbang, Satej Milind Wagle, Jianqiang Shen, Raghavan Muthuregunathan, Abhinav Gupta, Mathew Teoh, Andrew Kirk, Thomas Kwan, Jingwei Wu, Wenjing Zhang

发表机构 * LinkedIn Corporation（LinkedIn公司）

AI总结本文提出SAGE框架，通过双向校准循环将高质量的人类产品判断转化为可扩展的评估信号，解决了大规模搜索系统中相关性评估的治理差距问题，并实现了92倍成本降低的模型迭代和政策监督。

详情

AI中文摘要

在大规模搜索系统中评估相关性本质上受到人类监督与生产系统高吞吐要求之间的治理差距的限制。传统方法依赖于参与代理或稀疏手动审查，但这些方法往往无法捕捉高影响的相关性失败的全部范围。我们提出了SAGE（可扩展的人工智能治理与评估）框架，该框架将高质量的人类产品判断作为可扩展的评估信号。SAGE的核心是一个双向校准循环，其中自然语言政策、精心编写的先例和一个LLM替代法官共同进化。SAGE系统性地解决语义模糊和不一致，将主观的相关性判断转化为可执行的多维标准，具有接近人类水平的一致性。为了弥合前沿模型推理与工业级推理之间的差距，我们应用教师-学生蒸馏技术，将高保真判断转移到紧凑的学生替代体，成本降低92倍。SAGE部署在LinkedIn搜索生态系统中，通过模拟驱动开发指导模型迭代，蒸馏出符合政策的模型用于在线服务，并实现快速的离线评估。在生产环境中，它推动了政策监督，测量了升级的模型变体并检测到无法被参与指标检测到的回归。集体上，这些措施推动了LinkedIn每日活跃用户的0.25%提升。

英文摘要

Evaluating relevance in large-scale search systems is fundamentally constrained by the governance gap between nuanced, resource-constrained human oversight and the high-throughput requirements of production systems. While traditional approaches rely on engagement proxies or sparse manual review, these methods often fail to capture the full scope of high-impact relevance failures. We present \textbf{SAGE} (Scalable AI Governance \& Evaluation), a framework that operationalizes high-quality human product judgment as a scalable evaluation signal. At the core of SAGE is a bidirectional calibration loop where natural-language \emph{Policy}, curated \emph{Precedent}, and an \emph{LLM Surrogate Judge} co-evolve. SAGE systematically resolves semantic ambiguities and misalignments, transforming subjective relevance judgment into an executable, multi-dimensional rubric with near human-level agreement. To bridge the gap between frontier model reasoning and industrial-scale inference, we apply teacher-student distillation to transfer high-fidelity judgments into compact student surrogates at \textbf{92$\times$} lower cost. Deployed within LinkedIn Search ecosystems, SAGE guided model iteration through simulation-driven development, distilling policy-aligned models for online serving and enabling rapid offline evaluation. In production, it powered policy oversight that measured ramped model variants and detected regressions invisible to engagement metrics. Collectively, these drove a \textbf{0.25\%} lift in LinkedIn daily active users.

URL PDF HTML ☆

赞 0 踩 0

2601.12164 2026-06-11 cs.CY cs.CL 版本更新

The Language You Ask In: Language-Conditioned Ideological Divergence in LLM Analysis of Contested Political Documents

提问的语言：语言条件对LLM分析争议性政治文件时的意识形态分歧的影响

Oleg Smirnov

发表机构 * Microsoft（微软）

AI总结研究通过俄语和乌克兰语语义等价提示，发现ChatGPT和Claude Opus在分析同一乌克兰公民社会文件时，输出出现系统性意识形态分歧，且分歧程度因模型而异。

详情

AI中文摘要

大型语言模型（LLM）越来越多地被部署为跨多语言语境的分析工具，但其输出可能带有由提示语言条件引起的系统性偏差。本研究对LLM生成的乌克兰公民社会文件政治分析进行了实验比较，使用俄语和乌克兰语的语义等价提示，分别对来自不同开发者的两个前沿模型——ChatGPT 5.2和Claude Opus 4.5进行测试。尽管源材料相同且查询结构平行，两个模型沿同一轴线出现分歧：俄语输出倾向于去合法化框架，将公民社会行为者描述为限制民主授权的外部资助精英，而乌克兰语输出则将同一行为者视为民主竞争中的合法利益相关者。然而，这种分歧的程度因模型而异。ChatGPT的俄语输出再现了俄罗斯国家话语的特征词汇；Claude Opus的输出则保持在主流批评语境内，并在两种语言中对其判断进行限定。这些发现表明，仅提示语言就能系统性地改变分析相同内容的同一模型的意识形态取向。这种转变是多语言LLM的一个普遍属性，其严重程度及其与宣传叙事的对齐程度因系统而异。这些影响涉及AI在极化信息环境中的部署、跨语言研究以及多语言社会中的AI治理。

英文摘要

Large language models (LLMs) are increasingly deployed as analytical tools across multilingual contexts, yet their outputs may carry systematic biases conditioned by the language of the prompt. This study presents an experimental comparison of LLM-generated political analyses of a Ukrainian civil society document, using semantically equivalent prompts in Russian and Ukrainian administered to two frontier models from different developers, ChatGPT 5.2 and Claude Opus 4.5. Despite identical source material and parallel query structures, both models diverged along the same axis: Russian-language outputs leaned toward delegitimizing framings, characterizing civil society actors as externally funded elites constraining a democratic mandate, while Ukrainian-language outputs treated the same actors as legitimate stakeholders in democratic contestation. The magnitude of this divergence, however, was model-dependent. ChatGPT's Russian output reproduced vocabulary characteristic of Russian state discourse; Claude Opus's stayed in a mainstream critical idiom and hedged its judgments in both languages. These findings demonstrate that prompt language alone can systematically shift the ideological orientation of an unchanged model analyzing identical content. The shift is a general property of multilingual LLMs whose severity, and whose alignment with propaganda narratives, varies across systems. The implications reach AI deployment in polarized information environments, cross-lingual research, and AI governance in multilingual societies.

URL PDF HTML ☆

赞 0 踩 0

2510.07750 2026-06-11 stat.ML cs.LG 版本更新

Calibrating Decision Robustness via Inverse Conformal Risk Control

通过逆保形风险控制校准决策鲁棒性

Wenbin Zhou, Shixiang Zhu

发表机构 * Wenbin Zhou（周文彬）； Shixiang Zhu（朱世祥）

AI总结提出逆保形风险控制框架，为鲁棒优化策略提供无分布、有限样本的误覆盖与遗憾保证，通过追踪Pareto前沿帮助决策者根据成本-风险偏好校准鲁棒性水平。

详情

AI中文摘要

鲁棒优化通过针对最坏情况优化来保护决策免受不确定性影响，但其有效性取决于预先指定的鲁棒性水平，该水平通常是临时选择的，导致保护不足或过度保守且成本高昂的解决方案。最近使用保形预测的方法构建了具有有限样本覆盖保证的数据驱动不确定性集，但它们仍然事先固定覆盖目标，并且对选择鲁棒性水平提供的指导很少。我们提出了一个新框架，该框架为任何鲁棒预测-然后优化策略族提供了无分布、有限样本的误覆盖和遗憾保证。我们的方法构建了有效的估计量，这些估计量描绘出误覆盖-遗憾帕累托前沿，使决策者能够根据其成本-风险偏好可靠地评估和校准鲁棒性水平。该框架易于实现，广泛适用于经典优化公式，并实现了更优的有限样本性能。本文提供了一种原则性的数据驱动方法，用于指导鲁棒性选择，并使从业者能够在高风险决策中平衡鲁棒性和保守性。

英文摘要

Robust optimization safeguards decisions against uncertainty by optimizing against worst-case scenarios, yet their effectiveness hinges on a prespecified robustness level that is often chosen ad hoc, leading to either insufficient protection or overly conservative and costly solutions. Recent approaches using conformal prediction construct data-driven uncertainty sets with finite-sample coverage guarantees, but they still fix coverage targets a priori and offer little guidance for selecting robustness levels. We propose a new framework that provides distribution-free, finite-sample guarantees on both miscoverage and regret for any family of robust predict-then-optimize policies. Our method constructs valid estimators that trace out the miscoverage--regret Pareto frontier, enabling decision-makers to reliably evaluate and calibrate robustness levels according to their cost--risk preferences. The framework is simple to implement, broadly applicable across classical optimization formulations, and achieves sharper finite-sample performance. This paper offers a principled data-driven methodology for guiding robustness selection and empowers practitioners to balance robustness and conservativeness in high-stakes decision-making.

URL PDF HTML ☆

赞 0 踩 0

2601.21817 2026-06-11 stat.ML cs.LG 版本更新

A Judge-Aware Ranking Framework for Evaluating Large Language Models without Ground Truth

一种面向评委的排名框架：无需真实标签评估大语言模型

Mingyuan Xu, Xinzi Tan, Jiawei Wu, Doudou Zhou

发表机构 * University of Technology Sydney（悉尼科技大学）

AI总结本文提出一种面向评委的排名框架，通过引入评委特定的辨别参数扩展Bradley-Terry-Luce模型，在不参考标签的情况下联合估计潜在模型质量和评委可靠性，从而提高人类偏好的一致性，提高数据效率，并产生校准的不确定性量化。

详情

AI中文摘要

评估大语言模型（LLMs）在开放性任务上无需真实标签的评估越来越通过LLM-as-a-judge范式进行。一个关键但未充分建模的问题是，评判LLMs在可靠性上存在显著差异；将所有评委视为同等对待会导致偏见的排行榜和误导性的不确定性估计。更多的数据在不正确的聚合下可能导致评估更加自信地错误。我们提出了一种面向评委的排名框架，通过引入评委特定的辨别参数扩展Bradley-Terry-Luce模型，在不参考标签的情况下联合估计潜在模型质量和评委可靠性。我们建立了可识别性，直到自然归一化，并证明最大似然估计的一致性和渐近正态性，从而能够为分数差异和排名比较生成置信区间。在多个公开基准和一个新收集的数据集上，我们的方法提高了与人类偏好的一致性，比无权基线实现了更高的数据效率，并产生了校准的LLM排名不确定性量化。

英文摘要

Evaluating large language models (LLMs) on open-ended tasks without ground-truth labels is increasingly done via the LLM-as-a-judge paradigm. A critical but under-modeled issue is that judge LLMs differ substantially in reliability; treating all judges equally can yield biased leaderboards and misleading uncertainty estimates. More data can make evaluation more confidently wrong under misspecified aggregation. We propose a judge-aware ranking framework that extends the Bradley-Terry-Luce model by introducing judge-specific discrimination parameters, jointly estimating latent model quality and judge reliability from pairwise comparisons without reference labels. We establish identifiability up to natural normalizations and prove consistency and asymptotic normality of the maximum likelihood estimator, enabling confidence intervals for score differences and rank comparisons. Across multiple public benchmarks and a newly collected dataset, our method improves agreement with human preferences, achieves higher data efficiency than unweighted baselines, and produces calibrated uncertainty quantification for LLM rankings.

URL PDF HTML ☆

赞 0 踩 0

2601.14031 2026-06-11 stat.ML cs.LG 版本更新

Intermittent time series forecasting: local vs global models

间歇性时间序列预测：局部模型与全局模型

Stefano Damato, Nicolò Rubattu, Dario Azzimonti, Giorgio Corani

发表机构 * Supplementary Institute of Science and Technology（瑞士苏黎世联邦理工学院）

AI总结针对间歇性时间序列预测问题，首次系统比较了概率性局部模型与全局模型（如TiDE），发现简单神经网络架构TiDE在精度和计算效率上均优于局部模型，且Tweedie分布头对高分位数估计最佳。

Comments Submitted to the Journal of the Operational Research Society

详情

AI中文摘要

预测包含零值的间歇性时间序列是供应链中的一个关键挑战，因为库存策略需要概率预测来建立安全水平。间歇性时间序列通常使用局部模型进行预测，即对每个时间序列单独训练。近年来，基于大量时间序列训练的全局模型在时间序列预测中变得流行。全局模型通常基于神经网络或梯度提升树。我们进行了首次研究，比较了最先进的概率性局部模型和全局模型在间歇性时间序列上的表现。对于全局模型，我们考虑了三种适用于间歇性时间序列的不同分布头：负二项、障碍移位负二项和Tweedie。据我们所知，这是后两者首次与神经网络结合使用。我们在五个数据集上进行了实验，这些数据集总共包含超过40,000个真实世界的时间序列。在全局模型中，TiDE（一种简单的神经网络架构）取得了最佳精度；它还持续优于局部模型，并且计算需求更低。大型全局模型反而计算需求更高且精度更低。在分布头中，Tweedie提供了最高分位数的最佳估计。

英文摘要

Forecasting intermittent time series, which contain zeros, is a crucial challenge in supply chains as inventory policies require probabilistic forecasts to establish safety levels. Intermittent time series are commonly forecast using local models, trained individually on each time series. In the last years global models, trained on a large collection of time series, have become popular for time series forecasting. Global models are often based on neural networks or gradient boosted trees. We carry out the first study comparing state-of-the-art probabilistic local and global models on intermittent time series. For global models we consider three different distribution heads suitable for intermittent time series: negative binomial, hurdle-shifted negative binomial and Tweedie. To the best of our knowledge, this is the first use of the latter two with neural networks. We perform experiments on five datasets comprising overall more than 40'000 real-world time series. Among global models, TiDE, a simple neural network architecture, achieves the best accuracy; it also consistently outperforms local models and has lower computational requirements. Large global models are instead much more computationally demanding and less accurate. Among the distribution heads, the Tweedie provides the best estimates of the highest quantiles.

URL PDF HTML ☆

赞 0 踩 0

2505.00571 2026-06-11 stat.ML cs.LG 版本更新

Discovery and inference beyond linearity for epidemiological data by integrating Bayesian regression, tree ensembles and Shapley values

通过整合贝叶斯回归、树集成和Shapley值对流行病学数据进行线性之外的发现与推断

Giorgio Spadaccini, Marjolein Fokkema, Mark A. van de Wiel

发表机构 * Amsterdam UMC Leiden University（阿姆斯特丹大学医学中心-莱顿大学）； Leiden University（莱顿大学）； Amsterdam UMC（阿姆斯特丹大学医学中心）

AI总结提出RuleSHAP框架，结合贝叶斯稀疏回归、改进的树规则生成器和Shapley值，实现非线性与交互效应的检测及个体水平的不确定性量化，应用于流行病学数据发现高胆固醇和血压的影响因素。

详情

AI中文摘要

机器学习在流行病学和医疗健康研究中越来越受欢迎，用于无假设地发现风险和保护因素。机器学习在发现非线性和交互作用方面很强，但这种能力因缺乏可靠的推断而受损。尽管Shapley值提供了特征效应的局部度量，但这些效应通常缺乏有效的不确定性量化，从而排除了统计推断。我们提出RuleSHAP，一个通过结合专用贝叶斯稀疏回归模型、改进的基于树的规则生成器和Shapley值归因来解决这一局限性的框架。RuleSHAP能够检测非线性和交互效应，其关键贡献在于个体水平的不确定性量化。我们推导了一个在该框架内计算边际Shapley值的有效公式。我们将RuleSHAP应用于一个流行病学队列的数据，以检测和推断高胆固醇和血压的几种效应，例如年龄、性别、种族、BMI和血糖水平等特征之间的非线性交互效应。最后，我们在模拟数据上证明了我们框架的有效性。

英文摘要

Machine Learning (ML) is gaining popularity in epidemiology and healthcare studies for hypothesis-free discovery of risk and protective factors. ML is strong at discovering nonlinearities and interactions, but this power is compromised by a lack of reliable inference. Although Shapley values provide local measures of features' effects, valid uncertainty quantification for these effects is typically lacking, thus precluding statistical inference. We propose RuleSHAP, a framework that addresses this limitation by combining a dedicated Bayesian sparse regression model with an improved tree-based rule generator and Shapley value attribution. RuleSHAP provides detection of nonlinear and interaction effects, with uncertainty quantification at the individual level as a key contribution. We derive an efficient formula for computing marginal Shapley values within this framework. We apply RuleSHAP to data from an epidemiological cohort to detect and infer several effects for high cholesterol and blood pressure, such as nonlinear interaction effects between features like age, sex, ethnicity, BMI and glucose level. To conclude, we demonstrate the validity of our framework on simulated data.

URL PDF HTML ☆

赞 0 踩 0

2512.22219 2026-06-11 cs.DC cs.LG cs.PL 版本更新

MPK: A Compiler and Runtime for Mega-Kernelizing Tensor Programs

MPK：一种用于将张量程序转化为巨型内核的编译器和运行时系统

Xinhao Cheng, Zhihao Zhang, Yu Zhou, Jianan Ji, Jinchen Jiang, Zepeng Zhao, Ziruo Xiao, Zihao Ye, Yingyi Huang, Ruihang Lai, Hongyi Jin, Bohan Hou, Mengdi Wu, Yixin Dong, Anthony Yip, Zihao Ye, Songting Wang, Wenqin Yang, Xupeng Miao, Tianqi Chen, Zhihao Jia

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； Tsinghua University（清华大学）； NVIDIA ； University of Michigan（密歇根大学）； Independent Researcher（独立研究者）； Peking University（北京大学）

AI总结提出MPK，首个自动将多GPU模型推理转化为单个高性能巨型内核的编译器和运行时系统，通过SM级图表示实现跨算子软件流水线和细粒度计算通信重叠，显著降低推理延迟。

Comments 14 pages

详情

AI中文摘要

我们介绍了Mirage Persistent Kernel (MPK)，这是首个自动将多GPU模型推理转化为单个高性能巨型内核的编译器和运行时系统。MPK引入了一种SM级图表示，该表示在单个流式多处理器（SM）的粒度上捕获数据依赖关系，从而实现跨算子软件流水线、计算与通信的细粒度重叠，以及在传统每算子内核执行模型下不可行的其他优化。MPK编译器将张量程序降级为优化的SM级任务图，并为每个任务生成快速的CUDA实现，而MPK内核内并行运行时则通过跨SM的分散调度在单个持久巨型内核内执行这些任务。这些组件共同提供了端到端的内核融合，且开发工作量极小，同时保留了现有编程模型的灵活性。我们的评估表明，MPK显著优于现有的每算子内核LLM服务系统，实现了高达1.7倍的端到端推理延迟降低，并将LLM推理性能推近底层硬件的极限。MPK在此https URL公开可用。

英文摘要

We introduce Mirage Persistent Kernel (MPK), the first compiler and runtime system that automatically transforms multi-GPU model inference into a single high-performance mega-kernel. MPK introduces an SM-level graph representation that captures data dependencies at the granularity of individual streaming multiprocessors (SMs), enabling cross-operator software pipelining, \rev{fine-grained overlap of computation and communication, and other optimizations that are infeasible under the conventional kernel-per-operator execution model}. The MPK compiler lowers tensor programs into optimized SM-level task graphs and generates fast CUDA implementations for each task, while the MPK in-kernel parallel runtime executes these tasks within a single persistent mega-kernel using decentralized scheduling across SMs. Together, these components provide end-to-end kernel fusion with minimal developer effort, while preserving the flexibility of existing programming models. Our evaluation shows that MPK significantly outperforms existing kernel-per-operator LLM serving systems, achieving up to 1.7$\times$ lower end-to-end inference latency and pushing LLM inference performance close to the limits of the underlying hardware. MPK is publicly available at https://github.com/mirage-project/mirage.

URL PDF HTML ☆

赞 0 踩 0

2512.19245 2026-06-11 eess.SY cs.RO cs.SY 版本更新

Vision-Aided Relative State Estimation for Approach and Landing on a Moving Platform with Inertial Measurements

基于视觉辅助的相对状态估计用于移动平台进近与着陆的惯性测量

Tarek Bouazza, Alessandro Melis, Soulaimane Berkane, Robert Mahony, Tarek Hamel

发表机构 * I3S, CNRS, Université Côte d’Azur（I3S、CNRS、普罗旺斯大学）； Département d’informatique et d’ingénierie, Université du Québec en Outaouis and Department of Electrical Engineering, Lakehead University（信息与工程系、魁北克大学 Outaouais 以及拉夫堡大学电子工程系）； Systems Theory and Robotics Group Australian National University（系统理论与机器人组、澳大利亚国立大学）； Institut Universitaire de France (IUF)（法国高等研究院）

AI总结提出一种级联观测器，结合SO(3)互补滤波和线性Riccati观测器，利用IMU和单目相机估计无人机与移动平台的相对位姿和速度，在持续激励条件下实现几乎全局渐近稳定。

Comments 13 pages, 4 figures. To appear in proceedings of IFAC World Congress 2026

详情

AI中文摘要

本文解决了在进近和着陆过程中，无人机与经历任意三维运动的平面平台之间的相对位置、姿态和速度的估计问题。该估计依赖于安装在两个系统上的惯性测量单元（IMU）的测量值，假设存在合适的通信信道来交换数据，以及由机载单目相机提供的视觉信息，从中提取平台中心的方位（视线方向）和其平面表面的法向量。我们提出了一种级联观测器，在$\mathbf{SO}(3)$上采用互补滤波器来重构相对姿态，随后使用线性Riccati观测器进行相对位置和速度估计。在持续激励条件下建立了两个观测器的收敛性，并证明了级联是几乎全局渐近和局部指数稳定的。我们进一步将设计扩展到平台旋转限制在其法向轴的情况，并表明可以利用其测量的线性加速度来恢复剩余不可观测的旋转角。提供了该情况下局部指数收敛的充分条件。通过大量仿真验证了所提出的观测器。

英文摘要

This paper tackles the problem of estimating the relative position, orientation, and velocity between a UAV and a planar platform undergoing arbitrary 3D motion during approach and landing. The estimation relies on measurements from Inertial Measurement Units (IMUs) mounted on both systems, assuming there is a suitable communication channel to exchange data, together with visual information provided by an onboard monocular camera, from which the bearing (line-of-sight direction) to the platform's center and the normal vector of its planar surface are extracted. We propose a cascade observer with a complementary filter on $\mathbf{SO}(3)$ to reconstruct the relative attitude, followed by a linear Riccati observer for relative position and velocity estimation. Convergence of both observers is established under persistently exciting conditions, and the cascade is shown to be almost globally asymptotically and locally exponentially stable. We further extend the design to the case where the platform's rotation is restricted to its normal axis and show that its measured linear acceleration can be exploited to recover the remaining unobservable rotation angle. A sufficient condition for local exponential convergence in this setting is provided. The proposed observers are validated through extensive simulations.

URL PDF HTML ☆

赞 0 踩 0

2512.13765 2026-06-11 eess.IV cs.AI cs.LG 版本更新

Towards Deep Learning Surrogate for the Forward Problem in Electrocardiology: A Scalable Alternative to Physics-Based Models

面向心电学正问题的深度学习代理模型：一种可扩展的物理模型替代方案

Shaheim Ogbomo-Harmitt, Cesare Magnetti, Chiara Spota, Jakub Grzelak, Oleg Aslanidi

发表机构 * School of Biomedical Engineering and Imaging Sciences, King’s College London（伦敦国王学院生物医学工程与成像科学学院）； PhysicsX

AI总结提出基于注意力机制的序列到序列深度学习框架，作为心电学正问题的代理模型，从心脏电压传播图预测心电图信号，在2D组织模拟中达到高精度（平均R²=0.99±0.01），为物理模型提供可扩展、低成本的替代方案。

Comments Accepted to CinC conference 2025

详情

AI中文摘要

心电学中的正问题，即从心脏电活动计算体表电位，传统上使用基于物理的模型（如双域或单域方程）求解。虽然准确，但这些方法计算成本高，限制了其在实时和大规模临床中的应用。我们提出一个概念验证的深度学习（DL）框架，作为正问题求解器的高效代理。该模型采用基于时间依赖注意力机制的序列到序列架构，从心脏电压传播图预测心电图（ECG）信号。引入了一种混合损失函数，结合Huber损失和谱熵项，以保持时域和频域的保真度。使用包含健康、纤维化和缝隙连接重塑条件的2D组织模拟，模型实现了高精度（平均$R^2 = 0.99 \pm 0.01$）。消融研究证实了卷积编码器、时间感知注意力和谱熵损失的贡献。这些发现突显了DL作为物理求解器的可扩展、低成本替代方案的潜力，适用于临床和数字孪生应用。

英文摘要

The forward problem in electrocardiology, computing body surface potentials from cardiac electrical activity, is traditionally solved using physics-based models such as the bidomain or monodomain equations. While accurate, these approaches are computationally expensive, limiting their use in real-time and large-scale clinical applications. We propose a proof-of-concept deep learning (DL) framework as an efficient surrogate for forward solvers. The model adopts a time-dependent, attention-based sequence-to-sequence architecture to predict electrocardiogram (ECG) signals from cardiac voltage propagation maps. A hybrid loss combining Huber loss with a spectral entropy term was introduced to preserve both temporal and frequency-domain fidelity. Using 2D tissue simulations incorporating healthy, fibrotic, and gap junction-remodelled conditions, the model achieved high accuracy (mean $R^2 = 0.99 \pm 0.01$). Ablation studies confirmed the contributions of convolutional encoders, time-aware attention, and spectral entropy loss. These findings highlight DL as a scalable, cost-effective alternative to physics-based solvers, with potential for clinical and digital twin applications.

URL PDF HTML ☆

赞 0 踩 0

2512.13666 2026-06-11 cs.CR cs.DC cs.IT cs.LG math.IT 版本更新

SEDULity: A Proof-of-Learning Framework for Distributed and Secure Blockchains with Efficient Useful Work

SEDULity：一种面向分布式安全区块链的高效有用工作证明学习框架

Weihang Cao, Mustafa Doger, Sennur Ulukus

发表机构 * Department of Electrical and Computer Engineering（电气与计算机工程系）

AI总结提出一种名为SEDULity的证明学习框架，通过将区块模板编码到训练过程中并设计难解易验的有用函数替代PoW谜题，在保持区块链安全性的同时高效训练机器学习模型。

详情

AI中文摘要

工作量证明（PoW）的安全性和去中心化已在现有区块链系统中得到充分验证，但其巨大的能源浪费引发了可持续性担忧。有用工作证明（PoUW）旨在将无意义的计算重定向到有意义任务（如解决机器学习问题），从而催生了学习证明（PoL）分支。尽管已有研究提出了多种PoL，但它们都在一定程度上存在安全性、去中心化或效率问题。本文提出一种PoL框架，在完全分布式环境中高效训练机器学习模型，同时维护区块链安全性。我们将该框架命名为SEDULity，代表安全、高效、分布式和有用的基于学习的区块链系统。具体而言，我们将区块模板编码到训练过程中，并设计一种难解但相对易验的有用函数，作为PoW谜题的替代。我们证明该框架是分布式、安全的，并能高效训练机器学习模型。进一步展示所提出的PoL框架可扩展到其他类型的有用工作，并设计激励机制以激励任务验证。理论上证明，在精心设计的系统参数下，理性矿工有动机完全诚实地进行训练。最后，通过仿真结果展示框架性能并验证分析。

英文摘要

The security and decentralization of Proof-of-Work (PoW) have been well-tested in existing blockchain systems. However, its tremendous energy waste has raised concerns about sustainability. Proof-of-Useful-Work (PoUW) aims to redirect the meaningless computation to meaningful tasks such as solving machine learning (ML) problems, giving rise to the branch of Proof-of-Learning (PoL). While previous studies have proposed various PoLs, they all, to some degree, suffer from security, decentralization, or efficiency issues. In this paper, we propose a PoL framework that trains ML models efficiently while maintaining blockchain security in a fully distributed manner. We name the framework SEDULity, which stands for a Secure, Efficient, Distributed, and Useful Learning-based blockchain system. Specifically, we encode the template block into the training process and design a useful function that is difficult to solve but relatively easy to verify, as a substitute for the PoW puzzle. We show that our framework is distributed, secure, and efficiently trains ML models. We further demonstrate that the proposed PoL framework can be extended to other types of useful work and design an incentive mechanism to incentivize task verification. We show theoretically that a rational miner is incentivized to train fully honestly with well-designed system parameters. Finally, we present simulation results to demonstrate the performance of our framework and validate our analysis.

URL PDF HTML ☆

赞 0 踩 0

2512.11982 2026-06-11 astro-ph.IM cs.AI cs.CV cs.LG 版本更新

Semantic search for 100M+ galaxy images using AI-generated captions

基于AI生成描述的1亿+星系图像语义搜索

Nolan Koblischke, Liam Parker, Francois Lanusse, Jo Bovy, Irina Espejo, Shirley Ho

发表机构 * New York University（纽约大学）； University of Toronto（多伦多大学）； Dunlap Institute for Astronomy & Astrophysics（达伦普天文与天体物理研究所）； University of California, Berkeley（加州大学伯克利分校）； Center for Data Science（数据科学中心）； Lawrence Berkeley National Lab（伯克利国家实验室）； Flatiron Institute（Flatiron研究所）； Université Paris-Saclay（巴黎-萨克莱大学）； CEA（法国原子能委员会）； CNRS（法国国家科学研究中心）； AIM（应用数学研究所）； Princeton University（普林斯顿大学）

AI总结提出利用视觉语言模型生成星系图像描述，并对比对齐预训练天文学基础模型，构建可搜索嵌入，实现大规模星系图像的语义搜索，在稀有现象发现上取得最先进性能。

Comments ApJ, in press

详情

AI中文摘要

通过缓慢的手动标注活动寻找科学上有趣的现象严重限制了我们对望远镜产生的数十亿星系图像的探索能力。在这项工作中，我们开发了一个流水线，从完全未标记的图像数据创建语义搜索引擎。我们的方法利用视觉语言模型（VLM）为星系图像生成描述，然后将预训练的天文学基础模型与这些嵌入的描述进行对比对齐，以产生大规模可搜索的嵌入。我们发现当前的VLM提供的描述信息足够丰富，可以训练一个语义搜索模型，该模型优于直接图像相似性搜索。我们的模型AION-Search在寻找稀有现象方面实现了最先进的零样本性能，尽管训练是在随机选择的图像上进行的，没有针对稀有情况进行刻意策划。此外，我们引入了一种基于VLM的重排序方法，该方法在top-100结果中对我们最具挑战性的目标的召回率几乎翻倍。首次，AION-Search实现了对超过1亿张星系图像的灵活语义搜索，使得从以前不可行的搜索中能够发现新现象，包括识别出36个新的河外恒星流候选体。更广泛地说，我们的工作提供了一种方法，使大型、未标记的科学图像档案变得可语义搜索，扩展了从地球观测到显微镜等领域的数据探索能力。代码、数据和应用程序可在以下网址公开获取：https://this https URL

英文摘要

Finding scientifically interesting phenomena through slow manual labeling campaigns severely limits our ability to explore the billions of galaxy images produced by telescopes. In this work, we develop a pipeline to create a semantic search engine from completely unlabeled image data. Our method leverages Vision-Language Models (VLMs) to generate descriptions for galaxy images, then contrastively aligns a pre-trained astronomy foundation model with these embedded descriptions to produce searchable embeddings at scale. We find that current VLMs provide descriptions that are sufficiently informative to train a semantic search model that outperforms direct image similarity search. Our model, AION-Search, achieves state-of-the-art zero-shot performance on finding rare phenomena despite training on randomly selected images with no deliberate curation for rare cases. Furthermore, we introduce a VLM-based re-ranking method that nearly doubles the recall for our most challenging targets in the top-100 results. For the first time, AION-Search enables flexible semantic search for over 100 million galaxy images, enabling discovery from previously infeasible searches, including the identification of 36 new extragalactic stellar stream candidates. More broadly, our work provides an approach for making large, unlabeled scientific image archives semantically searchable, expanding data exploration capabilities in fields from Earth observation to microscopy. The code, data, and app are publicly available at https://github.com/NolanKoblischke/AION-Search

URL PDF HTML ☆

赞 0 踩 0

2512.11081 2026-06-11 stat.ML cs.LG stat.ME 版本更新

Provable Recovery of Locally Important Signed Features and Interactions from Random Forest

从随机森林中可证明地恢复局部重要符号特征和交互

Kata Vuk, Nicolas Alexander Ihlo, Merle Behr

发表机构 * Faculty of Informatics and Data Science, University of Regensburg, Germany（信息与数据科学学院，莱茵河畔雷根斯堡大学）

AI总结提出一种局部、模型特定的特征与交互重要性方法，通过结合全局和局部决策路径模式，在局部尖峰稀疏模型下可证明地恢复真实信号特征及其交互，并识别特征值大小对预测的驱动方向。

详情

AI中文摘要

特征与交互重要性（FII）方法在监督学习中至关重要，用于评估复杂预测模型中输入变量及其交互的相关性。在许多领域，如个性化医疗，通常需要针对单个预测的局部解释，而不是总结整体特征重要性的全局分数。随机森林（RF）在这些场景中被广泛使用，现有的可解释性方法通常利用树结构和分裂统计量来提供模型特定的见解。然而，对RF的局部FII方法的理论理解仍然有限，这使得如何解释单个预测的高重要性分数变得不明确。我们提出了一种新颖的、局部的、模型特定的FII方法，该方法识别特征在决策路径上的频繁共现，将全局模式与特定测试点路径上的模式相结合。我们证明，在局部尖峰稀疏（LSS）模型下，我们的方法一致地恢复真实的局部信号特征及其交互，并识别出大或小的特征值是否驱动预测。通过模拟研究和真实数据示例，我们展示了我们的方法和理论结果的有用性。

英文摘要

Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many domains, such as personalized medicine, local interpretations for individual predictions are often required, rather than global scores summarizing overall feature importance. Random Forests (RFs) are widely used in these settings, and existing interpretability methods typically exploit tree structures and split statistics to provide model-specific insights. However, theoretical understanding of local FII methods for RF remains limited, making it unclear how to interpret high importance scores for individual predictions. We propose a novel, local, model-specific FII method that identifies frequent co-occurrences of features along decision paths, combining global patterns with those observed on paths specific to a given test point. We prove that our method consistently recovers the true local signal features and their interactions under a Locally Spike Sparse (LSS) model and also identifies whether large or small feature values drive a prediction. We illustrate the usefulness of our method and theoretical results through simulation studies and a real-world data example.

URL PDF HTML ☆

赞 0 踩 0

2512.03077 2026-06-11 cs.CY cs.AI 版本更新

Irresponsible AI: big tech's influence on AI research and associated impacts

不负责任的人工智能：大型科技公司对AI研究的影响及相关影响

Alex Hernandez-Garcia, Alexandra Volokhova, Ezekiel Williams, Dounia Shaaban Kabakibo, Mélisande Teng

发表机构 * Big Tech（大科技公司）

AI总结本文指出大型科技公司对AI研究的不成比例影响推动了不负责任的AI发展，并加剧了环境和社会负面影响，呼吁研究者通过集体行动加以抵制。

Comments Presented as a spotlight oral at the International Conference on Machine Learning 2026 (Position Paper Track). First version presented at NeurIPS 2025 Workshop on Algorithmic Collective Action

详情

AI中文摘要

人工智能系统的加速开发、部署和采纳得益于大型科技公司在AI领域的日益深入。这一趋势伴随着日益增长的伦理关切以及加剧的社会和环境影响。本文立场认为，不负责任的AI发展在很大程度上是由大型科技公司在该领域的影响和参与所驱动的。首先，我们审视了大型科技公司在AI研究中日益增长且不成比例的影响，并认为其对规模化和通用系统的追求从根本上与负责任、合乎伦理和可持续的AI发展相悖。其次，我们回顾了当前AI的主要负面环境和社会影响，并追溯其与大型科技公司影响的联系。第三，我们讨论了推动大型科技公司行动的基本经济力量。最后，作为行动号召，我们邀请AI研究者通过基于相关行为者责任和集体行动的策略，来对抗大型科技公司对不负责任AI发展的影响。

英文摘要

The accelerated development, deployment and adoption of artificial intelligence systems has been fuelled by the increasing presence of big tech in the AI field. This trend has been accompanied by growing ethical concerns and intensified societal and environmental impacts. This position paper argues that irresponsible AI development is strongly driven by big tech's influence and involvement in the field. First, we examine the growing and disproportionate influence of big tech in AI research and argue that its drive for scaling and general-purpose systems is fundamentally at odds with the responsible, ethical, and sustainable development of AI. Second, we review key current environmental and societal negative impacts of AI and trace their connections to big tech's influence. Third, we discuss the underlying economic forces driving big tech's actions. Finally, as a call to action, we invite AI researchers to counter big tech's influence in irresponsible AI development through strategies that build on the responsibility of implicated actors and collective action.

URL PDF HTML ☆

赞 0 踩 0

2411.12193 2026-06-11 stat.AP cs.LG stat.ML 版本更新

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

分布式能源采纳的分层概率保形预测

Wenbin Zhou, Shixiang Zhu

发表机构 * Carnegie Mellon University（卡内基梅隆大学）

AI总结针对分布式能源采纳预测中的不确定性和分层电网结构，提出基于多元霍克斯过程与分裂保形预测的量化框架，确保聚合后统计有效性，在印第安纳波利斯数据上优于基线。

详情

AI中文摘要

分布式能源（DERs）的快速增长为电网管理带来了机遇和运营挑战。准确预测DER采纳对于主动基础设施规划至关重要，但DER增长的固有不确定性和空间差异使传统预测方法复杂化。此外，配电网的分层结构要求预测在电路和变电站层面均满足统计保证，这是可靠决策的非平凡要求。本文提出了一种新的DER采纳预测不确定性量化框架，确保在分层电网结构中的有效性。利用多元霍克斯过程建模DER采纳动态，并采用定制的分裂保形预测算法，我们引入了一种新的非一致性分数，在保持预测效率的同时，在聚合下保留统计保证。我们在温和条件下建立了理论有效性，并通过印第安纳州印第安纳波利斯的客户级太阳能电池板安装数据实证评估，表明我们的方法在预测准确性和不确定性校准方面始终优于现有基线。

英文摘要

The rapid growth of distributed energy resources (DERs) presents both opportunities and operational challenges for electric grid management. Accurately predicting DER adoption is critical for proactive infrastructure planning, but the inherent uncertainty and spatial disparity of DER growth complicate traditional forecasting approaches. Moreover, the hierarchical structure of distribution grids demands that predictions satisfy statistical guarantees at both the circuit and substation levels, a non-trivial requirement for reliable decision-making. In this paper, we propose a novel uncertainty quantification framework for DER adoption predictions that ensures validity across hierarchical grid structures. Leveraging a multivariate Hawkes process to model DER adoption dynamics and a tailored split conformal prediction algorithm, we introduce a new nonconformity score that preserves statistical guarantees under aggregation while maintaining prediction efficiency. We establish theoretical validity under mild conditions and demonstrate through empirical evaluation on customer-level solar panel installation data from Indianapolis, Indiana that our method consistently outperforms existing baselines in both predictive accuracy and uncertainty calibration.

URL PDF HTML ☆

赞 0 踩 0

2510.23320 2026-06-11 eess.AS cs.CL cs.SD 版本更新

LibriConvo: Simulating Conversations from Read Literature for ASR and Diarization

LibriConvo：从阅读文献模拟对话用于ASR和说话人日志

Máté Gedeon, Péter Mihajlik

发表机构 * Department of Telecommunications and Artificial Intelligence, Budapest University of Technology and Economics（电信与人工智能系，布达佩斯技术与经济大学）； Speechtex Ltd.（Speechtex公司）； ELTE Research Centre for Linguistics（ELTE语言研究所）

AI总结提出LibriConvo合成对话语音语料库，基于说话人感知模拟对话框架构建，用于说话人日志和ASR基准测试，包含240.1小时音频，基线实验显示Sortformer在日志中优于pyannote，Fast Conformer-CTC在ASR中优于Whisper。

Comments Accepted by TSD 2026

详情

AI中文摘要

我们介绍了LibriConvo，一个用于说话人日志和自动语音识别（ASR）的合成对话语音语料库，通过在数据集和基准测试设置中实例化先前提出的说话人感知模拟对话（SASC）框架构建而成。本文的主要贡献是基于该框架的语料库构建流程和基准测试。为了使数据更适合下游ASR和说话人日志，我们使用外部语音活动检测从英语CallHome估计对话时间统计信息，压缩长停顿，按书籍分组LibriTTS话语以改善局部语义连续性，并通过空间合理性启发式选择房间脉冲响应。生成的语料库包含240.1小时的音频，涉及830个说话人的1496个对话，划分为说话人不重叠的训练、验证和测试集。我们报告了说话人日志和ASR的基线结果。在测试集上，Sortformer在说话人日志中优于pyannote流水线（DER 11.1%对比24.4%）。对于ASR，使用序列化输出训练微调的Fast Conformer-CTC XLarge模型实现了7.29%的WER和6.97%的cpWER，优于零样本Whisper-large-v3。这些结果使LibriConvo成为研究合成对话语音和评估多说话人语音处理系统的实用基准。

英文摘要

We introduce LibriConvo, a synthetic conversational speech corpus for speaker diarization and automatic speech recognition (ASR), built by instantiating the previously proposed Speaker-Aware Simulated Conversation (SASC) framework in a dataset and benchmarking setting. The main contribution of this paper is a corpus construction pipeline and benchmark derived from that framework. To make the data more suitable for downstream ASR and diarization, conversational timing statistics are estimated from English CallHome using external voice activity detection, long pauses are compressed, LibriTTS utterances are grouped by book to improve local semantic continuity, and room impulse responses are selected with a spatial-plausibility heuristic. The resulting corpus contains 240.1 hours of audio across 1,496 dialogues involving 830 speakers, partitioned into speaker-disjoint train, validation, and test splits. We report baseline results for both diarization and ASR. On the test split, Sortformer outperforms the pyannote pipeline in diarization (11.1\% vs.~24.4\% DER). For ASR, a Fast Conformer-CTC XLarge model fine-tuned with Serialized Output Training achieves 7.29\% WER and 6.97\% cpWER, outperforming zero-shot Whisper-large-v3. These results position LibriConvo as a practical benchmark for studying synthetic conversational speech and for evaluating multi-speaker speech processing systems.

URL PDF HTML ☆

赞 0 踩 0

2510.22397 2026-06-11 cs.NI cs.LG 版本更新

NetBurst: Event-Centric Forecasting of Bursty, Intermittent Time Series

NetBurst: 以事件为中心的突发间歇性时间序列预测

Satyandra Guthula, Jaber Daneshamooz, Charles Fleming, Kesheng Wu, Walter Willinger, Arpit Gupta

发表机构 * University of California, Santa Barbara（加州大学圣巴bara分校）； Cisco Research（思科研究）； Lawrence Berkeley National Laboratory（伯克利国家实验室）； Northwestern University（西北大学）

AI总结针对网络遥测数据中罕见突发和长间隔低活动的“野性”统计特性，提出NetBurst事件中心管道，通过压缩低活动期、分离突发时序和幅度流学习统一表示，在预测误差、突发分布匹配和异常描述性上显著优于Chronos-2等基线。

详情

AI中文摘要

网络运营商通过收集遥测数据（如数据包计数、字节速率或流体积）来监控其基础设施，但有效运营所需的问题——预测未来负载、诊断和表征异常、搜索和检索历史先例——需要超越原始测量。弥合这一差距需要学习表示：紧凑的每实体摘要，从每个实体的单变量时间序列中捕获时间动态。时间序列基础模型是自然的起点，但它们是为密集、周期性的基准数据集（“温和”统计体制）设计的。然而，网络遥测数据处于“野性”体制：操作相关事件罕见，被可变长度的低活动或无活动（“低潮”）间隔分隔，并伴有间歇性的重尾极端值突发（“潮汐”）。我们提出NetBurst，一个以事件为中心的管道，它压缩低潮，将每个时间序列分离为突发时序流和突发幅度流，并学习一个服务于所有三个操作任务的单一表示。与八个基线中最强的竞争者（包括Amazon的Chronos-2和Datadog的Toto）相比，在九个生产遥测配置上，NetBurst在野性体制数据上将中位预测误差降低了1.3–116倍，对真实突发分布的匹配度提高了1.0–7.5倍，并在温和体制基准上与基线相当。对于异常表征，NetBurst产生平衡、分布良好的聚类，在一种新的可解释性评分下，这些聚类在操作员熟悉的术语中可描述性提高了16倍，而聚类过滤搜索实现了7.5倍的端到端检索加速。

英文摘要

Network operators monitor their infrastructure by collecting telemetry data such as packet counts, byte rates, or flow volumes, yet answering the questions that effective operations demand -- forecasting future load, diagnosing and characterizing anomalies, and searching for and retrieving historical precedents -- requires more than raw measurements. Bridging this gap calls for learned representations: compact per-entity summaries that capture temporal dynamics from each entity's univariate time series. Time-series foundation models are the natural starting point, but they are designed for dense, periodic benchmark datasets -- the \emph{mild} statistical regime. However, network telemetry data inhabits the \emph{wild} regime: operationally relevant events are rare, separated by variable-length stretches of low or no activity (``ebbs''), with intermittent bursts of heavy-tailed extremes (``tides''). We present NetBurst, an event-centric pipeline that collapses ebbs, separates each time series into a stream of burst timings and a stream of burst magnitudes, and learns a single representation serving all three operational tasks. Compared to the strongest competitors among eight baselines -- including Amazon's Chronos-2 and Datadog's Toto -- and across nine production telemetry configurations, NetBurst reduces median forecasting error by $1.3$--$116\times$ on wild-regime data with a $1.0$--$7.5\times$ better match to the true burst distribution, and matches baselines on mild-regime benchmarks. For characterizing anomalies, NetBurst produces balanced, well-spread clusters that are $16\times$ more describable in operator-familiar terms under a novel interpretability score, and cluster-filtered search delivers $7.5\times$ faster end-to-end retrieval.

URL PDF HTML ☆

赞 0 踩 0

2510.17816 2026-06-11 eess.SP cs.CV 版本更新

Cross-Domain Multi-Person Human Activity Recognition via Near-Field Wi-Fi Sensing

基于近场Wi-Fi感知的跨域多人人体活动识别

Xin Li, Jingzhi Hu, Yinghui He, Hongbo Wang, Jin Gan, Jun Luo

发表机构 * College of Computing and Data Science, Nanyang Technological University, Singapore（计算与数据科学学院，南洋理工大学，新加坡）

AI总结针对Wi-Fi多人活动识别中跨域适应难题，提出WiAnchor框架，通过预训练扩大类间特征间隔、微调阶段引入锚点匹配机制过滤个体干扰，实现缺失类别下的高效跨域识别，准确率超90%。

详情

AI中文摘要

基于Wi-Fi的人体活动识别（HAR）提供了极大的便利，并已成为一个蓬勃发展的研究领域，然而Wi-Fi固有的粗空间分辨率严重阻碍了其区分多个目标的能力。通过利用近场主导效应，为每个目标通过其个人Wi-Fi设备建立专用传感链路，为原生流量下的多人HAR提供了一种有前景的解决方案。然而，由于近场信号的目标特定特性和不规则模式，HAR神经网络模型需要微调（FT）以实现跨域适应，这在某些类别不可用时变得特别具有挑战性。在本文中，我们提出WiAnchor，一种新颖的训练框架，用于在活动类别不完整的情况下实现高效的跨域适应。该框架通过三个步骤处理嵌入不规则时间信息的Wi-Fi信号：在预训练期间，我们扩大类间特征间隔以增强活动的可分离性；在微调阶段，我们创新性地引入一种锚点匹配机制用于跨域适应，根据不完整的活动类别过滤目标特定干扰，而不是试图从中提取完整特征；最后，基于输入样本与锚点的特征级相似性进一步改进识别。我们构建了一个全面的数据集来彻底评估WiAnchor，在缺失活动类别的情况下实现了超过90%的跨域准确率。

英文摘要

Wi-Fi-based human activity recognition (HAR) provides substantial convenience and has emerged as a thriving research field, yet the coarse spatial resolution inherent to Wi-Fi significantly hinders its ability to distinguish multiple subjects. By exploiting the near-field domination effect, establishing a dedicated sensing link for each subject through their personal Wi-Fi device offers a promising solution for multi-person HAR under native traffic. However, due to the subject-specific characteristics and irregular patterns of near-field signals, HAR neural network models require fine-tuning (FT) for cross-domain adaptation, which becomes particularly challenging with certain categories unavailable. In this paper, we propose WiAnchor, a novel training framework for efficient cross-domain adaptation in the presence of incomplete activity categories. This framework processes Wi-Fi signals embedded with irregular time information in three steps: during pre-training, we enlarge inter-class feature margins to enhance the separability of activities; in the FT stage, we innovate an anchor matching mechanism for cross-domain adaptation, filtering subject-specific interference informed by incomplete activity categories, rather than attempting to extract complete features from them; finally, the recognition of input samples is further improved based on their feature-level similarity with anchors. We construct a comprehensive dataset to thoroughly evaluate WiAnchor, achieving over 90% cross-domain accuracy with absent activity categories.

URL PDF HTML ☆

赞 0 踩 0