arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.11581 2026-06-11 eess.AS cs.SD 新提交

Sensitivity Analysis of Generative Spatial Audio Metrics: A Study on Responsiveness, Smoothness, and Symmetry

生成式空间音频指标的敏感性分析：响应性、平滑性和对称性研究

Purnima Kamath, Adrian S. Roman, Koichi Saito, Yuki Mitsufuji, Juan P. Bello

发表机构 * New York University（纽约大学）； Sony AI（索尼人工智能）； Sony Group Corporation（索尼集团）

AI总结提出一个框架分析生成式空间音频指标对空间参数变化的敏感性，定义响应性、平滑性和对称性三个期望属性，评估标准指标后发现FAD和声学地图表现最佳。

Comments Accepted for publication at Interspeech 2026

详情

AI中文摘要

由于对指标如何响应方位角和仰角等空间参数变化的理解有限，评估一阶环绕声（FOA）的生成式空间音频仍然具有挑战性。我们借鉴参数化声音合成中的敏感性分析原理，提出了一个沿连续空间轨迹分析指标敏感性的框架。通过使用复杂度递增的受控FOA场景，我们定义了指标行为的三个期望属性：响应性、平滑性和对称性。我们评估了标准基于分布和基于样本的指标，包括Fréchet音频距离（FAD）、强度向量和声学地图。我们的发现表明，使用定位特定嵌入和声学地图的FAD在不同条件下具有高响应性以及稳健的平滑性和对称性，而强度向量随着场景复杂度的增加而退化。这是研究生成式空间音频指标敏感性的第一步。

英文摘要

Evaluating generative spatial audio for First-Order Ambisonics (FOA) remains challenging due to a limited understanding of how metrics respond to changes in spatial parameters such as azimuth and elevation. We propose a framework to analyze metric sensitivity along continuous spatial trajectories, drawing on principles of sensitivity analysis in parametric sound synthesis. Using controlled FOA scenes with increasing scene complexity, we define three desiderata for metric behavior: Responsiveness, Smoothness, and Symmetry. We assess standard distribution-based and sample-based metrics, including Fréchet Audio Distance (FAD), intensity vectors, and acoustic maps. Our findings show that FAD using localization-specific embeddings and acoustic maps yield high Responsiveness and robust Smoothness and Symmetry across conditions, while intensity vectors degrade with increasing scene complexity. This is the first step towards investigating the sensitivity of metrics for generative spatial audio.

URL PDF HTML ☆

赞 0 踩 0

2606.11570 2026-06-11 stat.ML cs.LG stat.ME 新提交

Enhancing Spectral Embedding through Robust and Flexible Knowledge Transfer in Electronic Health Records

通过电子健康记录中的鲁棒且灵活的知识迁移增强谱嵌入

Feiqing Huang, Zongqi Xia, Rong Ma, Tianxi Cai

发表机构 * Harvard T.H. Chan School of Public Health（哈佛大学T.H. Chan公共卫生学院）； Dana-Farber Cancer Institute（达纳-法伯癌症研究所）； Harvard Medical School（哈佛医学院）； University of Pittsburgh（匹兹堡大学）

AI总结提出一种基于谱的无监督表示学习框架，通过从更广泛人群提取知识矩阵并放松信号对齐假设，为罕见病队列生成低维嵌入，在模拟和真实多发性硬化症数据中优于现有方法。

详情

AI中文摘要

我们提出了一种基于谱的无监督表示学习框架，用于从电子健康记录中为罕见病队列的临床概念和患者导出低维嵌入，其中数据是高维的但样本量有限。为了克服这一挑战，我们引入了一个从更广泛人群中提取的知识矩阵，该矩阵与罕见病队列共享部分重叠的子空间。我们的方法不同于现有方法，它放松了潜在数据矩阵和知识矩阵之间严格的一对一信号对齐假设，允许更灵活和现实的结构化共享形式。我们引入了一种新颖的两步谱嵌入过程：首先，我们从知识矩阵中识别并移除不相关的成分；然后，我们应用基于投影的方法分别恢复共享和异质成分。模拟和对真实世界多发性硬化症队列的分析表明，所提出的方法优于竞争方法，特别是在共享信号较弱且仅部分对齐的挑战性场景中，这在罕见病数据中很常见。

英文摘要

We propose a spectral-based, unsupervised representation learning framework to derive low-dimensional embeddings for clinical concepts and patients in rare disease cohorts from electronic health records, where data are high-dimensional but sample sizes are limited. To overcome this challenge, we incorporate a knowledge matrix extracted from a broader population that shares a partially overlapping subspace with the rare-disease cohort. Our method departs from existing approaches by relaxing restrictive one-to-one signal-alignment assumptions between the latent data matrix and knowledge matrix, allowing more flexible and realistic forms of structured sharing. We introduce a novel two-step spectral embedding procedure: first, we identify and remove irrelevant components from the knowledge matrix; then, we apply a projection-based method to separately recover shared and heterogeneous components. Simulations and an analysis of a real-world multiple sclerosis cohort show that the proposed method outperforms competing approaches, particularly in challenging scenarios where shared signals are weak and only partially aligned, as is common in rare-disease data.

URL PDF HTML ☆

赞 0 踩 0

2606.11560 2026-06-11 cs.DB cs.AI 新提交

LLMs+Graphs: Toward Graph-Native, Synergistic AI Systems

LLMs+Graphs：迈向图原生的协同人工智能系统

Arijit Khan, Longxu Sun, Xin Huang

发表机构 * Bowling Green State University（伯灵顿绿色州立大学）； Hong Kong Baptist University（香港 Baptist大学）

AI总结本文综述了大语言模型与图计算的三种协同方式，包括增强推理、知识图谱双向集成及图算法增强的AI代理，并探讨了图数据管理与图机器学习的新能力，旨在为构建下一代图原生AI系统提供统一视角。

Comments 10 pages, Accepted at PAKDD 2066 Tutorial

详情

AI中文摘要

大语言模型（LLMs）发展迅速，但它们在结构化和多跳推理方面的局限性凸显了对图原生、协同人工智能（AI）系统的需求。图结构数据支撑着社交、生物、金融、交通、网络和知识领域的关键应用，因此理解LLMs如何利用图计算进行基于上下文的扎实推理至关重要。三种互补的协同方式正在涌现：通过图计算增强LLMs进行检索和推理；LLMs与知识图谱（KGs）的双向集成，其中LLMs支持KG构建和整理，而KGs强制执行语义约束和事实一致性；以及通过图算法增强的AI代理进行规划、决策和多步推理。同时，LLMs通过自然语言接口和混合LLM-图神经网络（GNN）流水线，为图数据管理和图机器学习（ML）引入了新能力。本教程综合了推动这些融合方向的算法、系统和设计原则，为数据科学和数据挖掘研究人员提供了将LLMs、图数据管理、图挖掘、图ML和代理计算集成到下一代图原生AI系统中的统一视角。

英文摘要

Large Language Models (LLMs) have advanced rapidly, but their limitations in structured and multi-hop reasoning underscore the need for graph-native, synergistic artificial intelligence (AI) systems. Graph-structured data underpins critical applications across social, biological, financial, transportation, web, and knowledge domains, making it essential to understand how LLMs can leverage graph computation for grounded, context-rich inference. Three complementary synergies are emerging: LLMs augmented with graph computation for retrieval and reasoning; bidirectional integration between LLMs and knowledge graphs (KGs), where LLMs support KG construction and curation while KGs enforce semantic constraints and factual consistency; and AI agents strengthened by graph algorithms for planning, decision making, and multi-step reasoning. In parallel, LLMs introduce new capabilities for graph data management and graph machine learning (ML) through natural language interfaces and hybrid LLM-graph neural network (GNN) pipelines. This tutorial synthesizes the algorithms, systems, and design principles driving these converging directions, offering data science and data mining researchers a unified perspective on integrating LLMs, graph data management, graph mining, graph ML, and agentic computation into next-generation graph-native AI systems.

URL PDF HTML ☆

赞 0 踩 0

2606.11556 2026-06-11 cs.CR cs.AI cs.LG 新提交

Privacy-Preserving Federated Autoencoder for ECG Anomaly Detection on Edge Devices

面向边缘设备上心电图异常检测的隐私保护联邦自编码器

Kaan Arda Akyol, Jakub Kacper Szeląg, Aydin Abadi, Maha Alghamdi, Ghadah Albalawi, Ghouse Ibrahim Kaleelullah, Hilal Tutus, Sarah Al Subaiei, Shardul Kapse, Syed Mohammed Raheeb, Mujeeb Ahmed, Rehmat Ullah

发表机构 * Google Research, New York, NY（谷歌研究，纽约，纽约州）； University of California, Berkeley（加州大学伯克利分校）； University of Cambridge（剑桥大学）； University of Toronto（多伦多大学）； University of Melbourne（墨尔本大学）； University of Sydney（悉尼大学）

AI总结提出一种结合联邦学习、差分隐私和INT8量化的端到端系统，在PTB-XL数据集上实现无监督12导联ECG异常检测，满足隐私、实时性和非IID数据要求。

Comments 9 pages, 4 figures, 6 tables. Preprint prepared in IEEE conference format. Submitted to: FLTA 2026

详情

AI中文摘要

连续心电图监测可以在心律异常演变为心血管事件之前发现它们。然而，一个可部署的系统必须同时满足三个要求：法律级别的隐私（GDPR、HIPAA）、在受限边缘硬件上的实时推理以及在非IID跨医院数据下的检测质量。我们设计并评估了一个端到端的联邦系统，在PTB-XL数据集上解决了无监督12导联ECG异常检测的所有三个要求，结合了三种自编码器家族（VanillaAE、ConvAE、VAE）、基于Flower的联邦平均（FedAvg）跨十个模拟医院、客户端差分隐私SGD（DP-SGD）与Rényi-DP会计，以及使用Raspberry Pi 4基准测试的8位整数（INT8）训练后量化。我们的主要贡献是：这些机制如何组合的经验性特征、实用的DP特定建议，以及针对临床敏感环境的技术和安全见解。联邦学习在所有架构上匹配或超过集中基线（ConvAE联邦ROC曲线下面积AUROC为0.782），并且ε扫描确定ε=4为推荐的临床操作点。INT8量化大致将模型大小减半，并将Pi 4延迟降低多达44%，AUROC损失小于0.12%。关键的是，DP和量化的惩罚在经验上是独立的，因此从业者不需要为了紧凑的边缘足迹而牺牲强大的隐私保证。据我们所知，这是第一个结合联邦学习、形式化(ε,δ)-DP、无监督重建检测和量化AArch64部署的系统。

英文摘要

Continuous electrocardiography (ECG) monitoring could surface rhythm abnormalities before they escalate into cardiovascular events. However, a deployable system must satisfy three requirements simultaneously: legal-grade privacy (GDPR, HIPAA), real-time inference on constrained edge hardware, and detection quality under non-IID cross-hospital data. We design and evaluate an end-to-end federated system addressing all three for unsupervised 12-lead ECG anomaly detection on PTB-XL dataset, combining three autoencoder families (VanillaAE, ConvAE, VAE), Flower-based federated averaging (FedAvg) across ten simulated hospitals, client-side differentially private SGD (DP-SGD) with a Rényi-DP accountant, and 8-bit integer (INT8) post-training quantization with Raspberry Pi 4 benchmarking. Our main contributions are: an empirical characterization of how these mechanisms compose, practical DP-specific recommendations, and technical and security insights for a clinically sensitive setting. Federated learning matches or exceeds the centralized baseline across all architectures (ConvAE federated area under the ROC curve, AUROC, $0.782$), and an $\varepsilon$ sweep identifies $\varepsilon=4$ as the recommended clinical operating point. INT8 quantization roughly halves model size and cuts Pi 4 latency by up to $44%$ with $<0.12%$ AUROC loss. Crucially, DP and quantization penalties are empirically independent, so practitioners need not trade a strong privacy guarantee for a compact edge footprint. To our knowledge, this is the first system combining federated learning, formal $(\varepsilon,δ)$-DP, unsupervised reconstruction-based detection, and quantized AArch64 deployment.

URL PDF HTML ☆

赞 0 踩 0

2606.11555 2026-06-11 q-bio.NC cs.AI cs.LG 新提交

End-to-End Machine Learning for Depressive State Classification via EEG and fNIRS

基于EEG和fNIRS的抑郁状态分类的端到端机器学习

Riki Sakurai, Simon Kojima, Mihoko Otake-Matsuura, Shin'ichiro Kanoh, Tomasz M. Rutkowski

发表机构 * RIKEN AIP（日本东京RIKEN AIP）

AI总结本研究提出一个端到端机器学习框架，利用EEG和fNIRS信号对抑郁状态进行分类，旨在克服传统诊断的主观性，为临床提供客观的自动化诊断工具。

Comments 4 pages, 4 figures, Accepted for publication in the Proc. 48th Annu. Int. Conf. IEEE EMBS (EMBC 2026), Toronto, Canada, July 20-24, 2026

详情

AI中文摘要

随着社会压力的增加，对心理医疗的需求不断上升，凸显了传统精神病学诊断的局限性。传统方法——主要依赖临床访谈和患者自我报告——本质上容易受到主观偏见和从业者不同的经验判断的影响。为了满足定量评估的需求，基于生物信号的检测，包括脑电图（EEG）和功能性近红外光谱（fNIRS），已成为一种有前景的客观替代方案。这类技术对于识别可能未被受试者自身意识到的潜在抑郁状态尤为重要。此外，在老龄化人群中，抑郁症与痴呆症的高共病性要求早期区分，以防止症状相互恶化并维持生活质量（QoL）。这项针对11名健康学生的初步研究建立了一个基于生物信号的抑郁症检测框架，为临床使用的自动化、客观诊断工具奠定了基础。

英文摘要

The escalating demand for mental healthcare, driven by rising societal stress, highlights the limitations of traditional psychiatric diagnostics. Conventional methods - relying primarily on clinical interviews and patient self-reports - are inherently vulnerable to subjective bias and the varying empirical judgment of practitioners. To address the need for quantitative evaluation, biological signal-based detection, including electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS), has emerged as a promising objective alternative. Such technology is particularly vital for identifying latent depressive states that may be unrecognized by the subjects themselves. Furthermore, in aging populations, the high comorbidity between depression and dementia necessitates early differentiation to prevent mutual symptom exacerbation and maintain Quality of Life (QoL). This pilot study of eleven healthy students establishes a framework for biological signal-based depression detection, serving as a foundational step toward automated, objective diagnostic tools for clinical use.

URL PDF HTML ☆

赞 0 踩 0

2606.11534 2026-06-11 physics.ao-ph cs.LG 新提交

Urban Heat MiniCubes: An AI-Ready dataset for urban heat research

城市热微型数据立方体：面向城市热研究的人工智能就绪数据集

Jonathan Starfeldt, Maria J. Molina, Alexander Kerr, Adam Yang, Thomas R. H. Holmes, Christopher R. Hain

发表机构 * Department of Atmospheric and Oceanic Science, University of Maryland, College Park, MD, USA（大学大气科学与海洋科学系，马里兰大学，学院公园，MD，美国）； Department of Computer Science, University of Maryland, College Park, MD, USA（大学计算机科学系，马里兰大学，学院公园，MD，美国）； NASA Goddard Space Flight Center, Greenbelt, MD, USA（NASA戈达德航天飞行中心，格林贝尔特，MD，美国）； NASA Marshall Space Flight Center, Huntsville, AL, USA（NASA马歇尔航天飞行中心，亨茨维尔，AL，美国）

AI总结提出Urban Heat MiniCubes数据集，整合多源卫星数据（Landsat 8/9、Sentinel-1、GOES-R等），为48个城市提供90×90公里网格化数据立方体，支持机器学习在城市热研究中的应用。

Comments 53 pages, 26 figures, Submitted to Nature Scientific Data

详情

AI中文摘要

城市热效应因不透水表面和异质建筑环境而加剧，但街道尺度的变异性仍难以量化，因为多传感器观测很少以一致、分析就绪的形式在必要的时空尺度上可用。我们提出了“Urban Heat MiniCubes”，一个公开可用、符合FAIR原则的数据集，专为城市热研究中的机器学习应用而设计。该数据集提供了西半球48个城市在2022-2023年间的统一90×90公里网格化数据立方体，变量被重新投影并配准到公共网格，以减少预处理（例如，重投影、重采样和时空对齐）。Urban Heat MiniCubes包括两种互补模态：（i）来自Landsat 8/9（例如，地表反射率）和Sentinel-1（例如，合成孔径雷达后向散射）的高空间分辨率、低频观测，以及（ii）来自GOES-R（例如，长波红外亮温）和微波地表温度产品的更高时间频率、较粗分辨率观测。我们记录了变量和元数据，并通过变量间分析和基于自编码器的像素类别（例如，水和云）重建误差总结提供了技术评估。还讨论了潜在用例和局限性。

英文摘要

Urban heat is amplified by impermeable surfaces and heterogeneous built environments, yet street-level variability remains difficult to quantify because multi-sensor observations are rarely available in consistent, analysis-ready form at the necessary spatiotemporal scales. We present "Urban Heat MiniCubes," a publicly available, FAIR-oriented dataset designed for machine learning applications in urban heat research. The dataset provides harmonized 90 x 90 km gridded data cubes for 48 cities in the Western Hemisphere spanning 2022-2023, with variables reprojected and collocated to a common grid to reduce preprocessing (e.g., reprojection, resampling, and spatiotemporal alignment). Urban Heat MiniCubes includes two complementary modalities: (i) higher-spatial-resolution, lower-frequency observations from Landsat 8/9 (e.g., surface reflectances) and Sentinel-1 (e.g., synthetic aperture radar backscatter), and (ii) higher-temporal-frequency, coarser observations from GOES-R (e.g., longwave infrared brightness temperatures) and a microwave land surface temperature product. We document variables and metadata and provide technical assessment using inter-variable analyses and autoencoder-based reconstruction-error summaries across pixel classes (e.g., water and cloud). Potential use cases and limitations are also discussed.

URL PDF HTML ☆

赞 0 踩 0

2606.11533 2026-06-11 cs.CY cs.AI cs.ET cs.LG 新提交

AI Researchers Must Help Lead Arms Control to Mitigate Military AI Risks

AI研究人员必须主导军备控制以降低军事AI风险

Ted Fujimoto, Jacob Benz

发表机构 * arXiv

AI总结本文主张AI研究人员应主导军备控制研究，通过借鉴核威慑经验，推动验证与外交技术创新，以降低军事AI应用带来的紧迫风险。

Comments 9 pages, 1 figure, ICML 2026 Position Paper

详情

AI中文摘要

AI能力的进步迫使研究人员和公众更加关注其潜在的全球影响。一个紧迫的近期问题是军事AI应用的监管。武器制造商和国防承包商正在加大对AI能力的投资，并与AI公司建立合作伙伴关系，形成了一个新兴的联盟，要求军事领导人、军备控制外交专家和AI研究人员合作，以确保更安全的未来。虽然AI研究人员通常关注超级智能AI的长期影响，但这种方法可能无法充分应对军事应用中AI带来的直接挑战。成功需要承认并减轻前沿AI模型（计划集成到国防应用中，如军事AI系统）的新兴风险。军备控制已经减少了过去的灾难性风险，因此从核威慑中吸取的经验教训可以指导AI安全与安保研究，推动验证和外交方面的创新。然而，AI研究人员必须协助主导技术研究，明确定义并缓解军事环境中的不稳定性。鉴于这些新责任以及缺乏足够可靠的解决方案，我们认为AI研究人员必须在推进军备控制研究以最小化军事AI应用风险方面发挥主导作用。

英文摘要

The advancement of AI capabilities compels researchers and the public to be more aware of its potential worldwide impact. A pressing near-term concern is the regulation of military AI applications. Armament manufacturers and defense contractors are increasingly investing in AI capabilities and forging partnerships with AI companies, creating a burgeoning coalition that demands military leaders, arms control diplomacy experts, and AI researchers collaborate to ensure a safer future. While AI researchers often focus on the long-term implications of superintelligent AI, this approach may not adequately address the immediate challenges posed by AI in military applications. Success requires acknowledging and mitigating the emerging risks of frontier AI models that plan to be integrated into defense applications, like military AI systems. Arms control has reduced past catastrophic risks, so lessons learned from nuclear deterrence can guide AI safety and security research towards innovations in verification and diplomacy. AI researchers, however, must assist in leading the technical research that clearly defines and alleviates instability in military settings. Given these new responsibilities and the lack of sufficiently reliable solutions, we argue that AI researchers must take a leading role in advancing arms control research to minimize risk in military AI applications.

URL PDF HTML ☆

赞 0 踩 0

2606.11529 2026-06-11 cs.GR cs.CV cs.PF 新提交

XPR: An Extensible Cross-Platform Point-Based Differentiable Renderer

XPR：一个可扩展的跨平台基于点的可微分渲染器

Steve Rhyner, Sankeerth Durvasula, Aleksandr Kovalev, Hansel Jia, Adrian Zhao, Mrutunjayya Mrutunjayya, Nilesh Ahuja, Selvakumar Panneer, Christina Giannoula, Nandita Vijaykumar

发表机构 * University of Toronto（多伦多大学）； Vector Institute（向量研究所）； Intel（英特尔）； Max Planck Institute for Software Systems（马克斯·普朗克软件系统研究所）

AI总结提出XPR框架，通过高级编程接口和模块化渲染管线，支持用少量代码实现3DGS等新方法，并利用XLA编译器跨平台运行。

详情

AI中文摘要

基于点的可微分渲染支撑着现代3D重建、新视角合成和基于学习的图形管线，但开发新的渲染方法通常需要大量的底层实现、硬件特定的内核以及手动编写的反向传播。这限制了快速原型设计、可重复性、探索和部署，尤其是在不同的硬件平台上。本文提出了XPR，一个可扩展的跨平台基于点的可微分渲染框架。XPR引入了一个高级编程接口，将方法特定的逻辑与共享的渲染管线分离，允许用户用几行代码实现新方法。其管线将渲染分解为模块化的、静态形状的并行操作，这些操作可以通过跨平台编译器降级到GPU、TPU、CPU和其他ML加速器。我们展示了3DGS、3DGUT和LinPrim的实现，仅需几百行Python代码，每个都可以通过XLA编译器编译到一系列硬件平台。这些结果表明，XPR为新兴的基于点的可微分渲染系统实现了快速实验和可移植执行。

英文摘要

Point-based differentiable rendering underpins modern 3D reconstruction, novel-view synthesis, and learning-based graphics pipelines, but developing new rendering methods often requires extensive low-level implementation, hardware-specific kernels, and manually written backward passes. This limits rapid prototyping, reproducibility, exploration, and deployment, especially across diverse hardware platforms. This paper presents XPR, an extensible cross-platform framework for point-based differentiable rendering. XPR introduces a high-level programming interface that separates method-specific logic from the shared rendering pipeline, allowing users to implement new methods in a few lines of code. Its pipeline decomposes rendering into modular, statically shaped parallel operations that can be lowered by a cross-platform compiler to GPUs, TPUs, CPUs, and other ML accelerators. We demonstrate implementations of 3DGS, 3DGUT, and LinPrim, with only a few 100s lines of Python code, each of which can be compiled to a range of hardware platforms with the XLA compiler. These results show that XPR enables fast experimentation and portable execution for emerging point-based differentiable rendering systems.

URL PDF HTML ☆

赞 0 踩 0

2606.11500 2026-06-11 eess.IV cs.CE cs.IT cs.LG math.IT q-bio.NC 新提交

FlexiBrain: Resolution-Agnostic Voxel-Level Encoding for Native fMRI

FlexiBrain: 面向原生fMRI的分辨率无关体素级编码

Mo Wang, Wenhao Ye, Junfeng Xia, Minghao Xu, Hongkai Wen, Quanying Liu

发表机构 * Southern University of Science and Technology（南方科技大学）； University of Warwick（沃里克大学）

AI总结提出FlexiBrain，一种基于Mamba-JEPA的分辨率无关体素级编码框架，通过动态补丁调整直接处理原生fMRI数据，避免破坏性空间标准化，在五个下游任务中性能提升达12个百分点，并显著降低预处理成本。

详情

AI中文摘要

大规模深度学习模型在神经科学中的成功从根本上受到严重数据异质性的制约。从不同来源聚合的原生fMRI数据在空间和时间分辨率上表现出显著差异。因此，大多数现有框架依赖于冗长、僵化的预处理流程，以强制数据集之间的一致性。这种做法引入了两个关键限制：（1）可能退化受试者特定的解剖信息；（2）显著的计算开销，通常每个受试者需要数小时的处理。在此，我们提出FlexiBrain，一种基于Mamba-JEPA的分辨率无关体素级编码框架，用于原生fMRI。FlexiBrain以真实物理单位定义补丁大小，并采用动态补丁调整，从而绕过破坏性的空间标准化，同时允许直接摄取原生空间中的数据。我们使用高效的Mamba-JEPA骨干网络实例化该框架，以建模高维4D fMRI信号。在五个不同的下游神经科学任务中，FlexiBrain持续优于近期最先进的方法，在不使用外部数据增强的情况下实现了高达12个百分点的提升。重要的是，FlexiBrain作为一个无缝插件模块，显著降低了预处理成本，并加速了稳健的体素级fMRI基础模型的开发。代码可在该https URL获取。

英文摘要

The success of large-scale deep learning models in neuroscience is fundamentally constrained by severe data heterogeneity. Native fMRI data aggregated from diverse sources exhibit substantial variation in both spatial and temporal resolutions. Consequently, most existing frameworks rely on lengthy, rigid preprocessing pipelines that enforce uniformity across datasets. This practice introduces two critical limitations: (1) potential degradation of subject-specific anatomical information; (2) significant computational overhead, often requiring hours of processing per subject. Here, we propose FlexiBrain, a resolution-agnostic voxel-level encoding framework for native fMRI based on Mamba-JEPA. FlexiBrain defines patch sizes in real-world physical units and employs a dynamic patch resizing, thereby bypassing destructive spatial standardization while enabling direct ingestion of data in native space. We instantiate the framework using an efficient Mamba-JEPA backbone to model high-dimensional 4D fMRI signals. Across five diverse downstream neuroscience tasks, FlexiBrain consistently outperforms recent state-of-the-art methods, achieving gains of up to 12 percentage points without external data augmentation. Importantly, FlexiBrain functions as a seamless plug-in module, substantially reducing preprocessing costs and accelerating the development of robust voxel-level fMRI foundation models. Code is available at https://github.com/OneMore1/FlexiBrain.

URL PDF HTML ☆

赞 0 踩 0

2606.11482 2026-06-11 cs.SI cs.CL 新提交

Building Social World Models with Large Language Models

用大型语言模型构建社会世界模型

Haofei Yu, Yining Zhao, Guanyu Lin, Jiaxuan You

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出社会世界模型（SWM）框架，利用LLM从社会数据中挖掘时间模式，学习社会信念的状态转移函数，无需人工标注或普查数据，在预测市场基准上超越时序基础模型。

Comments 9 pages. ICML 2026

详情

AI中文摘要

理解和预测社会信念如何因事件（从政策变化到科学突破）而演变仍然是社会科学中的一个基本挑战。鉴于LLM的常识知识和社会智能，我们提出：LLM能否模拟社会事件后社会信念的动态？在这项工作中，我们引入了社会世界模型（SWM）的概念，这是一个通用框架，旨在捕捉社会信念如何因重大事件而演变。SWM通过挖掘社会数据中的时间模式并优化证据下界来学习社会信念的状态转移函数，无需将事件与信念转变联系起来的人工标注，也无需昂贵的普查数据。为了评估SWM，我们引入了一个基准SWM-bench，该基准源自真实世界的预测市场，特别是Kalshi和Polymarket。SWM-bench包含超过12k个数据点，用于跨政治、金融和加密货币等不同领域的社会信念预测任务。我们的实验结果表明，SWM显著优于时序基础模型，在Kalshi数据上取得了最先进的结果，并在Polymarket数据上展示了竞争性能，同时为社会信念动态的潜在机制提供了可解释的见解。

英文摘要

Understanding and predicting how social beliefs evolve in response to events -- from policy changes to scientific breakthroughs -- remains a fundamental challenge in social science. Given LLMs' commonsense knowledge and social intelligence, we ask: Can LLMs model the dynamics of social beliefs following social events? In this work, we introduce the concept of the Social World Model (SWM), a general framework designed to capture how social beliefs evolve in response to major events. SWM learns state-transition functions for social beliefs by mining temporal patterns in social data and optimizing the evidence lower bound, without the need for explicit human annotations linking events to belief shifts, or for expensive census data. To evaluate SWM, we introduce a benchmark, SWM-bench, derived from real-world prediction markets, specifically Kalshi and Polymarket. SWM-bench includes over 12k data points for social belief prediction tasks spanning diverse domains such as politics, finance, and cryptocurrency. Our experimental results show that SWM significantly outperforms time-series foundation models, achieving state-of-the-art results on Kalshi data and demonstrating competitive performance on Polymarket data, while offering interpretable insights into the underlying mechanisms of social belief dynamics.

URL PDF HTML ☆

赞 0 踩 0

2606.11471 2026-06-11 cs.CR cs.LG 新提交

Evaluating and Combating the Impact of Concept Drift on the Performance of Machine Learning-Based Phishing Detection Systems

评估与对抗概念漂移对基于机器学习的钓鱼检测系统性能的影响

Warren Fernando, Nikos Komninos

发表机构 * Department of Computer Science, School of Mathematics, Computer Science and Engineering, City St George’s, University of London, UK（伦敦大学城市学院计算机科学系，数学、计算机科学与工程学院，伦敦大学）

AI总结研究概念漂移对基于机器学习的钓鱼邮件检测系统性能的影响，并提出缓解性能下降的策略。

详情

AI中文摘要

数字领域的扩展导致数字通信大幅增加，电子邮件已成为最突出的渠道之一。电子邮件通信的普及在专业和个人环境中都很明显，从而为恶意行为者创造了大量可利用的漏洞。垃圾邮件作为一种未经请求的通信形式，通常对收件人带有恶意意图，自电子邮件技术诞生以来一直是电子邮件用户面临的持续挑战，而数字景观的增长加剧了这一问题。电子邮件垃圾邮件过滤器是电子邮件客户端的组成部分，旨在识别潜在有害消息并提醒用户其恶意内容。钓鱼攻击通常是基于恶意软件攻击的初始阶段，并且随着时间推移，恶意软件变得越来越复杂，钓鱼攻击也在迅速演变。检测恶意软件和垃圾邮件领域中恶意活动的一种广泛采用的方法是应用机器学习。我们的目标是评估垃圾邮件领域内的演变对这些基于机器学习的检测系统的影响，并探索减轻相关性能下降的策略。

英文摘要

The expansion of the digital domain has resulted in a substantial increase in digital communication, with email emerging as one of the most prominent channels. The proliferation of email communication is apparent in both professional and personal contexts, thereby creating numerous vulnerabilities for malicious actors to exploit. Spam emails, a form of unsolicited correspondence often bearing malicious intent towards recipients, have been an ongoing challenge for email users since the inception of email technology, and this problem has been exacerbated by the growth of the digital landscape. Email spam filters are integral components of email clients, engineered to identify potentially harmful messages and alert users to their malicious content. Phishing, frequently the initial phase of malware-based attacks, is evolving rapidly, with malware becoming increasingly sophisticated over time. A widely adopted approach for detecting malicious activity within malware and spam domains is the application of machine learning. Our aim is to assess the impact of the evolution within the spam email domain on these machine learning-based detection systems and to explore strategies for mitigating associated performance degradation.

URL PDF HTML ☆

赞 0 踩 0

2606.11469 2026-06-11 cs.DS cs.LG math.ST stat.TH 新提交

Density estimation for Hellinger via minimum-distance estimators: mixtures of Gaussians, log-concave, and more

基于最小距离估计量的Hellinger密度估计：高斯混合、对数凹等

Spencer Compton, Jerry Li

发表机构 * Stanford University（斯坦福大学）； University of Washington（华盛顿大学）

AI总结将最小距离估计方法从总变差距离扩展到Hellinger距离，通过反向数据处理不等式，实现了对对数凹混合和高斯混合（任意方差）的近线性时间学习，样本复杂度接近最优。

详情

AI中文摘要

我们研究密度估计任务，希望从$n$个样本中准确估计概率密度。在总变差距离下，密度估计的经典方法是最小距离估计量方法，其中我们仅通过限制特定概念类（即Yatracos类）的VC维即可得到算法和分析。虽然该技术最初主要针对总变差距离给出了精确保证，但在本文中，我们将最小距离估计量方法扩展到Hellinger距离下的学习。我们的主要观察是，通过联系最近得到反向数据处理不等式的结果，我们可以为Hellinger距离生成类似的方案（其中我们只需要限制相关概念类的VC维）。该方案足够灵活，可以容纳最初为总变差距离设计的快速算法；通过修改Acharya等人（2017）的方法，我们首次得到了近线性时间算法，用于学习包括单变量对数凹密度混合和高斯混合（具有任意方差）在内的类别，且样本复杂度接近最优。

英文摘要

We study the task of density estimation, where we hope to accurately estimate a probability density from $n$ samples. A textbook method for density estimation in total variation distance is the minimum-distance estimator approach, where we conclude both the algorithm and the analysis merely from bounding the VC dimension of a particular concept class (the so-called Yatracos class). While this technique has originally yielded sharp guarantees primarily for total variation distance, in this work we extend the minimum-distance estimator approach for learning within Hellinger distance. Our main observation is that we may produce an analogous recipe for Hellinger (where we only require bounding the VC dimension of a related concept class) by drawing connections to recent results yielding reverse data processing inequalities. This recipe is flexible enough to accommodate fast algorithms originally designed for total variation distance; by modifying the approach of Acharya et al. (2017) we conclude the first near-linear time algorithm for learning classes including univariate mixtures of log-concave densities and mixtures of Gaussians (with arbitrary variances), with near-optimal sample complexity.

URL PDF HTML ☆

赞 0 踩 0

2606.11437 2026-06-11 cs.DS cs.AI cs.LG stat.ML 新提交

The Power of Test-Time Training for Approximate Sampling

测试时训练对近似采样的威力

Noah Golowich, Ankur Moitra, Dhruv Rohatgi

发表机构 * Microsoft Research NYC（微软研究院纽约分校）； MIT（麻省理工学院）

AI总结本文形式化测试时训练（TTT）为从已知分布类中采样的问题，证明查询复杂度的二次下界，并展示在分布类大小受限时可规避该下界，为TTT提供理论框架。

详情

AI中文摘要

从复杂概率分布中高效采样是一个基本问题，近年来随着生成式AI的兴起，这一问题变得越来越重要，因为从大语言模型（LLM）中提出的复杂采样程序已被用于解决具有挑战性的推理问题。然而，这类采样算法的有效性受到LLM与特定采样任务之间关系的限制，这推动了测试时训练（TTT）框架的发展。TTT通过根据推理时收到的部分生成和奖励反馈更新模型权重来工作，从而适应特定问题。在这项工作中，我们提出了一种TTT的形式化，将其定义为从属于已知分布类$F$的给定概率测度$\mu^\star$中生成样本的问题，给定一个提供$\mu^\star$近似密度估计的预言机$\hat \mu$。这与Jerrum、Valiant和Vazirani（1986）以及Jerrum和Sinclair（1989）的开创性工作中研究的将采样约化为近似计数的问题密切相关：即当$F$是所有分布的类时，它恰好与上述计数到采样的约化一致。在本文中，我们首先证明了在给定对$\hat \mu$的查询访问的情况下，从$\mu^\star$采样的查询复杂度的二次下界（对于足够大的类$F$），从而表明Jerrum和Sinclair（1989）提出并由Hayes和Sinclair（2010）改进的随机游走方法是最优的。这回答了Hayes和Sinclair提出的一个开放问题。然后，我们证明如果$F$的大小适当受限，这个下界可以被规避。正如我们所讨论的，后一个结果可以被视为TTT的抽象，因此代表了为TTT发展一个原则性理论框架的起点。

英文摘要

Efficiently sampling from a complex probability distribution is a fundamental problem which has become increasingly pertinent in recent years with the rise of generative AI, as sophisticated sampling procedures from LLMs have been proposed to solve challenging reasoning problems. The efficacy of such sampling algorithms is limited, however, by the relationship between the LLM and the particular sampling task at hand, which has motivated the framework of test-time training (TTT). TTT works by updating a model's weights in response to partial generations and reward feedback received at inference time, thus adapting to the particular problem. In this work, we propose a formalization for TTT as the problem of producing a sample from a given probability measure $μ^\star$ belonging to a known class ${F}$ of distributions, given an oracle $\hat μ$ which yields approximate density estimates for $μ^\star$. This is closely related to the problem of reducing sampling to approximate counting studied in seminal works of Jerrum, Valiant & Vazirani (1986) and Jerrum & Sinclair (1989): namely, when ${F}$ is the class of all distributions, it coincides exactly with the aforementioned counting-to-sampling reduction. In this paper, we first show a quadratic lower bound on the query complexity of sampling from $μ^\star$ given query access to $\hat μ$ (for sufficiently large classes ${F}$), thus showing that the random walk approach proposed by Jerrum & Sinclair (1989) and refined by Hayes & Sinclair (2010), is optimal. This answers an open question posed by Hayes & Sinclair. We then show that this lower bound can be circumvented if the size of ${F}$ is bounded appropriately. As we discuss, this latter result can be viewed as an abstraction of TTT, and thus represents a starting point for the development of a principled theoretical framework for TTT.

URL PDF HTML ☆

赞 0 踩 0

2606.11430 2026-06-11 cs.DL cs.AI cs.LO 新提交

Towards a Bridge Layer Between Bibliographic and Formalized Mathematical Knowledge

迈向文献与形式化数学知识之间的桥梁层

A. Mayeux

发表机构 * GitHub

AI总结提出一个关系型桥接数据库，对齐出版物元数据与形式化工件，并引入论文级形式化评分，通过跨文档对齐估计形式化覆盖度，以整合文献与形式化数学生态系统。

详情

AI中文摘要

数学知识分散在文献数据库（如MathSciNet、zbMATH Open）和形式化证明库（如Lean mathlib）中，阻碍了已发表结果与其形式化之间的统一访问。我们提出了一个关系型桥接数据库，将出版物元数据与形式化工件对齐，为数学文献和机器可验证证明提供互操作层。我们引入了一个论文级形式化评分，衡量一篇出版物在形式化系统中的覆盖程度。作为可行性研究，我们展示了如何通过非正式文本与Lean形式化之间的跨文档对齐来估计此类评分，从而实现对形式化覆盖度的大规模分析。该框架是将文献和形式化数学生态系统整合为可扩展、机器可操作的知识图谱的第一步，该图谱将出版物与形式化证明对象关联起来。

英文摘要

Mathematical knowledge is split between bibliographic databases (e.g., MathSciNet, zbMATH Open) and formal proof libraries (e.g., Lean mathlib), preventing unified access between published results and their formalizations. We propose a relational bridge-database that aligns publication metadata with formal artifacts, providing an interoperability layer between mathematical literature and machine-verifiable proofs. We introduce a paper-level formalization score that measures how much of a publication is covered in formal systems. As a feasibility study, we show how such scores can be estimated via cross-document alignment between informal texts and Lean formalizations, enabling large-scale analysis of formalization coverage. This framework is a first step toward integrating bibliographic and formal mathematical ecosystems into scalable, machine-actionable knowledge graphs linking publications to formal proof objects.

URL PDF HTML ☆

赞 0 踩 0

2606.11429 2026-06-11 eess.AS cs.CL cs.SD 新提交

Gumbel-BEARD: Automatic Layer Selection for Self-Supervised Adaptation of Whisper in Low-Resource Domains

Gumbel-BEARD：低资源领域Whisper自监督自适应的自动层选择

Zilai Wang, Natarajan Balaji Shankar, Mohan Shi, Kaiyuan Zhang, Abeer Alwan

发表机构 * University of California, Los Angeles, USA（加州大学洛杉矶分校）

AI总结提出Gumbel-BEARD框架，通过可训练的Gumbel-Softmax选择器自动选择Whisper编码器层，结合BEST-RQ自监督目标实现低资源领域自适应，在儿童语音和方言数据集上取得最先进词错误率。

Comments Accepted by Interspeech 2026

详情

AI中文摘要

语音基础模型在低资源领域常因领域不匹配和数据稀缺而表现不佳。我们提出Gumbel-BEARD，一种领域自适应框架，通过端到端可训练的硬Gumbel-Softmax选择器自动选择Whisper编码器层。它利用BEST-RQ目标实现自监督自适应，无需手动调整即可动态适应目标声学特征。在MyST儿童语音语料库上的实验证明了其效率和可扩展性：使用10小时标注数据进行微调，我们的方法匹配了在完整133小时标注集上训练的完全监督基线。我们在MyST上使用Whisper-medium建立了8.21%的新最先进词错误率（WER），在OGI自发言语数据集上使用Whisper-small达到11.06%。在CORAAL上的评估进一步证实了对成人方言领域偏移的鲁棒性，相对WER降低高达6%，突显了我们的方法对多样低资源条件的泛化能力。

英文摘要

Speech foundation models often struggle in low-resource domains due to domain mismatch and data scarcity. We propose Gumbel-BEARD, a domain adaptation framework that automates Whisper encoder layer selection via an end-to-end trainable hard Gumbel-Softmax selector. It enables self-supervised adaptation with a BEST-RQ objective that dynamically adapts to target acoustic characteristics without manual tuning. Experiments on the MyST child speech corpus demonstrate efficiency and scalability: with 10 h of labeled data for fine-tuning, our method matches a fully supervised baseline trained on the complete 133 h labeled set. We establish new state-of-the-art word error rates (WERs) of 8.21% using Whisper-medium on MyST and 11.06% using Whisper-small on the OGI Spontaneous dataset. Evaluation on CORAAL further confirms robustness to adult dialectal domain shifts, with up to 6% relative WER reduction, highlighting the generalizability of our approach to diverse low-resource conditions.

URL PDF HTML ☆

赞 0 踩 0

2606.11425 2026-06-11 cs.CR cs.AI 新提交

JailbreakOPT: Tool-Assisted Iterative Jailbreak Prompt Optimization

JailbreakOPT: 工具辅助的迭代越狱提示优化

Ge Shi, Jun Yin, Donglin Xie, Fangyi Liu, Yucan Li, Menglin Liu

发表机构 * University of California, Davis（加州大学戴维斯分校）； The Renmin University of China（中国人民大学）； Independent Researcher（独立研究员）； Nankai University（南开大学）； Cornell University（康奈尔大学）； The Chinese University of Hong Kong, Shenzhen（香港中文大学（深圳））

AI总结提出JailbreakOPT框架，通过工具库和上下文Thompson采样优化单轮越狱提示，在多个LLM上提高攻击成功率并减少攻击次数。

详情

AI中文摘要

越狱攻击暴露了大语言模型（LLM）中持续存在的安全弱点，但现有的无状态单轮方法面临权衡：手工制作的提示具有表现力但静态，而迭代提示优化可以适应但通常依赖于需要多次目标查询的低级突变。我们提出了JailbreakOPT，一个用于改进迭代单轮越狱提示优化的工具辅助框架。JailbreakOPT将多样化的原子越狱提示组织成一个攻击工具库，并通过统一的回合内优化抽象组合它们，以生成更强的独立攻击提示。为了跨攻击回合重用经验，JailbreakOPT进一步将工具选择框架化为上下文赌博机问题，并应用上下文汤普森采样基于过去的结果指导探索和利用。在多个目标LLM和攻击目标上的实验表明，与原子单轮攻击和现有的迭代优化基线相比，JailbreakOPT提高了攻击成功率（ASR），同时减少了成功所需的攻击次数（No.A）。本文可能包含冒犯性或有害内容。

英文摘要

Jailbreak attacks expose persistent safety weaknesses in large language models (LLMs), but existing stateless single-turn methods face a trade-off: hand-crafted prompts are expressive but static, while iterative prompt optimization can adapt but often relies on low-level mutations that require many target queries. We propose JailbreakOPT, a tool-assisted framework for improving iterative single-turn jailbreak prompt optimization. JailbreakOPT organizes diverse atomic jailbreak prompts into an attack tool library and composes them through a unified intra-episode optimization abstraction to generate stronger standalone attack prompts. To reuse experience across attack episodes, JailbreakOPT further frames tool selection as a contextual bandit problem and applies contextual Thompson sampling to guide exploration and exploitation based on past outcomes. Experiments across multiple target LLMs and attack goals show that JailbreakOPT improves attack success rate (ASR) while reducing the number of attacks until success (No.A) compared with atomic single-turn attacks and existing iterative optimization baselines. This paper may contain offensive or harmful content.

URL PDF HTML ☆

赞 0 踩 0

2606.11416 2026-06-11 cs.CR cs.AI 新提交

MPC-Patch-Bench: Security-Aware LLM Code Patch for Multi-Party Computation

MPC-Patch-Bench：面向多方计算的安全感知LLM代码补丁

Yukuan Zhang, Mengxin Zheng, Qian Lou

发表机构 * University of Central Florida（中央佛罗里达大学）

AI总结针对多方计算（MPC）软件缺乏仓库级代码修复基准的问题，提出MPC-Patch-Bench，包含数据筛选框架和MPC验证器，评估LLM在MPC仓库级修复中的安全性和数值保真度。

Comments preprint

详情

AI中文摘要

目前尚不存在用于评估大型语言模型（LLM）在安全多方计算（MPC）软件上代码修复的仓库级基准，直接移植SWE-bench等通用基准在三个结构层面失败：（i）MPC仓库主要由通用Python基础设施而非密码学逻辑主导；（ii）高价值MPC修复缺乏严格提取流程所需的标准化测试；（iii）标准失败到通过评估对于必须同时保证密码学安全的代码是不充分的。MPC越来越多地部署于隐私保护机器学习、生物医学协作和安全分析。现有的MPC特定代码合成工作仅涵盖算子级或单框架任务；在真实仓库级MPC修复上评估LLM代理反而需要MPC感知的数据筛选和与MPC程序必须遵守的安全性和数值保真度保证相匹配的验证器，而现有基准均未提供。我们提出MPC-Patch-Bench，一个围绕两个框架组织的仓库级基准。（1）数据筛选框架结合了一个领域特定筛选代理，该代理通过三个密码学层过滤原始拉取请求，并配备一个人类-AI补全引擎，合成缺失的问题描述和失败到通过/通过到通过测试，生成205个完全验证的实例。（2）MPC验证器通过针对明文预言机的动态差分测试和MPC特定静态分析规则（标记不安全泄露、不安全算术和非法公共/私有转换）提供专门的安全性和数值保真度检查。评估的最强LLM在功能上仅解决了22.9%的MPC-Patch-Bench任务；MPC验证器进一步将验证通过率降至17.1%，其中高达40%的功能通过补丁因密码学或数值保真度违规而被拒绝。

英文摘要

Repository-level benchmarks for evaluating Large Language Model (LLM) code repair on Secure Multi-Party Computation (MPC) software do not yet exist, and directly transplanting general-purpose benchmarks such as SWE-bench fails on three structural fronts: (i) MPC repositories are dominated by generic Python infrastructure rather than cryptographic logic; (ii) high-value MPC fixes lack the standardized tests rigid extraction pipelines require; and (iii) standard fail-to-pass evaluation is insufficient for code that must also be cryptographically safe. MPC is increasingly deployed for privacy-preserving machine learning, biomedical collaboration, and secure analytics. Existing MPC-specific code-synthesis efforts cover only operator-level or single-framework tasks; evaluating LLM agents on real repository-level MPC repair instead demands MPC-aware data curation and a verifier matched to the security and numerical-fidelity guarantees MPC programs must obey neither of which existing benchmarks provide. We introduce MPC-Patch-Bench, a repository-level benchmark organised around two frameworks. (1)The Data Curation Framework combines a domain-specific curation agent that filters raw pull requests through three cryptographic layers with a human-AI completion engine that synthesizes missing problem statements and Fail-to-Pass/Pass-to-Pass tests, yielding 205 fully verified instances. (2)The MPC Verifier provides dedicated security and numerical-fidelity checks via dynamic differential testing against plaintext oracles and MPC-specific static analysis rules that flag unsafe reveals, insecure arithmetic, and illegal public/private casts. The strongest evaluated LLM functionally resolves only 22.9% of MPC-Patch-Bench tasks; the MPC Verifier further reduces verified resolution to 17.1%, with up to 40% of functionally-passing patches rejected for cryptographic or numerical-fidelity violations.

URL PDF HTML ☆

赞 0 踩 0

2606.11415 2026-06-11 q-bio.NC cs.LG physics.data-an q-bio.QM 新提交

Spatially Masked Regression Reveals Local and Distributed Predictability in Electrophysiological Recordings

空间掩蔽回归揭示电生理记录中的局部和分布式可预测性

Maryam Ostadsharif Memar, Nima Dehghani

发表机构 * Department of Electrical and Computer Engineering, IUT（电气与计算机工程系）； McGovern Institute for Brain Research, Massachusetts Institute of Technology (MIT)（脑科学研究所，麻省理工学院（MIT））

AI总结提出空间掩蔽回归（SMR）框架，通过逐步增大掩蔽区域量化电极信号中局部与分布式信息的贡献，应用于颅内和头皮脑电数据，发现邻近电极贡献显著但非全部，表明信号同时包含局部冗余和全局结构。

详情

AI中文摘要

神经记录通常被解释为局部测量，但任何单个传感器的信号也可能反映分布在整个网络中的结构化活动。这引出一个基本问题：电极信号在多大程度上反映底层系统中的局部信息与分布式信息？更具体地说，电极的活动有多少由其邻近区域携带，又有多少嵌入在阵列的更广泛分布中？我们通过空间掩蔽回归（SMR）框架解决这一问题，该框架从其余电极重建每个电极的时间序列，同时排除目标周围可配置的邻域。通过逐步增大掩蔽，空间局部性成为实验控制，用于量化在移除附近通道后有多少预测信息幸存。我们将SMR应用于具有异质电极覆盖的颅内脑电图（iEEG）和具有标准化导联组合的感觉运动皮层头皮脑电图（EEG）。使用原始信号与重建信号之间的距离相关性，我们发现两种模态中均存在强烈的受试者内重建，即使排除局部邻域后仍有显著的可预测性，且EEG中的跨受试者转移明显强于iEEG。掩蔽显示邻近电极对重建贡献显著，但并非全部，表明单个通道既反映局部冗余也反映更广泛的分布式结构。保留选定边际或谱特性但破坏相位结构或时间顺序的替代数据显著降低了性能，支持SMR依赖于结构化时间和跨通道组织而非仅边际统计的结论。这些结果将SMR定位为量化记录中局部与分布式信息平衡的可解释框架。

英文摘要

Neural recordings are often interpreted as local measurements, yet the signal at any one sensor can also reflect structured activity distributed across the broader network. This raises a basic question: to what extent does an electrode's signal reflect local versus distributed information in the underlying system? More specifically, how much of an electrode's activity is carried by its immediate neighborhood, and how much is embedded more broadly across the array? We address this with a Spatially Masked Regression (SMR) framework that reconstructs each electrode's timeseries from the remaining electrodes while excluding a configurable neighborhood around the target. By progressively increasing this mask, spatial locality becomes an experimental control for quantifying how much predictive information survives after nearby channels are withheld. We apply SMR to intracranial EEG with heterogeneous electrode coverage and to scalp EEG with standardized montages over sensorimotor cortex. Using distance correlation between original and reconstructed signals, we find strong within-subject reconstruction in both modalities, substantial residual predictability even when local neighbors are excluded, and markedly stronger cross-subject transfer in EEG than in iEEG. Masking shows that nearby electrodes contribute strongly to reconstruction but do not account for all of it, indicating that individual channels reflect both local redundancy and broader distributed structure. Surrogates that preserve selected marginal or spectral properties while disrupting phase structure or temporal ordering substantially reduce performance, supporting the conclusion that SMR depends on structured temporal and cross-channel organization rather than on marginal statistics alone. These results position SMR as an interpretable framework for quantifying the balance between local and distributed information in recordings.

URL PDF HTML ☆

赞 0 踩 0

2606.11361 2026-06-11 cs.IR cs.CL 新提交

A PubMed-Scale Dataset of Structured Biomedical Abstracts

一个PubMed规模的生物医学结构化摘要数据集

Chia-Hsuan Chang, Haerin Song, Brian Ondov, Hua Xu

发表机构 * Department of Biomedical Informatics & Data Science, School of Medicine, Yale University（耶鲁大学生物医学信息学与数据科学系，医学院）

AI总结针对PubMed中大量非结构化摘要阻碍下游文本处理的问题，构建了包含2320万条记录的结构化摘要语料库，其中590万条来自官方XML，1720万条通过大语言模型自动标注，统一为五段格式。

Comments Data and code for this work are available at https://doi.org/10.5281/zenodo.20336717 and https://github.com/BIDS-Xu-Lab/StructuredPubMed, respectively

详情

AI中文摘要

结构化摘要对于生物医学文献处理至关重要，它有助于信息检索、文本挖掘和知识综合。然而，PubMed中索引的绝大部分摘要仍然是非结构化的，这给下游文本处理工作流程和应用带来了重大瓶颈。为解决这一限制，我们引入了Structured PubMed，这是一个从完整PubMed数据库编译而来的全面语料库，包含超过2320万条研究文章记录，每条记录都带有节标签。该语料库分为两个不同的子集：一个包含590万条作者结构化摘要的集合，这些摘要从官方XML文件中解析而来；另一个包含1720万条原本非结构化摘要的自动标注集合，这些摘要通过逐字提取的大语言模型流水线进行结构化。每条记录都统一在统一的五节模式下，并映射到其原始PubMed标识符、出版类型和出版日期。该数据集可用于训练句子分类模型、基准测试文本分割架构，并在前所未有的PubMed范围内进行大规模、特定节的信息提取。

英文摘要

Structured abstracts are important for biomedical literature processing, by facilitating information retrieval, text mining, and knowledge synthesis. However, a vast portion of abstracts indexed in PubMed remain unstructured, presenting a significant bottleneck for downstream text-processing workflows and applications. To resolve this limitation, we introduce Structured PubMed, a comprehensive corpus of section-labeled biomedical abstracts compiled from the complete PubMed database, encompassing over 23.2 million research-article records. The corpus is divided into two distinct subsets: a collection of 5.9 million author-structured abstracts parsed from official XML files, and an automatically labeled collection of 17.2 million originally unstructured abstracts structured via a verbatim-extraction Large Language Model pipeline. Every record is harmonized under a unified five-section schema and mapped to its original PubMed identifier, publication type, and publication date. This dataset can be utilized to train sentence-classification models, benchmark text-segmentation architectures, and perform large-scale, section-specific information extraction at an unprecedented PubMed-wide scale.

URL PDF HTML ☆

赞 0 踩 0

2606.11357 2026-06-11 cs.DC cs.AI cs.AR cs.PF 新提交

TileFuse: A Fused Mixed-Precision Kernel Library for Efficient Quantized LLM Inference on AMD NPUs

TileFuse：用于AMD NPU上高效量化LLM推理的融合混合精度内核库

Wesley Pang, Gregory Hyegang Jun, Feiyang Liu, Deming Chen

发表机构 * University of Illinois Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校）； Department of Electrical and Computer Engineering（电气与计算机工程系）

AI总结针对边缘NPU上量化LLM部署困难，提出TileFuse库，通过融合解包、反量化与GEMM/GEMV内核，并设计交错预分块布局与数据流，在XDNA2上实现AWQ格式原生支持，性能提升最高281%，能耗降低64.6%。

Comments 13 pages excluding reference, 11 figures

详情

AI中文摘要

随着设备端LLM推理需求的增长，边缘SoC越来越多地集成NPU，以在严格的功耗和热预算下提高性能和能效。然而，当前客户端NPU上的实际LLM部署仍然困难：广泛使用的量化格式（如AWQ）无法干净地映射到许多现有NPU软件栈上，这些软件栈通常是专有的，并且暴露有限底层控制。在这项工作中，我们提出了\textit{TileFuse}，一个面向AMD XDNA2 NPU的近底层混合精度内核库，针对量化LLM推理中的Transformer线性层。TileFuse将实用的低位格式（如AWQ风格的W4A16和W8A16）直接引入XDNA2，而不是迫使模型围绕NPU特定的量化方案重新调整。TileFuse协同设计了权重布局、元数据放置、混合精度微内核和阵列级数据流。具体来说，它将解包、反量化以及GEMM/GEMV执行融合到单个内核流中，引入了一种支持高达32K GEMM维度的交错预分块布局，并重新设计了GEMV数据流以利用完整的4x8 AIE阵列。在内核级评估中，与全精度基线相比，TileFuse在GEMM上性能提升高达121.6%，在GEMV上提升281%，同时在GEMM上相比强iGPU基线实现了超过2倍的性能和能效提升。在Ryzen AI笔记本电脑上的端到端LLM实验中，TileFuse实现了高达2.0倍的预填充延迟降低，能耗降低超过64.6%。这些结果共同表明，XDNA2是AWQ风格边缘LLM推理的实用目标，并且对现成量化的原生NPU支持可以使NPU在实际客户端部署中更加可用。

英文摘要

With the growing demand for on-device LLM inference, edge SoCs increasingly integrate NPUs to improve performance and energy efficiency under tight power and thermal budgets. However, practical LLM deployment on current client NPUs remains difficult: widely used quantization formats such as AWQ do not map cleanly onto many existing NPU software stacks, which are often proprietary and expose limited low-level control. In this work, we present \textit{TileFuse}, a close-to-metal mixed-precision kernel library for AMD XDNA2 NPUs that targets transformer linear layers in quantized LLM inference. TileFuse brings practical low-bit formats such as AWQ-style W4A16 and W8A16 directly onto XDNA2, rather than forcing the model to be reshaped around an NPU-specific quantization scheme. TileFuse co-designs weight layout, metadata placement, mixed-precision microkernels, and array-level dataflow. Specifically, it fuses unpacking, dequantization, and GEMM/GEMV execution into a single kernel flow, introduces an interleaved pre-tiling layout that supports GEMM dimensions up to 32K, and redesigns GEMV dataflow to utilize the full 4x8 AIE array. Across kernel-level evaluations, TileFuse improves performance by up to 121.6% for GEMM and 281% for GEMV over full-precision baselines, while delivering more than 2x performance and energy-efficiency gains over strong iGPU baselines on GEMM. In end-to-end LLM experiments on Ryzen AI laptops, TileFuse achieves up to 2.0x lower prefilling latency with more than 64.6% lower energy consumption. Together, these results show that XDNA2 is a practical target for AWQ-style edge LLM inference and that native NPU support for off-the-shelf quantization can make NPUs substantially more usable in real client deployments.

URL PDF HTML ☆

赞 0 踩 0

2606.11347 2026-06-11 stat.ML cs.LG math.OC 新提交

Annealed Entropic Allocation for Ranking and Selection

退火熵分配用于排序与选择

Xin Fei, Juergen Branke

发表机构 * Business School（商学院）； The University of Edinburgh（爱丁堡大学）； Warwick Business School（沃里克商学院）； The University of Warwick（沃里克大学）

AI总结提出退火熵分配框架，通过加权log-sum-exp替代非光滑极大极小大偏差率目标，结合鞍点近似提升有限预算下的区分能力，数值实验表明在多个候选接近时性能优异。

详情

AI中文摘要

我们提出了退火熵分配，一种用于排序与选择中顺序预算分配的退火加权软最小化框架。核心思想是用加权log-sum-exp替代非光滑的极大极小大偏差率目标，该替代通过软最小化权重聚合特定候选对的得分，从而在多个候选几乎同时活跃时缓解硬切换。为了提升有限预算下的区分能力，我们引入了鞍点近似——一种从精细化的成对尾部渐近性导出的次指数修正。由于这些修正是次指数的，且平滑参数退火至零，该替代保持了与经典极大极小公式相同的一阶大偏差目标。我们证明了该替代一致收敛于硬最小值，软最小化权重集中于活跃候选，并且在固定权重下，诱导的目标分配映射在单纯形内部是连续的。在高斯和指数实例上的数值实验展示了竞争性能，尤其是在多个候选几乎持平时。

英文摘要

We propose Annealed Entropic Allocation, an annealed weighted soft-min framework for sequential budget allocation in ranking and selection. The central idea is to replace the non-smooth maximin large-deviation rate objective with a weighted log-sum-exp surrogate that aggregates challenger-specific pairwise scores through soft-min weights, mitigating hard switching when several challengers are nearly active. To improve finite-budget discrimination, we incorporate the saddlepoint approximation -- a sub-exponential correction derived from refined pairwise tail asymptotics. Because these corrections are sub-exponential and the smoothing parameter is annealed to zero, the surrogate preserves the same first-order large-deviation target as the classical maximin formulation. We show that the surrogate converges uniformly to the hard minimum, that the soft-min weights concentrate on the active challengers, and that, under fixed weights, the induced target allocation map is continuous on the simplex interior. Numerical experiments on Gaussian and exponential instances demonstrate competitive performance, especially when multiple challengers are nearly tied.

URL PDF HTML ☆

赞 0 踩 0

2606.11339 2026-06-11 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML 新提交

Quantized Stochastic Primal-Dual Methods for Distributed Optimization under Relaxed Global Geometry

松弛全局几何下分布式优化的量化随机原始-对偶方法

Susmit Sarkar, Abhinav Raghuvanshi, Kushal Chakrabarti, Mayank Baranwal

发表机构 * Indian Institute of Technology Bombay（印度理工学院班加罗尔）； Tata Consultancy Services Research（塔塔咨询公司研究）

AI总结提出量化随机原始-对偶方法q-PDGD，在松弛全局几何下证明线性收敛到邻域或O(1/k)收敛，匹配最优集中随机复杂度。

Comments Accepted to UAI

2606.11304 2026-06-11 physics.ins-det cs.LG hep-ex hep-ph 新提交

SPADE: Split-and-Delay Embeddings for Autoregressive High-Granularity Calorimeter Simulation

SPADE: 用于自回归高粒度量热器模拟的分裂与延迟嵌入

Joschka Birk, Frank Gaede, Anna Hallin, Gregor Kasieczka, Martina Mozzanica, Henning Rose

发表机构 * Institute for Experimental Physics, Universität Hamburg（实验物理研究所，汉堡大学）； Deutsches Elektronen-Synchrotron DESY（德国电子同步辐射光源DESY）

AI总结提出SPADE自回归变压器，通过独立嵌入多特征令牌并延迟特征流，利用标准自注意力学习令牌内相关性，在ILD探测器点云簇射生成中优于现有模型。

Comments 20 pages, 13 figures

2606.11295 2026-06-11 astro-ph.CO cs.LG 新提交

Interpretable Neural Marked Statistics for Cosmological Inference

可解释的神经标记统计用于宇宙学推断

Federico Semenzato, Benjamin D. Wandelt, Michele Liguori, Alvise Raccanelli

发表机构 * University of Cambridge（剑桥大学）； University of Waterloo（多伦多大学）

AI总结提出一种神经标记方案，通过可解释的物理变换从形态学层面提取宇宙学信息，在对比学习目标下优化标记统计，显著提高对σ₈和Ωₘ的约束精度。

Comments 11 pages, 6 figures. Accepted to the Workshop on AI for Physics (ICML 2026)

详情

AI中文摘要

恢复超出功率谱的宇宙学信息是即将进行的宇宙学调查的核心目标，因为物质密度中的晚期非高斯信号无法仅通过两点统计获得。标记统计通过使用非线性函数对场进行重新加权，将部分信息折叠回两点水平。我们提出了一种神经标记方案，通过一组可解释的、物理驱动的变换来推广这一过程，这些变换直接允许在形态学层面解释宇宙学信息的增益。我们采用对比学习目标将可学习的标记摘要与底层宇宙学参数对齐。在$k_{\max}=0.2\\,h\mathrm{Mpc}^{-1}$处，与经典标记相比，我们的神经标记将$\sigma_8$的边缘化约束提高了$2.9\times$，将$\Omega_m$提高了$1.8\times$，在Fisher信息层面打破了$\Omega_m-\sigma_8$简并。它进一步将参数MSE在整个宇宙学参数先验上比最佳经典标记降低了$1.45\times$。学习到的潜在几何与参数空间中的$\Omega_m$和$\sigma_8$方向对齐，表明对比目标恢复了宇宙学信息的主导轴。我们的方法为更强大、可解释的宇宙学推断摘要统计打开了大门。

英文摘要

Recovering cosmological information beyond the power spectrum is a central goal for upcoming cosmological surveys, since late-time non-Gaussian signal in the matter density cannot be accessed through two-point statistics alone. Marked statistics fold part of this information back into the two-point level by reweighting the field with non-linear functions. We propose a neural marking scheme to generalize this process through a set of interpretable, physically motivated transformations that directly allow to interpret the gain in cosmological information at the morphological level. We employ a contrastive learning objective to align learnable marked summaries with the underlying cosmological parameters. At $k_{\max}=0.2\,h\mathrm{Mpc}^{-1}$, our neural mark tightens the marginalized constraint on $σ_8$ by $2.9\times$ and on $Ω_m$ by $1.8\times$ compared to classical marks, breaking the $Ω_m-σ_8$ degeneracy at the Fisher information level. It further reduces the parameter MSE across our cosmological parameter prior by $1.45\times$ over the best classical mark. The learned latent geometry aligns with the $Ω_m$ and $σ_8$ directions in parameter space, indicating that the contrastive objective recovers the dominant axes of cosmological information. Our approach opens the door to more powerful, interpretable summary statistics for cosmological inference.

URL PDF HTML ☆

赞 0 踩 0

2606.11287 2026-06-11 eess.IV cs.CV 新提交

Intelligent Skin Cancer Detection Using a Multispectral Metasurface and a Hybrid

基于多光谱超表面和混合深度学习的智能皮肤癌检测

Afsane Saee Arezoomand

发表机构 * PhD in Communication Engineering, Young Researchers and Elite Club, Islamic Azad University, Urmia Branch, Urmia, Iran（通信工程博士，青年学者与精英俱乐部，伊斯兰 azad 大学乌尔米亚分校，乌尔米亚，伊朗）

AI总结提出结合多光谱超表面成像与CNN-ViT混合深度学习架构，实现皮肤癌高精度检测，准确率达98%，灵敏度95%，特异性99%。

Comments 8 pages

详情

Journal ref: New Researches in the Smart City, Vol. 4, No. 1, Autumn 2025

AI中文摘要

皮肤癌是全球最常见的恶性肿瘤之一，早期检测对于提高患者生存率和降低治疗成本至关重要。传统的皮肤镜和视觉成像技术主要局限于可见光谱，通常无法捕捉与早期恶性肿瘤相关的细微光谱特征。本研究提出了一种创新框架，将多光谱超表面成像与基于卷积神经网络和视觉Transformer的混合深度学习架构相结合。设计的超表面能够非侵入性地获取对组织变化高度敏感的丰富光谱信息，而混合CNN-ViT模型同时提取局部和全局特征，以稳健地对皮肤病变进行分类。基于模拟的评估表明，所提方法实现了约98%的准确率、95%的灵敏度和99%的特异性，优于传统的基于RGB和单一架构的方法。使用注意力图进行的定性分析显示，模型关注临床相关的病变区域，提高了可解释性。总体而言，结果表明，将基于超表面的多光谱成像与混合深度学习相结合，可以引入新一代皮肤病学诊断工具，并为便携、快速且高精度的临床系统铺平道路。

英文摘要

Skin cancer is among the most prevalent malignancies worldwiAdbe satnradcitts early detection is essential for improving patient survival and reducing treatment costs Conventional dermoscopic and visual imaging techniques are primarily limited to the visible spectrum and often fail to capture subtle spectral signatures associated with early stage malignancies This study proposes an innovative framework that integrates a multispectral metasurface for imaging with a hybrid deep learning architecture based on Convolutional Neural Networks and Vision Transformers The designed metasurface enables noninvasive acquisition of rich spectral information highly sensitive to tissue alterations while the hybrid CNN ViT model simultaneously extracts local and global features to robustly classify skin lesions Simulation-based evaluations demonstrate that the proposed method achieves approximately 98 accuracy 95 percentages sensitivity and 99 perentage specificity surpassing conventional RGB-based and single-architecture approaches Qualitative analyses using attention maps reveal that the model focuses on clinically relevant lesion regions improving interpretability Overall the results indicate that combining metasurface based multispectral imaging with hybrid deep learning can introduce a new generation of diagnostic tools in dermatology and pave the way for portable fast and highly accurate clinical systems

URL PDF HTML ☆

赞 0 踩 0

2606.11284 2026-06-11 cs.MA cs.GT cs.LG 新提交

Phi-Actor-Critic: Steering General-Sum Games to Pareto-Efficient Correlated Equilibria

Phi-Actor-Critic: 引导一般和博弈走向帕累托高效关联均衡

Wongyu Lee, Francesco Lelli, Omran Ayoub, Massimo Tornatore

发表机构 * Politecnico di Milano（米兰理工大学）； Tilburg University（蒂尔堡大学）； University of Applied Sciences and Arts of Southern Switzerland（瑞士南瑞士应用科学与艺术大学）

AI总结提出Φ-Actor-Critic框架，通过交换遗憾最小化引导多智能体学习向高社会福利的关联均衡收敛，并采用集中式注意力批评家高效估计反事实遗憾，结合拉格朗日均衡选择机制优化社会福利。

Comments Accepted to IJCAI 2026

详情

AI中文摘要

现实世界的多智能体系统，从交通协调到资源分配，通常被建模为一般和博弈，其中个体激励与集体福利相冲突。在这些设定中，核心挑战不仅是找到均衡，而是在许多次优纳什均衡中选择社会期望的结果。标准的深度多智能体强化学习（MARL）方法难以解决这个问题，因为价值分解方法受单调性假设约束，而策略梯度方法往往收敛到稳定但社会效率低下的均衡。为了解决这一限制，我们提出了Φ-Actor-Critic（Φ-AC），一个利用交换遗憾最小化引导学习向高福利关联均衡（CE）收敛的框架。为了使反事实遗憾估计在深度MARL中易于处理，Φ-AC采用了一个集中式注意力批评家，在单次前向传播中预测向量值遗憾，避免了计算昂贵的反事实模拟。我们进一步引入了一个基于拉格朗日的均衡选择机制，通过遗憾约束优化社会福利同时确保稳定性。在矩阵博弈、多智能体粒子环境（MPE）和Melting Pot Harvest场景上的实验表明，Φ-AC在多样的混合动机设定中学习到高效且稳定的协调策略，同时保持高集体回报和竞争公平性。

英文摘要

Real-world multi-agent systems, from traffic coordination to resource allocation, are often modeled as general-sum games where individual incentives conflict with collective welfare. In these settings, the central challenge is not merely finding an equilibrium, but selecting socially desirable outcomes among many suboptimal Nash equilibria. Standard deep multi-agent reinforcement learning (MARL) methods struggle with this problem, as value-decomposition approaches are constrained by monotonicity assumptions and policy-gradient methods often converge to stable but socially inefficient equilibria. To address this limitation, we propose $Φ$-Actor-Critic ($Φ$-AC), a framework that leverages swap regret minimization to steer learning toward high-welfare correlated equilibria (CE). To make counterfactual regret estimation tractable in deep MARL, $Φ$-AC employs a centralized attention critic that predicts vector-valued regrets in a single forward pass, avoiding computationally expensive counterfactual simulations. We further introduce a Lagrangian-based equilibrium selection mechanism that optimizes social welfare while enforcing stability through regret constraints. Experiments on matrix games, Multi-Agent Particle Environments (MPE), and the Melting Pot Harvest scenario demonstrate that $Φ$-AC learns efficient and stable coordination strategies across diverse mixed-motive settings while maintaining high collective return and competitive fairness.

URL PDF HTML ☆

赞 0 踩 0

2606.11283 2026-06-11 cs.DS cs.LG stat.ML 新提交

Fixed-Parameter Tractability of Private Synthetic Data Generation

私有合成数据生成的固定参数可处理性

Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi

发表机构 * Google Deepmind（谷歌深Mind）； Institute for Mathematical and Computational Engineering, Faculty of Mathematics and School of Engineering, Pontificia Universidad Católica de Chile（数学与计算工程学院、数学系和工程学院、智利天主教大学）

AI总结研究差分隐私下合成数据生成问题，通过查询族关联图的树宽参数建立固定参数可处理性，提出两种最优算法。

2606.11279 2026-06-11 eess.AS cs.CL cs.LG cs.SD 新提交

Massive Open-Vocabulary Keyword Spotting

大规模开放词汇关键词识别

Leonor Barreiros, Raul Monteiro, Afonso Mendes, Gonçalo M. Correia

发表机构 * Priberam Labs（Priberam实验室）； Instituto Superior Técnico（理工学院）； Instituto de Telecomunicações（电信研究所）

AI总结提出一种内存占用更小的开放词汇关键词识别系统，无需微调即可处理大规模数据库，在未见语言中达到与未压缩方案相当的实体召回率。

Comments Accepted to Interspeech 2026

2606.11274 2026-06-11 cs.MA cs.LG physics.flu-dyn 新提交

Multi-agent rendezvous in fluid flows via reinforcement learning

基于强化学习的多智能体在流体中的会合

Bocheng Li, Jingran Qiu, Lihao Zhao

发表机构 * AML, Department of Engineering Mechanics, Tsinghua University（AML，工程力学系，清华大学）； Department of Physics, Gothenburg University（物理系，哥德堡大学）

AI总结采用多智能体强化学习（MARL）在涡旋流中开发物理信息会合策略，显著提高会合率，并具有跨涡旋强度、尺度和群体规模的迁移性，通过打破状态-动作图对称性防止智能体被困在分离涡旋中。

详情

AI中文摘要

会合是多智能体系统的一项关键任务，要求智能体协调以在未指定位置相遇。然而，在流体环境中实现这一目标具有挑战性，因为尚不清楚智能体如何利用底层流体运动学来促进收敛。在本研究中，我们采用多智能体强化学习（MARL）方法在涡旋流中开发物理信息会合策略。与智能体向其对应方导航的朴素策略相比，MARL策略显著提高了会合率。MARL策略还表现出跨不同涡旋强度、涡旋尺度和群体规模的可迁移性。通过打破状态-动作图的对称性，MARL策略利用一种非直观的机制，防止智能体被困在分离的涡旋中，从而提高会合成功率。此外，从学习到的策略中提取了一种启发式策略，其性能也优于朴素策略。进一步的理论分析表明，流体变形阻碍了会合过程。大的有限时间李雅普诺夫指数识别出流体效应分离相邻智能体的区域，表明应在弱变形区域规划目标。我们的发现揭示了智能体-流体相互作用在多智能体任务中的重要作用，并突出了MARL在复杂流动环境中探索群体智能的能力。

英文摘要

Rendezvous is a critical task for multi-agent systems, requiring agents to coordinate to meet at an unspecified location. However, achieving this in fluid environments presents a challenge, as it remains unclear how agents can exploit underlying fluid kinematics to facilitate convergence. In this study, we adopt a multi-agent reinforcement learning (MARL) approach to develop physics-informed rendezvous strategies in vortical flows. Compared to a naive strategy, where agents navigate toward their counterparts, MARL strategies significantly improve the rendezvous rate. MARL strategies also show transferability across varying vortex intensities, vortex scales, and swarm sizes. By breaking the symmetry of the state-action map, MARL strategy leverages a non-intuitive mechanism that prevents agents from becoming trapped in separate vortices, thereby enhancing rendezvous success. Additionally, a heuristic strategy is extracted from the learned strategy and also outperforms the naive strategy. Furthermore, a theoretical analysis demonstrates that fluid deformation impedes the rendezvous process. Large finite-time Lyapunov exponents identify where fluid effects separate adjacent agents, suggesting that targets should be planned in weak-deformation regions. Our findings reveal the important role that agent-fluid interactions play in multi-agent tasks and highlight the MARL capability to explore swarm intelligence in complex flow environments.

URL PDF HTML ☆

赞 0 踩 0

2606.11265 2026-06-11 cs.CR cs.AI 新提交

When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelines

当投毒在检索后失败：重新审视分块与重排序管道下的语料库投毒

Xi Nie, Hongwei Li, Shenghao Wu, Mingxuan Li, Jiachen Li, Wenbo Jiang

发表机构 * School of Computer Science, Shandong University（山东大学计算机学院）； School of Information, Shandong University（山东大学信息学院）； School of Software Engineering, Shandong University（山东大学软件学院）

AI总结针对RAG系统，提出CRCP框架，通过联合优化检索相关性、重排序一致性和分块边界鲁棒性，解决现有投毒方法在真实多阶段检索管道中因分块和重排序导致效果下降的问题。

详情

AI中文摘要

检索增强生成（RAG）系统容易受到语料库投毒攻击，这些攻击通过恶意知识注入操纵下游模型输出。现有研究主要在简化的检索设置下评估投毒，忽视了涉及文档分块、密集检索、重排序和基于生成的生成等实际RAG管道。在本文中，我们重新审视了在现实多阶段检索管道下的语料库投毒，并表明许多现有攻击在重排序后效果显著下降，尽管在检索阶段实现了高相关性。我们识别出检索粒度不匹配是这种失败的关键原因：文档级别的对抗信号在分块过程中经常被碎片化，而重排序器偏好局部连贯且包含答案的段落，而非全局优化的语义相似性。基于这一观察，我们提出了分块感知和重排序一致的投毒（CRCP），这是一个联合优化检索相关性、重排序一致性和分块边界鲁棒性的投毒框架。CRCP在优化过程中显式建模分块变换，以生成在变化的分块配置下仍然有效的局部自包含对抗段落。在多个检索器和重排序器的标准RAG基准上的实验表明，现有投毒方法对分块大小和重排序策略高度敏感，而CRCP在现实检索管道中实现了显著更高的攻击成功率和更强的鲁棒性。我们的发现凸显了当前RAG安全评估中的一个重要现实差距，并表明现代RAG系统中的投毒应被视为一个多阶段检索一致性问题，而不仅仅是检索问题。

英文摘要

Retrieval-Augmented Generation (RAG) systems are vulnerable to corpus poisoning attacks that manipulate downstream model outputs through malicious knowledge injection. Existing studies mainly evaluate poisoning under simplified retrieval settings, overlooking practical RAG pipelines involving document chunking, dense retrieval, reranking, and grounded generation. In this paper, we revisit corpus poisoning under realistic multi-stage retrieval pipelines and show that many existing attacks substantially degrade after reranking despite achieving high retrieval-stage relevance. We identify retrieval granularity mismatch as a key reason for this failure: document-level adversarial signals are often fragmented during chunking, while rerankers favor locally coherent and answer-bearing passages rather than globally optimized semantic similarity. Based on this observation, we propose Chunk-aware and Rerank-Consistent Poisoning (CRCP), a poisoning framework that jointly optimizes retrieval relevance, reranker consistency, and chunk-boundary robustness. CRCP explicitly models chunking transformations during optimization to generate locally self-contained adversarial passages that remain effective under varying chunking configurations. Experiments on standard RAG benchmarks with multiple retrievers and rerankers show that existing poisoning methods are highly sensitive to chunk size and reranking strategies, whereas CRCP achieves substantially higher attack success rates and stronger robustness across realistic retrieval pipelines. Our findings highlight an important realism gap in current RAG security evaluation and suggest that poisoning in modern RAG systems should be studied as a multi-stage retrieval consistency problem rather than a retrieval-only problem.

URL PDF HTML ☆

赞 0 踩 0