语言大模型 / LLM - arXivDaily 专题

2606.19721 2026-06-19 cs.LG cs.AI 新提交 65%

OnDeFog: Online Decision Transformer under Frame Dropping

OnDeFog：帧丢失下的在线决策变压器

Daiki Yotsufuji, Kenta Nishihara, Shoma Shimizu, Kento Uchida, Shinichi Shirakawa

发表机构 * Yokohama National University（横滨国立大学）

专题命中其他LLM ：提出在线决策变压器处理帧丢失问题。

AI总结针对帧丢失导致性能下降的问题，提出OnDeFog，将DeFog机制与在线决策变压器结合，通过直接环境交互学习策略，在高丢帧率环境下优于ODT，在低奖励数据集上优于DeFog。

Comments Accepted to PRICAI 2025

详情

DOI: 10.1007/978-981-95-7072-0_10

AI中文摘要

在具有挑战性的现实世界强化学习应用中，通信延迟或传感器故障经常导致帧丢失，此时智能体无法接收丢失的状态及相关奖励。为了解决帧丢失导致的性能下降问题，通过将额外机制引入决策变压器以处理帧丢失，开发了随机帧丢失下的决策变压器（DeFog）。尽管DeFog可以缓解帧丢失环境中的性能下降，但由于DeFog是一种离线学习方法，它难以有效泛化到训练数据集中未充分表示的新状态。在本研究中，我们提出OnDeFog，它将DeFog中的机制与在线决策变压器（ODT）相结合，ODT是一种通过直接环境交互学习策略的在线强化学习方法。全面的实验评估表明，我们提出的OnDeFog在高丢帧率环境下相比ODT取得了更优的性能，并且在包含大量低奖励数据的数据集上优于DeFog。

英文摘要

In challenging real-world reinforcement learning applications, communication delays or sensor failures often cause frame dropping, in which the agent cannot receive the dropped states and associated rewards. To address the performance degradation caused by frame dropping, the Decision Transformer under Random Frame Dropping (DeFog) was developed by incorporating additional mechanisms into the decision transformer to tackle frame dropping. Although DeFog can mitigate performance degradation in frame-dropping environments, since DeFog is an offline learning method, it struggles to effectively generalize to novel states not adequately represented in the training dataset. In this study, we propose OnDeFog, which integrates the mechanisms in DeFog with the online decision transformer (ODT), an online reinforcement learning method that learns policies through direct environmental interaction. Comprehensive experimental evaluation demonstrates that our proposed OnDeFog achieves superior performance compared to ODT in environments characterized by high dropping frame rate and outperforms DeFog on datasets containing a large amount of low-reward data.

URL PDF HTML ☆

赞 0 踩 0

2606.19587 2026-06-19 stat.ML cs.LG 新提交 60%

A Solver-Free Training Method for Predict-then-Optimize

一种无求解器的预测后优化训练方法

Beichen Wan, Mo Liu

发表机构 * Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, NC, USA（统计与运筹学系，北卡罗来纳大学教堂山分校）

专题命中其他LLM ：提出无求解器训练方法，优化预测模型，属于LLM应用

AI总结提出一种基于测度变换的决策聚焦学习管道，通过无求解器代理损失实现预测后优化中预测模型的高效训练，理论保证Fisher一致性，训练时间降低数个数量级。

Comments Accepted by ICML 2026

详情

AI中文摘要

我们提出了一种可扩展的方法，用于在预测后优化范式中训练预测（机器学习）模型，其中模型输出作为后续线性优化任务的系数。直接最小化经验决策遗憾对于线性规划和组合优化是不可行的，因为决策映射是分段常数，且梯度几乎处处为零。虽然现有方法通过平滑微分过程来解决这一问题，但它们存在可扩展性问题，因为每次梯度评估都需要调用计算昂贵的求解器。为了解决这个问题，我们提出了一种基于测度变换原理的决策聚焦学习管道，该管道在训练期间产生一个完全无优化求解器的新代理损失。我们建立了理论保证，包括Fisher一致性和超额风险界。实验上，我们的方法在实现与最先进方法相当的决策质量的同时，将训练时间减少了数个数量级。

英文摘要

We propose a scalable method for training prediction (machine learning) models in the predict-then-optimize paradigm, where model outputs serve as coefficients for a subsequent linear optimization task. Directly minimizing the empirical decision regret is intractable for linear programming and combinatorial optimization since the decision mapping is piecewise constant, and the gradients are zero almost everywhere. While existing methods address this by smoothing the differentiation process, they suffer from scalability issues, since a computationally expensive solver call is required for every gradient evaluation. To address this, we propose a decision-focused learning pipeline based on a measure transformation principle, which yields a new surrogate loss that is completely optimization-solver-free during training. We establish theoretical guarantees, including Fisher consistency and excess risk bounds. Empirically, our method achieves decision quality competitive with state-of-the-art methods while reducing training time by orders of magnitude.

URL PDF HTML ☆

赞 0 踩 0

2606.19410 2026-06-19 stat.ML cs.LG 新提交 60%

The Representational Limit of Scalar Interactions: An Interventional Decomposition

标量交互的表征限制：一种干预分解

Potito Aghilar, Sabino Roccotelli, Stanislao Fidanza, Vito Walter Anelli, Sebastiano Stramaglia, Tommaso Di Noia

发表机构 * Polytechnic University of Bari（巴里理工学院）； University of Bari Aldo Moro（巴里大学Aldo Moro）

专题命中其他LLM ：提出特征交互分解方法，可用于模型解释

AI总结本文证明标量交互指标混淆了唯一性、冗余性和协同性，并提出Stochastic Hi-Fi方法，通过干预掩码推理分解每个特征的U/R/S轮廓，在表格和图像任务中恢复被标量基线遗漏的结构。

详情

AI中文摘要

有符号的成对交互指标从根本上混淆了唯一性（U）、冗余性（R）和协同性（S）。我们在一个最小的3路XOR结构因果模型上证明了这一点：忠实的指标如Shapley-Taylor对每对返回零，而投影指标如Shapley Interaction将三阶效应扩散到混淆三种机制的成对标量中。我们引入了Stochastic Hi-Fi，一种事后、无需重新训练的可预测性分解方法，通过干预掩码推理估计每个特征的U/R/S轮廓。该估计器提供精确的干预语义、有限样本蒙特卡洛界限、耦合菱形采样带来的严格方差减少以及均匀的有限词汇收敛。在表格SCM上，Stochastic Hi-Fi恢复了被标量基线遗漏的结构（交互幅度恢复比高达411倍）。它还在GPT-2 IOI电路中分离了冗余和协同头。在NIH ChestX-ray14上，Stochastic Hi-Fi在Pointing Game中匹配GradCAM，并在Deletion AUC上显著改进。

英文摘要

Signed pairwise interaction scores fundamentally conflate uniqueness (U), redundancy (R), and synergy (S). We prove this on a minimal 3-way XOR structural causal model: faithful indices such as Shapley-Taylor return zero per pair, whereas projective indices such as Shapley Interaction spread the third-order effect into pair scalars that conflate the three mechanisms. We introduce Stochastic Hi-Fi, a post-hoc, retraining-free predictability decomposition that estimates per-feature U/R/S profiles by interventional masked inference. The estimator provides exact interventional semantics, finite-sample Monte Carlo bounds, strict variance reduction from coupled diamond sampling, and uniform finite-vocabulary convergence. Across tabular SCMs, Stochastic Hi-Fi recovers structure missed by scalar baselines (up to 411x larger interaction-magnitude recovery ratios). It also separates redundant and synergistic heads in the GPT-2 IOI circuit. On NIH ChestX-ray14, Stochastic Hi-Fi matches GradCAM on Pointing Game and improves substantially on Deletion AUC.

URL PDF HTML ☆

赞 0 踩 0

2606.20518 2026-06-19 cs.AI 新提交 60%

FlowEdit: Associative Memory for Lifelong Pronunciation Adaptation in Flow-Matching TTS

FlowEdit: 流匹配TTS中终身发音适应的联想记忆

Harshit Singh, Ayush Pratap Singh, Nityanand Mathur

发表机构 * University Of Maryland（马里兰大学）； TU Darmstadt（达姆施塔特工业大学）； Smallest AI

专题命中其他LLM ：流匹配TTS的终身发音适应

AI总结针对流匹配TTS部署后无法纠正专有名词发音错误的问题，提出FlowEdit框架，通过潜在条件编辑而非权重更新学习发音修正，并利用现代Hopfield网络存储和检索修正，在312个多语言专有名词基准上将音素错误率降低92.7%。

详情

AI中文摘要

流匹配文本到语音系统在零样本场景下表现出色，但部署后保持静态：除非重新训练模型，否则对词汇表外的专有名词的发音错误会持续存在。我们提出FlowEdit，一个用于冻结的流匹配TTS的终身适应框架，它将发音修正学习为潜在条件编辑而非权重更新。当提供纠正性反馈时，FlowEdit优化文本嵌入空间中的令牌级扰动，然后将修正存储在作为内容可寻址情景记忆的现代Hopfield网络中。在推理时，通过具有相似性门控的软注意力检索修正，实现模糊形态匹配。在我们整理的涵盖18个语系的312个多语言专有名词基准上，FlowEdit相对于零样本基线将目标词音素错误率降低了92.7%，同时保持相同的通用语音质量。修正过程在单个GPU上大约15秒完成。

英文摘要

Flow-matching text-to-speech systems achieve remarkable zero-shot quality but remain static after deployment: pronunciation errors on out-of-vocabulary proper nouns persist unless the model is retrained. We introduce FlowEdit, a life-long adaptation framework for frozen flow-matching TTS that learns pronunciation corrections as latent conditioning edits rather than weight updates. When corrective feedback is provided, FlowEdit optimizes a token-level perturbation in the text embedding space, then stores the correction in a Modern Hopfield Network serving as content-addressable episodic memory. At inference, corrections are retrieved via soft attention with a similarity gate, enabling fuzzy morphological matching. On our curated benchmark of 312 multilingual proper nouns across 18 language families, FlowEdit reduces target-word Phoneme Error Rate by 92.7% relative to the zero-shot baseline while maintaining identical general-speech quality. Corrections complete in approximately 15 seconds on a single GPU.

URL PDF HTML ☆

赞 0 踩 0

2606.20431 2026-06-19 cs.LG 新提交 60%

Sparsity, Superposition, and Forgetting: A Mechanistic Study of Representation Retention in Continual Learning

稀疏性、叠加与遗忘：持续学习中表示保持的机制研究

Jan Wasilewski, Jędrzej Kozal, Michał Woźniak, Bartosz Krawczyk

发表机构 * Rochester Institute of Technology（罗切斯特理工学院）； Wrocław University of Science and Technology（弗罗茨瓦夫科技大学）

专题命中其他LLM ：研究持续学习中的遗忘机制，与LLM相关

AI总结通过可控玩具框架研究持续学习中的遗忘机制，发现叠加随时间增加但任务边界处有瞬降，高稀疏性增加叠加但不必然导致遗忘，任务级有效秩随稀疏性增长。

详情

AI中文摘要

持续学习（CL）系统常常遗忘先前获得的知识，但由于真实数据集纠缠了许多因素，遗忘的机制在实践中难以孤立。我们提出了一个可控的玩具世界框架，使这些机制可观察和可测试。使用合成生成器-分离器流水线，我们定义了真实潜在特征，构建了具有可调稀疏性和重叠的任务，并引入了表示强度和叠加（特征间的方向重叠）的可测量量。然后，我们通过拟合保留、叠加和暴露历史之间的稀疏动态关系（通过SINDy）来研究保留动态——表示强度的时间变化。基于有效秩的互补任务级分析表征了表示能力如何在任务间分配。我们的受控实验得出三个要点。（1）叠加随时间增加，在任务边界处有瞬降，表明边界特定的干扰而非稳定漂移。（2）更高的特征稀疏性导致更多叠加，但不必然引起遗忘；当表示保持强时，尽管重叠，遗忘可以减少。（3）任务级有效秩随稀疏性增长，表明在稀疏机制下更广泛的能力使用。这些结果共同细化了常见直觉——更多叠加导致更多遗忘，通过显示重叠与表示强度和能力分配相互作用。我们的玩具分析为CL提供了可证伪的假设和诊断工具。

英文摘要

Continual learning (CL) systems often forget previously acquired knowledge, yet the mechanisms driving forgetting remain hard to isolate in practice because real datasets entangle many factors. We present a controlled, toy-world framework that makes these mechanisms observable and testable. Using a synthetic generator-separator pipeline, we define ground-truth latent features, build tasks with tunable sparsity and overlap, and introduce measurable quantities for representation strength and superposition (directional overlap among features). We then study retention dynamics-the temporal change of representation strength by fitting sparse dynamical relations (via SINDy) between retention, superposition, and exposure history. A complementary task-level analysis based on effective rank characterizes how representational capacity is allocated across tasks. Our controlled experiments yield three takeaways. (1) Superposition tends to increase over time with transient dips at task boundaries, suggesting boundary-specific interference rather than steady drift. (2) Higher feature sparsity induces more superposition yet does not inevitably cause forgetting; when representations remain strong, forgetting can be reduced despite overlap. (3) Task-level effective rank grows with sparsity, indicating broader capacity usage under sparse regimes. Together, these results nuance the common intuition that more superposition leads to more forgetting by showing that overlap interacts with representation strength and capacity allocation. Our toy analysis provides falsifiable hypotheses and diagnostic tools for CL.

URL PDF HTML ☆

赞 0 踩 0

2606.20254 2026-06-19 cs.CR 新提交 60%

Quantization as a Malicious Task: Removing Quantization-Conditioned Backdoors via Task Arithmetic

量化作为恶意任务：通过任务算术移除量化条件后门

Kaihsun Yang, Min-Yan Tsai, Chia-Mu Yu

专题命中其他LLM ：防御量化后门，涉及模型安全

AI总结提出QVec方法，通过将量化引起的权重变化视为恶意任务向量，在部署前进行参数校正，无需重训练或触发样本即可防御量化条件后门。

详情

AI中文摘要

模型量化被广泛采用，以在资源受限设备上部署深度神经网络时减少内存使用和推理成本。然而，最近的研究揭示了一种新的安全威胁，称为量化条件后门（QCBs），其中模型在全精度下行为正常，但仅在量化后激活恶意行为。现有的防御通常修改量化过程或校正激活统计，往往引入额外的计算开销或依赖特定的量化设置。在这里，我们提出QVec，一种从参数空间角度防御QCBs的方法。我们观察到，全精度模型与其量化版本之间的权重差异编码了一种结构化的行为偏移，可以解释为恶意任务向量，而非随机量化噪声。基于这一见解，QVec通过在部署前进行受控的参数校正来抵消这一恶意方向。QVec无需重新训练，无需触发样本，仅需一次量化传递来估计参数偏移，以及轻量级的超参数搜索。在图像分类基准和多个大型语言模型（LLM）攻击场景中的大量实验表明，QVec在保持干净性能的同时，持续抑制后门激活。

英文摘要

Model quantization is widely adopted to reduce memory usage and inference cost when deploying deep neural networks on resource-constrained devices. However, recent studies have revealed a new security threat known as Quantization-Conditioned Backdoors (QCBs), where a model behaves normally in full precision but activates malicious behavior only after quantization. Existing defenses typically modify quantization procedures or correct activation statistics, often introducing additional computational overhead or relying on specific quantization settings. Here, we present QVec, a parameter-space perspective for defending against QCBs. We observe that the weight difference between a full-precision model and its quantized counterpart encodes a structured behavioral shift, which can be interpreted as a malicious task vector rather than random quantization noise. Based on this insight, QVec counteracts this malicious direction through controlled parameter correction prior to deployment. QVec requires no retraining, no trigger samples, and only a single quantization pass to estimate the parameter shift, together with a lightweight hyperparameter search. Extensive experiments across image classification benchmarks and multiple Large Language Model (LLM) attack scenarios demonstrate that QVec consistently suppresses backdoor activation while preserving clean performance.

URL PDF HTML ☆

赞 0 踩 0

2606.19910 2026-06-19 cs.CL cs.SD eess.AS 新提交 60%

Light-weight Pronunciation Assessment via Discrete Speech Token Surprisal

轻量级发音评估：基于离散语音标记的意外度

Syeda Faiza Ahmed Sara, Shammur Absar Chowdhury

发表机构 * Qatar Computing Research Institute, Doha, Qatar（卡塔尔计算研究所，多哈，卡塔尔）

专题命中其他LLM ：使用语言模型计算语音标记意外度进行发音评估。

AI总结提出仅使用母语语音资源训练的轻量级发音评估框架，通过离散化语音标记和语言模型计算意外度，结合文本引导对齐特征，在无监督或少量校准下达到接近监督方法的性能。

Comments Accepted to Interspeech 2026

详情

AI中文摘要

训练自动发音评估通常依赖于标记的学习者错误或非母语语料库，这些语料库收集成本高昂。我们提出一个轻量级框架，仅使用母语语音资源训练，以无监督或通过少量评分话语进行轻量校准的方式运行。在推理时，学习者语音通过SSL编码器和K-means码本进行离散化。一个在母语序列上训练的标记语言模型计算意外度，其中较高的意外度表示音位偏差。我们添加了一个转录引导的Text2DUnit--DTW模块，该模块从参考文本预测母语标记序列，并将其与声学标记对齐以推导出错误敏感特征。意外度和对齐特征通过简单回归融合。在SpeechOcean762上，PCC从0.60提升到0.66（带转录引导），接近监督基线。在L2-ARCTIC上的跨数据集评估显示了一致的提升。

英文摘要

Training automated pronunciation assessment often relies on labeled learner errors or non-native corpora that are costly to collect. We propose a lightweight framework trained only on native speech resources, operating unsupervised or lightly calibrated with a small set of scored utterances. At inference, learner speech is discretized with an SSL encoder and a K-means codebook. A token language model trained on native sequences computes surprisal where higher surprisal indicates phonotactic deviation. We add a transcript-guided Text2DUnit--DTW module that predicts native token sequences from reference text and aligns them to acoustic tokens to derive error-sensitive features. Surprisal and alignment features are fused via simple regression. On SpeechOcean762, PCC improves from 0.60 to 0.66 with transcript guidance, near supervised baselines. Cross-dataset evaluation on L2-ARCTIC shows consistent gains.

URL PDF HTML ☆

赞 0 踩 0

2606.19734 2026-06-19 cs.LG 新提交 60%

Federated Bilevel Performative Prediction

联邦双层执行预测

Liangxin Qian, Chang Liu, Xuanyu Cao, Jun Zhao, Kwok-Yan Lam

发表机构 * Nanyang Technological University（南洋理工大学）； Zhejiang University（浙江大学）； Washington State University（华盛顿州立大学）

专题命中其他LLM ：研究联邦学习中的双层优化，涉及分布偏移。

AI总结研究联邦学习中客户端数据分布受决策影响的双层优化问题，提出联邦双层执行稳定点概念及两种求解方法，实验验证了稳定性阈值和元泛化提升。

Comments Accepted by ICML 2026

详情

AI中文摘要

联邦双层优化广泛用于跨分布式客户端的嵌套学习问题，例如在隐私和通信约束下的联邦超参数调整和元学习。大多数现有公式假设客户端数据分布固定，但执行性可能违反这一假设，其中部署的决策会重塑客户端行为和数据收集，导致客户端特定的、决策依赖的分布偏移。我们研究联邦双层执行预测，其中上层（UL）和下层（LL）目标都在客户端依赖、决策依赖的分布下进行评估。我们在解耦风险视角下形式化联邦双层执行稳定（FBPS）点，并给出其存在性和唯一性的充分条件。然后，我们开发两种联邦方法来计算FBPS解：FBi-RRM，在收缩条件下线性收敛；以及FBi-SGD，一种基于联邦超梯度估计的通信高效随机方法，在步长递减且敏感性足够小时具有收敛保证。在策略回归和元策略分类上的实验验证了预测的稳定性阈值，并展示了相对于非执行基线的元泛化改进，基于CNN的分类进一步证明了所提方法在非凸神经网络设置中的实际有效性。

英文摘要

Federated bilevel optimization is widely used for nested learning problems across distributed clients, such as federated hyperparameter tuning and meta-learning under privacy and communication constraints. Most existing formulations assume fixed client data distributions, which can be violated by performativity, where deployed decisions reshape client behavior and data collection, inducing client-specific, decision-dependent distribution shift. We study federated bilevel performative prediction, where both upper-level (UL) and lower-level (LL) objectives are evaluated under client-dependent, decision-dependent distributions. We formalize the federated bilevel performatively stable (FBPS) point under a decoupled-risk perspective and provide sufficient conditions for its existence and uniqueness. We then develop two federated methods to compute the FBPS solution: FBi-RRM, which converges linearly under a contraction condition, and FBi-SGD, a communication-efficient stochastic method based on federated hypergradient estimation with convergence guarantees under diminishing step sizes when sensitivities are sufficiently small. Experiments on strategic regression and meta strategic classification validate the predicted stability thresholds and demonstrate improved meta-generalization over non-performative baselines, and CNN-based classification further demonstrates the practical effectiveness of the proposed methods in nonconvex neural network settings.

URL PDF HTML ☆

赞 0 踩 0

2606.19603 2026-06-19 cs.LG 新提交 60%

Comparing Linear Probes with Mahalanobis Cosine Similarity

比较线性探针与马氏余弦相似度

Zhuofan Josh Ying, Peter Hase, Nikolaus Kriegeskorte

发表机构 * Columbia University（哥伦比亚大学）； Stanford University（斯坦福大学）； Schmidt Sciences（施密特科学）

专题命中其他LLM ：研究线性探针比较方法，与LLM可解释性相关

AI总结研究证明马氏余弦相似度与OOD AUROC存在线性关系，提供理论解释并验证其作为线性探针比较指标的有效性。

Comments 16 pages, 10 figures

详情

AI中文摘要

线性探针广泛用于可解释性研究，并常通过余弦相似度进行比较。两个方向之间的马氏余弦相似度（MCS）通过测试数据协方差重新加权内积，是一种自然的任务感知改进。Ying等人（2026）报告称，探针与在分布外（OOD）数据上训练的参考探针的MCS近乎完美地线性预测了该探针的OOD AUROC（R^2 = 0.98）。在这里，我们将这一实证发现扩展到不同模型、层和概念领域，并以封闭形式证明了这一普遍现象：对于投影为高斯分布的平衡类别，OOD AUROC与参考探针的MCS是线性的，因为两者都是探针在测试数据上信噪比（SNR）的S形函数。该理论还预测了这种线性何时失效，我们通过实验验证了这一点。MCS为比较线性探针提供了有理论依据且经验有效的替代方案，优于欧几里得余弦相似度。

英文摘要

Linear probes are widely used in interpretability research and often compared by cosine similarity. The Mahalanobis cosine similarity (MCS) between two directions, which reweights the inner product by test data covariance, is a natural task-aware refinement. Ying et al. (2026) report that a probe's MCS to a reference probe trained on the out-of-distribution (OOD) data near-perfectly linearly predicts the probe's OOD AUROC (R^2 = 0.98). Here, we extend this empirical finding across models, layers, and concept domains, and prove this general phenomenon in closed form: For balanced classes whose projections are Gaussian, OOD AUROC and MCS to the reference probe are linear because both are sigmoid-shaped functions of the probe's signal-to-noise ratio (SNR) on the test data. The theory also predicts when this linearity fails, which we verify empirically. MCS offers a theoretically grounded and empirically effective alternative to Euclidean cosine similarity for comparing linear probes.

URL PDF HTML ☆

赞 0 踩 0

2606.19411 2026-06-19 cs.LG 新提交 60%

Spectral DPPs via NEPv: A Scalable Continuous Relaxation of Determinantal MAP for Diversity-Aware Data Selection

通过NEPv的谱DPP：用于多样性感知数据选择的确定性点过程MAP的可扩展连续松弛

Richard Yi Da Xu

发表机构 * Hong Kong Baptist University（香港浸会大学）； TadReamk Limited（TadReamk有限公司）

专题命中其他LLM ：多样性感知数据选择，可应用于LLM数据筛选。

AI总结提出将NP难的DPP-MAP选择问题转化为Stiefel流形上的连续优化，通过非线性特征值问题（NEPv）的自洽场迭代实现近线性时间求解，适用于大规模数据选择。

详情

AI中文摘要

从海量候选池中选择一个小的、多样化的、高质量的子集是现代机器学习中的一个常见原语——用于训练和微调大型模型的数据整理和核心集选择、主动学习批次获取、上下文学习的提示和示例选择、检索多样化以及实验设计。确定性点过程（DPP）为此任务提供了原则性的、良好校准的多样性概念，但其MAP目标——选择大小为$k$的子集$S$最大化$\log\det(L_S)$——是NP难的，并且标准的贪心和采样算法在候选集大小$n$上具有超线性复杂度。这种成本在多样性最重要的数据为中心的场景中尤其高昂，其中$n$范围从数百万到数十亿的候选示例、特征或嵌入。我们将DPP-MAP重新表述为Stiefel流形上的连续优化问题，并证明其最优性条件构成一个先前未研究形式的具有特征向量依赖性的非线性特征值问题（NEPv）。该NEPv允许自洽场（SCF）迭代，具有基于谱间隙的局部收缩保证，从而提供了一个原则性的迭代求解器，其中多样性目标驱动一个特征向量依赖的算子。由此产生的算法OurMethod仅需要与核的矩阵-向量乘积，运行时间为$O\!\big((ndk+nk^2)\,t\big)$，其中迭代次数$t$很小，在$n$上接近线性，并直接与机器学习中常见的低秩和特征映射核集成。本文重点介绍松弛、求解器和扩展分析；完整的真实数据基准测试留给计划中的实证研究。

英文摘要

Selecting a small, diverse, high-quality subset from a massive pool of candidates is a recurring primitive in modern machine learning -- data curation and coreset selection for training and fine-tuning large models, active-learning batch acquisition, prompt and exemplar selection for in-context learning, retrieval diversification, and experimental design. Determinantal Point Processes (\DPP s) give a principled, well-calibrated notion of diversity for this task, but their \emph{MAP} objective -- pick a size-$k$ subset $S$ maximizing $\logdet(L_S)$ -- is NP-hard, and the standard greedy and sampling algorithms scale superlinearly in the ground-set size $n$. This cost is prohibitive precisely in the data-centric regime where diversity matters most, where $n$ ranges over millions to billions of candidate examples, features, or embeddings. We recast \DPP-MAP as a continuous optimization problem over the Stiefel manifold, and show that its first-order optimality conditions form a \emph{Nonlinear Eigenvalue Problem with eigenvector dependency} (\NEPv) of a previously unstudied form. This \NEPv\ admits a self-consistent field (\SCF) iteration with a spectral-gap-based local contraction guarantee, giving a principled iterative solver where the diversity objective drives an eigenvector-dependent operator. The resulting algorithm, \OurMethod, requires only matrix-vector products with the kernel and runs in time $O\!\big((ndk+nk^2)\,t\big)$ for a small number of iterations $t$, scaling near-linearly in $n$ and integrating directly with low-rank and feature-map kernels common in ML. This paper focuses on the relaxation, solver, and scaling analysis; full real-data benchmarking is left to a planned empirical study.

URL PDF HTML ☆

赞 0 踩 0

2606.19539 2026-06-19 astro-ph.SR cs.AI 新提交 60%

Review of Machine Learning Models for Solar Energetic Particle Prediction

太阳高能粒子预测的机器学习模型综述

Spiridon Kasapis, Pouya Hosseinzadeh, Kathryn Whitman, Ricky Egeland, Manolis Georgoulis, Angelos Vourlidas, Athanasios Papaioannou, Eleni Lavasa, Anastasios Anastasiadis, Giorgos Giannopoulos, Andres Munoz-Jaramillo, Bala Poduval, Irina N. Kitiashvili, Alexander G. Kosovichev, Viacheslav Sadykov, Soukaina Filali Boubrahimi, Tate T. Hutchins, Hameedullah A. Farooki, Manuel E. Cuesta, Leng Y. Khoo, Sungmin Pak, Robert Czarnota, Jamie S. Rankin, Jamey Szalay, Mitchell M. Shen, Georgios Livadiotis, Zigong Xu, David J. McComas, Nikolaos Sarlis, Dionissios Hristopulos, Arik Posner, Alec J. Engell, Mohammed AbuBakr Ali, Ali G. A. Abdelkawy, Abdelrazek M. K. Shaltout, M. M. Beheary, Christina O. Lee, Sigiava Aminalragia-Giamini, Constantinos Papadimitriou, Ingmar Sandberg, Savvas Raptis, Shah Muhammad Hamdi, Monica Laurenza, Mirko Stumpo, Sumanth A. Rotti, India Jackson, Aatiya Ali, Atilim Gunes Baydin, Nathan Schwadron, Subhamoy Chatterjee, Maher A. Dayeh, Gelu M. Nita, Patrick M. O'Keefe, Chun Jie Chong, Paul Kosovich, Russell D. Marroquin, Berkay Aydin, Petrus C. Martens, Lulu Zhao, Yang Chen, Yian Yu, Monica G. Bobra, Ward Manchester, Tamas Gombosi, Ming Zhang, Jesse Torres, Philip K. Chan, Mohamed Nedal, Kamen Kozarev, Peijin Zhang, Kimberly Moreland, Hazel M. Bain, Samuel Hart, Michael J. Starkey, Alan G. Ling, Simone Benella

发表机构 * Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA ； Computational Physics Branch, NASA Ames Research Center, Moffett Field, CA, USA ； Department of Computer Science, Utah State University, Logan, UT, USA ； Space Radiation Analysis Group, NASA Johnson Space Center, Houston, TX, USA ； Johns Hopkins Applied Physics Lab, 11100 Johns Hopkins Rd, Laurel, MD 20723, United States ； Research Center for Astronomy ； Applied Mathematics of the Academy of Athens, 4 Soranou Efesiou Street, Athens 11527, Greece ； Institute for Astronomy, Astrophysics, Space Applications ； Southwest Research Institute, Boulder, CO, USA ； Space Science Center, University of New Hampshire, Durham, NH, USA ； Department of Physics, New Jersey Institute of Technology, Newark, NJ, USA ； Astronomy Department, Georgia State University, Atlanta, GA, USA ； Department of Computer Science, Princeton University, Princeton, NJ, USA ； Department of Mathematics, Rowan University, Glassboro, NJ, USA ； Astronomy, California Institute of Technology, Pasadena, CA, USA ； Department of Physics, National ； Kapodistrian University of Athens, Athens, Greece ； School of Electrical ； Computer Engineering, Technical University of Crete, Chania, Greece ； Department of Astronomy ； Meteorology, Faculty of Science, Al-Azhar University, Cairo, Egypt ； Space Sciences Lab, University of California, Berkeley, CA, USA ； Research Consultancy, Athens, Greece ； Institute for Space Astrophysics ； Department of Physics ； Astronomy, Georgia State University, Atlanta, GA 30303, USA ； Aryabhatta Research Institute of Observational Sciences (ARIES), Manora Peak, Nainital-263001, Uttarakhand, India ； Department of Computer Science, Oxford University, Oxford, England ； Southwest Research Institute, San Antonio, TX, USA ； Computer Science Department, New Jersey Institute of Technology, Newark, NJ, USA ； Department of Physics, University of California San Diego, La Jolla, CA 92093, USA ； Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA ； Department of Climate ； Engineering, University of Michigan, Ann Arbor, MI, USA ； Department of Statistics, University of Michigan, Ann Arbor, MI, USA ； Department of Electrical Engineering ； Computer Science, Florida Institute of Technology, Melbourne, FL, USA ； Astrophysics Section, School of Cosmic Physics, Dublin Institute for Advanced Studies, DIAS Dunsink Observatory, Dublin D15 XR2R, Ireland ； Institute of Astronomy of the Bulgarian Academy of Sciences, Sofia, Bulgaria ； Center for Solar-Terrestrial Research, New Jersey Institute of Technology, Newark, NJ 07102, USA ； Cooperative Programs for the Advancement of Earth System Science, University Corporation for Atmospheric Research, Boulder, CO, USA ； CIRES, University of Colorado Boulder, Boulder, CO, USA ； Space Weather Prediction Center, NOAA, Boulder, CO, USA ； Astronomy, College of Science, The University of Texas at San Antonio, San Antonio, TX, USA ； Space Weather Prediction Center, National Oceanic ； The University of Texas at San Antonio, San Antonio, TX, USA ； Environmental Research, Inc., MA, USA

专题命中其他LLM ：机器学习模型综述，非LLM核心

AI总结综述了用于太阳高能粒子预测的机器学习模型，包括数据集、架构、输入输出比较，并提出了未来研究建议。

Comments Review Paper, Maine text: 23 pages, References: 5 pages, Appendix: 42 pages

详情

AI中文摘要

太阳高能粒子事件因其对航空、航天器电子设备以及地球磁层外人类任务的显著辐射危害而日益受到关注。从科学角度来看，SEP事件之所以引人入胜，是因为它们源于从太阳表面和日冕延伸到日光层的一系列物理过程，提供了对广泛适用于天体物理学的粒子加速和传输机制的洞察。因此，提高我们理解和预测SEP事件的能力，对于加深对这些机制的认识以及保护空间技术和探索至关重要。传统上，研究人员使用基于物理的模拟和经验方法对SEP进行建模。最近，机器学习已成为理解和预测SEP事件的新工具。本文旨在回顾当前可用于SEP预测的机器学习模型，识别用于训练的数据集，比较它们的架构、输入和输出，并基于这些见解，为未来研究概述良好实践和建议。

英文摘要

Solar energetic particle (SEP) events have attracted increasing attention due to their significant radiation hazards for aviation, spacecraft electronics, and human missions beyond Earth's magnetosphere. From a scientific perspective, SEP events are intriguing because they arise from a set of physical processes extending from the solar surface and corona through the heliosphere, offering insight into particle acceleration and transport mechanisms that are widely applicable across astrophysics. Therefore, advancing our ability to understand and predict SEP events is essential both for deepening our knowledge of such mechanisms and for safeguarding space technologies and exploration. Traditionally, researchers have modeled SEPs using physics-based simulations and empirical methods. More recently, machine learning (ML) has emerged as a new tool for understanding and predicting SEP events. The purpose of this manuscript is to review the currently available ML models for SEP prediction, identify the datasets used for training, compare their architectures, inputs, and outputs, and, based on these insights, outline good practices and recommendations for future research.

URL PDF HTML ☆

赞 0 踩 0

2606.16106 2026-06-19 cs.PF cs.AR cs.DC 新提交 60%

Edge-Inference Governors Need Memory-Clock State

超越CPU-GPU频率：内存时钟和尾部效应对边缘推理延迟估计的影响

Jaehoon Kang

专题命中其他LLM ：研究边缘推理中LLM延迟估计

AI总结通过测量NVIDIA Jetson Orin Nano，发现内存时钟是缺失的维度、聚合丢失率隐藏突发性、频率切换存在延迟，这些现象超出传统频率感知延迟模型的范围。

Comments 20 pages, 13 figures, 11 tables. Code and data: https://github.com/dankang21/jetson-latency-lab ; traces: https://doi.org/10.5281/zenodo.20745228

详情

AI中文摘要

频率感知延迟估计器通过建模CPU和GPU频率上的延迟，使得边缘ML推理的截止时间感知DVFS成为可能。我们在NVIDIA Jetson Orin Nano上进行了测量研究，展示了该建模范围之外的三种现象。(1) 内存时钟是一个缺失的维度：在现实的上限EMC范围（2133->3199 MHz）内，根据工作负载的不同，它将中位数延迟偏移了+11%到+48%，并且在最高GPU时钟下，对于合成L2驻留内核，我们观察到一个可重复的非单调情况（-9%）。在一个功率配置下分析并在另一个功率配置下部署的GPU频率估计器，因此低估了高达32%的延迟；列出四个可锁定的EMC点可以修复大多数工作负载，而参数化的1/f_emc项则不能。(2) 聚合丢失率隐藏了突发性：在固定时钟下，100k周期运行显示出刀锋边缘分布，其截止时间丢失的悬崖跨度约为1毫秒，但丢失的聚集远超出独立性——在0.1%的聚合丢失率下，下一个周期也丢失的概率高达74%（是独立基线的740倍）。高斯mu+3sigma边界超过0.1%丢失目标13倍到29倍，而样本外广义帕累托边界在所有八种配置中保持在~2倍以内。(3) 频率切换并非免费：每个域的过渡停顿低于100微秒，但新的工作点需要1/5/8毫秒（CPU/GPU/EMC）才能生效——对于每推理周期的调控器来说，这是典型推理周期的很大一部分。我们发布了完整的测量工具，并讨论了对下一代频率感知估计器和调控器的影响。

英文摘要

Frequency-aware latency estimators let deadline-aware DVFS governors schedule edge ML inference by modeling latency over CPU and GPU clocks, but they cannot observe the memory clock (EMC) -- a missing deployment state that decides whether a governor meets its deadlines and at what energy. We show this with a deployed, measured governor on a Jetson Orin NX: an EMC-blind GPU-only fit misses 25-28% of cycles at tight deadlines, whereas an EMC-aware refit holds misses to at most 1.3% under a 2% QoS miss budget by selecting a budget-feasible clock -- the energy-minimal one for periodic vision (calibrated module-rail power). The failure generalizes across three workload classes -- MobileNetV2, a ViT transformer, and Qwen2.5 LLM token decode (where saturated decode makes the aware policy lower-energy than the infeasible blind choice): a CPUxGPU estimator sends the deployed governor to an infeasible operating point, and only an EMC-aware model identifies the feasible side of the energy frontier. The effect is real and outside the CPUxGPU state abstraction: across two Orin SKUs sharing the same lockable EMC points it shifts median latency by up to ~45%, replicates on both, and survives a fused TensorRT fp16 engine. CPUxGPU models do not absorb it: per-lockable-point EMC tables are needed, a scoped inversion shows monotone assumptions can pick the wrong direction, and clustered misses make aggregate QoS rates understate deployment risk. We release the harness; this complements, not rebuts, the state of the art within its CPUxGPU scope.

URL PDF HTML ☆

赞 0 踩 0

2605.05481 2026-06-19 cs.LG 版本更新 60%

Approximate Next Policy Sampling: Replacing Conservative Target Policy Updates in Deep RL

近似下一策略采样：替代深度强化学习中的保守目标策略更新

Dillon Sandhu, Ronald Parr

专题命中其他LLM ：提出近似下一策略采样方法，属于强化学习，非LLM核心内容

AI总结提出近似下一策略采样（ANPS）方法，通过修改训练分布而非约束策略更新来解决强化学习中的“鸡生蛋”问题，并基于此设计稳定值近似策略迭代（SV-API）算法，在Atari和连续控制任务上实现更大目标策略更新且性能匹配或提升。

详情

AI中文摘要

我们重新审视强化学习中一个经典的“鸡生蛋”问题：为了安全地改进策略，价值函数必须在更新策略的状态访问分布上准确。该状态分布是未知的，且无法为训练价值函数而采样。保守更新解决了这个问题，但代价是缩小策略更新。本文探索了一种替代方案，即近似下一策略采样（ANPS），它通过修改训练分布而非约束策略更新来解决问题。如果训练数据的分布近似于下一策略的分布，则ANPS成立。为了证明ANPS的可行性和有效性，我们引入了稳定值近似策略迭代（SV-API）。SV-API修改了标准的近似策略迭代循环，在迭代更新的行为策略收集相关经验的同时，保持目标策略固定。它仅在满足收敛准则后才承诺采用新策略。如果满足某些稳定性准则，则更新保证是安全的；否则，其安全性不低于标准近似策略迭代。将SV-API应用于PPO得到稳定值PPO（SV-PPO），在高维离散（Atari）和连续控制基准测试中，SV-PPO在执行显著更大的目标策略更新的同时，性能匹配或提升。这些结果证明了ANPS作为RL中这一经典挑战的新解决方案的可行性。

英文摘要

We revisit a classic "chicken-and-egg" problem in reinforcement learning: to safely improve a policy, the value function must be accurate on the state-visitation distribution of the updated policy. That distribution over states is unknown and cannot be sampled for the purposes of training the value function. Conservative updates solve this problem, but at the cost of shrinking the policy update. This paper explores an alternative solution, Approximate Next Policy Sampling (ANPS), which addresses the problem by modifying the training distribution rather than constraining the policy update. ANPS is satisfied if the distribution of the training data approximates that of the next policy. To demonstrate the feasibility and efficacy of ANPS, we introduce Stable Value Approximate Policy Iteration (SV-API). SV-API modifies the standard approximate policy iteration loop to hold the target policy fixed while an iteratively updated behavioral policy gathers relevant experience. It only commits to a new policy once a convergence criterion has been met. If certain stability criteria are met, the update is guaranteed to be safe; otherwise, it remains no less safe than standard approximate policy iteration. Applying SV-API to PPO yields Stable Value PPO (SV-PPO), which matches or improves performance on high-dimensional discrete (Atari) and continuous control benchmarks while executing substantially larger target policy updates. These results demonstrate the viability of ANPS as a new solution to this classic challenge in RL.

URL PDF HTML ☆

赞 0 踩 0

2604.07328 2026-06-19 cs.LG 版本更新 60%

How to sketch a learning algorithm

如何勾勒学习算法

Sam Gunn

发表机构 * UC Berkeley（伯克利大学）

专题命中其他LLM ：提出数据删除方案用于深度学习模型

AI总结提出一种数据删除方案，基于稳定性假设，通过随机复方向的高阶导数局部勾勒算术电路，实现深度学习模型输出预测的误差和失败概率可忽略，且预计算和推理仅慢对数因子。

Comments Improved presentation and simplified Algorithm 4

详情

AI中文摘要

训练数据的选择如何影响AI模型？这个广泛的问题对于可解释性、隐私和基础科学至关重要。其技术核心是数据删除问题：在合理的预计算量之后，快速预测如果从学习算法中排除给定训练数据子集，模型在给定情况下的行为。我们提出了一种数据删除方案，能够在深度学习设置中以可忽略的误差$\varepsilon$和失败概率$\delta$预测模型输出。我们的预计算和预测算法分别仅比常规训练和推理慢$\tilde{O}(\log(1/\delta)/\varepsilon^2)$因子。存储需求为$\tilde{O}(\log(1/\delta)/\varepsilon^2)$个模型。我们的证明基于一个称为稳定性的假设。与先前工作所做的假设相比，稳定性似乎与学习强大AI模型完全兼容。为支持这一点，我们展示了稳定性在microgpt的最小实验集中得到满足。我们的代码可在https://this URL获取。在技术层面，我们的工作基于一种新方法，通过计算随机复方向的高阶导数来局部勾勒算术电路。前向模式自动微分允许廉价计算这些导数。

英文摘要

How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $δ$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/δ)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/δ)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

URL PDF HTML ☆

赞 0 踩 0

2604.06464 2026-06-19 cs.LG physics.app-ph stat.ML 版本更新 60%

Weighted Bayesian Conformal Prediction

加权贝叶斯共形预测

Xiayin Lou, Peng Luo

发表机构 * Technical University of Munich（慕尼黑技术大学）； Massachusetts Institute of Technology（麻省理工学院）

专题命中其他LLM ：加权贝叶斯共形预测方法

AI总结提出加权贝叶斯共形预测（WBCP），通过加权Dirichlet先验推广贝叶斯共形预测到重要性加权设置，理论证明有效样本量决定后验方差，并提供更丰富的条件覆盖不确定性。

详情

AI中文摘要

共形预测提供具有有限样本覆盖保证的分布自由预测区间，Snell & Griffiths 最近的工作将其重新解释为贝叶斯求积（BQ-CP），通过阈值上的 Dirichlet 后验产生强大的数据条件保证。然而，BQ-CP 根本上要求 i.i.d. 假设。同时，加权共形预测通过重要性权重处理分布偏移，但仍然是频率学派方法，仅产生点估计阈值。我们提出 \textbf{加权贝叶斯共形预测（WBCP）}，它将 BQ-CP 推广到任意重要性加权设置，用加权 Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$ 替换均匀 Dirichlet $\Dir(1,\ldots,1)$，其中 $\neff$ 是 Kish 有效样本量。我们证明了四个理论结果：(1)~$\neff$ 是匹配频率学派和贝叶斯方差的唯一集中参数；(2)~后验标准差以 $O(1/\sqrt{\neff})$ 衰减；(3)~BQ-CP 的随机占优保证扩展到每个权重轮廓的数据条件保证；(4)~HPD 阈值在条件覆盖上提供 $O(1/\sqrt{\neff})$ 的改进。我们将 WBCP 实例化为 \emph{地理贝叶斯共形预测}，其中基于核的空间权重产生每个位置的后验，并具有可解释的诊断。在合成和真实空间数据集上的实验表明，WBCP 在保持覆盖保证的同时提供了更丰富的不确定性信息。

英文摘要

Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.

URL PDF HTML ☆

赞 0 踩 0

2603.10184 2026-06-19 stat.ML cs.LG 版本更新 60%

Stabilizing Bandits using Regularization: Precise Regret and A Quantitative Central Limit Theorem

使用正则化稳定赌博机：精确遗憾与定量中心极限定理

Budhaditya Halder, Ishan Sengupta, Koustav Chowdhury, Samya Praharaj, Koulik Khamaru

发表机构 * Department of Statistics, Rutgers University（罗切斯特大学统计系）； Indian Statistical Institute, Kolkata（加尔各答印度统计研究所）

专题命中其他LLM ：研究赌博机算法稳定性，与LLM弱相关。

AI总结本文提出一种精细的稳定性条件，证明正则化随机镜像下降算法满足该条件，并推导出自适应采样下经验奖励估计的非渐近Berry-Esseen界、匹配的遗憾上下界，以及抗腐败下的渐近正态性，同时揭示正则化是有效推断的必要代价。

Comments Updated rate of convergence and precise regret in version 2

详情

AI中文摘要

由于自适应采样违反了经典渐近理论中的独立性假设，使用赌博机数据进行统计推断面临根本性挑战。近期工作将稳定性~\citep{laiwei82} 确定为自适应下有效推断的充分条件。本文首先提出一个精细的稳定性条件，以在线算法的迭代形式表述，并证明一大类正则化随机镜像下降算法满足该条件。这一精细条件使我们能够在多个方面加强~\citet{laiwei82} 的渐近结果。首先，我们推导出自适应采样下经验奖励估计的非渐近Berry-Esseen界。其次，我们推导出所提算法遗憾的匹配非渐近上下界，从而精确刻画其遗憾。第三，我们证明这些正则化算法在给定水平的对抗性腐败下保持渐近正态性和有效推断。最后，我们表明正则化是必要的而非偶然的：Lai-Wei稳定性与最优的$O(\sqrt{T})$遗憾率（如EXP3等非正则化算法所达到的）不相容，因此受控的多对数级遗憾膨胀是有效推断的代价。

英文摘要

Statistical inference with bandit data presents fundamental challenges owing to adaptive sampling, which violates the independence assumptions underlying classical asymptotic theory. Recent work has identified stability~\citep{laiwei82} as a sufficient condition for valid inference under adaptivity. This paper first provides a refined stability condition, stated in terms of the iterates of an online algorithm, and shows that a large class of regularized stochastic-mirror-descent-style algorithms satisfy it. This refined condition allows us to strengthen the asymptotic results of~\citet{laiwei82} in several ways. First, we derive a non-asymptotic Berry--Esseen bound for the empirical reward estimates under adaptive sampling. Second, we derive matching non-asymptotic upper and lower bounds on the regret of the proposed algorithm, yielding a precise characterization of its regret. Third, we show that these regularized algorithms preserve asymptotic normality and valid inference under a prescribed level of adversarial corruption. Finally, we show that regularization is necessary rather than incidental: Lai--Wei stability is incompatible with the optimal $O(\sqrt{T})$ regret rate -- the rate attained by unregularized algorithms such as EXP3 -- so that a controlled, polylogarithmic inflation in regret is the price of valid inference.

URL PDF HTML ☆

赞 0 踩 0

2511.22283 2026-06-19 cs.LG 版本更新 60%

The Hidden Cost of Approximation in Online Mirror Descent

在线镜像下降中近似的隐藏代价

Ofir Schlisselberg, Uri Sherman, Tomer Koren, Yishay Mansour

发表机构 * Tel Aviv University（特拉维夫大学）； Google Research（谷歌研究）

专题命中其他LLM ：研究在线镜像下降在近似误差下的鲁棒性，与优化相关。

AI总结研究在线镜像下降（OMD）在近似误差下的鲁棒性，发现正则子光滑度与误差容忍度密切相关：均匀光滑正则子有紧界，而负熵在单纯形上需指数小误差，对数障碍和Tsallis正则子仅需多项式误差。

详情

AI中文摘要

在线镜像下降（OMD）是一个基本的算法范式，支撑着优化、机器学习和序列决策中的许多算法。OMD迭代被定义为优化子问题的解，而这些子问题通常只能近似求解，导致算法的不精确版本。然而，现有的OMD分析通常假设理想的无误差环境，从而限制了我们对实践中应期望的性能保证的理解。在这项工作中，我们启动了对不精确OMD的系统研究，并揭示了正则子光滑性与对近似误差鲁棒性之间的复杂关系。当正则子一致光滑时，我们建立了由误差引起的超额遗憾的紧界。然后，对于单纯形及其子集上的障碍正则子，我们识别出一个尖锐的分离：负熵需要指数小的误差以避免线性遗憾，而对数障碍和Tsallis正则子即使在误差仅为多项式大小时也能保持鲁棒。最后，我们表明当损失是随机的且域是单纯形时，负熵重新获得鲁棒性——但这种性质并不扩展到所有子集，在那里指数小的误差再次是避免次优遗憾所必需的。

英文摘要

Online mirror descent (OMD) is a fundamental algorithmic paradigm that underlies many algorithms in optimization, machine learning and sequential decision-making. The OMD iterates are defined as solutions to optimization subproblems which, oftentimes, can be solved only approximately, leading to an inexact version of the algorithm. Nonetheless, existing OMD analyses typically assume an idealized error free setting, thereby limiting our understanding of performance guarantees that should be expected in practice. In this work we initiate a systematic study into inexact OMD, and uncover an intricate relation between regularizer smoothness and robustness to approximation errors. When the regularizer is uniformly smooth, we establish a tight bound on the excess regret due to errors. Then, for barrier regularizers over the simplex and its subsets, we identify a sharp separation: negative entropy requires exponentially small errors to avoid linear regret, whereas log-barrier and Tsallis regularizers remain robust even when the errors are only polynomial. Finally, we show that when the losses are stochastic and the domain is the simplex, negative entropy regains robustness-but this property does not extend to all subsets, where exponentially small errors are again necessary to avoid suboptimal regret.

URL PDF HTML ☆

赞 0 踩 0

2509.23806 2026-06-19 cs.SE cs.LG 版本更新 60%

Influence-Guided Concolic Testing of Transformer Robustness

影响力引导的Transformer鲁棒性具体化测试

Chih-Duo Hong, Chih-Cheng Yang, Yu Wang, Fang Yu

发表机构 * Department of Management Information Systems（管理信息系）

专题命中其他LLM ：测试Transformer鲁棒性，但主要关注软件测试

AI总结提出一种基于SHAP影响力排序路径谓词的具体化测试方法，通过纯Python实现多头注意力语义并显式化softmax边界，在CIFAR-10上对紧凑Transformer分类器实现60%攻击成功率，比差分进化基线高45%，且谓词优先级排序将中位攻击时间降低51%。

Comments Accepted at the 26th International Conference on Software Quality, Reliability, and Security

详情

AI中文摘要

神经网络的具体化测试交替进行具体执行和约束求解，以搜索翻转模型决策的输入。我们提出一种针对Transformer分类器的具体化测试器，使用SHAP估计对待定路径谓词按其当前预测的影响进行排序。为了支持SMT求解驱动的执行中多头自注意力机制，我们用纯Python实现注意力语义，使其与求解器兼容，并通过具体化指数参数使softmax边界显式化。我们在CIFAR-10上对三个紧凑Transformer分类器、ResNet18和VGG16在单像素预算和900秒时限下评估了该方法。在匹配比较的500个模型-输入对中，我们的方法实现了60%的成功率，而将模型视为黑盒的差分进化基线仅为15%。在主要的两层Transformer分支排序研究中，基于SHAP的谓词优先级排序将成功率从56%提升至60%，并将中位攻击时间降低51%。这些结果表明，影响力引导的路径探索可以使具体化测试成为在Transformer模型中寻找对抗样本的实用方法。

英文摘要

Concolic testing for neural networks alternates concrete execution with constraint solving to search for inputs that flip model decisions. We present a concolic tester for Transformer classifiers that uses SHAP estimates to rank pending path predicates by their impact on the current prediction. To support self-attention with multiple heads in execution backed by SMT solving, we implement attention semantics in pure Python that are compatible with the solver and make the softmax boundary explicit by concretizing exponentiation arguments. We evaluate our method on CIFAR-10 across three compact Transformer classifiers, ResNet18, and VGG16 under a one-pixel budget and a 900s horizon. Across the 500 model--input pairs in this matched comparison, our method achieves 60% success, compared with 15% for a differential evolution baseline that treats the model as a black box. In the primary two-layer Transformer branch-ordering study, SHAP-based predicate prioritization raises success from 56% to 60% and reduces median attack time by 51%. These results show that influence-guided path exploration can make concolic testing a practical way to find adversarial examples in Transformer models.

URL PDF HTML ☆

赞 0 踩 0

2507.05169 2026-06-19 cs.LG cs.AI cs.CL cs.CV cs.RO 版本更新 60%

Critique of World Model

世界模型批判：一种用于世界建模的生成式潜在预测架构

Eric Xing, Mingkai Deng, Jinyu Hou

专题命中其他LLM ：世界模型架构综述，涉及生成式预测，与LLM相关。

AI总结本文从心理学“假设性思维”出发，提出世界模型的核心目标是模拟真实世界的所有可行动可能性，并设计了一种基于状态化、分层、多级、混合连续/离散表示的生成式潜在预测（GLP）架构。

详情

AI中文摘要

世界模型，即生物智能体所经历并对其采取行动的真实世界环境的算法模拟器，近年来因开发具有人工（通用）智能的虚拟智能体的需求日益增长而成为一个新兴课题。关于世界模型究竟是什么、如何构建、如何使用以及如何评估，已有许多讨论。本文从著名科幻经典《沙丘》中的想象出发，并借鉴心理学文献中“假设性思维”的概念，论证世界模型的主要目标是模拟真实世界中所有可行动的可能性，以进行有目的的推理和行动。我们审视了世界建模的关键设计维度：数据、表示、架构、学习目标和使用，调查了现有方法并分析了它们的权衡。在此基础上，我们提出了一种新的通用世界模型生成式潜在预测（GLP）架构，基于有状态的、分层的、多层次的、混合连续/离散表示，以及生成式和自监督学习框架，并展望了由这种模型支持的物理、智能体和嵌套（PAN）AGI系统。

英文摘要

World Model, the algorithmic simulator of the real-world environment which biological agents experience and act upon, has been an emerging topic in recent years due to the rising need to develop virtual agents with artificial (general) intelligence. There has been much discussion on what a world model really is, how to build it, how to use it, and how to evaluate it. In this essay, starting from the imagination in the famed Sci-Fi classic Dune, and drawing inspiration from the concept of ``hypothetical thinking'' in psychology literature, we argue the primary goal of a world model to be {\it simulating all actionable possibilities of the real world for purposeful reasoning and acting}. We examine the key design dimensions of world modeling: data, representation, architecture, learning objective, and usage, surveying existing approaches and analyzing their tradeoffs. Building on this examination, we propose a new Generative Latent Prediction (GLP) architecture for a general-purpose world model, based on stateful, hierarchical, multi-level, and mixed continuous/discrete representations, and a generative and self-supervised learning framework, with an outlook of a Physical, Agentic, and Nested (PAN) AGI system enabled by such a model.

URL PDF HTML ☆

赞 0 踩 0

2502.03227 2026-06-19 cs.LG cs.CV 版本更新 60%

Adversarial Dependence Minimization

对抗性依赖最小化

Pierre-François De Plaen, Tinne Tuytelaars, Marc Proesmans, Luc Van Gool

发表机构 * CVL, ETH Zürich, Switzerland（CVL，苏黎世联邦理工学院，瑞士）； INSAIT, Sofia University, Bulgaria（INSAIT，索菲亚大学，保加利亚）

专题命中其他LLM ：算法可应用于自监督学习防止维度坍塌

AI总结提出ADM算法，通过对抗博弈最小化特征维度间的统计依赖性，证明全局最优时达到相互独立，并应用于非线性去相关、图像分类泛化提升和自监督学习维度坍塌预防。

2306.12679 2026-06-19 cs.CL 60%

Constructing Colloquial Dataset for Persian Sentiment Analysis of Social Microblogs

构建波斯语社交媒体微博客情感分析的口语数据集

Mojtaba Mazoochi, Leila Rabiei, Farzaneh Rahmani, Zeinab Rajabi

发表机构 * Faculty member in ICT Research Institute（ICT研究所教员）； Iran Telecommunication Research Center (ITRC)（伊朗电信研究中心）； Faculty member in Computer Department（计算机系教员）； Mehralborz University（梅赫拉布尔兹大学）； Hazrat-e Masoumeh University（玛苏姆大学）

专题命中其他LLM ：构建波斯语情感分析数据集，使用CNN模型

AI总结本文构建了波斯语口语数据集并提出基于CNN的模型，提升社交媒体微博客口语文本的情感分析性能，实验结果显示72%的准确率。

Journal ref Multimedia Tools and Applications, 2025

详情

DOI: 10.1007/s11042-025-20777-3

AI中文摘要

介绍：微博网站为情感分析和观点挖掘提供了丰富的数据源。然而，由于微博帖子通常缺乏语法一致的术语和代表性词汇，且用户不愿撰写长文，情感分类效率较低。此外，低资源语言也存在局限性。波斯语具有独特特征，需要独特的标注数据和模型进行情感分析，这与英语文本特征不同。方法：本文首先在协作环境中构建了一个名为ITRC-Opinion的用户意见数据集，包含60,000条来自Twitter和Instagram等社交媒体的非正式波斯语文本。其次，本文提出了一种基于卷积神经网络（CNN）的新型架构，以更有效地进行社交媒体微博客口语文本的情感分析。构建的数据集用于评估所提出的架构。此外，一些模型，如LSTM、CNN-RNN、BiLSTM和BiGRU，结合不同的词嵌入，包括Fasttext、Glove和Word2vec，也研究了我们的数据集并评估了结果。结果：结果表明我们的数据集和所提模型（72%准确率）的优势，展示了情感分类性能的显著提升。

英文摘要

Introduction: Microblogging websites have massed rich data sources for sentiment analysis and opinion mining. In this regard, sentiment classification has frequently proven inefficient because microblog posts typically lack syntactically consistent terms and representatives since users on these social networks do not like to write lengthy statements. Also, there are some limitations to low-resource languages. The Persian language has exceptional characteristics and demands unique annotated data and models for the sentiment analysis task, which are distinctive from text features within the English dialect. Method: This paper first constructs a user opinion dataset called ITRC-Opinion in a collaborative environment and insource way. Our dataset contains 60,000 informal and colloquial Persian texts from social microblogs such as Twitter and Instagram. Second, this study proposes a new architecture based on the convolutional neural network (CNN) model for more effective sentiment analysis of colloquial text in social microblog posts. The constructed datasets are used to evaluate the presented architecture. Furthermore, some models, such as LSTM, CNN-RNN, BiLSTM, and BiGRU with different word embeddings, including Fasttext, Glove, and Word2vec, investigated our dataset and evaluated the results. Results: The results demonstrate the benefit of our dataset and the proposed model (72% accuracy), displaying meaningful improvement in sentiment classification performance.

URL PDF HTML ☆

赞 0 踩 0

2606.19366 2026-06-19 cs.LG cs.AI eess.SP 新提交 55%

Information Lattice Learning as Probabilistic Graphical Model Structure Learning

信息格学习作为概率图模型结构学习

Haizi Yu, Lav R. Varshney

发表机构 * Kocree, Inc.（Kocree公司）； AI Innovation Institute, Stony Brook University（石溪大学人工智能创新研究所）

专题命中其他LLM ：信息格学习与概率图模型相关，非LLM。

AI总结将信息格学习（ILL）解释为概率图模型结构学习，通过投影到分区格上学习可解释规则，并建立与最大熵和因子图的联系。

详情

AI中文摘要

信息格学习（ILL）通过将信号交替投影到编码抽象层次结构的分区格上，并将选定的规则提升回信号域，来学习信号的可解释规则。当信号是概率质量函数时，我们证明ILL学习的概率规则具有自然的概率图模型（PGM）解释，并详细发展了这一解释。ILL中的分区诱导出一个确定性的商变量，规则是该商变量的边际分布。因此，规则集是可解释抽象上的边际约束集合。一般提升是满足这些约束的所有联合分布的可行族，而特殊提升则选择最大无知重建，在ILL中通过L2均匀性原理实现，该原理与最大熵密切相关。在香农熵提升下，相同的约束产生一个对数线性因子图，其因子由学习的抽象索引。然而，信息格本身不是贝叶斯网络：其边编码抽象的细化与粗化，而非条件依赖。因此，ILL最好被视为商变量上可解释的基于约束的因子图的结构学习。这一观点阐明了ILL如何与图模型和最大熵模型相关，同时为推理、可识别性和混合符号-概率学习提出了新方向。

英文摘要

Information lattice learning (ILL) learns interpretable rules of a signal by alternately projecting the signal onto a partition lattice that encodes a hierarchy of abstractions and lifting selected rules back to the signal domain. When the signal is a probability mass function, we show the probabilistic rules learned by ILL admit a natural probabilistic graphical model (PGM) interpretation and develop this interpretation in detail. A partition in ILL induces a deterministic quotient variable, and a rule is the marginal law of that quotient variable. A rule set is therefore a collection of marginal constraints over interpretable abstractions. General lifting is the feasible family of all joint distributions satisfying those constraints, while special lifting chooses a maximum-ignorance reconstruction, implemented in ILL by an L2 uniformity principle closely related to maximum entropy. Under a Shannon-entropy lifting, the same constraints yield a log-linear factor graph whose factors are indexed by learned abstractions. The information lattice itself, however, is not a Bayesian network: its edges encode refinement and coarsening of abstractions, not conditional dependence. Thus ILL is best viewed as structure learning for interpretable constraint-based factor graphs over quotient variables. This view clarifies how ILL relates to graphical models and maximum entropy models, while suggesting new directions for inference, identifiability, and hybrid symbolic-probabilistic learning.

URL PDF HTML ☆

赞 0 踩 0

2602.05533 2026-06-19 cs.AI 版本更新 55%

Conditional Diffusion Guidance under Hard Constraint: A Stochastic Analysis Approach

硬约束下的条件扩散引导：一种随机分析方法

Zhengyi Guo, Wenpin Tang, Renyuan Xu

发表机构 * Department of Industrial Engineering and Operations Research, Columbia University（哥伦比亚大学工业工程与运营管理系）； Department of Management Science and Engineering, Stanford University（斯坦福大学管理科学与工程系）

专题命中其他LLM ：扩散模型条件生成，与LLM弱相关。

AI总结提出基于Doob h-变换和鞅表示的条件扩散引导框架，通过鞅损失和鞅协方差损失学习条件函数梯度，确保硬约束满足并给出非渐近保证。

详情

AI中文摘要

我们研究了扩散模型中在硬约束下的条件生成，其中生成的样本必须以概率1满足预设事件。这类约束在安全关键应用和稀有事件模拟中自然出现，而软或基于奖励的引导方法无法保证约束满足。基于扩散模型的概率解释，我们利用Doob h-变换、鞅表示和二次变差过程，开发了一个原则性的条件扩散引导框架。具体地，得到的引导动力学通过涉及条件函数对数梯度的显式漂移校正来增强预训练扩散，而不修改预训练得分网络。利用鞅和二次变差恒等式，我们提出了两种新的离策略学习算法，基于鞅损失和鞅协方差损失，仅使用预训练模型的轨迹来估计h及其梯度。我们为得到的条件采样器在总变差和Wasserstein距离下提供了非渐近保证，明确刻画了得分近似和引导估计误差的影响。数值实验证明了所提方法在强制硬约束和生成稀有事件样本方面的有效性。数值实验的代码可在此https URL找到。

英文摘要

We study conditional generation in diffusion models under hard constraints, where generated samples must satisfy prescribed events with probability one. Such constraints arise naturally in safety-critical applications and in rare-event simulation, where soft or reward-based guidance methods offer no guarantee of constraint satisfaction. Building on a probabilistic interpretation of diffusion models, we develop a principled conditional diffusion guidance framework based on Doob's h-transform, martingale representation and quadratic variation process. Specifically, the resulting guided dynamics augment a pretrained diffusion with an explicit drift correction involving the logarithmic gradient of a conditioning function, without modifying the pretrained score network. Leveraging martingale and quadratic-variation identities, we propose two novel off-policy learning algorithms based on a martingale loss and a martingale-covariation loss to estimate h and its gradient using only trajectories from the pretrained model. We provide non-asymptotic guarantees for the resulting conditional sampler in both total variation and Wasserstein distances, explicitly characterizing the impact of score approximation and guidance estimation errors. Numerical experiments demonstrate the effectiveness of the proposed methods in enforcing hard constraints and generating rare-event samples. The code of the numerical experiments can be found at https://github.com/ZhengyiGuo2002/CDG_Finance.

URL PDF HTML ☆

赞 0 踩 0