语言大模型 / LLM

2604.00626 2026-06-19 cs.LG cs.CL 版本更新 90%

A Survey of On-Policy Distillation for Large Language Models

大型语言模型的在线策略蒸馏综述

Mingyang Song, Mao Zheng

发表机构 * Tencent, China（腾讯，中国）

专题命中后训练：综述在线策略蒸馏方法，涉及LLM后训练

AI总结本文综述了大型语言模型的在线策略蒸馏方法，探讨了蒸馏过程中如何通过反馈减少累积误差，提出了基于f-散度最小化的蒸馏框架，并分析了蒸馏与强化学习之间的联系。

Comments Ongoing Work

详情

AI中文摘要

随着大型语言模型（LLMs）在能力和成本上的持续增长，将前沿能力转移到更小、可部署的学生模型已成为核心工程问题，知识蒸馏仍然是这一转移的主导技术。工业流水线中普遍采用的静态模仿教师生成文本的方法存在结构性缺陷，随着任务变得更长且需要更多推理，这种缺陷变得更加严重。因为学生是在完美教师前缀上训练的，但在推理时必须生成自己的文本，小错误往往会积累成学生很少被训练来恢复的轨迹，导致的暴露偏差已被证明与序列长度的平方成比例。在线策略蒸馏（OPD）围绕这一观察重新组织训练循环，通过让教师对学生实际生成的内容提供反馈，以减少累积项趋于线性，并将蒸馏重新定义为迭代修正过程，而不是单次模仿。由此产生的文献在分歧设计、奖励引导优化和自我对抗方面有所扩展，但贡献仍然分散在知识蒸馏、RLHF和模仿学习社区中，缺乏统一的处理。本文提供了这样的处理。我们正式将OPD定义为学生采样轨迹上的f-散度最小化，将该领域沿三个设计轴（优化什么、信号来源在哪里、以及如何在实践中稳定训练）组织起来，并整合成功条件、反复失败模式以及OPD与KL约束强化学习之间的联系。最后，我们提出了由此综合而产生的开放性问题，包括蒸馏扩展定律、不确定反馈、代理蒸馏以及知识蒸馏与强化学习之间的日益增长的重叠。

英文摘要

As Large Language Models continue to grow in both capability and cost, transferring frontier capabilities into smaller, deployable students has become an important engineering problem, and knowledge distillation remains a common technique for this transfer. The prevailing recipe in industrial pipelines, static imitation of teacher-generated text, carries a structural weakness that grows more severe as tasks become longer and more reasoning-intensive. Because the student is trained on flawless teacher prefixes but generates its own at inference, small errors tend to accumulate into trajectories it has rarely been trained to recover from, and the resulting exposure bias has been shown to scale roughly with the square of sequence length. On-Policy Distillation reorganizes the training loop around this observation by having the teacher provide feedback on what the student actually produces, with the goal of reducing the compounding term toward linear and reframing distillation as an iterative correction process rather than single-pass imitation. The resulting literature has expanded along divergence design, reward-guided optimization, and self-play, yet contributions remain scattered across the knowledge distillation, RLHF, and imitation learning communities without a unified treatment. This survey provides such a treatment. We formalize OPD as f-divergence minimization over student-sampled trajectories, organize the field along three design axes (what to optimize, where the signal comes from, and how to stabilize training in practice), and consolidate success conditions, recurring failure modes, and the connection between OPD and KL-constrained reinforcement learning. We close with open problems that emerge from this synthesis, including distillation scaling laws, uncertainty-aware feedback, agent-level distillation, and the growing overlap between knowledge distillation and RL.

URL PDF HTML ☆

赞 0 踩 0

2602.22495 2026-06-19 cs.LG cs.AI 版本更新 90%

Reinforcement-aware Knowledge Distillation for LLM Reasoning

面向LLM推理的强化学习感知知识蒸馏

Zhaoyang Zhang, Shuli Jiang, Yantao Shen, Yuting Zhang, Dhananjay Ram, Shuo Yang, Zhuowen Tu, Wei Xia, Stefano Soatto

发表机构 * Meta ； Guo et al. ； Lin et al. ； Xu et al. ； Shao et al. ； Schulman et al. ； Xie et al.

专题命中后训练：强化学习感知知识蒸馏用于LLM推理

AI总结提出RL感知蒸馏（RLAD），通过信任区域比率蒸馏（TRRD）在强化学习后训练中实现选择性模仿，解决分布不匹配和目标干扰问题，在逻辑推理和数学基准上优于现有方法。

详情

AI中文摘要

强化学习（RL）后训练最近推动了长链思维推理大语言模型（LLM）的重大进展，但这类模型的高推理成本促使将其蒸馏到更小的学生模型中。大多数现有的知识蒸馏（KD）方法是为监督微调（SFT）设计的，依赖于固定的教师轨迹或基于教师-学生KL散度的正则化。当与RL结合时，这些方法常常遭受分布不匹配和目标干扰：教师监督可能与学生不断变化的rollout分布不一致，并且KL正则化项可能与奖励最大化竞争，需要仔细的损失平衡。为了解决这些问题，我们提出了RL感知蒸馏（RLAD），它在RL期间执行选择性模仿——仅在改进当前策略更新时引导学生向教师学习。我们的核心组件，信任区域比率蒸馏（TRRD），用基于PPO/GRPO风格似然比的目标替代教师-学生KL正则化项，该目标锚定到教师-旧策略混合，从而在学生rollout上产生优势感知、信任区域约束的蒸馏，并自然平衡探索、利用和模仿。在多种逻辑推理和数学基准上，RLAD始终优于离线蒸馏、标准GRPO和基于KL的在策略教师-学生知识蒸馏。

英文摘要

Reinforcement learning (RL) post-training has recently driven major gains in long chain-of-thought reasoning large language models (LLMs), but the high inference cost of such models motivates distillation into smaller students. Most existing knowledge distillation (KD) methods are designed for supervised fine-tuning (SFT), relying on fixed teacher traces or teacher-student Kullback-Leibler (KL) divergence-based regularization. When combined with RL, these approaches often suffer from distribution mismatch and objective interference: teacher supervision may not align with the student's evolving rollout distribution, and the KL regularizer can compete with reward maximization and require careful loss balancing. To address these issues, we propose RL-aware distillation (RLAD), which performs selective imitation during RL -- guiding the student toward the teacher only when it improves the current policy update. Our core component, Trust Region Ratio Distillation (TRRD), replaces the teacher-student KL regularizer with a PPO/GRPO-style likelihood-ratio objective anchored to a teacher--old-policy mixture, yielding advantage-aware, trust-region-bounded distillation on student rollouts and naturally balancing exploration, exploitation, and imitation. Across diverse logic reasoning and math benchmarks, RLAD consistently outperforms offline distillation, standard GRPO, and KL-based on-policy teacher-student knowledge distillation.

URL PDF HTML ☆

赞 0 踩 0

2509.25148 2026-06-19 cs.AI 版本更新 90%

AAPA: Adversarially Anchored Preference Alignment for Post-Training of Large Language Models

AAPA：用于大型语言模型后训练的对抗锚定偏好对齐

Faqiang Qian, Kang An, Weikun Zhang, Ziliang Wang, Xuhui Zheng, Liangjian Wen, Yong Dai, Mengya Gao, Yichao Wu

发表机构 * Southwest University of Finance and Economics（西南财经大学）

专题命中后训练：提出对抗锚定偏好对齐框架，增强后训练目标

AI总结提出AAPA框架，通过固定轻量判别器对策略输出与专家响应进行句子级对抗锚定，增强SFT、GRPO等后训练目标，在指令遵循基准上持续提升性能。

详情

AI中文摘要

大型语言模型的后训练对齐通常结合了专家演示上的监督微调（SFT）和来自偏好或可验证反馈的强化学习（RL）。SFT提供了有用的行为锚点，但可能过拟合静态演示，而RL鼓励探索但可能偏离专家行为或利用不完美的奖励。我们提出\textbf{AAPA}（\emph{对抗锚定偏好对齐}），这是一个插件式框架，通过句子级对抗锚定信号增强现有的后训练目标。AAPA使用固定的轻量判别器将策略生成结果与离线预收集的专家响应进行比较，因此在策略优化期间既不需要在线教师推理，也不需要判别器协同训练。相同的锚定项可以添加到SFT、GRPO和CHORD中，同时保留其原始训练流程。在指令遵循基准上的实验表明，AAPA在不同模型规模上一致地改善了相应的基础目标。特别是，分阶段的AAPA配置在\texttt{Qwen3-0.6B}上比强GRPO基线提高了5.77%，在\texttt{Qwen3-4B}上提高了3.75%。对响应长度、对数概率分布和判别器变体的进一步分析表明，对抗锚定为偏好优化提供了稳定的语义基础信号。代码可在\url{this https URL}获取。

英文摘要

Post-training alignment of large language models often combines supervised fine-tuning (SFT) on expert demonstrations with reinforcement learning (RL) from preference or verifiable feedback. SFT provides a useful behavioral anchor but can overfit to static demonstrations, whereas RL encourages exploration but may drift from expert behavior or exploit imperfect rewards. We propose \textbf{AAPA} (\emph{Adversarially Anchored Preference Alignment}), a plug-in framework that augments existing post-training objectives with a sentence-level adversarial anchoring signal. AAPA compares policy rollouts with offline, pre-collected expert responses using a fixed lightweight discriminator, and therefore requires neither online teacher inference nor discriminator co-training during policy optimization. The same anchoring term can be added to SFT, GRPO, and CHORD while preserving their original training pipelines. Experiments on instruction-following benchmarks show that AAPA consistently improves the corresponding base objectives across model scales. In particular, the staged AAPA configuration improves over a strong GRPO baseline by 5.77\% on \texttt{Qwen3-0.6B} and 3.75\% on \texttt{Qwen3-4B}. Further analyses on response length, log-probability distributions, and discriminator variants suggest that adversarial anchoring provides a stable semantic grounding signal for preference optimization. Code is available at \url{https://github.com/IsFaqq/AAPA}.

URL PDF HTML ☆

赞 0 踩 0

2602.09689 2026-06-19 cs.LG 版本更新 80%

Model soups need only one ingredient

模型汤只需一种成分

Alireza Abdollahpoorrostam, Nikolaos Dimitriadis, Adam Hazimeh, Pascal Frossard

发表机构 * EPFL（瑞士联邦理工学院）； EPFL LTS4（瑞士联邦理工学院 LTS4）

专题命中后训练：MonoSoup方法通过SVD实现单检查点模型汤

AI总结提出MonoSoup方法，利用SVD分解单检查点的层更新，通过熵有效秩自动重加权成分，实现强分布内-分布外平衡，无需多检查点。

详情

AI中文摘要

在目标分布上微调大型预训练模型通常会提高分布内（ID）准确性，但代价是分布外（OOD）鲁棒性下降，因为表示会专门适应微调数据。权重空间集成方法，如模型汤（Model Soups），通过平均多个检查点来缓解这一影响，但它们在计算上代价高昂，需要训练和存储数十个微调模型。在本文中，我们介绍了MonoSoup，一种简单、无数据、无超参数的事后方法，仅使用单个检查点即可实现强大的ID-OOD平衡。我们的方法对每一层的更新应用奇异值分解（SVD），将其分解为捕捉任务特定适应的高能量方向和引入噪声但可能仍编码对鲁棒性有用的残余信号的低能量方向。然后，MonoSoup使用基于熵的有效秩自动重新加权这些分量，并考虑模型的谱和几何结构的逐层系数。在ImageNet上微调并在自然分布偏移下评估的CLIP模型，以及在数学推理和多选题基准上测试的Qwen语言模型上的实验表明，这种即插即用方法是多检查点方法的实用且有效的替代方案，保留了其大部分好处而无需计算开销。

英文摘要

Fine-tuning large pre-trained models on a target distribution often improves in-distribution (ID) accuracy, but at the cost of out-of-distribution (OOD) robustness as representations specialize to the fine-tuning data. Weight-space ensembling methods, such as Model Soups, mitigate this effect by averaging multiple checkpoints, but they are computationally prohibitive, requiring the training and storage of dozens of fine-tuned models. In this paper, we introduce MonoSoup, a simple, data-free, hyperparameter-free, post-hoc method that achieves a strong ID-OOD balance using only a single checkpoint. Our method applies Singular Value Decomposition (SVD) to each layer's update and decomposes it into high-energy directions that capture task-specific adaptation and low-energy directions that introduce noise but may still encode residual signals useful for robustness. MonoSoup then uses entropy-based effective rank to automatically re-weigh these components with layer-wise coefficients that account for the spectral and geometric structure of the model. Experiments on CLIP models fine-tuned on ImageNet and evaluated under natural distribution shifts, as well as on Qwen language models tested on mathematical reasoning and multiple-choice benchmarks, show that this plug-and-play approach is a practical and effective alternative to multi-checkpoint methods, retaining much of their benefits without their computational overhead.

URL PDF HTML ☆

赞 0 踩 0

2602.14696 2026-06-19 cs.LG 版本更新 90%

A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't)

对目标指令选择的批判性审视：厘清什么重要（以及什么不重要）

Nihal V. Nayak, Paula Rodriguez-Diaz, Neha Hulkund, Sara Beery, David Alvarez-Melis

发表机构 * Harvard University（哈佛大学）； MIT（麻省理工学院）； Kempner Institute（凯门研究所）

专题命中指令微调：系统分析指令微调中目标指令选择的核心要素

AI总结本文系统解构指令微调中目标指令选择的两大核心要素——数据表示与选择算法，发现基于梯度的表示结合贪心轮询选择在低预算下表现最佳，但收益随预算增加而减弱，并统一了多种算法为近似距离最小化。

Comments ICML 2026

详情

AI中文摘要

大型语言模型（LLM）的指令微调通常涉及从大型候选池中选择一个指令训练子集，使用来自目标任务的小型查询集。尽管兴趣日益增长，关于目标指令选择的文献仍然支离破碎且不透明：方法在选择预算上差异很大，经常省略零样本基线，并且常常混淆关键组件的贡献。因此，实践者缺乏针对其目标任务选择指令的可操作指导。在这项工作中，我们旨在通过解构和系统分析两个核心要素：数据表示和选择算法，为这一领域带来清晰度。我们的框架支持跨模型、任务和预算的受控比较。我们发现，只有基于梯度的数据表示选择的子集，其与查询的相似性能够一致地预测跨数据集、模型和候选池的性能。虽然没有单一方法占主导地位，但基于梯度的表示与贪心轮询选择相结合，在低预算下平均表现最佳，但这些收益在较大预算下会减弱。最后，我们将几种现有的选择算法统一为所选子集与查询集之间近似距离最小化的形式，并用新的泛化界限支持这一观点。更广泛地说，我们的发现为LLM微调中更原则性的数据选择提供了关键见解和基础。代码可在该 https URL 获取。

英文摘要

Instruction fine-tuning of large language models (LLMs) often involves selecting a subset of instruction training data from a large candidate pool, using a small query set from the target task. Despite growing interest, the literature on targeted instruction selection remains fragmented and opaque: methods vary widely in selection budgets, often omit zero-shot baselines, and frequently entangle the contributions of key components. As a result, practitioners lack actionable guidance on selecting instructions for their target tasks. In this work, we aim to bring clarity to this landscape by disentangling and systematically analyzing the two core ingredients: data representation and selection algorithms. Our framework enables controlled comparisons across models, tasks, and budgets. We find that only gradient-based data representations choose subsets whose similarity to the query consistently predicts performance across datasets, models, and candidate pools. While no single method dominates, gradient-based representations paired with greedy round-robin selection often perform best on average at low budgets, but these gains diminish at larger budgets. Finally, we unify several existing selection algorithms as forms of approximate distance minimization between the selected subset and the query set, and support this view with new generalization bounds. More broadly, our findings provide critical insights and a foundation for more principled data selection in LLM fine-tuning. The code is available at https://github.com/dcml-lab/targeted-instruction-selection.

URL PDF HTML ☆

赞 0 踩 0

2602.04306 2026-06-19 cs.CL cs.AI 版本更新 85%

DeFrame: Debiasing Large Language Models Against Framing Effects

DeFrame: 消除大语言模型中的框架效应偏差

Kahee Lim, Soyeon Kim, Steven Euijong Whang

发表机构 * KAIST（韩国科学技术院）

专题命中指令微调：提出框架感知去偏方法，增强LLM跨框架一致性

AI总结针对大语言模型在语义等价但不同表述的提示下产生不一致偏见的问题，提出框架感知的去偏方法，通过量化框架差异并增强跨框架一致性，有效降低整体偏见并提升鲁棒性。

Comments Accepted to Findings of ACL 2026

详情

AI中文摘要

随着大语言模型（LLMs）在现实应用中的日益部署，确保其在不同人口群体中的公平响应变得至关重要。尽管做出了许多努力，但一个持续的挑战是隐藏的偏见：LLMs 在标准评估下表现公平，但在这些评估设置之外可能产生有偏见的响应。在本文中，我们识别出框架——语义等价的提示在表达方式上的差异（例如，“A 比 B 好” vs. “B 比 A 差”）——作为导致这一差距的一个未被充分探索的因素。我们首先引入“框架差异”的概念来量化框架对公平性评估的影响。通过用替代框架扩充公平性评估基准，我们发现（1）公平性得分随框架变化显著，以及（2）现有的去偏方法改善了整体（即框架平均）公平性，但往往未能减少框架引起的差异。为了解决这个问题，我们提出了一种框架感知的去偏方法，鼓励 LLMs 在不同框架之间更加一致。实验表明，我们的方法减少了整体偏见，并提高了对框架差异的鲁棒性，使 LLMs 能够产生更公平和更一致的响应。

英文摘要

As large language models (LLMs) are increasingly deployed in real-world applications, ensuring their fair responses across demographics has become crucial. Despite many efforts, an ongoing challenge is hidden bias: LLMs appear fair under standard evaluations, but can produce biased responses outside those evaluation settings. In this paper, we identify framing -- differences in how semantically equivalent prompts are expressed (e.g., "A is better than B" vs. "B is worse than A") -- as an underexplored contributor to this gap. We first introduce the concept of "framing disparity" to quantify the impact of framing on fairness evaluation. By augmenting fairness evaluation benchmarks with alternative framings, we find that (1) fairness scores vary significantly with framing and (2) existing debiasing methods improve overall (i.e., frame-averaged) fairness, but often fail to reduce framing-induced disparities. To address this, we propose a framing-aware debiasing method that encourages LLMs to be more consistent across framings. Experiments demonstrate that our approach reduces overall bias and improves robustness against framing disparities, enabling LLMs to produce fairer and more consistent responses.

URL PDF HTML ☆

赞 0 踩 0

2605.16865 2026-06-19 cs.CL 版本更新 80%

MixSD: Mixed Contextual Self-Distillation for Knowledge Injection

MixSD: 混合上下文自蒸馏用于知识注入

Jiarui Liu, Lechen Zhang, Yongjin Yang, Yinghui He, Yingheng Wang, Weihao Xuan, Zhijing Jin, Mona Diab

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； Jinesis Lab, University of Toronto & Vector Institute（Jinesis实验室，多伦多大学及向量研究所）； University of Illinois Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校）； Princeton University（普林斯顿大学）； Cornell University（康奈尔大学）； The University of Tokyo（东京大学）； RIKEN AIP（日本理化学研究所AIP）； Max Planck Institute for Intelligent Systems, Tübingen, Germany（德国图宾根最大计划智能系统研究所）； EuroSafeAI

专题命中指令微调：混合上下文自蒸馏用于知识注入

AI总结本文提出MixSD方法，通过混合模型自身条件下的token来实现与模型生成分布对齐的知识注入，从而在保持预训练能力的同时提升事实记忆和推理能力。

详情

AI中文摘要

监督微调（SFT）被广泛用于将新知识注入语言模型，但通常会损害预训练能力，如推理和通用领域性能。我们认为这种遗忘是由于微调目标与模型的自回归分布不一致，迫使优化器模仿低概率token序列。为了解决这个问题，我们提出了MixSD，一种无需外部教师的简单方法，用于对齐分布的知识注入。与固定目标训练不同，MixSD通过混合基础模型自身两个条件下的token动态构建监督。所生成的监督序列保留了事实学习信号，同时更接近基础模型的分布。我们在两个合成语料库上评估了MixSD，研究事实回忆和算术功能学习，并结合已建立的开放领域事实问答和知识编辑基准。在多种模型规模和设置下，MixSD在记忆-保留权衡上优于SFT和在线自蒸馏基线，能够保留基础模型的100% held-out能力，同时保持接近完美的训练准确率，而标准SFT只能保留1%。我们进一步表明，MixSD在基础模型下生成的监督目标具有显著更低的NLL，并减少了有害的Fisher敏感参数方向运动。这些结果表明，将监督与模型的本征生成分布对齐是简单且有效的知识注入原则，可以缓解灾难性遗忘。

英文摘要

Supervised fine-tuning (SFT) is widely used to inject new knowledge into language models, but it often degrades pretrained capabilities such as reasoning and general-domain performance. We argue this forgetting arises because fine-tuning targets from humans or external systems diverge from the model's autoregressive distribution, forcing the optimizer to imitate low-probability token sequences. To address this problem, we propose MixSD, a simple external-teacher-free method for distribution-aligned knowledge injection. Instead of training on fixed targets, MixSD constructs supervision dynamically by mixing tokens from two conditionals of the base model itself: an expert conditional that observes the injected fact in context, and a naive conditional that reflects the model's original prior. The resulting supervision sequences preserve the factual learning signal while remaining substantially closer to the base model's distribution. We evaluate MixSD on two synthetic corpora that we construct to study factual recall and arithmetic function acquisition in a controlled setting, together with established benchmarks for open-domain factual question answering and knowledge editing. Across multiple model scales and settings, MixSD consistently achieves a better memorization-retention trade-off compared to SFT and on-policy self distillation baselines, retaining up to 100% of the base model's held-out capability while maintaining near-perfect training accuracy, whereas standard SFT retains as little as 1%. We further show that MixSD produces substantially lower-NLL supervision targets under the base model and reduces harmful movement along Fisher-sensitive parameter directions. These results suggest that aligning supervision with the model's native generation distribution is a simple and effective principle for knowledge injection that mitigates catastrophic forgetting.

URL PDF HTML ☆

赞 0 踩 0

2605.31393 2026-06-19 cs.CL cs.AI 版本更新 70%

Target-Side Paraphrase Augmentation for Sign Language Translation with Large Language Models

面向手语翻译的大语言模型目标端释义增强

Pedro Dal Bianco, Jean Paul Nunes Reinhold, Oscar Stanchi, Facundo Quiroga, Franco Ronchetti, Ulisses Brisolara Corrêa

发表机构 * III-LIDI Universidad Nacional de La Plata（III-LIDI国立拉普拉塔大学）； CDTEC, Federal University of Pelotas（CDTEC，联邦 Pelotas 大学）； CONICET III-LIDI ； Comision de Investigaciones Cientificas Universidad Nacional de La Plata（科学委员会国立拉普拉塔大学）； Universidade Federal de Pelotas（联邦 Pelotas 大学）

专题命中指令微调：使用GPT-4o生成释义增强手语翻译。

AI总结针对手语翻译中平行语料稀缺和目标词汇长尾分布的问题，提出利用GPT-4o生成参考句子的受控释义变体进行目标端增强，并在三种手语数据集上验证了方法的有效性。

Comments Accepted at GenSign @ CVPR 2026. Non-Proceedings Track (https://genai4sl.github.io/)

详情

AI中文摘要

手语翻译（SLT）仍然受到有限的配对手语视频/文本语料库和长尾目标词汇的限制。我们研究了目标端增强方法，其中GPT-4o生成参考句子的受控释义变体，而手语输入保持不变。采用基于Signformer姿态的Transformer，在两阶段调度下进行训练：先在增强语料库上预训练，然后在原始参考句子上微调。我们在三个具有互补挑战的数据集上进行了评估：PHOENIX14T（德国手语），具有适度的词汇多样性；GSL（希腊手语），具有高度受控、重复的录制；以及LSA-T（阿根廷手语），具有严重的长尾稀疏性。在PHOENIX14T上，增强将BLEU-4从9.56提高到10.33。接近饱和的GSL基线和极其稀疏的LSA-T设置揭示了该方法的局限性。据我们所知，这是第一项将LLM生成的目标端释义和LLM作为评估者应用于手语翻译的研究。语义评估揭示了词汇重叠指标低估的忠实度提升。

英文摘要

Sign language translation (SLT) remains constrained by the limited availability of paired sign-video/text corpora and by the heavy-tailed vocabularies typical of real-world datasets. We study a target-side augmentation strategy in which a large language model (LLM) generates controlled paraphrase variants of the reference spoken-language sentence while the sign input remains unchanged. Concretely, we use GPT-4o to produce semantically faithful variants of the training targets and train a Signformer-style pose-based Transformer under a two-stage schedule: pre-training on the augmented corpus followed by fine-tuning on the original references. We evaluate this strategy on three datasets that span complementary challenges: PHOENIX14T (German Sign Language), a real-world corpus with moderate lexical diversity; the Greek Sign Language Dataset with highly controlled, repetitive recordings; and LSA-T (Argentinian Sign Language), a naturalistic corpus with a large vocabulary and severe long-tail sparsity. This range allows us to characterize precisely when and why target-side augmentation is beneficial. On PHOENIX14T, augmentation improves BLEU-4 from 9.56 to 10.33, demonstrating that paraphrastic exposure helps the decoder generalize beyond memorized reference phrasing. The near-saturated GSL baseline and the extremely sparse LSA-T setting reveal the limits of the approach: in both cases, single-reference lexical overlap metrics are insufficient to capture the full picture, motivating a complementary semantic evaluation. To our knowledge, this is the first study to examine LLM-generated target-side paraphrases as an augmentation mechanism for SLT, and the first to apply an LLM-as-a-Judge evaluation protocol to SLT. This complementary evaluation reveals gains in semantic fidelity that lexical overlap metrics understate.

URL PDF HTML ☆

赞 0 踩 0

2510.06048 2026-06-19 cs.LG 版本更新 85%

BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining

BLISS: 一种用于语言模型预训练数据选择的轻量级双层影响评分方法

Jie Hao, Rui Yu, Wei Zhang, Huixia Wang, Jie Xu, Mingrui Liu

发表机构 * Department of Computer Science, George Mason University, USA（乔治·马歇尔大学计算机科学系）； IBM T.J. Watson Research Center, USA（IBM T.J. Watson研究部）； Department of Statistics, Rice University（里士大学统计系）； Department of System Engineering & Operations Research, George Mason University, USA（乔治·马歇尔大学系统工程与运营管理系）

专题命中预训练：提出数据选择方法用于语言模型预训练

AI总结提出一种无需外部预训练模型的轻量级数据选择方法BLISS，通过双层优化和代理模型估计训练样本的长期影响，实现高效数据筛选，在C4数据集上预训练多种规模模型，显著加速收敛并提升下游任务性能。

详情

AI中文摘要

有效的数据选择对于预训练大型语言模型（LLM）至关重要，可以提高效率并增强对下游任务的泛化能力。然而，现有方法通常需要利用外部预训练模型，使得难以将数据选择的效果与外部预训练模型的效果分开。此外，如果模型训练至收敛，它们通常忽略所选数据的长期影响，这主要是由于全规模LLM预训练的过高成本。在本文中，我们介绍了BLISS（用于数据选择的轻量级双层影响评分方法）：一种轻量级数据选择方法，完全从头开始操作，不依赖任何外部预训练预言模型，同时明确考虑所选数据的长期影响。BLISS利用一个小型代理模型作为LLM的替代，并采用一个评分模型来估计如果代理模型训练至收敛时训练样本的长期影响。我们将数据选择形式化为一个双层优化问题，其中上层目标优化评分模型以分配重要性权重给训练样本，确保最小化下层目标（即在加权训练损失上训练代理模型直至收敛）导致最佳验证性能。一旦优化完成，训练好的评分模型预测数据集的影响分数，从而能够高效选择高质量样本用于LLM预训练。我们通过在C4数据集的选择子集上预训练410M/1B/2.8B Pythia和LLaMA-0.5B模型来验证BLISS。值得注意的是，在1B模型设置下，BLISS在达到与最先进方法相同性能时实现了1.7倍的加速，展示了在多个下游任务上的优越性能。

英文摘要

Effective data selection is essential for pretraining large language models (LLMs), enhancing efficiency and improving generalization to downstream tasks. However, existing approaches often require leveraging external pretrained models, making it difficult to disentangle the effects of data selection from those of the external pretrained models. In addition, they often overlook the long-term impact of selected data if the model is trained to convergence, primarily due to the prohibitive cost of full-scale LLM pretraining. In this paper, we introduce BLISS (\textbf{B}ileve\textbf{L} \textbf{I}nfluence \textbf{S}coring method for data \textbf{S}election): a lightweight data selection method that operates entirely \emph{from scratch}, without relying on any external pretrained oracle models, while explicitly accounting for the long-term impact of selected data. BLISS leverages a small proxy model as a surrogate for the LLM and employs a score model to estimate the long-term influence of training samples if the proxy model is trained to convergence. We formulate data selection as a bilevel optimization problem, where the upper-level objective optimizes the score model to assign importance weights to training samples, ensuring that minimizing the lower-level objective (i.e., training the proxy model over the weighted training loss until convergence) leads to best validation performance. Once optimized, the trained score model predicts influence scores for the dataset, enabling efficient selection of high-quality samples for LLM pretraining. We validate BLISS by pretraining 410M/1B/2.8B Pythia and LLaMA-0.5B models on selected subsets of the C4 dataset. Notably, under the 1B model setting, BLISS achieves $1.7\times$ speedup in reaching the same performance as the state-of-the-art method, demonstrating superior performance across multiple downstream tasks.

URL PDF HTML ☆

赞 0 踩 0

2602.04396 2026-06-19 cs.LG cs.AI 版本更新 80%

LoRDO: Distributed Low-Rank Optimization with Infrequent Communication

LoRDO: 分布式低秩优化与低频通信

Andrej Jovanović, Alex Iacob, Mher Safaryan, Ionut-Vlad Modoranu, Lorenzo Sani, William F. Shen, Xinchi Qiu, Dan Alistarh, Nicholas D. Lane

发表机构 * University of Cambridge（剑桥大学）； Institute of Science and Technology Austria（奥地利科学与技术研究院）； Lancaster University（兰卡斯特大学）； Flower Labs（Flower实验室）

专题命中预训练：LoRDO框架实现分布式低秩优化与低频通信

AI总结提出LoRDO框架，统一低秩优化与低频同步，通过全秩准双曲更新恢复子空间探索，在125M-720M模型规模下实现与低秩DDP近似的性能，通信量减少约10倍。

Comments Accepted at ICML 2026

详情

AI中文摘要

通过$\ exttt{DDP}$进行基础模型的分布式训练受限于互连带宽。虽然低频通信策略减少了同步频率，但优化器状态的内存和通信需求仍然构成瓶颈。低秩优化器可以缓解这些限制；然而，在局部更新机制下，工作节点无法访问计算低秩投影所需的全批次梯度，这降低了性能。我们提出$\ exttt{LoRDO}$，一个统一低秩优化与低频同步的原则性框架。我们首先证明，虽然基于伪梯度的全局投影在理论上更优，但它们将优化轨迹永久限制在低秩子空间中。为了恢复子空间探索，我们引入了一个全秩准双曲更新。$\ exttt{LoRDO}$在125M-720M模型规模的语言建模和下游任务中实现了与低秩$\ exttt{DDP}$近乎相同的性能，同时将通信量减少了约10倍。最后，我们表明在具有小秩/小批次大小的极低内存设置中，$\ exttt{LoRDO}$的性能提升更为显著。

英文摘要

Distributed training of foundation models via $\texttt{DDP}$ is limited by interconnect bandwidth. While infrequent communication strategies reduce synchronization frequency, they remain bottlenecked by the memory and communication requirements of optimizer states. Low-rank optimizers can alleviate these constraints; however, in the local-update regime, workers lack access to the full-batch gradients required to compute low-rank projections, which degrades performance. We propose $\texttt{LoRDO}$, a principled framework unifying low-rank optimization with infrequent synchronization. We first demonstrate that, while global projections based on pseudo-gradients are theoretically superior, they permanently restrict the optimization trajectory to a low-rank subspace. To restore subspace exploration, we introduce a full-rank quasi-hyperbolic update. $\texttt{LoRDO}$ achieves near-parity with low-rank $\texttt{DDP}$ in language modeling and downstream tasks at model scales of $125$M--$720$M, while reducing communication by $\approx 10 \times$. Finally, we show that $\texttt{LoRDO}$ improves performance even more in very low-memory settings with small rank/batch size.

URL PDF HTML ☆

赞 0 踩 0

2512.06899 2026-06-19 cs.CR 版本更新 85%

Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models

Patronus: 识别和缓解预训练语言模型中的可迁移后门

Tianhang Zhao, Haodong Zhao, Wei Du, Pengzhou Cheng, Junxian Li, Sufeng Duan, Haojin Zhu, Gongshen Liu

专题命中其他LLM ：针对预训练语言模型后门攻击的防御框架，涉及LLM安全。

AI总结针对预训练语言模型供应链中可迁移后门的安全威胁，提出Patronus防御框架，通过输入侧不变性检测和双阶段缓解策略，在15个模型和9个任务上实现≥98.3%后门检测召回率。

Comments Work in progress

详情

AI中文摘要

“预训练，然后微调”范式彻底改变了自然语言处理（NLP）。在此背景下，可迁移后门对预训练语言模型（PLMs）供应链构成严重威胁，然而防御研究仍处于起步阶段，主要依赖于检测输出特征空间中的异常。我们发现一个关键缺陷：下游任务的微调不可避免地会修改模型参数，改变输出分布，使得预先计算的防御失效。为解决此问题，我们提出Patronus，一种新颖的防御框架，将防御焦点从输出特征转移到输入侧不变性，利用对抗性触发即使在模型权重变化时也保持恒定的特性。为了克服离散文本优化的收敛挑战，Patronus引入了一种多触发对比搜索算法，有效桥接了基于梯度的优化与对比学习目标。此外，我们采用了一种双阶段缓解策略，结合实时输入监控和通过对抗训练进行的模型净化。在15个PLMs和9个任务上的大量实验表明，Patronus实现了≥98.3%的后门检测召回率，并将攻击成功率降低到干净设置的水平，在所有设置中显著优于所有最先进的基线。代码可从此https URL获取。

英文摘要

The ``Pre-train, then fine-tune'' paradigm has revolutionized Natural Language Processing (NLP). In this context, transferable backdoors pose a severe threat to the Pre-trained Language Models (PLMs) supply chain, yet defensive research remains nascent, primarily relying on detecting anomalies in the output feature space. We identify a critical flaw that fine-tuning on downstream tasks inevitably modifies model parameters, shifting the output distribution and rendering pre-computed defense ineffective. To address this, we propose Patronus, a novel defense framework that shifts the defensive focus from output features to input-side invariance, exploiting the fact that adversarial triggers remain constant even as model weights change. To overcome the convergence challenges of discrete text optimization, Patronus introduces a multi-trigger contrastive search algorithm that effectively bridges gradient-based optimization with contrastive learning objectives. Furthermore, we employ a dual-stage mitigation strategy combining real-time input monitoring with model purification via adversarial training. Extensive experiments across 15 PLMs and nine tasks demonstrate that Patronus achieves $\geq98.3\%$ backdoor detection recall and reduces attack success rates to clean settings, significantly outperforming all state-of-the-art baselines in all settings. Code is available at https://github.com/zth855/Patronus.

URL PDF HTML ☆

赞 0 踩 0

2603.25702 2026-06-19 cs.CL 版本更新 80%

S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

S2D2：通过免训练自我推测实现扩散LLM的快速解码

Ligong Han, Hao Wang, Han Gao, Kai Xu, Akash Srivastava

发表机构 * Red Hat AI Innovation（红帽AI创新）； MIT-IBM Watson AI Lab（MIT-IBM沃森人工智能实验室）； Iowa State University（爱荷华州立大学）； Core AI, IBM（IBM核心AI）

专题命中其他LLM ：扩散LLM解码加速，属于语言模型方法

AI总结提出S2D2，一种免训练的自我推测解码框架，通过将块扩散模型在块大小为1时变为自回归模型，实现草稿与验证角色复用，在不增加训练或测试计算下提升解码速度与准确性。

Comments Code is available at https://github.com/phymhan/S2D2

详情

AI中文摘要

块扩散语言模型通过结合块级自回归解码与块内并行去噪，为超越自回归生成提供了一条有前景的路径。然而，在实际加速所需的少步数场景中，标准的置信度阈值解码往往脆弱：激进的阈值损害质量，而保守的阈值则需要不必要的去噪步骤。现有解决此问题的方法要么需要额外训练，要么增加测试时计算。我们提出S2D2，一种用于块扩散语言模型的免训练自我推测解码框架。我们的关键观察是，当块大小减小到1时，块扩散模型变为自回归模型，从而允许相同的预训练模型同时充当草稿模型和验证模型。S2D2在标准块扩散解码中插入一个推测验证步骤，并使用轻量级路由策略来决定何时验证值得其成本。这产生了一种混合解码轨迹，其中扩散并行提出令牌，而自回归模式充当局部序列级评判器。在三个主流块扩散家族中，S2D2在准确性-速度权衡上持续优于强置信度阈值基线。在SDAR上，我们观察到相比自回归解码高达4.7倍加速，相比调优的动态解码基线高达1.57倍加速，同时准确性提升高达4.5个点。在LLaDA2.1-Mini上，S2D2与内置自校正保持互补，包括在保守设置下比静态基线快4.4倍且准确性略高。

英文摘要

Block-diffusion language models offer a promising path toward faster-than-autoregressive generation by combining block-wise autoregressive decoding with within-block parallel denoising. However, in the few-step regime needed for practical acceleration, standard confidence-thresholded decoding is often brittle: aggressive thresholds hurt quality, while conservative thresholds require unnecessary denoising steps. Existing approaches that address this issue either require additional training or incur extra test-time compute. We present S2D2, a training-free self-speculative decoding framework for block-diffusion language models. Our key observation is that a block-diffusion model becomes autoregressive when the block size is reduced to one, allowing the same pretrained model to act as both drafter and verifier. S2D2 inserts a speculative verification step into standard block-diffusion decoding and uses lightweight routing policies to decide when verification is worth its cost. This yields a hybrid decoding trajectory in which diffusion proposes tokens in parallel, while the autoregressive mode acts as a local sequence-level critic. Across three mainstream block-diffusion families, S2D2 consistently improves the accuracy-speed tradeoff over strong confidence-thresholding baselines. On SDAR, we observe up to $4.7\times$ speedup over autoregressive decoding, and up to $1.57\times$ over a tuned dynamic decoding baseline while improving accuracy by up to $4.5$ points. On LLaDA2.1-Mini, S2D2 remains complementary to built-in self-correction, including a conservative setting where it is $4.4\times$ faster than the static baseline with slightly higher accuracy.

URL PDF HTML ☆

赞 0 踩 0

2603.16606 2026-06-19 cs.CL 版本更新 80%

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Omnilingual SONAR：跨语言与跨模态句子嵌入，连接大规模多语言文本与语音

Omnilingual SONAR Team, João Maria Janeiro, Pere-Lluís Huguet Cabot, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramírez, Loic Barrault, Belen Alastruey, Xiang "Tony" Cao, Yu-An Chung, Marta R. Costa-Jussa, David Dale, Kevin Heffernan, Jaehyeong Jo, Artyom Kozhevnikov, Alexandre Mourachko, Christophe Ropers, Holger Schwenk, Paul-Ambroise Duquenne

发表机构 * FAIR at Meta（Meta的FAIR）

专题命中其他LLM ：跨语言跨模态句子嵌入模型

AI总结提出OmniSONAR模型，通过渐进式训练和教师-学生蒸馏，在数千种语言上实现文本、语音、代码和数学表达式的统一语义嵌入，在跨语言检索和翻译任务上显著降低错误率，并支持零样本语音翻译。

详情

AI中文摘要

跨语言句子编码器通常只覆盖几百种语言，并且常常为了更强的对齐而牺牲下游质量，限制了它们的采用。我们引入了OmniSONAR，一个新的全语言、跨语言和跨模态句子嵌入模型家族，它原生地将文本、语音、代码和数学表达式嵌入到单一语义空间中，同时在数千种语言（从高资源到极低资源变体）的规模上提供最先进的下游性能。为了在不发生表示崩溃的情况下达到这一规模，我们使用了渐进式训练。我们首先使用LLM初始化的编码器-解码器，结合token级解码、新颖的分裂softmax对比损失和合成硬负样本，为200种语言学习一个强大的基础空间。在此基础上，我们通过两阶段教师-学生编码器蒸馏框架扩展到数千种语言变体。最后，我们通过将177种口语无缝映射到该空间，展示了该空间的跨模态可扩展性。OmniSONAR将200种语言的FLORES数据集上的跨语言相似性搜索错误减半，并在1560种语言的BIBLE基准上将错误减少了15倍。它还实现了强大的翻译性能，在多语言基准上优于NLLB-3B，并在1560种语言到英语的BIBLE翻译上比先前模型（包括更大的LLM）高出15个chrF++点。OmniSONAR在MTEB和XLCoST上也表现强劲。对于语音，OmniSONAR实现了43%更低的相似性搜索错误，并达到了SeamlessM4T语音到文本质量的97%，尽管对于翻译是零样本（仅在ASR数据上训练）。最后，通过训练一个编码器-解码器LM Spectrum，仅使用英语文本处理OmniSONAR嵌入序列，我们为复杂的下游任务解锁了向数千种语言和语音的高性能迁移。

英文摘要

Cross-lingual sentence encoders typically cover only a few hundred languages and often trade downstream quality for stronger alignment, limiting their adoption. We introduce OmniSONAR, a new family of omnilingual, cross-lingual and cross-modal sentence embedding models that natively embed text, speech, code, and mathematical expressions in a single semantic space, while delivering state-of-the-art downstream performance at the scale of thousands of languages, from high-resource to extremely low-resource varieties. To reach this scale without representation collapse, we use progressive training. We first learn a strong foundational space for 200 languages with an LLM-initialized encoder-decoder, combining token-level decoding with a novel split-softmax contrastive loss and synthetic hard negatives. Building on this foundation, we expand to several thousands language varieties via a two-stage teacher-student encoder distillation framework. Finally, we demonstrate the cross-modal extensibility of this space by seamlessly mapping 177 spoken languages into it. OmniSONAR halves cross-lingual similarity search error on the 200-language FLORES dataset and reduces error by a factor of 15 on the 1,560-language BIBLE benchmark. It also enables strong translation, outperforming NLLB-3B on multilingual benchmarks and exceeding prior models (including much larger LLMs) by 15 chrF++ points on 1,560 languages into English BIBLE translation. OmniSONAR also performs strongly on MTEB and XLCoST. For speech, OmniSONAR achieves a 43% lower similarity-search error and reaches 97% of SeamlessM4T speech-to-text quality, despite being zero-shot for translation (trained only on ASR data). Finally, by training an encoder-decoder LM, Spectrum, exclusively on English text processing OmniSONAR embedding sequences, we unlock high-performance transfer to thousands of languages and speech for complex downstream tasks.

URL PDF HTML ☆

赞 0 踩 0

2512.03818 2026-06-19 cs.CL 版本更新 80%

Improving Alignment Between Human and Machine Codes: An Empirical Assessment of Prompt Engineering for Construct Identification in Psychology

改善人机编码对齐：心理学构念识别中提示工程的实证评估

Kylie L. Anglin, Stephanie Milan, Brittney Hernandez, Claudia Ventura

发表机构 * Department of Educational Psychology, Neag School of Education, University of Connecticut（教育心理学系，教育学院，康涅狄格大学）； Department of Psychological Sciences, College of Liberal Arts and Sciences, University of Connecticut（心理学系，文理学院，康涅狄格大学）

专题命中其他LLM ：优化LLM在心理学文本中识别构念的提示工程。

AI总结本研究提出一个实证框架，通过提示工程优化大语言模型在心理学文本中识别构念的性能。实验评估五种提示策略，发现构念定义和任务框架最关键，结合代码簿引导和自动提示工程的少样本方法最接近专家判断。

Comments 22 pages, 2 figures

详情

AI中文摘要

由于其架构和庞大的预训练数据，大语言模型（LLMs）表现出强大的文本分类性能。然而，LLM的输出——这里指分配给文本的类别——在很大程度上取决于提示的措辞。尽管关于提示工程的文献正在扩展，但很少有研究关注分类任务，更少有研究涉及心理学等领域，在这些领域中，构念具有精确的、理论驱动的定义，而这些定义可能未在预训练数据中得到充分体现。我们提出了一个实证框架，通过提示工程优化LLM在文本中识别构念的性能。我们实验评估了五种提示策略——代码簿引导的实证提示选择、自动提示工程、角色提示、思维链推理和解释性提示——采用零样本和少样本分类。我们发现，角色、思维链和解释并不能完全解决因措辞不当的提示而导致的性能损失。相反，提示中最有影响力的特征是构念定义、任务框架，以及在较小程度上提供的示例。在三个构念和两个模型中，与专家判断最一致的分类来自结合代码簿引导的实证提示选择和自动提示工程的少样本提示。基于我们的发现，我们建议研究人员生成并评估尽可能多的提示变体，无论是人工编写的、自动生成的，或者理想情况下两者兼有，并根据训练数据集中的实证性能选择提示和示例，在保留集中验证最终方法。该程序提供了一种实用、系统且理论驱动的方法，用于在需要与专家判断对齐的环境中优化LLM提示。

英文摘要

Due to their architecture and vast pre-training data, large language models (LLMs) demonstrate strong text classification performance. However, LLM output - here, the category assigned to a text - depends heavily on the wording of the prompt. While literature on prompt engineering is expanding, few studies focus on classification tasks, and even fewer address domains like psychology, where constructs have precise, theory-driven definitions that may not be well represented in pre-training data. We present an empirical framework for optimizing LLM performance for identifying constructs in texts via prompt engineering. We experimentally evaluate five prompting strategies -- codebook-guided empirical prompt selection, automatic prompt engineering, persona prompting, chain-of-thought reasoning, and explanatory prompting - with zero-shot and few-shot classification. We find that persona, chain-of-thought, and explanations do not fully address performance loss accompanying a badly worded prompt. Instead, the most influential features of a prompt are the construct definition, task framing, and, to a lesser extent, the examples provided. Across three constructs and two models, the classifications most aligned with expert judgments resulted from a few-shot prompt combining codebook-guided empirical prompt selection with automatic prompt engineering. Based on our findings, we recommend that researchers generate and evaluate as many prompt variants as feasible, whether human-crafted, automatically generated, or ideally both, and select prompts and examples based on empirical performance in a training dataset, validating the final approach in a holdout set. This procedure offers a practical, systematic, and theory-driven method for optimizing LLM prompts in settings where alignment with expert judgment is critical.

URL PDF HTML ☆

赞 0 踩 0

2606.06971 2026-06-19 cs.MA cs.SI 版本更新 70%

Modeling U.S. Attitudes Toward China via an Event-Steered Multi-Agent Simulator

通过事件驱动的多智能体模拟器建模美国对华态度

Chenxu Zhu, Hantao Yao, Wu Liu, Junbo Guo, Yongdong Zhang

专题命中其他LLM ：基于LLM的多智能体模拟，驱动舆论演化

AI总结提出事件驱动多智能体模拟器（ES-MAS），利用CURE数据集和双流数据集成引擎（DSDIE）及新闻驱动动态交互模块（NDDI），模拟美国对华舆论的动态演化，实验表明优于现有模型。

详情

AI中文摘要

理解舆论的动态演化，如美国公众对中国的态度，对于评估地缘政治风险至关重要。然而，现有的基于LLM的多智能体模拟器主要依赖静态规则和固定数据集，限制了其捕捉现实世界中宏观层面舆论转变的动态、事件驱动特性的能力。为解决这一限制，我们提出了一种事件驱动的多智能体模拟器（ES-MAS），其中重大事件和日常新闻通过智能体之间的动态交互持续驱动舆论演化。我们首先构建了中美关系演化（CURE）数据集，涵盖2021年至2025年的20个季度，包括258个重大事件和超过14,000篇日常新闻文章，为建模舆论动态提供了全面的时间基础。基于CURE数据集，我们提出了双流数据集成引擎（DSDIE），该引擎通过宏观层面事件将模拟与历史时间线对齐，同时基于个体智能体画像和上下文信号实现个性化信息暴露。此外，我们设计了新闻驱动的动态交互（NDDI）模块，该模块自适应地将具有共同新闻兴趣的智能体分组到局部交互上下文中，促进自下而上的共识形成，同时降低孤立信息茧房的风险。在CURE数据集上的实验结果表明，ES-MAS在复现真实世界历史趋势方面显著优于现有模拟器，为建模动态舆论演化提供了一个可扩展且有效的框架。

英文摘要

Understanding the dynamic evolution of opinions, such as U.S. public attitudes toward China, is essential for assessing geopolitical risks. However, existing LLM-based multiagent simulators predominantly rely on static rules and fixed datasets, limiting their ability to capture the dynamic, event-driven nature of macro-level opinion shifts in real-world settings. To address this limitation, we propose an Event-Steered Multi-Agent Simulator (ES-MAS), in which significant events and daily news continuously drive opinion evolution through dynamic interactions among agents. We first construct the China-U.S. Relation Evolution (CURE) dataset, covering 20 quarters from 2021 to 2025, including 258 major events and over 14,000 daily news articles, and providing a comprehensive temporal foundation for modeling opinion dynamics. Building upon the CURE dataset, we propose a Dual-Stream Data Integration Engine (DSDIE) that aligns simulations with historical timelines via macro-level events while enabling personalized information exposure based on individual agent profiles and contextual signals. Furthermore, we design a News-Driven Dynamic Interaction (NDDI) module, which adaptively groups agents with shared news interests into localized interaction contexts, facilitating bottom-up consensus formation while mitigating the risk of isolated information cocoons. Experimental results on the CURE dataset demonstrate that ES-MAS substantially outperforms existing simulators in reproducing real-world historical trends, offering a scalable and effective framework for modeling dynamic opinion evolution.

URL PDF HTML ☆

赞 0 踩 0

2604.07593 2026-06-19 cs.AI 版本更新 70%

Too long; didn't solve

太长；没解决

Lucía M. Cabrera, Isaac Saxton-Knight, Jocelyn D'Arcy

发表机构 * Instituto Balseiro（巴塞罗那研究所）； Poindexter Labs（波因迪克斯实验室）

专题命中其他LLM ：提示长度与数学推理性能关系研究

AI总结研究提示长度和解答长度与大型语言模型在数学问题上的性能关系，发现两者与模型失败率正相关。

2604.01955 2026-06-19 cs.CY 版本更新 70%

Teaching Students to Question the Machine: An AI Literacy Intervention Improves Students' Regulation of LLM Use in a Science Task

教导学生质疑机器：一项AI素养干预措施提升学生在科学任务中调节LLM使用的能力

O. Clerc, R. Abdelghani, C. Desvaux, E. Poisson, P. Y. Oudeyer, H. Sauzéon

专题命中其他LLM ：AI素养干预提升学生LLM使用能力

AI总结本研究通过两小时的AI素养工作坊，训练中学生（8-9年级）在科学问题解决中更有效地使用大语言模型，减少盲目依赖并提高答案质量。

Comments Workshop paper accepted at ALIT4ALL 2026: 2nd International Workshop on AI Literacy Education For All, co-located with AIED 2026

详情

AI中文摘要

生成式人工智能（GenAI）在学校中的快速普及引发了人们对学生不加批判地依赖其输出的担忧。有效使用大语言模型（LLM）不仅需要技术知识，还需要监控、评估和调节与系统交互的能力，这些过程与元认知调节密切相关。这些技能在中学阶段仍在发展中，使得学生特别容易过度信任和过早接受AI输出。由于课堂时间和教师培训资源有限，迫切需要开发和评估可在现实学校条件下实施的AI素养干预措施。我们报告了一项受控的课堂研究，考察两小时的AI素养工作坊是否能改善学生在LLM支持的科学问题解决中的交互策略和最终答案质量。共有116名学生（8-9年级；13-15岁）使用生成式AI系统完成了六项科学调查任务。两天前，干预组参加了工作坊，该工作坊结合了关于LLM如何工作及失败的信息，以及关于提示和响应评估的实用指导；对照组未接受培训。受过训练的学生表现出更少的盲目依赖：他们更频繁地重新表述查询、提出后续问题，并更准确地判断响应正确性，从而获得更好的表现。相比之下，GenAI和元认知自我报告分数不能预测表现，这表明有效使用生成式AI较少依赖于自我报告测量，而更多依赖于交互调节的明确训练。总体而言，结果表明，简短、可扩展的AI素养教学可以显著改善中学生在校本学习活动中使用生成式AI的方式。

英文摘要

The rapid adoption of generative artificial intelligence (GenAI) in schools raises concerns about students' uncritical reliance on its outputs. Effective use of large language models (LLMs) requires not only technical knowledge but also the ability to monitor, evaluate, and regulate one's interaction with the system, processes closely tied to metacognitive regulation. These skills are still developing in middle school, making students particularly vulnerable to over-trust and premature acceptance of AI outputs. Because classroom time and teacher training resources are constrained, there is a pressing need to develop and evaluate AI literacy interventions that can be implemented under realistic school conditions. We report a controlled classroom study examining whether a two-hour AI literacy workshop improves students' interaction strategies and quality of final answers in LLM-supported science problem solving. A total of 116 students (grades 8-9; ages 13-15) completed six science investigation tasks using a generative AI system. Two days prior, the intervention group attended the workshop, which combined information about how LLMs work and fail with practical guidance on prompting and response evaluation; the control group received no training. Trained students showed less uncritical reliance on the system: they more often reformulated queries, asked follow-up questions, and more accurately judged response correctness, leading to better performance. In contrast, GenAI and metacognitive self-report scores did not predict performance, suggesting that effective use of generative AI depends less on self-reported measures and more on explicit training in interaction regulation. Overall, the results show that brief, scalable AI literacy instruction can meaningfully improve how middle-school students use generative AI in school-like learning activities.

URL PDF HTML ☆

赞 0 踩 0

2603.16941 2026-06-19 eess.AS cs.CL cs.SD 版本更新 70%

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

言语背后的声音：量化语音大语言模型中的交叉偏见

Shree Harsha Bokkahalli Satish, Christoph Minixhofer, Maria Teleki, James Caverlee, Ondřej Klejch, Peter Bell, Gustav Eje Henter, Éva Székely

发表机构 * 1 Department of Speech, Music ； Hearing, KTH Royal Institute of Technology, Sweden 2 Centre for Speech Technology Research, University of Edinburgh, UK 3 Texas A\&M University, USA

专题命中其他LLM ：语音大语言模型中的交叉偏见量化

AI总结本研究通过2880次受控交互，评估三种语音大语言模型在六种英语口音和两种性别呈现中的口音与性别交叉偏见，发现东欧口音（尤其女性）获得更低有用性评分，且人类评估者比LLM评判更敏感。

Comments 5 pages, 3 figures, 1 table, Accepted to Interspeech 2026

详情

AI中文摘要

语音大语言模型直接处理语音输入，保留了之前级联管道中去除的口音和感知性别等线索，这导致了依赖于说话者身份的反应差异。我们使用2880次受控交互（涵盖六种英语口音和两种性别呈现，通过语音克隆保持语言内容不变），对三种语音大语言模型中的口音和性别偏见进行了大规模交叉评估。通过逐点LLM评判评分、成对比较以及经过人工验证的最佳-最差缩放，我们检测到反复出现的定向差异。东欧口音的语音获得较低的有用性评分，尤其是女性呈现的语音。反应保持礼貌但在有用性上存在差异。虽然LLM评判捕捉到了这些偏见的定向趋势，但人类评估者表现出显著更高的敏感性，显示出更强的口音级别对比。

英文摘要

Speech Large Language Models (SpeechLLMs) process spoken input directly, retaining cues such as accent and perceived gender that were previously removed in cascaded pipelines. This introduces speaker identity dependent variation in responses. We present a large-scale intersectional evaluation of accent and gender bias in three SpeechLLMs using 2,880 controlled interactions across six English accents and two gender presentations, keeping linguistic content constant through voice cloning. Using pointwise LLM-judge ratings, pairwise comparisons, and Best-Worst Scaling with human validation, we detect recurring directional disparities. Eastern European-accented speech receives lower helpfulness scores, particularly for female-presenting voices. Responses remain polite but differ in helpfulness. While LLM judges capture the directional trend of these biases, human evaluators exhibit significantly higher sensitivity, showing stronger accent-level contrasts.

URL PDF HTML ☆

赞 0 踩 0

2603.16357 2026-06-19 cs.CY cs.SE 版本更新 70%

Beyond Grading Accuracy: Exploring Alignment of TAs and LLMs

超越评分准确性：探索助教与LLMs的一致性

Matthijs Jansen op de Haar, Nacir Bouali, Faizan Ahmed

专题命中其他LLM ：开源LLM用于UML类图评分评估

AI总结本文提出一个评估管道，通过定量研究92个UML类图，比较助教与六个开源LLMs在单个评分标准上的表现，发现开源LLMs在评分准确性上接近助教，为混合主动评分系统提供了可能。

Comments 7 pages, 3 figures

详情

AI中文摘要

在本文中，我们研究了开源大型语言模型（LLMs）在评分统一建模语言（UML）类图方面的潜力。与现有主要评估专有LLMs的工作不同，我们专注于非专有模型，使得我们的方法适用于对透明度和成本敏感的大学。此外，现有研究评估的是完整图表而非单个标准的性能，对自动评分与人类评估的一致性提供的见解有限。为解决这些差距，我们提出一个评分管道，其中学生生成的UML类图由助教（TAs）和LLMs独立评估，然后在单个标准级别比较评分。我们通过一项对软件设计课程中92个UML类图的定量研究来评估该管道，将助教评分与六个开源LLMs产生的评估进行比较。性能在单个标准上测量，突出LLMs与人类评分者存在差异的领域。我们的结果显示，每个标准的准确率高达88.56%，皮尔逊相关系数高达0.78，仅使用开源模型就比先前工作有显著改进。这些模型的性能接近助教，表明了一条通往混合主动评分系统的可能路径，其中助教在评分中得到辅助。我们的发现表明，开源LLMs可以通过明确识别与评分标准的一致性来有效支持UML类图评分。所提出的管道提供了一种实用方法，以应对随着学生人数增长而增加的工作量。

英文摘要

In this paper, we investigate the potential of open-source Large Language Models (LLMs) for grading Unified Modeling Language (UML) class diagrams. In contrast to existing work, which primarily evaluates proprietary LLMs, we focus on non-proprietary models, making our approach suitable for universities where transparency and cost are critical. Additionally, existing studies assess performance over complete diagrams rather than individual criteria, offering limited insight into how automated grading aligns with human evaluation. To address these gaps, we propose a grading pipeline in which student-generated UML class diagrams are independently evaluated by both teaching assistants (TAs) and LLMs. Grades are then compared at the level of individual criteria. We evaluate this pipeline through a quantitative study of 92 UML class diagrams from a software design course, comparing TA grades against assessments produced by six open-source LLMs. Performance is measured across individual criteria, highlighting areas where LLMs diverge from human graders. Our results show per-criterion accuracy of up to 88.56\% and a Pearson correlation coefficient of up to 0.78, representing a substantial improvement over previous work while using only open-source models. The models achieve performance close to that of a TA, suggesting a possible path toward a mixed-initiative grading system, where TAs are aided in their grading. Our findings demonstrate that open-source LLMs can effectively support UML class diagram grading by explicitly identifying alignment with grading criteria. The proposed pipeline provides a practical approach to managing increasing workloads with growing student counts.

URL PDF HTML ☆

赞 0 踩 0

2502.19193 2026-06-19 cs.SI cs.AI cs.NE 版本更新 70%

Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms

受监管社交媒体平台下的语言演化模拟：大语言模型与遗传算法的协同方法

Jinyu Cai, Yusei Ishimizu, Mingyue Zhang, Munan Li, Jialong Li, Kenji Tei

专题命中其他LLM ：用LLM模拟语言演化，结合遗传算法

AI总结提出基于大语言模型的多智能体框架，结合遗传算法模拟用户语言策略在监管下的迭代演化，实验表明对话轮次增加可提升信息传递准确性和对话持续性。

Comments The manuscript has been accepted to IEEE Transactions on Computational Social Systems

详情

AI中文摘要

社交媒体平台经常实施限制性政策来调节用户内容，从而催生出创造性的规避语言策略。本文提出了一个基于大语言模型（LLMs）的多智能体框架，用于模拟在监管约束下语言策略的迭代演化。在该框架中，参与者智能体作为社交媒体用户，不断演化其语言表达，而监管智能体通过评估政策违规来模拟平台级别的监管。为了实现更逼真的模拟，我们采用了语言策略的双重设计（约束和表达）来区分冲突目标，并利用LLM驱动的遗传算法（GA）进行语言策略的选择、变异和交叉。该框架使用两种不同的场景进行评估：一个抽象的密码游戏和一个逼真的模拟非法宠物交易场景。实验结果表明，随着对话轮次的增加，不间断对话轮次的数量和信息传输的准确性都显著提高。此外，一项包含40名参与者的用户研究验证了生成对话和策略的现实相关性。消融研究也验证了GA的重要性，强调了其对长期适应性和整体结果改善的贡献。

英文摘要

Social media platforms frequently impose restrictive policies to moderate user content, prompting the emergence of creative evasion language strategies. This paper presents a multi-agent framework based on Large Language Models (LLMs) to simulate the iterative evolution of language strategies under regulatory constraints. In this framework, participant agents, as social media users, continuously evolve their language expression, while supervisory agents emulate platform-level regulation by assessing policy violations. To achieve a more faithful simulation, we employ a dual design of language strategies (constraint and expression) to differentiate conflicting goals and utilize an LLM-driven GA (Genetic Algorithm) for the selection, mutation, and crossover of language strategies. The framework is evaluated using two distinct scenarios: an abstract password game and a realistic simulated illegal pet trade scenario. Experimental results demonstrate that as the number of dialogue rounds increases, both the number of uninterrupted dialogue turns and the accuracy of information transmission improve significantly. Furthermore, a user study with 40 participants validates the real-world relevance of the generated dialogues and strategies. Moreover, ablation studies validate the importance of the GA, emphasizing its contribution to long-term adaptability and improved overall results.

URL PDF HTML ☆

赞 0 踩 0

2605.05481 2026-06-19 cs.LG 版本更新 60%

Approximate Next Policy Sampling: Replacing Conservative Target Policy Updates in Deep RL

近似下一策略采样：替代深度强化学习中的保守目标策略更新

Dillon Sandhu, Ronald Parr

专题命中其他LLM ：提出近似下一策略采样方法，属于强化学习，非LLM核心内容

AI总结提出近似下一策略采样（ANPS）方法，通过修改训练分布而非约束策略更新来解决强化学习中的“鸡生蛋”问题，并基于此设计稳定值近似策略迭代（SV-API）算法，在Atari和连续控制任务上实现更大目标策略更新且性能匹配或提升。

详情

AI中文摘要

我们重新审视强化学习中一个经典的“鸡生蛋”问题：为了安全地改进策略，价值函数必须在更新策略的状态访问分布上准确。该状态分布是未知的，且无法为训练价值函数而采样。保守更新解决了这个问题，但代价是缩小策略更新。本文探索了一种替代方案，即近似下一策略采样（ANPS），它通过修改训练分布而非约束策略更新来解决问题。如果训练数据的分布近似于下一策略的分布，则ANPS成立。为了证明ANPS的可行性和有效性，我们引入了稳定值近似策略迭代（SV-API）。SV-API修改了标准的近似策略迭代循环，在迭代更新的行为策略收集相关经验的同时，保持目标策略固定。它仅在满足收敛准则后才承诺采用新策略。如果满足某些稳定性准则，则更新保证是安全的；否则，其安全性不低于标准近似策略迭代。将SV-API应用于PPO得到稳定值PPO（SV-PPO），在高维离散（Atari）和连续控制基准测试中，SV-PPO在执行显著更大的目标策略更新的同时，性能匹配或提升。这些结果证明了ANPS作为RL中这一经典挑战的新解决方案的可行性。

英文摘要

We revisit a classic "chicken-and-egg" problem in reinforcement learning: to safely improve a policy, the value function must be accurate on the state-visitation distribution of the updated policy. That distribution over states is unknown and cannot be sampled for the purposes of training the value function. Conservative updates solve this problem, but at the cost of shrinking the policy update. This paper explores an alternative solution, Approximate Next Policy Sampling (ANPS), which addresses the problem by modifying the training distribution rather than constraining the policy update. ANPS is satisfied if the distribution of the training data approximates that of the next policy. To demonstrate the feasibility and efficacy of ANPS, we introduce Stable Value Approximate Policy Iteration (SV-API). SV-API modifies the standard approximate policy iteration loop to hold the target policy fixed while an iteratively updated behavioral policy gathers relevant experience. It only commits to a new policy once a convergence criterion has been met. If certain stability criteria are met, the update is guaranteed to be safe; otherwise, it remains no less safe than standard approximate policy iteration. Applying SV-API to PPO yields Stable Value PPO (SV-PPO), which matches or improves performance on high-dimensional discrete (Atari) and continuous control benchmarks while executing substantially larger target policy updates. These results demonstrate the viability of ANPS as a new solution to this classic challenge in RL.

URL PDF HTML ☆

赞 0 踩 0

2604.07328 2026-06-19 cs.LG 版本更新 60%

How to sketch a learning algorithm

如何勾勒学习算法

Sam Gunn

发表机构 * UC Berkeley（伯克利大学）

专题命中其他LLM ：提出数据删除方案用于深度学习模型

AI总结提出一种数据删除方案，基于稳定性假设，通过随机复方向的高阶导数局部勾勒算术电路，实现深度学习模型输出预测的误差和失败概率可忽略，且预计算和推理仅慢对数因子。

Comments Improved presentation and simplified Algorithm 4

详情

AI中文摘要

训练数据的选择如何影响AI模型？这个广泛的问题对于可解释性、隐私和基础科学至关重要。其技术核心是数据删除问题：在合理的预计算量之后，快速预测如果从学习算法中排除给定训练数据子集，模型在给定情况下的行为。我们提出了一种数据删除方案，能够在深度学习设置中以可忽略的误差$\varepsilon$和失败概率$\delta$预测模型输出。我们的预计算和预测算法分别仅比常规训练和推理慢$\tilde{O}(\log(1/\delta)/\varepsilon^2)$因子。存储需求为$\tilde{O}(\log(1/\delta)/\varepsilon^2)$个模型。我们的证明基于一个称为稳定性的假设。与先前工作所做的假设相比，稳定性似乎与学习强大AI模型完全兼容。为支持这一点，我们展示了稳定性在microgpt的最小实验集中得到满足。我们的代码可在https://this URL获取。在技术层面，我们的工作基于一种新方法，通过计算随机复方向的高阶导数来局部勾勒算术电路。前向模式自动微分允许廉价计算这些导数。

英文摘要

How does the choice of training data influence an AI model? This broad question is of central importance to interpretability, privacy, and basic science. At its technical core is the data deletion problem: after a reasonable amount of precomputation, quickly predict how the model would behave in a given situation if a given subset of training data had been excluded from the learning algorithm. We present a data deletion scheme capable of predicting model outputs with vanishing error $\varepsilon$ and failure probability $δ$ in the deep learning setting. Our precomputation and prediction algorithms are only $\tilde{O}(\log(1/δ)/\varepsilon^2)$ factors slower than regular training and inference, respectively. The storage requirements are those of $\tilde{O}(\log(1/δ)/\varepsilon^2)$ models. Our proof is based on an assumption that we call stability. In contrast to the assumptions made by prior work, stability appears to be fully compatible with learning powerful AI models. In support of this, we show that stability is satisfied in a minimal set of experiments with microgpt. Our code is available at https://github.com/SamSpo1/microgpt-sketch. At a technical level, our work is based on a new method for locally sketching an arithmetic circuit by computing higher-order derivatives in random complex directions. Forward-mode automatic differentiation allows cheap computation of these derivatives.

URL PDF HTML ☆

赞 0 踩 0

2604.06464 2026-06-19 cs.LG physics.app-ph stat.ML 版本更新 60%

Weighted Bayesian Conformal Prediction

加权贝叶斯共形预测

Xiayin Lou, Peng Luo

发表机构 * Technical University of Munich（慕尼黑技术大学）； Massachusetts Institute of Technology（麻省理工学院）

专题命中其他LLM ：加权贝叶斯共形预测方法

AI总结提出加权贝叶斯共形预测（WBCP），通过加权Dirichlet先验推广贝叶斯共形预测到重要性加权设置，理论证明有效样本量决定后验方差，并提供更丰富的条件覆盖不确定性。

详情

AI中文摘要

共形预测提供具有有限样本覆盖保证的分布自由预测区间，Snell & Griffiths 最近的工作将其重新解释为贝叶斯求积（BQ-CP），通过阈值上的 Dirichlet 后验产生强大的数据条件保证。然而，BQ-CP 根本上要求 i.i.d. 假设。同时，加权共形预测通过重要性权重处理分布偏移，但仍然是频率学派方法，仅产生点估计阈值。我们提出 \textbf{加权贝叶斯共形预测（WBCP）}，它将 BQ-CP 推广到任意重要性加权设置，用加权 Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$ 替换均匀 Dirichlet $\Dir(1,\ldots,1)$，其中 $\neff$ 是 Kish 有效样本量。我们证明了四个理论结果：(1)~$\neff$ 是匹配频率学派和贝叶斯方差的唯一集中参数；(2)~后验标准差以 $O(1/\sqrt{\neff})$ 衰减；(3)~BQ-CP 的随机占优保证扩展到每个权重轮廓的数据条件保证；(4)~HPD 阈值在条件覆盖上提供 $O(1/\sqrt{\neff})$ 的改进。我们将 WBCP 实例化为 \emph{地理贝叶斯共形预测}，其中基于核的空间权重产生每个位置的后验，并具有可解释的诊断。在合成和真实空间数据集上的实验表明，WBCP 在保持覆盖保证的同时提供了更丰富的不确定性信息。

英文摘要

Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.

URL PDF HTML ☆

赞 0 踩 0

2605.17443 2026-06-19 cs.CL cs.SD eess.AS 版本更新 80%

Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

分析韩语语音问答中ASR-LLM级联中的误差传播

Donghyuk Jung, Youngwon Choi

发表机构 * Korea Culture Technology Institute, Republic of Korea（韩国文化科技研究所）； Maum AI Inc., Republic of Korea（马姆人工智能公司）

专题命中领域大模型：研究ASR-LLM级联在韩语语音问答中的误差传播

AI总结本文研究了韩语语音问答中ASR-LLM级联中误差传播的问题，通过分析下游语义失败，揭示了传统ASR指标无法完全捕捉的误差影响，发现不同性能的LLM在级联降级上的一致性，识别出单字符ASR错误作为语义失败通道，并通过辅助比较表明大音频语言模型在噪声韩语SQA中优于匹配语言模型的ASR-LLM流水线。

Comments Preprint. Submitted to APSIPA ASC 2026

2604.18105 2026-06-19 eess.AS cs.CL cs.SD 版本更新 80%

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

NIM4-ASR：迈向高效、鲁棒且可定制的实时基于LLM的语音识别

Yuan Xie, Jiaqi Song, Guang Qiu, Xianliang Wang, Kai Qiao, Junfeng Yuan, Shengqing Liu, Yi Zhang, Bowen Chen, Ming Lei, Jie Gao, Jie Wu

发表机构 * Advanced Intelligent Systems Group, NIO（蔚来智能系统集团）

专题命中领域大模型：提出基于LLM的语音识别框架NIM4-ASR

AI总结提出NIM4-ASR框架，通过重新设计多阶段训练范式（包括预训练架构优化、迭代异步SFT和ASR专用强化学习）以及生产优化（噪声鲁棒性、流式推理和RAG热词定制），在2.3B参数下实现SOTA性能。

详情

AI中文摘要

将大语言模型（LLM）集成到自动语音识别（ASR）中已成为近年来的主流范式。尽管现有的基于LLM的ASR模型在公共基准上表现出色，但其训练仍然主要依赖数据驱动，未能充分解决关键的实际挑战——特别是在资源受限部署中的有限向下可扩展性以及声学挑战条件下的幻觉问题。为了解决这些问题，我们提出了NIM4-ASR，一个面向生产的、基于LLM的ASR框架，针对效率和鲁棒性进行了优化。基于编码器和LLM之间功能角色的原则性划分，我们重新设计了多阶段训练范式，使每个模块与其预期的能力边界对齐。具体来说，我们重新制定了预训练架构和目标以缓解模态差距并提高参数效率；引入了迭代异步SFT阶段以保持声学保真度并约束表示漂移；设计了ASR专用的强化学习阶段以进一步提高识别质量和鲁棒性。我们还加入了一系列面向生产的优化，包括噪声和静音条件下的鲁棒性、实时流式推理以及通过检索增强生成（RAG）进行的热词定制。实验表明，NIM4-ASR仅用2.3B参数就在多个公共基准上达到了最先进的性能，同时在内部基准上显著优于更大规模的竞争对手——特别是在实体密集的真实场景中。NIM4-ASR进一步通过RAG支持百万级热词定制，检索延迟低于毫秒，从而能够高效适应新兴实体和个性化用户需求。

英文摘要

Integrating large language models (LLMs) into automatic speech recognition (ASR) has become a mainstream paradigm in recent years. Although existing LLM-based ASR models demonstrate impressive performance on public benchmarks, their training remains predominantly data-driven, leaving key practical challenges insufficiently addressed -- particularly limited downward scalability in resource-constrained deployments and hallucinations under acoustically challenging conditions. To address these issues, we present NIM4-ASR, a production-oriented LLM-based ASR framework optimized for both efficiency and robustness. Grounded in a principled delineation of functional roles between the encoder and the LLM, we redesign the multi-stage training paradigm to align each module with its intended capability boundary. Specifically, we reformulate the pre-training architecture and objective to mitigate the modality gap and improve parameter efficiency; introduce an iterative asynchronous SFT stage to preserve acoustic fidelity and constrain representation drift; and design an ASR-specialized reinforcement learning stage to further enhance recognition quality and robustness. We additionally incorporate a suite of production-oriented optimizations, including robustness under noisy and silent conditions, real-time streaming inference, and hotword customization via retrieval-augmented generation (RAG). Experiments show that NIM4-ASR achieves state-of-the-art performance on multiple public benchmarks with merely 2.3B parameters, while substantially outperforming larger-scale competitors on internal benchmarks -- particularly in entity-intensive real-world scenarios. NIM4-ASR further supports million-scale hotword customization via RAG with sub-millisecond retrieval latency, enabling efficient adaptation to emerging entities and personalized user requirements.

URL PDF HTML ☆

赞 0 踩 0

2507.00875 2026-06-19 cs.CL cs.HC cs.MA 版本更新 80%

TransLaw: A Large-Scale Dataset and Multi-Agent Benchmark Simulating Professional Translation of Hong Kong Case Law

TransLaw：模拟香港判例法专业翻译的大规模数据集与多智能体基准

Xi Xuan, Chunyu Kit

发表机构 * City University of Hong Kong, Hong Kong SAR, China（香港城市大学）

专题命中领域大模型：多智能体框架用于法律翻译

AI总结针对香港判例法英译中资源匮乏、法律术语和格式要求严格的问题，构建了首个大规模句对齐平行语料库HKCFA Judgment 97-22，并提出多智能体框架TransLaw，通过分解翻译任务、集成法律词汇库和检索增强生成，显著提升翻译质量，但仍未达到人类专家的风格自然度。

Comments Accepted at ICML 2026 - AI for Law

详情

AI中文摘要

根据《基本法》第8-9条，香港法院判决书需从英文翻译成繁体中文，但由于平行资源短缺以及对法律术语、引用格式和司法风格的严格要求，这一任务仍受到限制。我们引入了HKCFA Judgment 97-22，这是首个用于香港判例法的大规模句对齐平行语料库，包含344份专业翻译的判决书（11,099个句对；210万词元），涵盖1997年至2022年。基于这一资源，我们提出了TransLaw，一个多智能体框架，将翻译分解为词级表达、句级翻译和多维审查，集成了专门的香港法律词汇数据库、检索增强生成和迭代反馈，并包括涵盖语义对齐、术语、引用和风格的四维专家审查。通过对13个开源和商业大语言模型进行基准测试，我们证明TransLaw在所有评估模型上均显著优于单智能体基线，并在3次迭代内收敛。由10名持证法律翻译人员使用我们提出的Legal ACS指标进行的人工评估证实了法律语义准确性的提升，同时表明TransLaw在风格自然度上仍落后于人类专家。数据集和基准代码可在以下网址获取：https://xxx。

英文摘要

Translating Hong Kong Court Judgments from English to Traditional Chinese is mandated by Articles 8-9 of the Basic Law, yet remains constrained by a shortage of parallel resources and rigorous demands on legal terminology, citation format, and judicial style. We introduce HKCFA Judgment 97-22, the first large-scale sentence-aligned parallel corpus for HK case law, comprising 344 professionally translated judgments (11,099 sentence pairs; 2.1M tokens) spanning 1997-2022. Building on this resource, we propose TransLaw, a multi-agent framework that decomposes translation into word-level expression, sentence-level translation, and multidimensional review, integrating a specialized Hong Kong legal glossary database, Retrieval-Augmented Generation, and iterative feedback, with four-dimensional expert review covering semantic alignment, terminology, citation, and style. Benchmarking 13 open-source and commercial LLMs, we demonstrate that TransLaw significantly outperforms single-agent baselines across all evaluated models, with convergence within 3 iterations. Human evaluation by 10 certified legal translators using our proposed Legal ACS metric confirms gains in legal-semantic accuracy, while showing that TransLaw still trails human experts in stylistic naturalness. The dataset and benchmark code are available at https://github.com/xuanxixi/TransLaw.

URL PDF HTML ☆

赞 0 踩 0

2509.03391 2026-06-19 cs.DL cs.CY 版本更新 80%

More Parameters Than Populations: A Systematic Literature Review of Large Language Models within Survey Research

参数多于总体：调查研究中的大语言模型系统文献综述

Trent D. Buskirk, Florian Keusch, Leah von der Heyde, Adam Eck

专题命中领域大模型：系统综述LLM在调查研究中的应用，涵盖三个阶段。

AI总结通过系统文献综述，评估大语言模型在调查研究三个阶段（数据收集前、中、后）的应用，讨论其潜力与陷阱，并展望调查研究对LLM发展的贡献。

Comments This working paper is outdated as of June 2026 - please refer to the full version with substantive changes here: https://doi.org/10.31235/osf.io/eubj4_v1 This work was presented at NLPOR 2025 (non-archival): https://openreview.net/forum?id=0Hxhwa56Yg

详情

AI中文摘要

[工作论文]调查研究长期以来一直是人力驱动的领域，但也接纳了多种技术来收集、处理和分析各种行为、政治和社会结果。与此同时，大语言模型（LLM）带来了新的技术挑战和前提条件，以充分利用其潜力。在本文中，我们报告了一项基于多个大规模数据库关键词搜索和引文网络的系统文献综述的进展，评估LLM目前在调查研究过程中的应用情况。我们根据调查研究过程综合并组织我们的发现，包括LLM在三个广泛阶段的使用示例：数据收集前、数据收集和数据收集后。我们基于现有文献中的示例，讨论了LLM潜在用例的选定示例及其陷阱。考虑到调查研究在数据质量方面拥有丰富的经验和历史，我们讨论了一些机会，并描述了调查研究为LLM的持续发展和改进做出贡献的未来展望。

英文摘要

[Working Paper] Survey research has a long-standing history of being a human-powered field, but one that embraces various technologies for the collection, processing, and analysis of various behavioral, political, and social outcomes of interest, among others. At the same time, Large Language Models (LLMs) bring new technological challenges and prerequisites in order to fully harness their potential. In this paper, we report work-in-progress on a systematic literature review based on keyword searches from multiple large-scale databases as well as citation networks that assesses how LLMs are currently being applied within the survey research process. We synthesize and organize our findings according to the survey research process to include examples of LLM usage across three broad phases: pre-data collection, data collection, and post-data collection. We discuss selected examples of potential use cases for LLMs as well as its pitfalls based on examples from existing literature. Considering survey research has rich experience and history regarding data quality, we discuss some opportunities and describe future outlooks for survey research to contribute to the continued development and refinement of LLMs.

URL PDF HTML ☆

赞 0 踩 0

2512.18859 2026-06-19 cs.CL 版本更新 75%

Toward Human-Centered AI-Assisted Terminology Work

迈向以人为中心的AI辅助术语工作

Antonio San Martin

发表机构 * Universite du Quebec à Trois-Rivieres（魁北克大学三河分校）

专题命中领域大模型：讨论生成式AI在术语工作中的应用，属于领域大模型

AI总结本文提出以人为中心的人工智能框架，在利用生成式AI自动化术语工作的同时，通过增强术语学家能力、保持人类控制权来确保术语数据的准确性和可靠性。

Comments Accepted for publication in the journal Terminology

详情

AI中文摘要

生成式AI可能通过创造自动化新机会来改变术语工作。同时，它引发了对术语学家和术语资源未来的担忧，因为效率压力可能鼓励过度自动化，认为人类专业知识可被AI取代。然而，由于错误、幻觉和各种形式的偏见，大型语言模型在术语目的上仍然不可靠，使得术语学家在确保术语数据的准确性和可靠性方面不可或缺。本文认为，以人为中心的AI（强调AI的主要目标应是促进人类福祉的方法）提供了一个框架，可以在最大化生成式AI收益的同时减轻其风险。它主张高水平的自动化和有意义的人类控制是兼容且可取的，AI应增强术语学家的能力，同时保留他们的自主权和决策权。通过三个相互关联的维度——增强的术语学家、伦理AI和以人为中心的设计——审视了AI辅助术语工作的影响。特别是，本文探讨了AI整合如何重塑术语学家的角色，影响专业价值观和工作条件，要求管理AI产生的偏见，并呼吁围绕术语学家的需求设计AI工具。本文得出结论，以人为中心的方向是必要的，以确保AI加强而非削弱术语工作在支持专业交流以及跨语言和跨文化准确传播知识中的关键作用。

英文摘要

Generative AI is likely to transform terminology work by creating new opportunities for automation. At the same time, it raises concerns about the future of terminologists and terminological resources, as efficiency pressures may encourage excessive automation based on the perception that human expertise can be replaced by AI. However, large language models remain unreliable for terminological purposes due to errors, hallucinations, and various forms of bias, making terminologists indispensable for ensuring the accuracy and reliability of terminological data. This paper argues that human-centered AI, an approach that emphasizes that AI's primary goal should be to contribute to human well-being, provides a framework for maximizing the benefits of generative AI while mitigating its risks. It contends that high levels of automation and meaningful human control are compatible and desirable, and that AI should enhance terminologists' capabilities while preserving their agency and decision-making authority. The implications of AI-assisted terminology work are examined through three interrelated dimensions: the augmented terminologist, ethical AI, and human-centered design. In particular, the paper examines how AI integration reshapes the role of the terminologist, affects professional values and working conditions, requires the management of AI-generated bias, and calls for the design of AI tools around the terminologist's needs. The paper concludes that a human-centered orientation is necessary to ensure that AI strengthens, rather than undermines, the essential role of terminology work in supporting specialized communication and the accurate transmission of knowledge across languages and cultures.

URL PDF HTML ☆

赞 0 踩 0

2604.23938 2026-06-19 cs.CL 版本更新 70%

TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment

TSAssistant: 一种人在回路中的自动化靶点安全性评估智能体框架

Xiaochen Zheng, Zhiwen Jiang, David Tokar, Yexiang Cheng, Alvaro Serra, Melanie Guerard, Klas Hatje, Tatyana Doktorova

发表机构 * Computational Sciences Center of Excellence（计算科学卓越中心）

专题命中领域大模型：利用LLM进行生物医学文献检索与综合

AI总结提出TSAssistant多智能体框架，通过分层指令架构和交互式优化循环，将靶点安全性评估报告生成分解为专业子任务，实现高可重复性和证据溯源。

Comments Updated with quantitative and expert evaluations

详情

AI中文摘要

靶点安全性评估（TSA）需要系统整合遗传、转录组、靶点同源性、药理学和临床数据，以评估治疗靶点的潜在安全性风险。该过程劳动密集且依赖专家，在可扩展性和可重复性方面面临挑战。我们提出TSAssistant，一种人在回路中的多智能体框架，将TSA报告生成分解为专门子智能体的工作流：研究子智能体各自基于并引用单个TSA领域，合成子智能体整合跨领域发现。子智能体通过标准化工具接口从精选生物医学来源检索和综合证据，生成可单独引用、基于证据的章节，其行为由分层指令架构塑造，该架构将协调逻辑与领域专业知识和用户意图分离。为补充这些软约束，程序化执行钩子和持久记忆存储在整个工作流中强制执行硬约束，而交互式优化循环允许专家在完全保留跨迭代对话上下文的情况下审查和修订各个章节。我们不是进行单一的整体比较，而是将报告质量分解为可重复性、证据基础、任务级准确性和专家监督下的可控性，发现高可重复性和证据基础、与人类参考高度一致以及专家驱动的净正面改进。

英文摘要

Target Safety Assessment (TSA) requires systematic integration of genetic, transcriptomic, target homology, pharmacological, and clinical data to evaluate potential safety liabilities of therapeutic targets. This process is labor-intensive and expert-dependent, posing challenges in scalability and reproducibility. We present TSAssistant, a human-in-the-loop multi-agent framework that decomposes TSA report generation into a workflow of specialized subagents: Research Subagents that each ground and cite a single TSA domain, and Synthesis Subagents that integrate findings across domains. Subagents retrieve and synthesize evidence from curated biomedical sources through standardized tool interfaces and produce individually citable, evidence-grounded sections, with behavior shaped by a hierarchical instruction architecture that separates coordination logic from domain expertise and user intent. To complement these soft constraints, programmatic execution hooks and persistent memory stores enforce hard constraints across the workflow, while an interactive refinement loop allows experts to review and revise individual sections with full conversational context preserved across iterations. Rather than a single holistic comparison, we decompose report quality into reproducibility, evidential grounding, task-level accuracy, and controllability under expert oversight, finding high reproducibility and grounding, substantial agreement with the human reference, and net-positive expert-driven refinement.

URL PDF HTML ☆

赞 0 踩 0

2402.14035 2026-06-19 cs.LG cs.AI 版本更新 70%

Wisdom of Committee: Diverse Distillation from Large Foundation Models and Domain Experts

委员会智慧：来自大型基础模型和领域专家的多样化蒸馏

Zichang Liu, Qingyun Liu, Yuening Li, Liang Liu, Anshumali Shrivastava, Shuchao Bi, Lichan Hong, Ed H. Chi, Zhe Zhao

发表机构 * Rice University（Rice大学）； Google DeepMind（谷歌DeepMind）； Google Inc（谷歌公司）； University of California, Davis（加州大学戴维斯分校）

专题命中领域大模型：蒸馏基础模型到紧凑领域模型，涉及推荐和视觉

AI总结针对基础模型向紧凑领域模型蒸馏时能力、架构和模态差异大的问题，提出DiverseDistill框架，通过可学习的问答机制和对齐异构教师输出，在推荐和视觉任务上恢复73-114%的性能差距。

Comments Accepted at the 1st Workshop on Resource-Efficient Learning and Knowledge Discovery (RelKD), KDD 2026

Journal ref Proceedings of the RelKD Workshop at KDD 2026

详情

AI中文摘要

从基础模型向紧凑领域模型进行知识蒸馏因能力、架构和模态的巨大差异而具有挑战性。例如，在我们的实验中，从7600万参数的语言模型蒸馏到200万参数的推荐模型仅能弥补未蒸馏学生与教师之间不到40%的性能差距。我们表明，引入与基础模型共享学生架构特征的领域专家作为多样化教师委员会，能显著改善迁移效果。然而，标准的多教师方法未能利用这种多样性：简单组合异构教师可能使性能低于单教师蒸馏。为此，我们提出DiverseDistill，一种交互式蒸馏框架，采用可学习的问答机制生成教师条件查询，并将异构教师输出对齐到学生的表示空间。与需要基于梯度的协同优化或修改教师架构的方法不同，DiverseDistill在冻结教师的情况下仅通过其中间层的前向推理运行：无需参数更新、无需协同训练、无需架构修改。动态教师重要性机制通过过滤每个样本中低相关性的教师（例如，在推荐任务中减少约30%的前向传播且无质量损失）进一步降低训练成本，而整个蒸馏模块在训练后被丢弃，推理时零开销。在推荐（38倍压缩）和视觉（3.6倍压缩）任务上的评估表明，DiverseDistill恢复了73-114%的师生性能差距，持续优于所有单教师和多教师基线方法。

英文摘要

Knowledge distillation from foundation models to compact domain models is challenging due to substantial gaps in capacity, architecture, and modality. For example, in our experiments, distilling from a 76M-parameter language model to a 2M-parameter recommender closes less than 40% of the performance gap between the undistilled student and the teacher. We show that introducing domain-specific experts -- which share the student's architectural characteristics -- alongside the foundation model as a diverse teacher committee significantly improves transfer. However, standard multi-teacher methods fail to exploit this diversity: naively combining heterogeneous teachers can degrade performance below single-teacher distillation. To address this, we propose DiverseDistill, an interactive distillation framework that employs a learnable Question-Answer mechanism to generate teacher-conditioned queries and align heterogeneous teacher outputs into the student's representation space. Unlike methods requiring gradient-based co-optimization or architectural modification of teachers, DiverseDistill operates with frozen teachers using only forward-pass inference through their intermediate layers: no parameter updates, no co-training, and no architectural surgery. A dynamic teacher importance mechanism further reduces training cost by filtering low-relevance teachers per sample (e.g., ~30% fewer forward passes with no quality loss for recommendation tasks), while the entire Distillation Module is discarded after training, adding zero inference overhead. Evaluations on recommendation (38x compression) and vision (3.6x compression) tasks demonstrate that DiverseDistill recovers 73-114% of the teacher-student performance gap, consistently outperforming all single- and multi-teacher baselines.

URL PDF HTML ☆

赞 0 踩 0

1. 后训练 4 篇

A Survey of On-Policy Distillation for Large Language Models

Reinforcement-aware Knowledge Distillation for LLM Reasoning

AAPA: Adversarially Anchored Preference Alignment for Post-Training of Large Language Models

Model soups need only one ingredient

2. 指令微调 4 篇

A Critical Look at Targeted Instruction Selection: Disentangling What Matters (and What Doesn't)

DeFrame: Debiasing Large Language Models Against Framing Effects

MixSD: Mixed Contextual Self-Distillation for Knowledge Injection

Target-Side Paraphrase Augmentation for Sign Language Translation with Large Language Models

3. 预训练 2 篇

BLISS: A Lightweight Bilevel Influence Scoring Method for Data Selection in Language Model Pretraining

LoRDO: Distributed Low-Rank Optimization with Infrequent Communication

4. 其他LLM 13 篇

Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models

S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Improving Alignment Between Human and Machine Codes: An Empirical Assessment of Prompt Engineering for Construct Identification in Psychology

Modeling U.S. Attitudes Toward China via an Event-Steered Multi-Agent Simulator

Too long; didn't solve

Teaching Students to Question the Machine: An AI Literacy Intervention Improves Students' Regulation of LLM Use in a Science Task

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs

Beyond Grading Accuracy: Exploring Alignment of TAs and LLMs

Simulation of Language Evolution under Regulated Social Media Platforms: A Synergistic Approach of Large Language Models and Genetic Algorithms

Approximate Next Policy Sampling: Replacing Conservative Target Policy Updates in Deep RL

How to sketch a learning algorithm

Weighted Bayesian Conformal Prediction

5. 领域大模型 7 篇

Analyzing Error Propagation in Korean Spoken QA with ASR-LLM Cascades

NIM4-ASR: Towards Efficient, Robust, and Customizable Real-Time LLM-Based ASR

TransLaw: A Large-Scale Dataset and Multi-Agent Benchmark Simulating Professional Translation of Hong Kong Case Law

More Parameters Than Populations: A Systematic Literature Review of Large Language Models within Survey Research

Toward Human-Centered AI-Assisted Terminology Work

TSAssistant: A Human-in-the-Loop Agentic Framework for Automated Target Safety Assessment

Wisdom of Committee: Diverse Distillation from Large Foundation Models and Domain Experts