arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2237
专题追踪
2606.17815 2026-06-17 cs.CR cs.CL 新提交

Beyond Native Success: Auditing Deployment-Interface Exposure of CLIP Backdoors

超越原生成功:审计CLIP后门的部署接口暴露

Kunlan Xiang, Haomiao Yang, Wenbo Jiang

发表机构 * University of Electronic Science and Technology of China(电子科技大学)

AI总结 提出DIFE框架审计CLIP后门在不同部署接口下的暴露情况,发现原生成功不代表全局安全,并引入BadTextTower填补文本编码器后门缺失。

详情
AI中文摘要

对比语言-图像预训练模型广泛重用于下游接口,包括特征提取、检索、重排序和选择。然而,现有的CLIP后门通常在小规模的原生攻击任务上验证攻击,导致不清楚当通过其他接口重用时,相同的投毒检查点是否仍然暴露、减弱或变得不适用。我们引入DIFE,一个部署接口足迹评估框架,用于审计跨部署接口的带后门CLIP检查点。DIFE通过指定每个接口的组件读出、触发通道、目标事件、参考条件和度量,使各种评估具有可比性。DIFE还引入了有效足迹诊断,以识别携带暴露的可重用CLIP组件或组件组合,并解释风险转移的位置。使用DIFE审计复现的CLIP后门揭示了一个结构化的景观:原生成功不是检查点级别的风险证书,暴露遵循组件足迹,文本侧投毒不会产生文本编码器控制,一些耦合攻击仍然受机制约束。这次审计揭示了现有CLIP后门中的一个重要空白:文本编码器本身成为对抗行为的可重用载体。因此,我们引入BadTextTower来填补这一空白。BadTextTower产生强大的文本条件检索、重排序和选择暴露,同时使仅视觉重用几乎保持清洁。

英文摘要

Contrastive Language-Image Pre-training models are widely reused across downstream interfaces, including feature extraction, retrieval, reranking, and selection. Existing CLIP backdoor, however, usually validate attacks on a small attack-native task, leaving unclear whether the same poisoned checkpoint remains exposed, weakens, or becomes not applicable when reused through other interfaces. We introduce DIFE, a Deployment-Interface Footprint Evaluation framework that audits backdoored CLIP checkpoints across deployment interfaces. DIFE makes various evaluations comparable by specifying each interface's component readout, trigger channel, target event, reference condition, and metric. DIFE also introduces effective-footprint diagnosis to identify the reusable CLIP component or component combination that carries exposure and explains where risk transfers. Auditing reproduced CLIP backdoors with DIFE reveals a structured landscape: native success is not a checkpoint-level risk certificate, exposure follows component footprints, text-side poisoning does not yield textual-encoder control, and some coupled attacks remain mechanism-bound. This audit reveals a import gapin existing CLIP backdoors: a textual encoder that itself becomes a reusable carrier of adversarial behavior. We therefore introduce BadTextTower to fill this gap. BadTextTower produces strong text-conditioned retrieval, reranking, and selection exposure while leaving visual-only reuse nearly clean.

2606.17799 2026-06-17 cs.SE cs.AI cs.CL 新提交

Position: Coding Benchmarks Are Misaligned with Agentic Software Engineering

立场:编程基准与智能体软件工程不一致

Maria I. Gorinova, Macey Baker, Amy Heineike, Maksim Shaposhnikov, Rob Willoughby, Dru Knox

发表机构 * Tessl

AI总结 本文指出当前编程基准在智能体时代存在三大问题:混淆模型与系统框架、单一参考答案惩罚有效替代方案、缺乏组件级信号导致迭代困难,并提出应重新设计基准以对齐智能体软件工程。

详情
AI中文摘要

编程智能体已成为软件工程的主要模式,但我们用于比较它们的基准是在智能体时代之前设计的:它们将模型、框架和环境合并为一个单一的端到端分数,通常针对一个参考答案进行计算,没有提供用于迭代的组件级信号。我们认为当前的编程基准与智能体软件工程不一致。在实践中,编程智能体不是一个模型:它是一个系统框架——由模型、框架、上下文、环境和反馈信号组成的复合体,其中任何一个都可能使基准分数移动与相邻模型代际之间相当的幅度。我们讨论了三个症状:(i) 基准分数混淆了模型与框架的其余部分;(ii) 针对单一参考答案评分惩罚了同样有效的替代方案;(iii) 缺乏单个框架组件级别的信号使得端到端系统分数难以迭代。

英文摘要

Coding agents have become a major mode of software engineering, but the benchmarks we use to compare them were designed in a pre-agent era: they collapse model, harness, and environment into a single end-to-end score, typically computed against one reference solution, with no component-level signal for iteration. We argue that current coding benchmarks are misaligned with agentic software engineering. A coding agent in practice is not a model: it is a system harness -- a composite of models, harnesses, contexts, environments, and feedback signals, any one of which can move the benchmark score by margins comparable to those between adjacent model generations. We discuss three symptoms: (i) benchmark scores conflate the model with the rest of the harness; (ii) grading against a single reference solution penalises equally valid alternatives; and (iii) the absence of signal at the level of individual harness components makes the end-to-end system score difficult to iterate on.

2606.17786 2026-06-17 cs.HC cs.CL 新提交

Toward Accessible Psychotherapy Training Using AI-Driven Interactive Patient Avatars

利用AI驱动的交互式患者化身实现可及的心理治疗培训

Pascal Riachi, Sofie Kamber, Stella Brogna, Andrew Gloster, Rafael Wampfler

发表机构 * ETH Zurich(苏黎世联邦理工学院) University of Lucerne(卢塞恩大学)

AI总结 提出一个通过具身虚拟患者进行对话训练ACT心理治疗师的系统,利用大语言模型模拟患者行为,并提供基于ACT保真度标准的逐轮反馈,专家评估证实了高真实性和培训效果。

Journal ref 2026 IEEE 14th International Conference on Healthcare Informatics (ICHI), Minneapolis, MN, June 1-3, 2026, pp. 990-995

详情
AI中文摘要

培训心理治疗师掌握诸如接纳与承诺疗法(ACT)等循证干预措施需要反复练习并伴有有意义的反馈,然而安全、标准化的培训机会受到伦理、后勤和资源限制。我们引入一个系统,旨在通过与具身虚拟患者的语音对话支持ACT导向的心理治疗培训。该系统使用大语言模型模拟患者行为,其行为基于真实治疗会话中提取的档案和可配置的临床场景,同时一个独立的自动评估器根据既定的ACT保真度标准为治疗师的回应提供逐轮反馈。该系统并非旨在取代督导,而是通过支持在低风险环境中进行实验、反思和即时反馈来促进刻意练习。执业心理学家的专家评估证实了患者行为的高度真实性,并表明即时的逐轮ACT反馈提高了治疗师对干预选择的意识,并使他们能够有效尝试替代回应。对49份治疗记录的定量评估确定GPT-4o-mini为最佳反馈模型,在复制人类督导的ACT保真度评分时实现了最低的平均绝对误差(MAE = 6.12),且具有统计显著性的一致性。这项工作展示了保真度感知的模拟患者作为心理治疗培训的可扩展补充的潜力。

英文摘要

Training psychotherapists in evidence-based interventions such as Acceptance and Commitment Therapy (ACT) requires repeated practice with meaningful feedback, yet opportunities for safe, standardized training are limited by ethical, logistical, and resource constraints. We introduce a system designed to support ACT-oriented psychotherapy training through spoken dialogue with an embodied virtual patient. The system uses large language models to simulate patient behavior conditioned on profiles derived from real therapy sessions and configurable clinical scenarios, while a separate automated evaluator provides turn-by-turn feedback on therapist responses based on established ACT fidelity criteria. Rather than aiming to replace supervision, the system is intended to support deliberate practice by enabling experimentation, reflection, and immediate feedback in low-risk settings. Expert evaluation with practicing psychologists confirmed high realism in patient behavior and demonstrated that immediate turn-by-turn ACT feedback increased therapists' awareness of intervention choices and enabled effective experimentation with alternative responses. Quantitative evaluation across 49 therapy transcripts identified GPT-4o-mini as the optimal feedback model, achieving the lowest mean absolute error (MAE = 6.12) in replicating human supervisor ACT fidelity ratings with statistically significant agreement. This work demonstrates the potential of fidelity-aware simulated patients as a scalable complement to psychotherapy training.

2606.17781 2026-06-17 cs.AR cs.AI 新提交

MIVE: A Minimalist Integer Vector Engine for Softmax LayerNorm and RMSNorm Acceleration

MIVE:用于Softmax、LayerNorm和RMSNorm加速的极简整数向量引擎

Kosmas Alexandridis, Giorgos Dimitrakopoulos

发表机构 * Integrated Circuits Lab, Electrical and Computer Engineering, Democritus University of Thrace (DUTH), Greece(德摩克利特大学特拉克分校集成电路实验室,电气与计算机工程,德摩克利特大学特拉克分校(DUTH),希腊)

AI总结 提出一种可编程的极简整数向量引擎MIVE,通过统一数据通路执行Softmax、LayerNorm和RMSNorm三种操作,最大化硬件共享,提升面积和硬件效率。

详情
AI中文摘要

大型语言模型(LLM)的快速增长加剧了对专用硬件加速器的需求,这些加速器必须满足严格的推理延迟和功耗约束。尽管矩阵乘法主导了整体计算工作负载,但非线性向量归一化操作(如LayerNorm、RMSNorm和Softmax)可能成为关键硬件瓶颈。现有加速器通常使用专用硬件块实现这些功能,导致资源重复和硅利用率低下。为解决这一限制,我们提出了一种极简整数向量引擎(MIVE),这是一种可编程架构,能够在统一数据通路内执行所有三种操作。通过利用LayerNorm、RMSNorm和Softmax之间的共同计算模式,所提出的向量引擎最大化硬件共享,同时减少实现开销。物理ASIC实现结果表明,MIVE提供全面的多函数支持,同时在面积和硬件效率方面优于大多数最先进的独立加速器。

英文摘要

The rapid growth of Large Language Models (LLMs) has intensified the need for specialized hardware accelerators that can satisfy stringent inference latency and power constraints. Although matrix multiplications dominate the overall computational workload, non-linear vector normalization operations, such as LayerNorm, RMSNorm and Softmax can become critical hardware bottlenecks. Existing accelerators typically implement these functions using dedicated hardware blocks, leading to duplicated resources and inefficient silicon utilization. To address this limitation, we propose a Minimalist Integer Vector Engine (MIVE), a programmable architecture capable of executing all three operations within a unified datapath. By exploiting common computational patterns across LayerNorm, RMSNorm and Softmax the proposed vector engine maximizes hardware sharing while reducing implementation overhead. Physical ASIC implementation results show that MIVE provides comprehensive multi-function support while achieving higher area and hardware efficiency than most state-of-the-art standalone accelerators.

2606.17767 2026-06-17 cs.HC cs.AI 新提交

Talking to Your Data: Exploring Embodied Conversation as an Interface for Personal Health Reflection

与你的数据对话:探索具身对话作为个人健康反思的界面

Nikola Kovacevic, Bastien Husler, Di Zhuang, Rafael Wampfler, Barbara Solenthaler

发表机构 * Department of Computer Science, ETH Zurich(苏黎世联邦理工学院计算机科学系)

AI总结 提出一种通过具身对话代理与可穿戴健康数据交互的新范式,采用双代理设计(观察者提取统计特征,呈现者以“口语化统计”沟通),通过模拟自我用户研究(N=5)与传统仪表盘对比,评估感知理解、行动具体性和认知转变。

Journal ref Joint Proceedings of the ACM Intelligent User Interfaces (IUI) Workshops 2026, Paphos, Cyprus, July 13-16, 2026

详情
AI中文摘要

来自可穿戴设备的个人健康数据通常通过图表和统计摘要的仪表盘呈现,要求用户主动解读模式和含义。我们探索了一种替代交互范式:通过一个具身对话代理与个人健康数据进行互动,该代理在与用户的对话中促进客观的数据反思。我们提出了一个系统,它将可穿戴数据的轻量级预处理与基于Unity的具身角色相结合。在内部,系统遵循双代理设计,其中观察者代理提取描述性统计和时间趋势,呈现者代理通过“口语化统计”传达这些发现,有意避免临床建议,以隔离交互模态的影响。我们通过一个模拟自我用户研究(N=5)采用被试内设计评估了这种方法。参与者采用来自LifeSnaps数据集的健康角色和目标,比较了传统仪表盘探索与具身对话反思。我们的评估侧重于感知理解、生成行动的具体性,以及从被动观看到主动意义建构的认知转变。本文贡献了一个功能原型、一个客观健康数据叙事生成的设计模式,以及关于具身性如何影响个人健康指标解释的早期实证见解。

英文摘要

Personal health data from wearables are typically presented through dashboards of charts and summary statistics, requiring users to actively interpret patterns and implications. We explore an alternative interaction paradigm: engaging with personal health data through an embodied conversational agent that facilitates objective data reflection in dialogue with the user. We present a system that combines lightweight preprocessing of wearable data with a Unity-based embodied character. Internally, the system follows a dual-agent design in which an Observer agent extracts descriptive statistics and temporal trends, and a Presenter agent communicates these findings through "spoken statistics," intentionally refraining from clinical advice to isolate the impact of the interaction modality. We evaluate this approach through a simulated-self user study (N=5) using a within-subject design. Participants adopted health personas and goals derived from the LifeSnaps dataset to compare traditional dashboard exploration with embodied conversational reflection. Our evaluation focuses on perceived understanding, the specificity of generated actions, and the cognitive shift from passive viewing to active sensemaking. The paper contributes a functional prototype, a design pattern for objective health data narrative generation, and early empirical insights into how embodiment affects the interpretation of personal health metrics.

2606.17666 2026-06-17 cs.SE cs.AI 新提交

FacProcessTwin: An LLM-Based System for Process Twin Development

FacProcessTwin: 一种基于LLM的流程孪生开发系统

Yash Pulse, Yong-Bin Kang, Abhik Banerjee, Prem Prakash Jayaraman

发表机构 * Swinburne University of Technology(斯winburne大学)

AI总结 提出FacProcessTwin系统,利用大语言模型从工厂文档和操作员自然语言输入中自动生成流程模型并绑定实时数据,通过交互式流程图实现人机协同治理,在食品制造案例中准确率达95.2%,开发时间缩短至人工的1/6。

详情
AI中文摘要

流程孪生提供整个生产过程的实时表示。通过捕捉流程步骤如何相互作用,而不是像基于资产的数字孪生那样孤立地监控单个机器,它们有潜力推动整个过程的效率提升。然而,开发流程孪生成本高昂。它需要精确建模整个生产过程:其流程步骤、每个步骤使用的设备和产品特定设置,以及其流程变体。然后,生成的模型必须绑定到实时操作数据。我们提出FacProcessTwin,一个利用大语言模型(LLM)来减少开发时间的系统,它从工厂的流程文档和操作员的自然语言输入中构建流程孪生。FacProcessTwin生成完整的流程模型,然后自动将其流程步骤绑定到实时操作数据。生成的模型及其数据绑定被渲染为交互式流程图表,制造人员可以通过该图表监控和纠正系统的自主决策,例如解决安全关键绑定步骤中的不确定性。我们通过一家澳大利亚食品制造商的真实案例研究评估FacProcessTwin,涵盖16个生产流程,涉及冷藏、冷冻和无菌常温产品类别,并包括同一产品内的流程变体。结果表明,FacProcessTwin准确生成这些流程模型(与真实情况相比平均F1为95.2%),并且每个孪生的构建时间约为手动时间的六分之一。其人在环治理机制保持安全关键绑定的正确性:在模糊标签处,单次通过基线在75.0%的情况下静默错误绑定,而FacProcessTwin则推迟给操作员,错误绑定率为0。

英文摘要

Process twins provide real-time representations of entire production processes. By capturing how process steps interact, rather than monitoring a single machine in isolation as an asset-based digital twin does, they have the potential to drive efficiency gains across the whole process. However, developing a process twin is costly. It requires accurately modelling the entire production process: its process steps, the equipment and product-specific settings each step uses, and its process variations. The resulting model must then be bound to live operational data. We present FacProcessTwin, a system that leverages a large language model (LLM) to reduce this development time, building a process twin from a plant's process documentation and natural-language input from an operator. FacProcessTwin generates this complete process model and then automatically binds its process steps to live operational data. The generated model and its data bindings are rendered as an interactive process diagram through which manufacturing personnel can monitor and correct the system's autonomous decisions, such as resolving uncertainty at safety-critical binding steps. We evaluate FacProcessTwin through a real-world case study of an Australian food manufacturer, covering 16 production process flows that span chilled, frozen, and aseptic shelf-stable product categories and include process variations within the same product. The results show that FacProcessTwin generates these process models accurately (a mean F1 of 95.2% against ground truth) and builds each twin in roughly a sixth of the manual time. Its human-in-the-loop governance then keeps the safety-critical bindings correct: at ambiguous tags where a single-pass baseline silently mis-binds 75.0% of the time, FacProcessTwin defers to the operator and mis-binds none.

2606.17664 2026-06-17 cs.IR cs.AI 新提交

Temporal Preference Optimization for Unsupervised Retrieval

面向无监督检索的时间偏好优化

HyunJin Kim, Jaejun Shim, Young Jin Kim, JinYeong Bak

发表机构 * Microsoft, Redmond, USA(微软公司,美国红mond) Sungkyunkwan University, Suwon, South Korea(成均馆大学,韩国首尔)

AI总结 提出TPOUR方法,通过时间检索偏好优化(TRPO)和可学习时间嵌入插值,使无监督稠密检索器能捕捉时间相关性,在时间信息检索任务上超越有监督和无监督基线。

Comments Accepted to ICML 2026

详情
AI中文摘要

无监督稠密检索器通过对比学习从无标签文档中学习语义相似性,从而提供可扩展性,但它们难以捕捉时间相关性,会检索到语义相关但时间错位的文档——当文档集合跨越多个时间段时(例如,针对“2019年的总统是谁?”检索2018-2025年的文档会引入时间歧义),这是一个重要方面。现有方法依赖于带有显式时间戳的有监督训练,但这并不总是可行的。我们提出TPOUR(面向无监督检索器的时间偏好优化),它使用我们新颖的训练方法时间检索偏好优化(TRPO)。TRPO在时间维度上重新诠释偏好学习,引导检索器偏向时间对齐的文档。TPOUR进一步通过在学习到的时间嵌入中进行插值,泛化到未见的时间段,实现连续的时间对齐。在时间信息检索(T-IR)实验上,TPOUR优于无监督和有监督基线。与Qwen-Embedding-8B相比,尽管规模小约72.7倍,TPOUR Contriever在显式查询上的平均nDCG@5提高了+4.04(+12.15%),在隐式查询上提高了+4.98(+15.21%)。我们的代码可在以下网址获取:https://this URL。

英文摘要

Unsupervised dense retrievers offer scalability by learning semantic similarity from unlabeled documents via contrastive learning, but they struggle to capture the temporal relevance, retrieving semantically related but temporally misaligned documents-an important aspect when a document collection spans multiple time periods (e.g., retrieving documents from 2018-2025 for "Who is the president in 2019?" introduces temporal ambiguity). Existing methods rely on supervised training with explicit timestamps, which are not always feasible. We propose TPOUR (Temporal Preference Optimization for Unsupervised Retriever), which uses our novel training method Temporal Retrieval Preference Optimization (TRPO). TRPO reinterprets preference learning in the temporal dimension, guiding the retriever to favor temporally aligned documents. TPOUR further generalizes to unseen time periods via interpolation in a learned time embedding, enabling continuous temporal alignment. Experiments on temporal information retrieval (T-IR), TPOUR outperforms both unsupervised and supervised baselines. Compared to Qwen-Embedding-8B, despite being about 72.7x smaller, TPOUR Contriever improves average nDCG@5 by +4.04 (+12.15%) on explicit and +4.98 (+15.21%) on implicit queries. We provide our code at https://github.com/agwaBom/TPOUR.

2606.17646 2026-06-17 cs.HC cs.AI 新提交

SketchXplain: Intuitive Visual Explanations of Image Classifiers with Sketches

SketchXplain:基于草图的图像分类器直观视觉解释

Wencan Zhang, Mario Michelessa, Xuejun Zhao, Brian Y. Lim

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 提出SketchXplain方法,结合显著性图、概念瓶颈模型和草图优化,生成基于草图的直观视觉解释,以提升图像分类器的可解释性。

Comments 14 pages, 6 figures, 4 tables. Submitted to TVCG

详情
AI中文摘要

显著性图可视化通过指向区域来解释基于图像的AI预测,但这些区域通常不直观且语义不清晰,存在可解释性差距。我们认为AI解释应该是直观的——与用户知识一致,同时简单且具有选择性以加速解释。受艺术绘画启发,我们提出SketchXplain,为直观的基于图像的可解释AI(XAI)生成基于草图的视觉解释。结合显著性图、概念瓶颈模型和草图优化技术,SketchXplain整合显著性以选择一致的观察伪影、概念以实现知识一致性、线索以表示它们,以及抽象以实现简洁性。在面部表情识别上的评估、建模和用户研究表明,与显著性图或简单绘图相比,SketchXplain支持更快速的解释,且可视化更一致。在皮肤病变诊断上的进一步评估发现,SketchXplain更一致地可视化疾病症状,更好地支持非专业诊断。因此,这项工作展示了草图在直观、简单、一致和快速的基于图像的XAI可视化中的价值。

英文摘要

Saliency map visualizations explain image-based AI predictions by pointing to regions, but these are often unintuitive and semantically unclear, leaving an interpretability gap. We argue that AI explanations should be intuitive -- coherent to user knowledge, yet simple and selective to accelerate interpretation. Inspired by artistic drawings, we propose SketchXplain to generate sketch-based visual explanations for intuitive image-based explainable AI (XAI). Combining techniques in saliency maps, concept-bottleneck models, and sketch optimization, SketchXplain integrates saliency to select coherent observation artifacts, concepts for knowledge coherence, cues to represent them, and abstraction for simplicity. Evaluating on face expression recognition, modeling and user studies showed that SketchXplain supported quicker interpretation with more aligned visualizations than saliency maps or simple drawings. Further evaluation on skin lesion diagnosis found that SketchXplain more coherently visualized disease symptoms, better supporting lay diagnosis. Thus, this work illustrates the value of sketches for intuitive, simple, coherent, and quick image-based XAI visualizations.

2606.17588 2026-06-17 cs.SE cs.AI 新提交

Understanding LLMs in Title-Abstract Screening: From Disagreements to Recommendations

理解LLM在标题-摘要筛选中的作用:从分歧到建议

Mika Mäntylä, Patricia Matsubara, Katia Romero Felizardo, Miikka Kuutila, Marco Gerosa, Savio de Sousa Sampaio, Tayana Conte, Igor Steinmacher

发表机构 * University of Helsinki, Finland(赫尔辛基大学,芬兰) UFMS, Brazil(巴西UFMS) UTFPR – Federal University of Technology - Paraná, Brazil(巴西UTFPR – 法定技术大学-帕拉那) LUT University, Finland(芬兰LUT大学) Northern Arizona University, United States(美国北亚利桑那大学) UFAM, Brazil(巴西UFAM)

AI总结 本研究通过定性分析LLM与人类在系统综述标题-摘要筛选中的分歧原因,提出改进建议,如验证语义理解、使用多个LLM和关注边界案例。

Comments 14 pages + references. Accepted for publication in the 52nd Euromicro Conference on Software Engineering and Advanced Applications (SEAA 2026)

详情
AI中文摘要

多项研究探讨了在系统综述(SRs)中使用大型语言模型(LLMs)进行标题-摘要筛选,报告了混合的准确性。然而,可靠性问题仍未得到充分解决。在本研究中,我们超越了定量的人机一致性指标,定性调查了LLMs失败的方式和原因。我们还提出了可操作的建议。我们分析了六个软件工程SRs和超过1000篇主要研究论文中LLMs与研究人员之间的分歧。对于每个SR,论文由人类专家和LLMs以零样本模式独立筛选,得到的Kappa值在0.52到0.77之间。定性分析表明,人机分歧源于反复出现的可识别原因,例如关键术语的边界模糊、关键词过度强调和错误的话题推断。基于这些发现,我们提出了建议,例如在部署前验证语义理解、运行多个LLMs以及将验证工作集中在边界案例上。未来的研究需要验证我们建议的影响,并且需要社区努力制定关于在SRs中使用LLMs的规范性指南。

英文摘要

Several studies have examined the use of large language models (LLMs) for title-abstract screening in systematic reviews (SRs), reporting mixed accuracy. However, questions of reliability remain largely unaddressed. In this study, we go beyond quantitative LLM-human agreement metrics and qualitatively investigate how and why LLMs fail. We also propose actionable recommendations. We analyzed disagreements between LLMs and researchers across six software engineering SRs and over 1,000 primary study papers. For each SR, papers were screened independently by human experts and LLMs in zero-shot mode, resulting in Kappa values ranging from 0.52 to 0.77. Qualitative analysis suggests that human-LLM disagreement results from recurring, identifiable causes, such as boundary ambiguity in key terms, keyword overemphasization, and incorrect topic inference. Based on these findings, we propose recommendations such as validating semantic understanding before deployment, running multiple LLMs, and focusing validation efforts on borderline cases. Future studies are needed to validate the impact of our recommendations, and community efforts are needed to develop normative guidelines on LLM usage in SRs.

2606.17581 2026-06-17 cs.PL cs.AI 新提交

Visored: A Controlled-Natural-Language Prover for LLM-Generated Mathematics

Visored: 一种面向LLM生成数学的受控自然语言证明器

Xiyu Zhai, Xinyi Chen, Yiping Wang, Runlong Zhou, Liao Zhang, Simon S. Du

发表机构 * University of Washington(华盛顿大学) University of Innsbruck(因斯布鲁克大学)

AI总结 提出一种基于依赖类型的证明器,其表面模仿数学自然语言,并通过规则驱动的自动化层填补常规步骤,使LLM无需专用训练数据即可在miniF2F基准上有效使用,并输出可检查的Lean文件。

详情
AI中文摘要

我们提出了一种基于依赖类型的证明器,其设计围绕LLM(以及人类)倾向于编写数学的方式,补充了Lean和Rocq等现有系统。其核心设计选择是模仿数学自然语言的表面,以及规则驱动的自动化层,该层关闭教科书通常会省略的常规步骤,使得被接受的证明可以重新作为经过检查的Lean文件输出。早期实验表明,即使没有任何特定于证明器的训练数据,LLM也能学会在miniF2F基准上有效使用它。Lean输出摘录:此 https URL

英文摘要

We present a dependent-type-based prover designed around the way LLMs (and humans) tend to write mathematics, complementing existing systems such as Lean and Rocq. Its core design choices are a surface that imitates mathematical natural language and a rule-driven automation layer that closes the routine steps a textbook would omit, so that an accepted proof can be re-emitted as a checked Lean file. Early experiments suggest that, even without any prover-specific training data, LLMs can learn to use it effectively on the miniF2F benchmark. Lean output excerpts: https://github.com/xiyuzhai-husky-lang/visored/

2606.17566 2026-06-17 cs.DC cs.LG 新提交

AoiZora: Topology-Aware Auto-Parallel Optimization for Inference of Diffusion Transformers

AoiZora: 面向扩散变换器推理的拓扑感知自动并行优化

Kaijian Wang, Yuanyuan Xu, Fanjiang Ye, Ye Cao, Jingwei Zuo, T. S. Eugene Ng, Yarong Mu, Yuke Wang

发表机构 * Rice University(里士大学) Independent Researcher(独立研究者) Google(谷歌)

AI总结 针对扩散变换器推理中的低延迟需求,提出AoiZora编译器,通过拓扑感知的物理布局优化自动并行策略,在TPU子片上实现高达1.42倍的加速。

详情
AI中文摘要

视频扩散已迅速成为关键的生成服务负载,但生成每个片段需要对大型时空潜在变量进行多次去噪迭代,这使得在单个设备上难以实现低延迟推理。因此,去噪步骤通常分布在多个加速器上,而TPU子片已成为一种有吸引力且实用的计算结构。然而,当前的自动并行系统几乎完全在逻辑设备网格上进行搜索,忽略了所选分片在物理TPU互连上的实际布局——这种疏忽导致了大量与拓扑相关的性能损失。我们通过AoiZora填补了这一空白,这是一个专为TPU子片上低延迟视频扩散推理设计的编译器中介拓扑规划器。其指导原则是通过利用编译流程中的不同点,重新连接逻辑分片与物理布局:AoiZora首先从廉价的预编译IR中消除弱分片候选,然后仅编译存活的候选,并使用编译后的HLO结合拓扑感知通信模型对其物理布局进行排序。最终方案沿普通编译器路径实现,保持模型代码、编译器降级、集合内核和网络路由完全不变。在TPU v5e子片上,与现有解决方案相比,AoiZora将Wan 2.1单步去噪延迟降低了多达1.42倍。

英文摘要

Video diffusion has quickly grown into a key generative serving workload, yet producing each clip demands many denoising iterations over large spatio-temporal latents, which puts low-latency inference out of reach on a single device. A denoising step is therefore typically distributed across multiple accelerators, and TPU sub-slices have become an attractive and practical fabric for doing so. Current auto-parallel systems, however, search almost exclusively over logical device meshes and disregard how a chosen sharding is actually laid out on the physical TPU interconnect -- an oversight that leaves large, topology-dependent performance on the table. We address this gap with AoiZora, a compiler-mediated topology planner built for low-latency video diffusion inference on TPU sub-slices. Its guiding principle is to reconnect logical sharding with physical placement by drawing on different points in the compilation flow: AoiZora first eliminates weak sharding candidates from inexpensive pre-compilation IRs, then compiles only the ones that survive and orders their physical placements using compiled HLO together with a topology-aware communication model. The winning plan is realized along the ordinary compiler path, leaving model code, compiler lowering, collective kernels, and network routing entirely intact. On TPU v5e sub-slices, AoiZora reduces Wan 2.1 one-step denoising latency by as much as 1.42x relative to existing solutions.

2606.17555 2026-06-17 cs.CR cs.AI cs.CE cs.ET 新提交

An AI Security Agent for Banking: Multi-Vector Fraud and AML Detection Across Retail and Corporate Accounts

面向银行业的人工智能安全代理:零售和企业账户的多向量欺诈与反洗钱检测

Joseph Walusimbi, Joshua Benjamin Ssentongo

发表机构 * \ Engineering Soroti University\ , Uganda

AI总结 提出一种融合LSTM序列模型、统计速度/阈值监控和图网络的三组件架构,并行处理交易流和会话流,在合成数据集上交易流F1达0.787,会话流F1达0.867,并集成客户验证聊天机器人(96.6%身份验证准确率)和分析师案例摘要助手(99.3%行动推荐F1)。

Comments 7 pages, 1 figure, 5 tables

详情
AI中文摘要

银行同时面临基于签名的欺诈(无卡攻击、账户接管、ATM克隆)和行为金融犯罪(结构化、分层、骡子网络、商业电子邮件欺诈)——两种具有根本不同检测需求的威胁家族。可靠捕获暴力攻击和高频事件的静态规则引擎,在结构上对商业电子邮件欺诈(BEC)支付重定向、会话劫持和洗钱分层视而不见,这些行为被设计为在单个交易或会话层面与合法活动难以区分。本文提出一种面向零售和企业银行业务的人工智能安全代理,通过一种三组件融合架构解决这一差距,该架构运行在两个并行事件流上:交易流(卡欺诈、ACH/电汇欺诈、反洗钱类别)和会话流(账户接管、会话劫持、SIM卡交换、内部滥用)。每个流结合了捕获每个账户行为历史的LSTM序列模型、统计速度/阈值监控器,以及捕获账户-对手方关系模式(扇入、扇出、传递比)用于洗钱检测的图/网络模块。在包含237,669笔交易和113,508个会话、涵盖13个威胁类别和3,470个模拟账户的合成事件日志上的实验表明,所提模型在交易流上的总体F1为0.787,会话流上为0.867,而基于规则的基线为0.562/0.733,仅LSTM基线为0.655/0.713。该代理包括一个面向客户的交易验证聊天机器人(96.6%身份验证准确率,86.8%大规模重置攻击检测)和一个分析师案例摘要助手(99.3%行动推荐F1),关键层自动响应延迟在95百分位下低于0.43毫秒。

英文摘要

Banks simultaneously face signature-based fraud (card-not-present attacks, account takeover, ATM cloning) and behavioural financial crime (structuring, layering, mule networks, business email compromise) -- two threat families with fundamentally different detection requirements. Static rule engines that reliably catch brute-force and high-velocity events are structurally blind to business-email-compromise (BEC) payment redirection, session hijacking, and money-laundering layering, which are engineered to appear indistinguishable from legitimate activity at the individual transaction or session level. This paper presents an AI security agent for retail and corporate banking that addresses this gap through a three-component fusion architecture operating on two parallel event streams: a transaction stream (card fraud, ACH/wire fraud, AML categories) and a session stream (account takeover, session hijacking, SIM-swap, insider abuse). Each stream combines an LSTM sequence model capturing per-account behavioural history, a statistical velocity/threshold monitor, and a graph/network module capturing account-counterparty relationship patterns (fan-in, fan-out, pass-through ratio) for money-laundering detection. Experiments on a synthetic event log of 237,669 transactions and 113,508 sessions across 13 threat categories and 3,470 simulated accounts demonstrate overall F1 of 0.787 (transaction stream) and 0.867 (session stream) for the proposed model, versus 0.562/0.733 for a rule-based baseline and 0.655/0.713 for an LSTM-only baseline. The agent includes a customer-facing transaction-verification chatbot (96.6% identity verification accuracy, 86.8% mass-reset attack detection) and an analyst case-summary assistant (99.3% action-recommendation F1), with Critical-tier automated response latency under 0.43 ms at the 95th percentile.

2606.17529 2026-06-17 cs.CE cs.LG 新提交

Domain-Validity-Gated Metamorphic Testing of Scientific ML Surrogates

基于域有效性门控的科学机器学习代理模型蜕变测试

Meng Li, Xiaohua Yang, Jie Liu, Shiyu Yan

发表机构 * School of Computing, University of South China(南方大学计算机学院) Hunan Engineering Research Center of Software Evaluation and Testing for Intellectual Equipment(湖南软件评估与测试研究中心) CNNC Key Laboratory on High Trusted Computing(中核高可信计算重点实验室)

AI总结 针对科学机器学习代理模型缺乏真实输出的问题,提出域有效性筛选方法将候选蜕变关系转化为可执行测试资产,并在多种代理模型上验证了其有效性。

详情
AI中文摘要

科学机器学习(SciML)代理模型近似昂贵的模拟,但任意输入的精确预期输出不可用(预言机问题)。蜕变测试检查执行间的关系,但候选关系并非自动有效:其前提条件、输出映射以及评分算子的数值下限决定了违反是否有意义。我们研究如何筛选候选蜕变关系(MR)的域有效性,并将其转化为可执行的、无预言机的SciML代理模型测试资产。我们提出:(i)域有效性准则,仅当候选的容差主导算子的数值下限且其前提条件成立时才接受该候选;(ii)MR卡可执行资产格式,记录源案例、变换、度量、容差和类型化的关系级判定;(iii)在MeshGraphNets圆柱流代理模型上的案例研究协议,附带声明账本将每个结果绑定到可追踪工件。在MeshGraphNets检查点上,节点置换达到机器精度,镜像y是有界分布外压力发现而非精确对称,绝对守恒被推迟而参考相对守卫通过。相同的读数在保留轨迹、检查点列表、另外三种架构以及PhysicsNeMo上保持一致。在第二个CFD任务(可压缩翼型)上,谓词反而基于物理原因拒绝不可压缩连续性,表明它推理域有效性而非运行固定检查表。在第二个PDE族上,FNO Burgers和热代理模型运行完整的接受/拒绝/执行判定。证据涵盖两个CFD任务和第二个PDE族,支持从候选MR到可审计SciML测试资产的域有效性感知桥梁,将模型级违反与域外应用区分开。

英文摘要

Scientific machine-learning (SciML) surrogates approximate expensive simulations, but exact expected outputs for arbitrary inputs are unavailable (the oracle problem). Metamorphic testing checks relations across executions, yet a candidate relation is not automatically valid: its preconditions, output mapping, and the numerical floor of the scoring operator determine whether a violation is meaningful. We study how candidate metamorphic relations (MRs) can be screened for domain validity and turned into executable, oracle-free test assets for SciML surrogates. We propose (i) a domain-validity rubric that admits a candidate only when its tolerance dominates the operator's numerical floor and its preconditions hold; (ii) an MR-card executable-asset format recording source cases, transformations, metrics, tolerances, and typed relation-level verdicts; and (iii) a case-study protocol on MeshGraphNets cylinder-flow surrogates, with a claim ledger binding every result to a tracked artifact. On a MeshGraphNets checkpoint, node permutation holds to machine precision, mirror-y is a bounded out-of-distribution stress finding rather than an exact symmetry, and absolute conservation stays deferred while a reference-relative guard passes. The same readings hold across held-out trajectories, a checkpoint roster, three further architectures, and PhysicsNeMo. On a second CFD task (compressible airfoil) the predicate instead rejects incompressible continuity on physical grounds, showing it reasons about domain validity rather than running a fixed checklist. On a second PDE family, FNO Burgers and heat surrogates run full admit/reject/execute verdicts. The evidence spans two CFD tasks and a second PDE family, supporting a validity-aware bridge from candidate MRs to auditable SciML test assets that separates model-level violations from out-of-domain applications.

2606.17514 2026-06-17 cs.SE cs.AI 新提交

Unlocking LLM Code Correction with Iterative Feedback Loops

解锁大语言模型代码修正的迭代反馈循环

Le Zhang, Suresh Kothari

发表机构 * Iowa State University(爱荷华州立大学)

AI总结 研究通过执行反馈迭代修正代码的能力,提出评估指标并分析推理与非推理模型在利用反馈上的差异,发现推理模型显著优于非推理模型,且语法和运行时错误比逻辑错误更易修正。

Comments 22 pages, 14th Computing Conference 2026

详情
AI中文摘要

大型语言模型在代码生成方面展现了卓越的能力。然而,现有评估大多只关注单次尝试的准确性,而忽略了现实编程中关键的迭代优化过程。本研究系统性地调查了LLMs通过执行反馈修正自身代码的能力。使用跨四个模型和两种主要编程语言的真实编程问题,本研究通过迭代优化框架评估性能,其中LLMs在每次尝试后接收编译器错误消息和测试用例反馈。本研究引入了评估代码失败、分析修正模式以及比较推理与非推理模型有效性的指标,为理解和实际应用LLM驱动代码生成系统中的反馈循环提供了可操作的见解。结果表明,推理模型在迭代中持续改进,在利用反馈方面显著优于非推理模型,而语法和运行时错误比逻辑或算法失败更容易处理。

英文摘要

Large Language Models have shown remarkable capabilities in code generation. However, most existing evaluations focus only on single-attempt accuracy and overlook the iterative refinement process that is central to real-world programming. This study presents a systematic investigation of LLMs' ability to rectify their own code through execution feedback. Using real-world programming problems across four models and two major programming languages, this study evaluates performance using iterative refinement framework where LLMs receive compiler error messages and testcase feedback after each attempt. This study introduces metrics to evaluate code failures, analyze rectification patterns, and compare the effectiveness of reasoning and non-reasoning models, offering actionable insights into both the understanding and practical application of feedback loops in LLM-driven code generation systems. Results show that reasoning models consistently improve over iterations, substantially outperforming non-reasoning models in leveraging feedback, while syntactic and runtime errors are far more tractable than logical or algorithmic failures.

2606.17467 2026-06-17 cs.CR cs.CL 新提交

PARSE: Provenance-Aware Retrieval Sanitization for Professional Domain LLM Agents

PARSE: 面向专业领域LLM Agent的溯源感知检索净化

Aaditya Pai

发表机构 * Data Science Institute(数据科学研究所)

AI总结 针对真实企业文档中提示注入攻击难以防御的问题,提出PARSE方法,通过分类句子注入可能性、提取结构化事实并验证一致性,将攻击成功率从25.4%降至15.6%,同时保持86.9%的实用性。

Comments 7 pages, 3 figures, 2 tables. Under submission at EMNLP 2026 Industry Track

详情
AI中文摘要

在合成基准上评估的提示注入防御无法泛化到真实企业文档,这些文档更长、更密集,并将合法权威语言与事实内容交织在一起。我们通过一个包含五个专业领域(金融、法律、医学、科学、DevOps)122个任务的真实文档基准,使用实际的SEC文件、联邦公报规则、PubMed摘要、arXiv论文和GitHub事后分析报告,展示了这一差距。在合成基准上最强的防御方法——释义,在真实文档上未显示出统计显著的攻击成功率降低(p=0.500),同时实用性从91.8%降至82.8%。我们引入了PARSE(溯源感知检索净化),一种领域感知、事实保留的净化流程,它按注入可能性对每个句子进行分类,在重写前提取结构化事实,并通过一致性检查循环验证事实保留。一个指导性门控将59%的真实企业文档路由到轻量级路径,将计算成本集中在高风险文档上。PARSE实现了15.6%的攻击成功率——相比25.4%的基线降低了38%——在86.9%的实用性下,这是唯一既统计显著(p=0.014,具有充分统计功效)又保持接近基线实用性的条件。从业者应在领域匹配的真实文档上评估防御,而不是合成代理。

英文摘要

Prompt injection defenses evaluated on synthetic benchmarks do not generalize to real enterprise documents, which are longer, denser, and interleave legitimate authority language with factual content. We demonstrate this gap with a real-document benchmark of 122 tasks across five professional domains (financial, legal, medical, scientific, DevOps) using actual SEC filings, Federal Register rules, PubMed abstracts, arXiv papers, and GitHub postmortems. Paraphrasing, the strongest defense on synthetic benchmarks, shows no statistically significant attack success rate reduction on real documents (p=0.500) while degrading utility from 91.8% to 82.8%. We introduce PARSE (Provenance-Aware Retrieval Sanitization), a domain-aware, fact-preserving sanitization pipeline that classifies each sentence by injection likelihood, extracts structured facts before rewriting, and verifies fact preservation via a consistency-checking loop. A directiveness gate routes 59% of real enterprise documents to a lightweight path, concentrating computational cost on high-risk documents. PARSE achieves 15.6% attack success rate -- a 38% reduction versus the 25.4% baseline -- at 86.9% utility, the only condition that is both statistically significant (p=0.014, adequately powered) and maintains near-baseline utility. Practitioners should evaluate defenses on domain-matched real documents, not synthetic proxies.

2606.17461 2026-06-17 cs.AR cs.AI cs.LG 新提交

AUTOGATE: Automated Clock Gating via Toggling-Aware LLM-based RTL Rewriting

AUTOGATE:基于翻转感知的LLM驱动RTL重写的自动时钟门控

Yiting Wang, Chenhui Deng, Chia-Tung Ho, Yanqing Zhang, Zhuo Feng, Cunxi Yu, Ang Li, Gang Qu, Brucek Khailany

发表机构 * University of Maryland, College Park(马里兰大学学院公园分校) NVIDIA(英伟达)

AI总结 提出AUTOGATE框架,通过ML-LLM协同设计将波形翻转迹线转化为紧凑表示,指导LLM进行RTL重写,实现层次化代码库中的时钟门控优化,平均降低动态功耗49.31%。

Comments 9 pages, 6 figures, 7 tables

详情
AI中文摘要

细粒度时钟门控(FGCG)是降低动态功耗最有效的技术之一,但当前的FGCG优化流程仍主要依赖手动操作。近期基于LLM的RTL优化方法受限于两个关键缺陷:(1)无法处理跨越数百万周期的长波形迹线,(2)难以在保持正确性的同时将优化扩展到大型层次化代码库。在本工作中,我们提出了AUTOGATE,这是首个面向工业级RTL功耗优化的智能体框架,支持在大型层次化代码库中进行工作负载感知的时钟门控优化。AUTOGATE引入了机器学习(ML)与LLM的协同设计,桥接了波形级分析与RTL重写。具体而言,我们设计了一种基于ML的聚类算法,将原始翻转迹线提炼为紧凑的结构化表示,以指导基于LLM的RTL重写。这使得无需LLM直接处理原始波形数据即可准确识别和应用时钟门控机会。为增强可扩展性,AUTOGATE采用层次化多智能体架构,将大型设计分解为可独立优化的模块,从而在深层设计层次中实现协调优化。我们在从小型RTL设计到大型工业级代码库的多样化设计集上评估了AUTOGATE。实验结果表明,与基线相比,AUTOGATE持续降低动态功耗。在小型设计套件上,AUTOGATE平均降低动态功耗49.31%。在工业级设计上,它在NVDLA和BlackParrot上分别实现了19.34%和7.96%的动态功耗降低,在高度优化的专有生产设计上最高降低6.86%。

英文摘要

Fine-grain clock gating (FGCG) is among the most effective techniques for reducing dynamic power, yet current FGCG optimization flows remain largely manual. Recent LLM-based RTL optimization approaches remain limited by two key drawbacks: (1) the inability to process long waveform traces spanning millions of cycles, and (2) the difficulty of scaling optimization to large hierarchical codebases while preserving correctness. In this work, we present AUTOGATE, the first agentic framework for industry-grade RTL power optimization, enabling workload-aware clock-gating optimization across large hierarchical codebases. AUTOGATE introduces a Machine Learning (ML)-LLM co-design that bridges waveform-level analysis and RTL rewriting. Specifically, we design an ML-based clustering algorithm that distills raw toggling traces into compact, structured representations that guide LLM-based RTL rewriting. This enables accurate identification and application of clock-gating opportunities without requiring LLMs to directly process raw waveform data. To enhance scalability, AUTOGATE employs a hierarchical multi-agent architecture that decomposes large designs into independently optimizable modules, enabling coordinated optimization across deep design hierarchies. We evaluate AUTOGATE on a diverse set of designs ranging from small RTL designs to large industrial-grade codebases. Experimental results show that AUTOGATE consistently reduces dynamic power relative to baselines. Across the small-design suite, AUTOGATE reduces dynamic power by 49.31% on average. On industry-scale designs, it achieves 19.34% and 7.96% dynamic power reductions on NVDLA and BlackParrot, respectively, and up to 6.86% on highly optimized proprietary production designs.

2606.17441 2026-06-17 cs.HC cs.AI cs.CY 新提交

Patients With Personality: Realistic Patient Simulation through Controlled Diversity and Selective Disclosure

具有个性的患者:通过受控多样性与选择性披露实现逼真的患者模拟

Moritz Schlager, Friederike Jungmann, Samuel Schmidgall, Philipp Raffler, Franziska Hartl, Eva Wende, Paula Roßmüller, Conrad Ketzer, Avinatan Hassidim, Dale R. Webster, Yossi Matias, Yun Liu, Daniel Rueckert, Mike Schaekermann, Paul Hager

发表机构 * Technical University of Munich(慕尼黑技术大学) Munich Center for Machine Learning(慕尼黑机器学习中心) TUM University Hospital(慕尼黑技术大学医院) Google DeepMind(谷歌DeepMind) Google Research(谷歌研究) Imperial College London(伦敦帝国学院)

AI总结 提出PatientsWithPersonality框架,通过HEXACO人格参数化控制患者对话风格、合作性和信息披露,生成逼真且多样化的虚拟患者,在临床评估中接近真实演员表现。

Comments 22 pages, 11 figures

详情
AI中文摘要

模拟逼真的患者交互是在没有耗时且昂贵的用户研究的情况下大规模测试LLMs临床应用的关键要求。然而,现有方法通常缺乏真实性和可控性,常常在未提示的情况下过度分享信息,并且未能捕捉患者行为的广泛变异性。在这里,我们引入了PatientsWithPersonality (PWP),一个患者模拟框架,通过在潜在患者状态上显式的人格参数化生成逼真且多样化的虚拟患者响应。基于HEXACO(一个用于量化和参数化人类行为特征的六维人格空间),我们的方法能够在统一框架内对对话风格、合作性和信息披露进行细粒度控制。在临床医生评估中,PWP被认为几乎与记录的人类演员一样逼真,并且明显优于先前的模拟器,同时被标记为“信息过多”的频率远低于前者。基于HEXACO轴的条件化产生的人格特质可由临床医生和自动评估者恢复,其行为足迹比最接近的基线宽得多,并防止过度分享。总之,我们的框架通过逼真且可操控的患者模拟器,为更准确且信息丰富的LLM基准测试铺平了道路。

英文摘要

Simulating realistic patient interactions is a key requirement to testing clinical applications of LLMs at scale without time-consuming and expensive user studies. However, existing approaches often lack realism and controllability, often oversharing information unprompted, and failing to capture the wide variability of patient behavior. Here, we introduce PatientsWithPersonality (PWP), a patient simulation framework that generates realistic yet diverse virtual patient responses through explicit personality parametrization over a latent patient state. Grounded in HEXACO, a six-dimensional personality space used to quantify and parameterize human behavioral traits, our approach enables fine-grained control over conversational style, cooperativeness, and information disclosure within a unified framework. In a clinician evaluation, PWP is judged nearly as realistic as recorded human actors and clearly ahead of prior simulators, while being flagged as "too informative" far less often. Conditioning on HEXACO axes yields personas whose configured traits are recoverable by both clinicians and an autorater, span a substantially wider behavioral footprint than the closest baseline, and prevent oversharing. Altogether, our framework paves the way for more accurate and informative LLM benchmarking through our realistic and steerable patient simulator.

2606.17432 2026-06-17 cs.GR cs.CV 新提交

Edit3DGS: Unified Framework for Dynamic Head Editing via 2D Instruction-Guided Diffusion and 3D Gaussian Splatting

Edit3DGS:通过2D指令引导扩散与3D高斯泼溅的动态头部编辑统一框架

Duy-Dat Tran, Trung-Nghia Le

发表机构 * University of Science, VNU-HCM, Ho Chi Minh, Vietnam(越南胡志明市国家大学) Vietnam National University, Ho Chi Minh, Vietnam(越南国家大学)

AI总结 提出Edit3DGS统一框架,结合2D指令引导扩散与3D高斯泼溅,实现动态3D头部的可控编辑,支持表情变换、属性修改等操作,并保持身份与运动动态的一致性。

Comments SOICT 2025

详情
AI中文摘要

我们提出Edit3DGS,一个用于动态3D头部编辑的统一框架,它将2D指令引导扩散与3D高斯泼溅相结合。与先前分别处理基于帧的编辑或静态3D重建的方法不同,我们的方法将图像域中的语义可控性与逼真、时间一致的3D表示结合起来。给定输入视频,可编辑的面部区域被掩码并使用文本条件扩散模型进行修改,以支持细粒度操作,如表情变换、属性修改和外观细化。然后,编辑后的帧通过3D高斯泼溅聚合,生成一个连贯、高保真的化身,同时保留身份和运动动态。为了强制一致性,Edit3DGS采用了多视图批量编辑和轻量级修复策略,以恢复跨时间步丢失的表情。实验结果表明,我们的框架能够实现可控、无伪影的头部编辑,并具有平滑的时间过渡,在虚拟化身、沉浸式通信、电影制作和交互媒体中具有实际应用。

英文摘要

We present Edit3DGS, a unified framework for dynamic 3D head editing that integrates 2D instruction-guided diffusion with 3D Gaussian splatting. Unlike prior approaches that separately address frame-based edits or static 3D reconstruction, our method couples semantic controllability in the image domain with photorealistic, temporally consistent 3D representations. Given an input video, editable facial regions are masked and modified using a text-conditioned diffusion model to support fine-grained operations such as expression transformation, attribute modification, and appearance refinement. The edited frames are then aggregated through 3D Gaussian splatting to produce a coherent, high-fidelity avatar that preserves both identity and motion dynamics. To enforce consistency, Edit3DGS incorporates multi-view batch editing and lightweight inpainting strategies that recover lost expressions across timesteps. Experimental results demonstrate that our framework enables controllable, artifact-free head editing with smooth temporal transitions, offering practical applications in virtual avatars, immersive communication, film production, and interactive media.

2606.17398 2026-06-17 cs.CR cs.AI cs.SE 新提交

SoK: AI-Augmented Binary Reversing

SoK: AI增强的二进制逆向工程

Yujeong Kwon, Yiyue Zhang, Shakhzod Yuldoshkhujaev, Kexin Pei, Dokyung Song, Hyungjoon Koo

发表机构 * Sungkyunkwan University(成均馆大学) The University of Chicago(芝加哥大学) Yonsei University(延世大学)

AI总结 系统化梳理AI增强二进制逆向工程领域,提出统一分类法涵盖传统与AI方法,揭示LLM和智能体AI的新角色,识别技术挑战与评估缺口。

Comments 20 pages, 7 tables, 3 figures

详情
AI中文摘要

二进制逆向工程是软件理解、漏洞发现、恶意软件调查和固件审计的基础。然而,由于编译过程中语义信息的不可逆丢失,它仍然具有固有的挑战性。机器学习、大型语言模型(LLM)和智能体AI系统的最新进展加速了AI增强二进制逆向工程的采用。然而,由此产生的工作在逆向领域、工件表示、学习方法和评估实践方面变得越来越分散。本文首次对AI增强二进制逆向工程的知识进行了全面的系统化。我们分析了自2015年以来发表的144篇研究论文,并根据推理任务将其组织成22个二进制逆向领域。我们进一步引入了一个统一的分类法,涵盖传统和AI增强的逆向流程。我们的分类法连接了传统分析技术、二进制衍生工件、表示策略、学习范式和下游推理任务,同时阐明了LLM和智能体AI系统的新兴角色。通过建立通用词汇和结构化框架,我们提供了该领域过去十年演变的整体视图。我们的研究揭示了看似不同方法背后的共同结构,突出了持续存在的技术挑战和评估缺口,并确定了未来研究的有希望的机会。总的来说,这些见解阐明了该领域的当前状态,并为下一代可靠且可扩展的AI增强二进制逆向系统奠定了基础。

英文摘要

Binary reversing is fundamental to software understanding, vulnerability discovery, malware investigation, and firmware auditing. However, it remains inherently challenging due to the irreversible loss of semantic information during compilation. Recent advances in machine learning, large language models (LLMs), and agentic AI systems have accelerated the adoption of AI-augmented binary reversing. Yet, the resulting body of work has become increasingly fragmented across reversing domains, artifact representations, learning approaches, and evaluation practices. This paper presents the first comprehensive systematization of knowledge on AI-augmented binary reversing. We analyze 144 research papers published since 2015, and organize them into 22 binary reversing domains according to the inference tasks. We further introduce a unified taxonomy spanning conventional and AI-augmented reversing pipelines. Our taxonomy connects traditional analysis techniques, binary-derived artifacts, representation strategies, learning paradigms, and downstream inference tasks, while clarifying the emerging roles of LLMs and agentic AI systems. By establishing a common vocabulary and structured framework, we provide a holistic view of the field's evolution over the past decade. Our study reveals common structures underlying seemingly disparate approaches, highlights persistent technical challenges and evaluation gaps, and identifies promising opportunities for future research. Collectively, these insights clarify the current state of the field and provide a foundation for the next generation of reliable and scalable AI-augmented binary reversing systems.

2606.17286 2026-06-17 cs.CY cs.AI 新提交

From Democracies to Autocracies: How AI Systems Enable Authoritarianism by Design

从民主到专制:AI系统如何通过设计实现威权主义

Jeba Sania, Marta Ziosi, Fazl Barez

发表机构 * Harvard Kennedy School(哈佛肯尼迪学校) University of Oxford(牛津大学)

AI总结 本文通过比较美国到中国的六种AI系统生命周期,识别出集中行政数据、监管漏洞、弱用户合规性及编码受保护群体特征等关键特征,揭示AI系统在不同政体中促成威权主义的机制。

详情
AI中文摘要

AI驱动的威权主义并不仅限于专制国家。本文通过调查和映射从美国到中国不同政治体制下部署的六种AI系统的生命周期,提供了更高的透明度。基于广泛来源(学术出版物、调查研究报告、第三方评估、媒体采访、政府采购公告),我们进行了系统性的定性比较,以识别在其各自政治背景下促成威权主义的关键技术和操作特征。我们发现,促成特征包括:集中和挪用行政数据用于执法和政治惩罚、未能阻止滥用的监管漏洞、使人类监督机制失效的弱用户合规性,以及识别弱势群体成员的受保护群体特征编码。我们发现这些特征存在于专制和民主政体部署的系统中,尽管配置不同。我们还发现,集中式和碎片化的AI系统都可以通过利用治理漏洞来助长威权主义:由行政当局(特别是安全和军事机构)指导的集中式系统通常不受正式监督机制的约束,而碎片化系统则在利益相关者之间分散责任,为根深蒂固铺平道路。这些发现表明,AI驱动的威权主义是分布式的,源于开发者、管理者和用户的设计和操作选择。最后,我们为开发者和政策制定者提供了缓解这些风险的建议。

英文摘要

AI-enabled authoritarianism is not confined to autocracies. In this paper, we provide greater transparency by investigating and mapping the lifecycles of six AI systems deployed in different political regimes, ranging from the US to China. By drawing on an extensive range of sources (academic publications, investigative research reports, third-party evaluations, media interviews, government procurement notices), we conduct a systematic, qualitative comparison across systems to identify the critical technical and operational features that enable authoritarianism within their respective political contexts. We find that enabling features include the centralization and co-optation of administrative data for law enforcement and political punishment, regulatory gaps that fail to deter misuse, weak user compliance that nullifies human oversight mechanisms, and the encoding of protected group traits that identify members of vulnerable populations. We find that these features are present across systems deployed in autocratic and democratic regimes, albeit in varying configurations. We also find that both centralized and fragmented AI systems can contribute to authoritarianism by exploiting governance gaps: centralized systems directed by executive authorities, particularly within security and military institutions, are often not subjected to formal oversight mechanisms, while fragmented systems diffuse accountability between stakeholders, paving the way for entrenchment. These findings reveal that AI-enabled authoritarianism is distributed, resulting from design and operational choices made by developers, administrators, and users alike. We conclude with recommendations for developers and policymakers to mitigate these risks.

2606.17283 2026-06-17 cs.CR cs.AI cs.LG 新提交

ARVO: Atlas of Reproducible Vulnerabilities for Open-Source Software

ARVO:开源软件可复现漏洞图谱

Xiang Mei, Jordi Del Castillo, Pulkit Singh Singaria, Haoran Xi, Abdelouahab Benchikh, Tiffany Bao, Ruoyu Wang, Yan Shoshitaishvili, Adam Doupé, Hammond Pearce, Brendan Dolan-Gavitt

发表机构 * National Vulnerability Database(国家漏洞数据库) Google(谷歌)

AI总结 提出一种大规模构建可复现漏洞数据集的方法,基于OSS-Fuzz构建含6100+真实漏洞的ARVO数据集,实现81%复现率与89.4%补丁定位精度,解决可复现性、数量与多样性三难问题。

Comments Accepted at IEEE European Symposium on Security and Privacy (EuroS&P) 2026

详情
AI中文摘要

长期以来,在漏洞数据集中实现可复现性、数量和多样性被视为固有的三方权衡,改进一个维度往往以牺牲其他维度为代价。在实践中,可复现性是最常被忽视的维度。这限制了从历史错误数据集中自动提取的内容,并降低了它们对下游安全研究的实用性。在这项工作中,我们提出了一种方法,通过识别大规模错误复现的关键障碍并用通用解决方案加以解决,从而生成一个新的安全数据集,确保大规模多样化漏洞的可复现性。使用这种方法,我们为最大的开源软件漏洞数据集(OSS-Fuzz)引入了完全可复现性,并构建了ARVO数据集(开源软件可复现漏洞图谱)。ARVO是一个大规模数据集,包含311个项目中的6100多个真实世界漏洞。专注于可复现性,ARVO与现有数据集的不同之处在于,它以可以跨版本一致重建、触发和分析的形式提供每个漏洞。可复现性还使得能够自动识别每个漏洞的相应补丁,并支持代码更改后直接与漏洞交互,这是现有大规模数据集所不具备的能力。在我们的评估中,ARVO成功复现了81%的漏洞,并在定位的补丁上达到了89.4%的准确率。我们还讨论了ARVO对上游实践和下游安全研究的影响。

英文摘要

Achieving reproducibility, quantity, and diversity in vulnerability datasets has long been viewed as an inherent three-way trade-off, where improving one dimension often comes at the cost of the others. In practice, reproducibility has been the dimension most often neglected. This has limited what can be automatically extracted from historical bug datasets, and has reduced their utility for downstream security research. In this work, we propose a method to produce a new security dataset which ensures reproducibility for diverse vulnerabilities at scale by identifying the key obstacles to large-scale bug reproduction and addressing them with general solutions. Using this method, we introduce full reproducibility to the largest open source software vulnerability dataset (OSS-Fuzz) and construct the ARVO dataset (an Atlas of Reproducible Vulnerabilities in Open-source software). ARVO is a large-scale dataset consisting of over 6,100 real-world vulnerabilities across 311 projects. Focusing on reproducibility, ARVO differs from existing datasets by providing each vulnerability in a form that can be consistently rebuilt, triggered, and analyzed across versions. Reproducibility also enables automatic identification of the corresponding patch for each vulnerability and supports direct interaction with vulnerabilities after code changes, capabilities that existing large-scale datasets do not provide. In our evaluation, ARVO successfully reproduces 81% of vulnerabilities and achieves 89.4% accuracy on the located patches. We also discuss ARVO's influence on both upstream practices and downstream security research.

2606.17249 2026-06-17 cs.AR cs.LG cs.NE eess.SP 新提交

From Compression to Deployment: Real-Time and Energy-Efficient FastGRNN on Ultra-Constrained Microcontrollers

从压缩到部署:超受限微控制器上的实时节能FastGRNN

Emre Can Kizilates

发表机构 * Electronics Engineer Independent Researcher, Izmir, Turkey

AI总结 针对超受限微控制器,提出端到端开源FastGRNN压缩部署方案,结合低秩分解、稀疏化和量化,在8位和16位MCU上实现实时50Hz推理,模型仅566字节权重,F1达0.918,并贡献了跨平台确定性推理、循环预热延迟、无乘法器查找表和硬件能耗分析。

Comments 14 pages, 8 figures. Code: https://github.com/emre1998/fastgrnn-har

详情
AI中文摘要

现代机器学习的主导轨迹一直是规模化:更大的模型、更大的加速器、更大的内存预算。然而,多年的全球半导体供应限制以及始终在线推理日益增长的能源和碳成本暴露了这一轨迹的脆弱性,并推动了相反的方向:重构AI和ML算法,使其适应已经在可穿戴设备、传感器和边缘设备中大规模生产的小型、无处不在的微控制器。我们提出了FastGRNN(一种紧凑的门控循环单元)的端到端开源复现,部署在两个裸机目标上:8位Arduino(ATmega328P)和16位MSP430(无硬件乘法器;16 KB闪存;512 B SRAM)。我们的压缩流水线结合了低秩权重分解、迭代硬阈值稀疏性和基于张量的Q15训练后量化,并带有显式激活校准。部署的模型占用566字节权重,在HAPT测试集上达到宏F1=0.918(种子0;五个种子的Q15平均值为0.853±0.107)。它在3399个测试窗口上与PyTorch参考实现达到100%预测一致(MCU种子0;五个种子上99.91-100% C等效)。两个平台都支持实时50Hz流式推理(Arduino上每个样本9.21 ms;MSP430上13 ms),其中256条目sigmoid/tanh查找表在无乘法器的MSP430上实现了30.5倍加速。四个贡献扩展了原始FastGRNN论文:(i)跨平台位等效确定性推理;(ii)循环预热延迟的表征(中位数74个样本,1.48秒;最坏情况125个样本,2.50秒,超过100个测试窗口);(iii)针对无乘法器嵌入式目标的可部署查找表方案;(iv)硬件能耗表征,显示17.7 mW主动推理功率,<0.09 mW空闲功率,以及使用LUT实现96.7%的能耗降低。

英文摘要

The dominant trajectory of modern machine learning has been to scale up: larger models, larger accelerators, larger memory budgets. Yet a multi-year global semiconductor supply constraint and the growing energy and carbon cost of always-online inference expose the fragility of this trajectory and motivate the opposite direction: refactoring AI and ML algorithms to fit the small, ubiquitous microcontrollers already in mass production in wearables, sensors, and edge appliances. We present an end-to-end open-source reproduction of FastGRNN, a compact gated recurrent cell, deployed on two bare-metal targets: the 8-bit Arduino (ATmega328P) and the 16-bit MSP430 (no hardware multiplier; 16 KB Flash; 512 B SRAM). Our compression pipeline combines low-rank weight factorization, iterative hard-thresholding sparsity, and per-tensor Q15 post-training quantization with explicit activation calibration. The deployed model occupies 566 bytes of weights and achieves macro F1 = 0.918 (seed 0; five-seed Q15 mean 0.853+-0.107) on the HAPT test set. It matches a PyTorch reference at 100% prediction agreement across 3,399 test windows (MCU seed 0; 99.91-100% C-equivalent across five seeds). Both platforms sustain real-time 50 Hz streaming inference (9.21 ms per sample on Arduino; 13 ms on MSP430), where a 256-entry sigmoid/tanh look-up table delivers a 30.5x speedup on the multiplier-less MSP430. Four contributions extend the original FastGRNN paper: (i) cross-platform bit-equivalent deterministic inference; (ii) characterization of recurrent warm-up latency (median 74 samples, 1.48 s; worst-case 125 samples, 2.50 s over 100 test windows); (iii) a deployable look-up-table recipe for multiplier-less embedded targets; and (iv) hardware energy characterization showing 17.7 mW active inference power, <0.09 mW idle power, and 96.7% energy reduction with the LUT.

2606.17216 2026-06-17 cs.MA cs.GT cs.RO 新提交

Intermittent Strategic Cooperation of Two Selfish Agents on Graphs

两个自私智能体在图上的间歇性战略合作

Itay Shedlezki, Noa Agmon

发表机构 * Bar-Ilan University(巴伊兰大学)

AI总结 研究两个自私智能体在时间与空间约束下的战略合作问题,通过IC2PP模型刻画纯纳什均衡结构,证明均衡存在性并提出多项式时间枚举算法。

详情
AI中文摘要

我们通过间歇性战略合作双智能体路径规划(IC2PP)问题研究两个自私智能体在空间和时间约束下的战略合作,这是一个图上的最短路径博弈,其中智能体向各自目标导航,同时可选地在特定节点合作以减少自身旅行时间。尽管这种合作对双方都有严格利益,但战略上脆弱:智能体可能在其路径的任何点偏离。建模为双人博弈,我们刻画了IC2PP中纯纳什均衡(PNE)联合策略的结构,并表明稳定合作必须遵循高度受限的形式。我们进一步证明每个IC2PP实例中至少存在一个PNE,并提出一个多项式时间算法来枚举所有相关PNE。当出现多个均衡时,我们研究基于议价理论选择概念的协调机制,并根据个体旅行时间和社会福利经验性地比较均衡结果。

英文摘要

We study strategic space- and time-constrained cooperation between two self-interested agents through the Intermittent Strategic Cooperation-Based Two-Agent Path Planning (IC2PP) problem, a shortest-path game on graphs in which agents navigate toward individual targets while optionally cooperating at specific nodes to reduce their own travel times. Although such cooperation can strictly benefit both agents, it is strategically fragile: agents may deviate at any point along their paths. Modeled as a 2-player game, we characterize the structure of Pure Nash Equilibrium (PNE) joint strategies in IC2PP, and show that stable cooperation must follow a highly constrained form. We further prove that at least one PNE exists in every instance of IC2PP, and present a polynomial-time algorithm for enumerating all relevant PNEs. When multiple equilibria arise, we study coordination mechanisms based on bargaining-theoretic selection concepts and empirically compare equilibrium outcomes in terms of individual travel times and social welfare.

2606.17203 2026-06-17 cs.SE cs.AI 新提交

Trust-Aware Multi-Agent Traceability: Confidence-Calibrated Knowledge Graphs for Consistent Software Artifact Management

信任感知的多智能体可追溯性:用于一致软件工件管理的置信度校准知识图谱

Mohamed Essam, Kareem Wael, Azza Hassan, Ahmed Haitham, Mahmoud Soliman, Samer Saber, Ibrahim Habib

发表机构 * CairoMotive Cairo, Egypt(开罗动力埃及)

AI总结 提出一种信任感知协调框架,通过共享知识图谱和校准置信度分数,结合嵌入检索与LLM多准则分析的两阶段可追溯性链接预测管道,解决多智能体系统中错误传播问题。

详情
AI中文摘要

多智能体AI系统越来越多地用于自动化软件工程任务,包括需求分析、架构设计、测试生成和可追溯性链接。当这些智能体作为顺序管道在共享软件工件上运行时,上游智能体做出的错误和低置信度决策会传播到下游阶段,产生孤立的需求、矛盾的链接和合规性差距,这在安全关键领域构成重大风险。我们提出一个信任感知协调框架,其中共享知识图谱既作为集中式语义记忆,又作为协调表面,智能体通过该表面使用校准的置信度分数评估并基于彼此的贡献进行构建。我们的方法引入了一个两阶段可追溯性链接预测管道,结合了基于嵌入的检索与基于LLM的多准则分析,一种可追溯性种子机制,能够比较推导时间和验证时间的置信度,以及一个一致性协议,通过置信度阈值门控、置信度发散检测和冲突解决来管理管道交互。我们在一个汽车软件工程案例研究上进行了评估,测量了链接预测校准、协议有效性、阈值敏感性和可追溯性种子的影响。消融研究证实,置信度校准对于有效的管道协调至关重要。

英文摘要

Multi-agent AI systems are increasingly used to automate software engineering tasks including requirements analysis, architecture design, test generation, and traceability linking. When these agents operate as a sequential pipeline over shared software artifacts, errors and low-confidence decisions made by upstream agents propagate to downstream stages, producing orphaned requirements, contradictory links, and compliance gaps that pose significant risks in safety-critical domains. We propose a trust-aware coordination framework where a shared knowledge graph serves as both centralized semantic memory and a coordination surface through which agents assess and build upon each other's contributions using calibrated confidence scores. Our approach introduces a two-stage traceability link prediction pipeline combining embedding-based retrieval with LLM-based multi-criteria analysis, a traceability seeding mechanism that enables comparison between derivation-time and validation-time confidence, and a consistency protocol governing pipeline interactions through confidence threshold gating, confidence divergence detection, and conflict resolution. We evaluate on an automotive software engineering case study measuring link prediction calibration, protocol effectiveness, threshold sensitivity, and the impact of traceability seeding. Ablation studies confirm that confidence calibration is essential for effective pipeline coordination.

2606.17197 2026-06-17 cs.SE cs.AI 新提交

Cluster-Aware Dual-Level Test Specification Generation for Large-Scale Automotive Software Requirements

面向大规模汽车软件需求的集群感知双层测试规格生成

Hazem Ayman, Menna Sedik, Kareem Mostafa, Mahmoud Soliman, Samer Saber, Ibrahim Habib

发表机构 * CairoMotive Cairo, Egypt(开罗动力埃及)

AI总结 提出一种“先聚类后总结”流水线,通过嵌入、降维、聚类、多级摘要和双层测试生成,解决大规模需求下LLM处理依赖缺失和上下文窗口限制问题,提升集成测试覆盖率并高效扩展。

详情
AI中文摘要

生成满足Automotive SPICE SWE.6要求的测试规格随着项目扩展到数千个需求而变得越来越具有挑战性和耗时。由于手动过程通常需要数周的工程努力,自动化成为关键需求。然而,标准的大语言模型方法在大规模下难以应对:单独处理需求会丢失重要的需求间依赖关系,而一次性输入整个语料库则超出上下文窗口限制,导致集成覆盖不完整和测试用例冗余。本文提出一种新颖的“先聚类后总结”流水线,通过三个阶段解决这些限制。需求使用句子变换器嵌入,并通过UMAP降维和HDBSCAN密度聚类进行分组。该分组利用自动最小聚类大小选择,该选择由结合归一化轮廓系数和Calinski-Harabasz分数的质量准则驱动。然后,多级map-reduce摘要算法将每个聚类提炼为简洁、符合领域的描述,同时保留定量阈值和安全完整性等级。该流水线利用派生的聚类拓扑在两级生成测试规格:单个需求验证和验证跨需求特征行为的聚类级集成测试。邻近聚类上下文机制在每个LLM调用期间提供有限的跨特征感知,检索增强生成将所有输出基于ISO 26262和ASPICE标准。在不同规模的汽车需求数据集上的评估表明,与基线方法相比,集群感知方法提高了集成测试覆盖率并保持了摘要保真度,同时高效扩展到数千个需求。

英文摘要

Generating test specifications that satisfy Automotive SPICE SWE.6 requirements becomes increasingly challenging and time-consuming as projects scale to thousands of requirements. Because this manual process often consumes weeks of engineering effort, automation becomes a critical necessity. However, standard Large Language Model (LLM) approaches struggle at scale: processing requirements individually discards vital inter-requirement dependencies, while feeding entire corpora at once exceeds context-window limits, leading to incomplete integration coverage and redundant test cases. This paper presents a novel "Cluster-then-Summarize" pipeline that addresses these limitations through three-stages. Requirements are embedded using sentence transformers and grouped using UMAP dimensionality reduction followed by HDBSCAN density-based clustering. This grouping utilizes an automatic minimum cluster size selection driven by a quality criterion combining normalized Silhouette and Calinski-Harabasz scores. A multi-level map-reduce summarization algorithm then distills each cluster into concise, domain-conformant descriptions while preserving quantitative thresholds and safety integrity levels. The pipeline exploits the derived cluster topology to generate test specifications at two levels: individual requirement verification and cluster-level integration tests that verify cross-requirement feature behavior. A nearby-cluster context mechanism provides bounded cross-feature awareness during each LLM call, and Retrieval-Augmented Generation grounds all outputs in ISO 26262 and ASPICE standards. Evaluation on automotive requirement datasets of varying scale demonstrates that the cluster-aware approach improves integration test coverage and maintains summarization fidelity compared to baseline methods while scaling efficiently to thousands of requirements.

2606.17123 2026-06-17 cs.CR cs.AI 新提交

LineageMark: Multi-user White-box Watermarking for Contribution Tracing in Model Derivation Chains

LineageMark:模型衍生链中用于贡献追踪的多用户白盒水印

Bingxue Zhang, Xiaofeng Xu, Feida Zhu

发表机构 * University of Shanghai for Science and Technology(上海科技大学) Singapore Management University(新加坡国立大学)

AI总结 提出LineageMark框架,通过投影法在模型参数中嵌入水印,支持多用户、多阶段衍生链中的贡献追踪,对重水印、微调等扰动具有鲁棒性。

Comments 14 pages, 2 figures

详情
AI中文摘要

在开放的大语言模型生态系统中,模型经常跨多个领域和应用进行适配,形成多阶段衍生链。因此,追踪和验证历史贡献对于模型溯源和知识产权保护至关重要。然而,现有的水印方法主要针对单用户一次性嵌入设计,在重复模型衍生和增量更新下常常失效。为解决此问题,我们提出LineageMark,一种用于模型衍生链的多用户白盒水印框架。该框架使用基于投影的方法在模型参数中编码水印。首先选择稳定载体以减少对模型变化的敏感性,然后将每个水印位表示为这些载体上的投影统计量。额外的水印插入仅在投影空间中引入有界扰动,并使用边界约束来保持信号完整性。我们在多阶段模型衍生链中评估了LineageMark的有效性。实验结果表明,LineageMark在多阶段衍生中保留了贡献者水印,并支持增量多用户水印插入。此外,它对重水印、微调、量化和剪枝等扰动表现出鲁棒性。

英文摘要

In open large language model (LLM) ecosystems, models are frequently adapted across multiple domains and applications, forming multi-stage derivation chains. Consequently, tracking and verifying historical contributions is essential for model provenance and intellectual property protection. However, existing watermarking methods are mainly designed for single-user, one-time embeddings, often fail under repeated model derivation and incremental updates. To address this problem, we propose LineageMark, a multi-user white-box watermarking framework for model derivation chains. The framework encodes watermarks in model parameters using a projection-based approach. Stable carriers are first selected to reduce sensitivity to model changes, each watermark bit is then represented as a projection statistic over these carriers. Additional watermark insertions introduce only bounded perturbations in the projection space, and margin constraints are used to maintain signal integrity. We evaluate the effectiveness of LineageMark in multi-stage model derivation chains. Experimental results show that LineageMark preserves contributor watermarks across multi-stage derivation and supports incremental multi-user watermark insertion. Furthermore, it exhibits robustness against perturbations such as re-watermarking, fine-tuning, quantization, and pruning.

2606.17122 2026-06-17 cs.CR cs.AI cs.LG 新提交

TrustErase: Auditable Instant Machine Unlearning with Passport-Embedded Representations

TrustErase:基于护照嵌入表示的可审计即时机器遗忘

Rutger Hendrix, Leonardo G. Russo, Concetto Spampinato, Matteo Pennisi, Giovanni Bellitto

发表机构 * University of Catania(卡塔尼亚大学)

AI总结 提出TrustErase框架,利用护照嵌入表示实现无需数据、可验证的即时遗忘,通过参数高效适配层中的护照作为密钥,仅需停用即可移除特定类别或数据集,无需重训练或微调。

详情
AI中文摘要

隐私合规AI的需求放大了对机器遗忘的需求;然而,现有的基于重训练或蒸馏的方法仍然不可验证且计算成本高。我们引入了TrustErase,一个可验证、无数据的遗忘框架,利用护照嵌入表示实现即时、模块化和可审计的遗忘。通过将护照视为参数高效适配层中的加密密钥,TrustErase能够通过简单的停用操作移除特定类别或数据集,无需重训练、微调或访问原始数据。基于奇异值分解将护照隐藏在模型权重中,确保遗忘操作保持透明且可证明合规。在MNIST、CIFAR10和CIFAR100上的评估表明,TrustErase在严格无数据模式下运行,匹配或超越了DELETE、L2UL和Boundary Shrink等最先进基准。最终,TrustErase为可信、负责且可即时遗忘的AI系统建立了新范式。

英文摘要

The demand for privacy-compliant AI has amplified the need for machine unlearning; yet, existing retraining or distillation-based methods remain unverifiable and computationally costly. We introduce TrustErase, a verifiable, data-free unlearning framework leveraging passport-embedded representations for instant, modular, and auditable forgetting. By treating passports as cryptographic keys within parameter-efficient adaptation layers, TrustErase enables the removal of specific classes or datasets through simple deactivation, without retraining, fine-tuning, or access to the original data. A singular value based decomposition conceals passports within model weights, ensuring that unlearning actions remain transparent and provably compliant. Evaluations on MNIST, CIFAR10 and CIFAR100 show that TrustErase matches or exceeds state-of-the-art benchmarks such as DELETE, L2UL, and Boundary Shrink, while operating in a strictly data-free regime. Ultimately, TrustErase establishes a new paradigm for trustworthy, accountable, and instantly forgettable AI systems.

2606.17119 2026-06-17 cs.CR cs.AI 新提交

Graph neural networks at war: integrating cybersecurity and drone intelligence in the Israeli-Iranian conflict

战争中的图神经网络:整合网络安全与无人机智能于以色列-伊朗冲突

Sozan Sulaiman Maghdid, Tarik Ahmed Rashid, Shavan Askar

发表机构 * Department of Information Technology, Khabat Technical Institute(信息科技系,Khabat技术学院) Erbil Polytechnic University(埃尔比尔理工大学) Computer Science and Engineering Department(计算机科学与工程系;AIIC,库尔德斯坦赫勒大学) AIIC, University of Kurdistan Hewler(信息系统工程系,计算机与信息工程技术学院) Department of Information Systems Engineering, Technical College of Computer and Informatics Engineering

AI总结 研究利用图神经网络(GNN)增强网络入侵检测与无人机响应,通过案例验证其在高检测率、快速响应和态势感知中的有效性。

详情
AI中文摘要

物理网络系统在检测和即时响应方面带来了新的威胁和挑战。本研究探讨了图神经网络(GNN)如何用于辅助包含网络入侵和无人机(UAV)的物理网络系统中的网络安全和无人机管理。通过在图形神经网络的结构理解之间架起桥梁,本工作提供了一种集成程序,使入侵检测系统能够学习底层网络结构,识别恶意活动,并促进无人机响应措施。基于仿真的案例研究,创建了网络攻击模型以引发无人机响应,证明基于图的学习有助于态势感知、群体协调和自适应机动。根据性能评估,该方法的检测率为94.2,接收者操作特征(ROC)曲线下平均面积为0.955,平均响应时间为1.4秒。对比实验表明,所提出的GraphSAGE网络在相同情况下比图卷积网络(GCN)和图注意力网络(GAT)更有效。这些发现证明,图神经网络可用于预防动态网络物理系统中的入侵和响应。

英文摘要

Physical cyber systems have brought about new threats and challenges in detection and immediate response. This study examines how Graph Neural Networks (GNNs) can be used to aid cybersecurity and drone management in a physical cyber system comprising of cyber intrusions and unmanned aerial vehicles (UAVs). By providing a bridge between structural understanding of graphical neural networks, this work has provided an integrated procedure that allows intrusion detection systems to educate on underlying network structures, identify malicious activity, and facilitates drone response measures. Based on an emulation-based case study, cyberattacks models were created to provoke the responses of the drones, which proved that graph-based learning can assist with the situational awareness, swarm coordination, and adaptive maneuver. According to the performance valuation, this method has a detection rate of 94.2, average area under the receiver operating characteristic (ROC) of 0.955 and an average response time of 1.4 seconds. Comparative experiments reveal that proposed GraphSAGE network is more effective than the Graphical Convolutional Networks (GCNs) and Graphical Attention Networks (GATs) in the identical situation. Such findings prove that graphical neural networks can be used to avert intrusion and response of dynamic cyber-physical systems.

2606.17114 2026-06-17 cs.CR cs.AI 新提交

An Evaluation of Data Leakage Risks in Tool-Using LLM Agents in Realistic Scenarios

现实场景中工具使用LLM代理的数据泄露风险评估

Hankyul Baek, Jaewon Noh, Sang Seo, Yongsu Kim, Gabriel Waikin Loh Matienzo, Young Il Kim, Ee Wei Seah, Akriti Vij

发表机构 * Korea AI Safety Institute(韩国人工智能安全研究所) Singapore AI Safety Institute(新加坡人工智能安全研究所)

AI总结 评估了12个非对抗性任务中AI代理的数据泄露风险,发现所有代理均存在数据安全意识不足、信息过度访问等问题,表明操作数据泄露是独立于对抗性窃取的一阶安全风险。

详情
AI中文摘要

AI代理越来越多地被用于企业和个人场景,可以访问电子邮件、数据库、文档和其他工具,从而读取、更新和传播敏感信息。先前关于代理数据泄露风险的研究大多集中在通过提示注入和越狱进行的对抗性数据窃取。然而,敏感信息也可能在非对抗性使用中暴露,即使在用户发出良性请求时也会产生泄露风险。我们报告了新加坡AI安全研究所和韩国AI安全研究所的联合评估,检查了12个现实、非对抗性任务中的代理数据泄露,涵盖客户支持、DevOps、网络自动化以及企业和个人生产力。评估涵盖了五种风险类型:缺乏数据意识、受众意识、政策合规性、数据最小化和访问边界意识。两个研究所使用独立的测试环境和特定任务的LLM评判标准,测试了一组反映真实部署的常见场景。在测试的三个代理中,没有一个在所有场景中实现完全正确且完全安全的执行。成功的任务完成往往伴随着数据处理失败,例如访问不必要的信息或向不适当的接收者披露信息,表明能力和数据处理安全性应分开评估。定性审查还揭示了声明-行动不匹配、模拟感知行为、用户-模拟器角色反转以及自动评判中的解释差距。总体而言,结果表明操作数据泄露是独立于对抗性窃取的一阶代理安全问题,并为未来代理数据处理安全评估提供了方法论。

英文摘要

AI agents are increasingly being adopted in enterprise and personal settings with access to emails, databases, documents, and other tools where they can read, update, and disseminate sensitive information. Much of prior research on data leakage risks in agents has focused on adversarial data exfiltration through prompt injections and jailbreaks. However, sensitive information may also be exposed during non-adversarial use, creating leakage risks even when users issue benign requests. We report a joint evaluation by the Singapore AI Safety Institute and the Korea AI Safety Institute examining agent data leakage in 12 realistic, non-adversarial tasks spanning customer support, DevOps, web automation, and enterprise and personal productivity. The evaluation covers five risk types: lack of data awareness, audience awareness, policy compliance, data minimization, and access-boundary awareness. Both institutes tested a common set of scenarios mirroring real-world deployments using independent testing environments and task-specific LLM-judge rubrics. Across the three tested agents, none achieved fully correct and fully safe execution across all scenarios. Successful task completion often coincided with data-handling failures such as accessing unnecessary information or disclosing information to inappropriate recipients, indicating that capability and data-handling safety should be evaluated separately. Qualitative review also revealed claim-action mismatches, simulation-aware behavior, user-simulator role reversal, and interpretation gaps in automated judging. Overall, the results indicate that operational data leakage is a first-order agent-safety concern distinct from adversarial exfiltration and provide a methodology for future evaluations of agent data-handling safety.

2606.17110 2026-06-17 cs.CR cs.LG 新提交

Loss Landscape Poisoning: Targeted Extraction of Unseen Training Data from LLMs

损失景观投毒:从大语言模型中定向提取未见训练数据

Md Abdullah Al Mamun, Ngoc Phu Doan, Pedram Zaree, Ihsen Alouani, Nael Abu-Ghazaleh

发表机构 * Queen's University Belfast(女王大学贝尔法斯特) CSIT, Queen's University Belfast(女王大学贝尔法斯特计算机科学与技术研究所)

AI总结 提出一种通过投毒重塑模型损失景观,迫使模型记忆目标数据并实现高成功率提取的攻击方法,在语言和视觉-语言模型上验证有效性,并发现差分隐私可防御但存在绕过攻击。

详情
AI中文摘要

大型语言模型越来越多地在专有或敏感数据上进行训练,从私人医疗和财务记录到包含秘密的用户对话。确保此类数据免受提取攻击的隐私性已成为一个核心问题。在本文中,我们探讨了一个攻击者能否通过投毒部分训练数据,来促进其无法访问的单独目标记录的泄露。我们给出了肯定的答案,并表明这种泄露可以通过一种重塑模型在目标补全周围的局部损失景观的投毒机制来诱导。我们的关键洞察是,通过投毒在目标处创建一个尖锐的损失最小值,并在附近替代方案上提升损失,迫使模型将目标记忆为其邻域中唯一的低损失解。该攻击不需要架构更改,并且可以推广到集中式和联邦学习设置。我们证明该攻击在语言模型(高达100%的成功提取)和视觉-语言模型(高达90%的成功提取)上放大了隐私泄露。我们表明,当模型以差分隐私方式训练时,该攻击被阻止。然而,我们引入了一种新的攻击,直接探测损失景观,甚至绕过了差分隐私防御。

英文摘要

Large Language Models are increasingly trained on proprietary or sensitive data, from private healthcare and financial records to user conversations containing secrets. Ensuring the privacy of such data against extraction attacks has become a central concern. In this paper, we ask whether an attacker who can poison a portion of the training data can facilitate the leakage of a separate target record they have no access to. We answer in the affirmative and show that such leakage can be induced by a poisoning mechanism that reshapes the model's local loss landscape around the target completion. Our key insight is that poisoning to create a sharp loss minimum at the target, surrounded by elevated loss on nearby alternatives, forces the model to memorize the target as the unique low-loss solution in its neighborhood. The attack requires no architectural changes, and generalizes across centralized and federated learning settings. We demonstrate that the attack amplifies privacy leakage across language (up to 100% successful extraction), and vision-language models (up 90% successful extraction). We show that the attack is thwarted when the model is trained to be differentially private. However, we introduce a new attack that directly probes the loss landscape bypassing even differential privacy defenses.