arXivDaily arXiv每日学术速递 周一至周五更新

AI 大模型

代码大模型 / AI 编程

代码生成、软件工程智能体、程序修复、测试生成和开发者工具。

今日/当前日期收录 9 信号源:cs.SE, cs.CL, cs.AI, cs.LG, cs.PL
2606.18733 2026-06-18 cs.SE cs.AI 新提交 90%

SWE-Future: Forecast-Conditioned Data Synthesis for Future-Oriented Software Engineering Agents

SWE-Future: 面向未来软件工程智能体的预测条件数据合成

Qiao Zhao, JianYing Qu, Jun Zhang, Yehua Yang, Hanwen Du, Zhongkai Sun

发表机构 * Baidu Inc(百度公司)

专题命中 软件智能体 :面向未来软件工程智能体的数据合成。

AI总结 提出SWE-Future方法,利用仓库历史证据预测未来任务类型(如功能实现、缺陷修复),并基于预测条件合成200个编码智能体任务,减少对历史PR回放的依赖,在80个仓库中达到58.1%的未来工作相关性。

详情
AI中文摘要

真实的编码智能体基准测试通常回放公开的GitHub问题和拉取请求,这使得它们容易与模型预训练、微调、合成数据生成或基准驱动的模型选择产生重叠。完全合成的任务避免了直接的历史回放,但可能偏离真实的仓库需求。我们提出了SWE-Future,一种面向未来编码任务的预测条件数据合成方法。给定时间$T_0$的预测快照,该方法仅使用$T_0$之前的仓库证据来预测未来的功能实现/增强、缺陷修复和重构任务族。我们首先回顾性地验证了这一预测步骤:在预测固定后,后续的拉取请求仅用于衡量预测的任务族是否与未来的仓库工作匹配。在一项80个仓库的研究中,预测器在主要语义匹配指标下达到了58.1%的未来工作相关性。然后,我们使用经过验证的预测族作为条件信号,从任务生成快照中跨61个仓库合成了一个包含200个任务的编码智能体数据集,而不是回放用于验证的后续拉取请求。SWE-Future表明,仓库演化预测可以指导现实的、面向未来的编码任务合成,同时减少对历史拉取请求回放的直接依赖。

英文摘要

Realistic coding-agent benchmarks often replay public GitHub issues and pull requests, making them vulnerable to overlap with model pretraining, fine-tuning, synthetic-data generation, or benchmark-driven model selection. Fully synthetic tasks avoid direct historical replay, but can drift away from real repository needs. We propose SWE-Future, a forecast-conditioned data synthesis method for future-oriented coding tasks. Given a forecast snapshot at time $T_0$, the method uses only pre-$T_0$ repository evidence to forecast future feature implementation/enhancement, bugfix, and refactor task families. We first validate this forecasting step retrospectively: after forecasts are fixed, later pull requests are used only to measure whether the predicted task families match future repository work. In an 80-repository study, the forecaster achieves 58.1\% future-work relevance under the main semantic matching metric. We then use validated forecast families as conditioning signals to synthesize a 200-task coding-agent dataset across 61 repositories from a task-generation snapshot, rather than replaying the later pull requests used for validation. SWE-Future shows that repository-evolution forecasts can guide realistic, future-oriented coding-task synthesis while reducing direct dependence on historical pull-request replay.

2606.15828 2026-06-18 cs.SE 新提交 90%

Configuration Smells in AGENTS.md Files: Common Mistakes in Configuring Coding Agents

AGENTS.md 文件中的配置异味:配置编码代理的常见错误

Helio Victor F. dos Santos, Vitor Costa, Joao Eduardo Montandon, Luciana Lourdes Silva, Marco Tulio Valente

专题命中 软件智能体 :编码代理配置文件异味分析,软件工程

AI总结 本文首次系统化编码代理配置文件(AGENTS.md/CLAUDE.md)的异味,通过灰文献综述和仓库挖掘识别出六种异味,并在100个开源仓库中验证其普遍性,其中Lint Leakage最常见(62%)。

详情
AI中文摘要

编码代理越来越多地被用于自动化软件工程任务。为了指导其行为,这些代理通常依赖配置文件(通常命名为 AGENTS.md 或 CLAUDE.md),这些文件提供关于架构、工作流、编码规范和测试实践的指令。尽管它们的重要性日益增加,但人们对影响这些文件定义和维护的常见问题知之甚少。在本文中,我们提出了首个编码代理配置文件异味目录。为了识别此类异味,我们首先进行了灰文献综述和仓库挖掘分析。结果,我们识别出六种配置异味,并提出了自动检测它们的启发式方法。为了评估所提出异味的普遍性,我们分析了100个包含 AGENTS.md 或 CLAUDE.md 文件的流行开源仓库。我们的结果表明,配置异味广泛存在。Lint Leakage 是最常见的异味,影响了62%的文件,其次是 Context Bloat(42%)和 Skill Leakage(35%)。我们进一步表明,几种异味经常同时出现,特别是 Context Bloat、Skill Leakage 和 Conflicting Instructions。

英文摘要

Coding agents are increasingly used to automate software engineering tasks. To guide their behavior, these agents commonly rely on configuration files, typically named AGENTS.‌md or CLAUDE.‌md, which provide instructions about architecture, workflows, coding conventions, and testing practices. Despite their growing importance, little is known about common problems affecting the definition and maintenance of these files. In this paper, we present the first catalog of smells for coding-agent configuration files. To identify such smells, we first conducted a grey literature review and a repository mining analysis. As a result, we identified six configuration smells and proposed automated heuristics to detect them. To evaluate the prevalence of the proposed smells, we analyzed 100 popular open-source repositories containing either an AGENTS.‌md or a CLAUDE.‌md file. Our results show that configuration smells are widespread. Lint Leakage was the most common smell, affecting 62% of the files, followed by Context Bloat (42%) and Skill Leakage (35%). We further show that several smells frequently co-occur, particularly Context Bloat, Skill Leakage, and Conflicting Instructions.

2606.19216 2026-06-18 cs.SE cs.HC 新提交 85%

No Two Developers Think Alike: How Problem-Solving Styles and Experience Shape Needs in Conversational Interaction with Copilot

没有两个开发者想法相同:问题解决风格和经验如何塑造与 Copilot 对话交互中的需求

Jonan Richards, Bruno Alves de Oliveira, Iury Oliveira, Igor Wiese, Mairieli Wessel

专题命中 软件智能体 :研究开发者与Copilot的交互,属于AI编程

AI总结 通过混合方法出声思考研究,识别出5种交互模式和10种需求,并建立概念模型,揭示认知多样性如何影响开发者与GitHub Copilot的交互。

Comments Accepted at the International Conference on Software Maintenance and Evolution (ICSME), 2026

详情
AI中文摘要

基于LLM的对话式“编程助手”为开发者提供了诸多好处。然而,最近的研究表明,个体开发者对编程助手的需求存在差异,并且只有特定开发者群体才会遇到挑战。在本研究中,我们探讨了认知多样性在塑造与GitHub Copilot聊天交互中的作用。通过对27名专业开发者和学生进行混合方法的出声思考研究,我们表征了开发者交互中的5种不同的“交互模式”和10种潜在需求,形成了一个概念模型。我们描述了这些模式、需求与开发者的问题解决风格和经验概况之间的联系,展示了认知多样性如何塑造开发者的交互。我们为研究人员和从业者提供了关于如何设计、研究和运用编程助手以更好地满足多样化开发者需求的见解和建议。

英文摘要

Conversational LLM-based ``programming assistants'' provide a range of benefits to developers. However, recent studies demonstrate the variety in individual developers' needs regarding programming assistants, and challenges encountered by only specific groups of developers. In this study, we explore the role of cognitive diversity in shaping interactions with GitHub Copilot chat. Through a mixed-methods think aloud study with 27 professional developers and students, we characterize 5 distinct ``interaction modes'' and 10 underlying needs in developers' interactions, forming a conceptual model. We characterize links between these modes, needs, and developers' problem-solving styles and experience profiles, showing how cognitive diversity may shape developers' interactions. We provide insights and recommendations for researchers and practitioners on how to design, research, and employ programming assistants to better account for diverse developer needs.

2606.19167 2026-06-18 cs.SE 新提交 85%

Teaching Software Engineering with LLM and MCP Integration: From Classroom to Industry Practice

用LLM和MCP集成教学软件工程:从课堂到工业实践

Kehui Chen, Jacky Keung, Weining Li, Xiangbing Shao, Yishu Li, Xiaoxue Ma

专题命中 软件智能体 :将LLM和MCP集成到软件工程教学,提升编程和工具使用能力

AI总结 本研究将LLM和MCP集成到软件工程协作教学模式中,通过嵌入驱动工具到教学、代码辅助和工程模拟,弥合传统教学与工业流程的差距,提升学生编程、问题解决和智能工具使用能力。

Comments Aceept by International Symposium on Educational Technology (ISET) 2026

详情
AI中文摘要

大型语言模型(LLM)和模型上下文协议(MCP)在工业软件工程中的快速集成,迫切要求更新软件工程教育以跟上新兴技术和不断变化的行业需求。本研究探讨了一种创新方法,将LLM和MCP集成到软件工程教育的协作教学模式中,旨在构建一个与实际工程实践紧密相连的实用学习框架。通过将LLM和MCP驱动的工具嵌入日常教学、代码辅助和工程模拟中,该模型有效弥合了传统教学与工业工作流程之间的差距。这种集成增强了学生的编程能力、实际问题解决能力以及使用智能工程工具的熟练度。此外,通过与行业实习的合作,学生可以在真实环境中应用这些技术,进一步加强学术准备与专业实践之间的联系。总体而言,本研究为人工智能时代软件工程教育的改革与创新提供了一条实用路径。

英文摘要

The rapid integration of Large Language Models (LLMs) and the Model Context Protocol (MCP) into industrial software engineering has created a pressing need to update software engineering education to align with emerging technologies and evolving industry demands. This study investigates an innovative approach that integrates LLMs and MCP into a collaborative teaching model for software engineering education, aiming to build a practical learning framework closely connected to real-world engineering practices. By embedding LLM and MCP driven tools into daily teaching, code assistance, and engineering simulations, the model effectively bridges the gap between traditional instruction and industrial workflows. This integration enhances students' programming competence, practical problem-solving abilities, and proficiency in using intelligent engineering tools. Furthermore, through partnerships with industry internships, students can apply these technologies in real-world settings, further strengthening the connection between academic preparation and professional practice. Overall, this research offers a practical pathway for reforming and innovating software engineering education in the era of artificial intelligence.

2606.19191 2026-06-18 cs.CR 新提交 80%

PhantomSkill: Malicious Code Injection in Agent Skill Ecosystems

PhantomSkill: 代理技能生态系统中的恶意代码注入

Yu-Ting Lin, Chia-Mu Yu

专题命中 软件智能体 :针对LLM编码代理的恶意代码注入攻击

AI总结 提出PhantomSkill攻击框架,通过VulMask技术将恶意行为隐藏在技能的辅助资源中,利用漏洞形状的实现绕过检测,在保持良性功能的同时降低警告和恶意软件检测率。

详情
AI中文摘要

代理技能使得基于LLM的编码代理能够从第三方包获取领域特定能力,但也引入了新的供应链攻击面。我们提出PhantomSkill,一个攻击框架,将恶意行为隐藏在技能的辅助资源中,而非其文本描述中。其核心技术VulMask将明显的恶意脚本重写为漏洞形状的实现,其恶意行为仅在攻击者控制的触发条件下激活。这种设计将可见信号从明确的恶意意图转变为看起来普通的易受攻击代码。在代表性的宿主技能、攻击目标、编码代理、生成模型和自动审查器上,与明显的恶意脚本相比,VulMask在保持良性功能的同时减少了警告和恶意软件级别检测。我们的结果表明,技能生态系统需要资源级审查、执行时隔离以及将代理技能中的可利用漏洞视为潜在恶意载荷的安全策略。

英文摘要

Agent skills allow LLM-based coding agents to acquire domain-specific capabilities from third-party packages, but they also introduce a new supply-chain attack surface. We present PhantomSkill, an attack framework that hides malicious behavior in a skill's auxiliary resources rather than in its textual description. Its core technique, VulMask, rewrites overt malicious scripts into vulnerability-shaped implementations whose malicious behavior is activated only under attacker-controlled trigger conditions. This design shifts the visible signal from explicit malicious intent to ordinary-looking insecure code. Across representative host skills, attack goals, coding agents, generation models, and automated reviewers, VulMask preserves benign utility while reducing warning and malware-level detection compared with overt malicious scripts. Our results show that skill ecosystems require resource-level vetting, execution-time containment, and security policies that treat exploitable vulnerabilities in agent skills as potential malicious payloads.

2602.04341 2026-06-18 cs.SE 80%

Model-Driven Legacy System Modernization at Scale

规模化遗留系统现代化的模型驱动方法

Tobias Böhm, Jens Guan Su Tien, Mohini Nonnenmann, Tom Schoonbaert, Bart Carpels, Andreas Biesdorf

专题命中 软件智能体 :模型驱动遗留系统现代化

AI总结 本文提出一种模型驱动的遗留系统现代化方法,通过在遗留代码库和现代目标平台之间插入富化中间模型,实现了核心UI组件和页面结构的半自动化迁移,提升了可维护性和开发者体验。

Comments Accepted for publication at the 1st Workshop on Code Translation, Transformation, and Modernization (ReCode'26), co-located with ICSE 2026

Journal ref Proc. ReCode '26, ACM, New York, NY, USA (2026) 13-18

详情
AI中文摘要

本文经验报告介绍了一种模型驱动的遗留系统现代化方法,通过在遗留代码库和现代目标平台之间插入一个富化、技术中立的中间模型,报告了其应用和评估。四阶段过程:分析、富化、合成和过渡,系统地提取、抽象和转换系统构件。我们应用该方法于一个基于遗留版本的.NET Framework和ASP.NET MVC构建的大型工业应用,展示了核心用户界面组件和页面结构可半自动化迁移到现代Web堆栈,同时保持功能行为和关键非功能特性。通过将架构知识整合到显式模型表示中,所得到的代码库具有更高的可维护性和可扩展性,从而改善了开发者体验。尽管自动化在标准模式迁移中有效,但定制化布局复合体的迁移仍具挑战性,需要针对性的手动调整。我们的贡献包括:(i) 一个端到端的模型驱动过程,(ii) 一个捕获结构、依赖性和语义元数据的富化中间模型,(iii) 保留功能行为和关键非功能特性的转换规则,以及(iv) 在工业环境中的应用和评估。总体而言,基于模型的抽象减少了风险和努力,同时支持了可扩展、可追溯的遗留应用现代化。我们的方法可推广到类似的现代化情境,并促进了迁移模式的重用。

英文摘要

This experience report presents a model-driven approach to legacy system modernization that inserts an enriched, technology-agnostic intermediate model between the legacy codebase and the modern target platform, and reports on its application and evaluation. The four-stage process of analysis, enrichment, synthesis, and transition systematically extracts, abstracts, and transforms system artifacts. We apply our approach to a large industrial application built on legacy versions of the .NET Framework and ASP.NET MVC and show that core user interface components and page structures can be migrated semi-automatically to a modern web stack while preserving functional behavior and essential non-functional qualities. By consolidating architectural knowledge into explicit model representations, the resulting codebase exhibits higher maintainability and extensibility, thereby improving developer experience. Although automation is effective for standard patterns, migration of bespoke layout composites remains challenging and requires targeted manual adaptation. Our contributions are: (i) an end-to-end model-driven process, (ii) an enriched intermediate model that captures structure, dependencies, and semantic metadata, (iii) transformation rules that preserve functional behavior and essential non-functional qualities, and (iv) application and evaluation of the approach in an industrial setting. Overall, model-based abstractions reduce risk and effort while supporting scalable, traceable modernization of legacy applications. Our approach generalizes to comparable modernization contexts and promotes reuse of migration patterns.

2606.19121 2026-06-18 cs.SE cs.CL cs.HC 新提交 75%

Written by AI, Managed by AI: Semantic Space Control and Index Sickness Elimination Across 391 Consecutive Sessions

由AI编写,由AI管理:跨越391个连续会话的语义空间控制与索引病消除

Hui Zhang, Shuren Song

发表机构 * Shenzhen Yunxi Technology Co., Ltd.(深圳云曦科技有限公司) Information Technology Center, Tsinghua University(清华大学信息科学技术中心)

专题命中 软件智能体 :长期LLM协作中的索引病问题,涉及代码工程

AI总结 本文通过真实软件项目中的行动研究,发现长期LLM协作中增加形式约束反而导致“索引病”,提出“基线-日志物理分离”机制,有效消除该问题。

Comments 22 pages, 2 tables, 1 figure. Action research. Bilingual submission (Chinese companion version included as supplementary). Submitted to ICSE 2027 IOR track

详情
AI中文摘要

解决长期LLM协作中概念漂移的主流工程直觉是,用更多的形式约束换取更可靠的输出——设计符号标识符系统,在系统提示中积累防御规则,扩展上下文窗口。我们的工程记录表明,在长期设置中,这种方向可能产生与设计意图相反的效果。通过在跨越约一个月和391个协作会话的真实软件项目(Bang-v3)中使用行动研究方法,我们记录并分析了这些策略的失败过程。当符号系统超过复杂度阈值时,LLM并不会变得更准确——相反,它们放弃了对业务语义的真正理解,退回到符号层内的自我指涉推理,并生成看似内部一致但实际上与现实脱节的输出。我们将这种失败模式命名为“索引病”,其典型表现为“幻影立法”。我们将底层原理命名为“庞原理(语义活力定律)”:带有明确目的的自然语言传达的信息质量远高于符号表达。由此,我们设计并验证了其物理工程机制:“基线-日志物理分离”。在同一项目中,该机制将AI指令量减少了约75%,并且在随后的约150个会话中,未观察到索引病复发。附有双语对照版本(中文)作为补充材料。

英文摘要

The prevailing engineering intuition for addressing conceptual drift in long-horizon LLM collaboration is to trade more formal constraints for more reliable outputs -- designing symbolic identifier systems, accumulating defensive rules in System Prompts, expanding context windows. Our engineering record shows that in long-horizon settings, this direction may produce effects contrary to design intent. Using action research methods in a real software project (Bang-v3) spanning approximately one month and 391 collaborative sessions, we document and analyze the failure process of these strategies. When the symbolic system exceeds a complexity threshold, LLMs do not become more accurate -- instead, they abandon genuine understanding of business semantics, retreat to self-referential reasoning within the symbolic layer, and generate outputs that appear internally consistent but are physically disconnected from reality. We name this failure pattern "Index Sickness," and its canonical manifestation "Phantom Legislation." We name the underlying principle the "Pang Principle (Semantic Vitality Law)": natural language carrying explicit purpose conveys far greater information quality than symbolic expression. From this, we design and validate its physical engineering mechanism: "Baseline-Log Physical Separation." In the same project, this mechanism reduced AI Instructions volume by ~75%, and across the subsequent ~150 sessions, no recurrence of Index Sickness was observed. A bilingual companion version (Chinese) is included as supplementary material.

2606.18855 2026-06-18 cs.SE 新提交 70%

Toward Semantically-Seeded, Graph-Propagated Impact Analysis Across Software Artifacts: A Vision

面向语义种子与图传播的跨软件制品影响分析:一个愿景

Momil Seedat

专题命中 软件智能体 :跨软件制品影响分析,融合语义与结构。

AI总结 提出一种无需训练、可解释的融合方法,结合语义相似性与结构依赖,通过异构制品图与传播机制覆盖两种方法的盲点,实现跨需求-配置-服务-测试链的影响分析。

详情
AI中文摘要

当单个软件制品发生变化——一个需求、一个配置值或一个函数——工程师必须确定还有什么受到影响。现有的变更影响分析(CIA)工具往往孤立地依赖两种信号之一:从文本中恢复的语义相似性(信息检索可追溯性、代码搜索、嵌入),或结构依赖跟踪(调用图、IDE“查找用法”、测试影响选择)。每种方法都有其特有的盲点。语义驱动的方法会遗漏与变更没有共享词汇的受影响制品;结构驱动的方法会遗漏在意义上相关但未被边连接的制品,并且大多数仅对代码而非需求-配置-服务-测试链进行操作。我们主张一种无需训练且可解释的分析器,它在同一嵌入上融合两种信号。我们将系统建模为一个异构制品图,其类型化边通过静态分析恢复,通过余弦相似度计算相对于变更制品的语义先验,通过行归一化的传播矩阵进行多跳衰减传播,并通过单个可调权重λ融合两者。在一个支付子系统(5个标记的变更场景)上进行的小型但完整的概念验证显示了我们关心的机制:与变更没有文本重叠的制品仍然通过传播被恢复,而单独传播无法到达的辅助函数则通过语义层被恢复。融合是唯一覆盖两个盲点的配置,λ充当显式的精确率/召回率控制。借鉴四个公开记录的生成故障,我们认为相同的公式可以扩展到仅靠代码分析无法触及的操作制品(镜像、指标、仪表盘、数据模式)。

英文摘要

When a single software artifact changes - a requirement, a configuration value, or a function - engineers must determine what else is impacted. Existing change-impact-analysis (CIA) tooling tends to rely on one of two signals in isolation: semantic similarity recovered from text (information-retrieval traceability, code search, embeddings), or structural dependency following (call graphs, IDE "find usages", test-impact selection). Each has a characteristic blind spot. A semantically driven tool misses an impacted artifact whose text shares no vocabulary with the change; a structurally driven tool misses artifacts related in meaning but not joined by an edge, and most operate only over code rather than the Requirement-Config-Service-Test chain. We argue for a training-free and interpretable analyzer that fuses both signals over the same embeddings. We model the system as a heterogeneous artifact graph with typed edges recovered by static analysis, compute a semantic prior by cosine similarity to the changed artifact, propagate impact multi-hop with decay over a row-normalized propagation matrix, and blend the two with a single tunable weight lambda. A small but complete proof-of-concept on a payment subsystem (5 labelled change scenarios) shows the mechanism we care about: artifacts with zero textual overlap with the change are still recovered through propagation, and helper functions that propagation alone cannot reach are recovered through the semantic layer. The fusion is the only configuration that covers both blind spots, and lambda acts as an explicit precision/recall control. Drawing on four publicly documented production failures, we argue that the same formulation extends to operational artifacts (images, metrics, dashboards, data schemas) that code-only analysis cannot reach.

2606.17510 2026-06-18 cs.SE cs.SY eess.SY 新提交 70%

OmniDroneX: An LLM-Assisted Holistic Drone-as-a-Service Ecosystem

OmniDroneX: 一种LLM辅助的全方位无人机即服务生态系统

I-Ling Yen, Akeem Mohammed, Farokh Bastani, San-Yih Hwang

专题命中 软件智能体 :LLM用于服务组合和代码生成

AI总结 提出OmniDroneX统一无人机即服务生态系统,通过libUAV接口和PT-SOA抽象模型连接底层物理与高层任务,利用大语言模型辅助功能识别、服务组合和自然语言任务定义,支持多种组合技术以实现可扩展、自演进的无人机系统。

Comments This manuscript is a full version of a paper accepted in shortened form by IEEE International Conference on Joint Cloud Computing

详情
AI中文摘要

尽管无人机技术取得了快速进步,但由于无人机系统研究中的若干空白,当前部署仍然有限。为应对这些挑战,我们提出OmniDroneX,一个统一的无人机即服务生态系统,其中无人机从固定功能平台转变为动态可组合实体,可与外部基础设施集成以提供全方位能力。OmniDroneX通过统一的供应商无关接口(libUAV)和形式化的物理服务抽象模型(PT-SOA)连接底层物理原语与高层任务意图。一个核心创新是大语言模型(LLM)在OmniDroneX架构多层中的多样化应用。LLM用于辅助识别和形式化原始设备功能及抽象服务定义,支持自动化服务组合和工作流生成,并实现交互式自然语言任务规范与细化。OmniDroneX还包含了动态无人机系统中至关重要的多种组合技术类别,包括用于无人机能力增强的物理层组合,以及时空、功能、协作、异常感知和基于QoS的服务组合。总体而言,这些特性使OmniDroneX能够作为在复杂动态环境中运行的可扩展、有弹性和自演进的无人机生态系统的基础。

英文摘要

Despite rapid advances in UAV technologies, current deployments remain limited due to several gaps in UAV systems research. To address these challenges, we propose OmniDroneX, a unified Drone-as-a-Service ecosystem, in which drones are transitioned from fixed function platforms into dynamically composable entities that can be integrated with external infrastructures to offer omni-capabilities. OmniDroneX bridges low-level physical primitives with high-level mission intent through a unified vendor-agnostic interface (libUAV) and a formal physical-service abstraction model (PT-SOA). A core innovation is the diverse application of large language models (LLMs) across multiple layers of the OmniDroneX architecture. LLMs are used to assist in identifying and formalizing primitive device functions and abstract service definitions, supporting automated service composition and workflow generation, and enabling interactive, natural-language mission specification and refinement. OmniDroneX also incorporates important categories of composition techniques that are essential in dynamic UAV systems, including physical layer composition for drone capability augmentation, as well as spatiotemporal, functional, collaborative, exception-aware, and QoS-based service compositions. Collectively, these features allow OmniDroneX to serve as a foundation for scalable, resilient, and self-evolving UAV ecosystems operating in complex and dynamic environments.