2605.21404 2026-05-21 cs.LG

What Twelve LLM Agent Benchmark Papers Disclose About Themselves: A Pilot Audit and an Open Scoring Schema

十二篇LLM代理基准测试论文披露了什么：一项初步审计和开放评分方案

Mahdi Naser Moghadasi, Faezeh Ghaderi

发表机构 * Research Division, BrightMind AI（BrightMind AI研究部）； Texas Tech University（德克萨斯理工大学）； University of Texas at Arlington（德克萨斯大学阿灵顿分校）

AI总结本文通过分析十二篇知名LLM代理基准测试论文，揭示了这些论文在评估方法披露方面的不足，设计了一种开放评分方案以提高透明度和可重复性。

Comments Pilot audit of 12 LLM agent benchmark papers; schema, codebook, and per-paper scoring sheet released. Submission to IEEE Big Data 2026

详情

AI中文摘要

关于PDE诱导度量的一步Wasserstein引导生成模型的正则性和泛化性

Likun Lin, Zhongjian Wang, Jack Xin, Zhiwen Zhang

发表机构 * Department of Mathematics, The University of Hong Kong（香港大学数学系）； Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University（南洋理工大学数学科学系）； Department of Mathematics, University of California at Irvine（加州大学 Irvine 分校数学系）

AI总结本文研究了一步Wasserstein引导生成模型在处理PDE诱导概率度量时的正则性和泛化性，通过理论框架证明了运输映射的正则性和生成模型的泛化性质，并通过实验验证了理论结果。

详情

AI中文摘要

尽管生成模型在经验上取得了显著成功，但其在科学计算中的统计准确性理论仍然较为悲观。本文发展了一个理论框架，用于理解运输映射的正则性和一步Wasserstein引导生成模型的泛化性质。我们考虑了与线性椭圆和抛物型方程在有界域上以及扩散和福克-计划克方程在环面上关联的归一化目标密度。在标准结构假设下，我们证明这些目标度量满足倍增条件。通过结合这一事实与倍增度量之间最优运输的正则性理论，我们证明了从均匀源度量到目标度量的最优运输映射是Hölder连续的。这种正则性为通过单个推前映射学习PDE诱导分布的一步生成模型提供了近似理论依据。作为代表实例，我们研究了DeepParticle，并推导了描述学习映射与总体最优映射之间差异的额外风险界。我们还建立了在目标转移下的鲁棒性估计，并通过实验验证了推导出的速率。

英文摘要

Despite the remarkable empirical success of generative models, the available theory on their statistical accuracy in scientific computing remains largely pessimistic. This paper develops a theoretical framework for understanding the regularity of transport maps and the generalization properties of one-step Wasserstein-guided generative models for PDE-induced probability measures. We consider normalized target densities associated with linear elliptic and parabolic equations on bounded domains, as well as diffusion and Fokker--Planck equations on the torus. Under standard structural assumptions, we prove that these target measures satisfy doubling conditions. By combining this fact with regularity theory for optimal transport between doubling measures, we show that the optimal transport map from a uniform source measure to the target measure is Hölder continuous. This regularity yields an approximation-theoretic justification for one-step generative models that learn PDE-induced distributions via a single pushforward map. As a representative instance, we study DeepParticle and derive excess-risk bounds characterizing the discrepancy between the learned map and the population-optimal map. We also establish a robustness estimate under target shift and illustrate the theory with experiments which support the derived rates.

URL PDF HTML ☆

赞 0 踩 0

2605.21381 2026-05-21 cs.CV cs.LG

Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration

解耦生成与回归在可控图像恢复中的随机插值

Yi Liu, Jia Ma, Wengen Li, Jihong Guan, Shuigeng Zhou, Yichao Zhang

发表机构 * Tongji University（同济大学）； Fudan University（复旦大学）

AI总结本文提出DiSI框架，通过解耦随机插值过程中的生成与回归组件，实现从纯回归到全生成的连续可控过渡，提升图像恢复任务的效率和精度。

Comments 44 pages, 16 figures, 16 tables

详情

AI中文摘要

近年来，图像恢复（IR）的进步主要由生成方法如扩散模型和流匹配驱动，这些方法在合成逼真纹理方面表现出色，但存在推理慢和像素保真度差的问题。相比之下，传统基于回归的IR方法在这些方面表现更佳，提供单步高效性和高像素级重建保真度。为弥合这一差距，我们提出DiSI，一个统一框架，将随机插值过程解耦为独立的生成和回归组件。这种解耦使DiSI具有显著的通用性，能够连续且可控地从纯回归过程过渡到全生成过程。技术上，我们通过两种特定的采样轨迹实例化该框架，并辅以统一的采样器，实现高质量的少步推理。此外，我们设计了双分支U-Net风格变压器网络，在像素空间中使用专用分支增强条件引导，同时确保高吞吐量。大量实验表明，DiSI在各种IR任务中实现了高效且具有竞争力的结果，同时在单个模型中提供推理时的灵活性，以控制失真感知的权衡。

英文摘要

Recent advances in Image Restoration (IR) have been largely driven by generative methods such as Diffusion Models and Flow Matching, which excel in synthesizing realistic textures while suffering from slow multi-step inference and compromised pixel fidelity. In contrast, classical regression-based IR methods excel precisely in these aspects, offering single-step efficiency and high pixel-level reconstruction fidelity. To bridge this gap, we propose DiSI, a unified framework that Disentangles the underlying Stochastic Interpolant process into independent generation and regression components. This decoupling endows DiSI with remarkable versatility, enabling a continuous and controllable transition from a pure regression process to a fully generative one. Technically, we instantiate this framework with two specific sampling trajectories, accompanied by a unified sampler for high-quality, few-step inference on arbitrary trajectories. Furthermore, we design a dual-branch U-Net style transformer network in pixel space, using a dedicated branch to enhance conditional guidance while ensuring high throughput. Extensive experiments demonstrate that DiSI efficiently achieves competitive results on various IR tasks, while uniquely offering the inference-time flexibility to control the distortion-perception trade-off within a single model.

URL PDF HTML ☆

赞 0 踩 0

2605.21372 2026-05-21 cs.CV cs.AI cs.LG cs.RO

LASH：适应性语义混合用于大语言模型的黑盒劫持

Abdullah Al Nomaan Nafi, Fnu Suya, Swarup Bhunia, Prabuddha Chakraborty

发表机构 * University of Maine（缅因大学）； University of Tennessee, Knoxville（田纳西大学, 基纳顿分校）； University of Florida（佛罗里达大学）

AI总结本文提出LASH框架，通过适应性语义混合方法，利用多个基础攻击的输出作为可重用的种子提示，针对不同目标模型和有害类别进行自适应组合，从而在黑盒红队测试中取得更高的攻击成功率。

详情

AI中文摘要

劫持攻击暴露了对齐的大语言模型预期安全行为与对抗性提示下行为之间的持续差距。现有自动化方法日益有效，但每个方法都局限于单一攻击家族（例如，一个细化循环、一个树搜索、一个突变空间或一个策略库），并且没有单一家族主导：表现最好的方法会根据目标模型和有害类别而变化，这表明互补优势可以通过每个提示的组合来利用。我们介绍了LASH（LLM适应性语义混合），一个黑盒框架，将多个基础攻击的输出视为可重用的种子提示，并针对每个目标请求自适应地组合它们。给定一个种子池，LASH搜索种子子集和softmax归一化的混合权重；组合模块合成一个候选提示，而无导数遗传优化器通过黑盒目标反馈和一个两阶段适应度函数（结合基于关键词的拒绝检测与LLM判官评分）更新权重。在包含100个有害提示的10个类别的JailbreakBench上，我们评估了LASH在六个常见目标模型上的表现。LASH在基于关键词的评估中平均攻击成功率为84.5%，在两阶段评估中为74.5%，其中响应首先被过滤以拒绝，然后由LLM判官评分是否实质上履行了原始有害请求。LASH在两个指标上均优于五个最先进的基线方法，仅使用30次平均目标查询。LASH还在三种防御机制下保持竞争力，并诱导出更多成功似内部表示。这些结果表明，跨异构劫持策略的适应性组合是黑盒红队测试的一个有前途的方向。

英文摘要

Jailbreak attacks expose a persistent gap between the intended safety behavior of aligned large language models and their behavior under adversarial prompting. Existing automated methods are increasingly effective but each commits to a single attack family (e.g., one refinement loop, one tree search, one mutation space, or one strategy library) and no single family dominates: the best-performing method shifts across target models and harm categories, suggesting complementary strengths that per-prompt composition could exploit. We introduce LASH (LLM Adaptive Semantic Hybridization), a black-box framework that treats outputs from multiple base attacks as reusable seed prompts and adaptively composes them for each target request. Given a seed pool, LASH searches over seed subsets and softmax-normalized mixture weights; a composition module synthesizes a single candidate prompt, and a derivative-free genetic optimizer updates the weights using black-box target feedback and a two-stage fitness function combining keyword-based refusal detection with LLM-judge scoring. On JailbreakBench, which contains 100 harmful prompts across 10 categories, we evaluate LASH on six common target models. LASH achieves an average attack success rate of 84.5% under keyword-based evaluation and 74.5% under two-stage evaluation, where responses are first filtered for refusals and then scored by an LLM judge for whether they substantively fulfill the original harmful request. LASH outperforms five state-of-the-art baselines on both metrics with only 30 mean target queries. LASH also remains competitive under three defense mechanisms and induces more success-like internal representations. These results suggest that adaptive composition across heterogeneous jailbreak strategies is a promising direction for black-box red-teaming.

URL PDF HTML ☆

赞 0 踩 0

2605.21348 2026-05-21 cs.LG cs.AI cs.NA math.NA physics.comp-ph

Data-Efficient Neural Operator Training via Physics-Based Active Learning

通过物理引导的主动学习实现数据高效的神经算子训练

Alicja Polanska, Lorenzo Zanisi, Vignesh Gopakumar, Stanislas Pamela

发表机构 * University College London（伦敦大学学院）； Atomic Energy Authority（原子能局）

AI总结本文提出了一种基于物理的主动学习方法，用于提高神经算子训练的数据效率，通过利用偏微分方程残差来指导数据选择，在1D Burgers方程和2D可压缩纳维-斯托克斯方程的数值实验中验证了该方法在数据效率上的优越性。

Comments Presented at the ICLR 2026 Workshop on Artificial Intelligence and Partial Differential Equations

详情

AI中文摘要

使用神经算子求解偏微分方程显著降低了计算成本，但仍然受到高训练数据需求的限制。主动学习提供了一个自然的框架，通过迭代方式选择最有信息量的样本来缓解这一问题。我们引入了基于物理的获取方法，这是一种新的物理引导的主动学习算法，利用偏微分方程残差来指导数据选择。我们通过1D Burgers方程和2D可压缩纳维-斯托克斯方程的数值实验验证了该方法。我们显示，在我们的实验中，基于物理的获取方法在数据效率上始终优于随机获取，并且在数据效率上与当前最先进的方法相媲美。同时，它具有独特的优势，即在训练过程中注入物理归纳偏差，确保在模型物理理解最弱的地方花费模拟成本。

英文摘要

Solving partial differential equations with neural operators significantly reduces computational costs but remains bottlenecked by high training data requirements. Active learning offers a natural framework to mitigate this by selectively acquiring the most informative samples in an iterative manner. We introduce physics-based acquisition - a novel physics-informed active learning algorithm that leverages the partial differential equation residual to guide data selection. We validate the method by presenting numerical experiments for the 1D Burgers equation and the 2D compressible Navier-Stokes equations. We show that, in our experiments, physics-based acquisition consistently outperforms random acquisition and matches the state of the art in data efficiency. At the same time, it has the unique advantage of injecting a physics inductive bias into the training process, ensuring that simulation cost is spent where the model's physical understanding is weakest.

URL PDF HTML ☆

赞 0 踩 0

2605.21343 2026-05-21 cs.CV

OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation

OcclusionFormer: 布局导向图像生成中的Z轴顺序安排

Ziye Li, Henghui Ding

发表机构 * Institute of Big Data, College of Computer Science and Artificial Intelligence, Fudan University, China（大数据研究院，计算机科学与人工智能学院，复旦大学，中国）

AI总结本文提出OcclusionFormer，一种基于Z轴顺序的扩散变换框架，通过解耦实例并利用体积渲染进行合成，以解决布局到图像模型中物体间遮挡问题，并通过查询对齐损失提升空间精度和语义一致性。

Comments ICML 2026, Project Page: https://henghuiding.com/OcclusionFormer/

详情

AI中文摘要

最近的布局到图像模型在空间可控性方面取得了显著进展。然而，它们仍然在物体间遮挡方面存在困难。当边界框重叠时，大多数现有方法缺乏显式的遮挡信息，这使得交集区域的生成本质上具有歧义性，并阻碍了复杂遮挡关系的确定。为此，我们首先构建了SA-Z，一个包含显式遮挡顺序和像素级注释的大型数据集。基于我们提出的数据集，我们引入了OcclusionFormer，一种新的遮挡感知扩散变换框架，通过解耦实例并利用体积渲染进行合成，显式地建模Z轴优先级。此外，为了确保细粒度的空间精度，我们引入了查询对齐损失，显式监督单个实例并增强语义一致性。所提出的方法有效减少了重叠区域的歧义性，强制正确遮挡依赖关系，并保持了结构完整性，从而在多样化的场景中实现了显著的准确性提升。

英文摘要

Recent layout-to-image models have achieved remarkable progress in spatial controllability. However, they still struggle with inter-object occlusion. When bounding boxes overlap, most existing methods lack explicit occlusion information, which makes the generation in intersection regions inherently ambiguous and hinders the determination of complex occlusion relationships. As a result, they often produce entangled textures or physically inconsistent layering in the overlapped areas. To address this issue, we first construct SA-Z, a large-scale dataset enriched with explicit occlusion ordering and pixel-level annotations. Building upon our proposed dataset, we introduce OcclusionFormer, a novel occlusion-aware Diffusion Transformer framework that explicitly models Z-order priority by decoupling instances and compositing them via volume rendering. Furthermore, to ensure fine-grained spatial precision, we introduce a queried alignment loss that explicitly supervises individual instances and enhances semantic consistency. The proposed method effectively reduces ambiguity in overlapping regions, enforces correct occlusion dependencies, and preserves structural integrity, leading to substantial accuracy gains across diverse scenes.

URL PDF HTML ☆

赞 0 踩 0

2605.21338 2026-05-21 cs.CL

优化的联邦知识蒸馏与分布式神经架构搜索

Chaimaa Medjadji, Sylvain Kubler, Yves Le Traon, Guilain Leduc, Sadi Alawadi, Feras M. Awaysheh

发表机构 * Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg（安全、可靠与信任跨学科研究中心（SnT），卢森堡大学）； Blekinge Institute of Technology（布莱金厄理工大学）； ADSLabs, Umea University（ADSLabs，乌梅亚大学）

AI总结本文提出FedKDNAS框架，结合客户端侧神经架构选择与服务器协调的知识蒸馏，以解决联邦学习中数据异质性、系统异质性和通信效率问题，通过提升准确率和效率的帕累托效率。

详情

AI中文摘要

联邦学习（FL）使在不集中数据的情况下进行协同模型训练成为可能。然而，现实部署必须同时解决客户端数据的统计异质性（非iid）、系统异质性（设备能力差异）和通信效率。现有FL方法通过改进聚合、个性化或知识蒸馏来缓解这些挑战，但几乎都假设客户端架构固定，限制了对异质数据复杂性和硬件约束的适应性。这种架构限制通常导致现实FL系统中准确率与效率之间的次优权衡。本文引入FedKDNAS，一种由蒸馏驱动的FL框架，结合客户端侧神经架构选择与服务器协调的知识蒸馏。每个客户端在准确率-资源约束下自主选择轻量模型，然后使用结合监督学习和知识蒸馏的混合目标在本地训练，并仅分享预测结果。服务器然后聚合并平滑这些预测，可选地与教师模型结合，以生成下一轮的稳定蒸馏目标。在六个数据集上对六个代表性的FL基线（FedAvg、Ditto、FedMD、FedDF、FedDistill、Local-KD）的广泛评估表明，FedKDNAS在非iid条件下将准确率提高高达15%，减少客户端CPU使用约28%，同时将通信开销减少高达44倍，同时保持轻量的logit通信。

英文摘要

Federated Learning (FL) enables collaborative model training without centralizing data. However, real-world deployments must simultaneously address statistical heterogeneity across client data (non-IID), system heterogeneity in device capabilities, and communication efficiency. Existing FL approaches mitigate these challenges through improved aggregation, personalization, or knowledge distillation, but they almost universally assume a fixed client architecture, limiting adaptability to heterogeneous data complexity and hardware constraints. This architectural constraint often leads to suboptimal trade-offs between accuracy and efficiency in real-world FL systems. This work introduces FedKDNAS, a distillation-driven FL framework that combines client-side neural architecture selection with distillation of server-coordinated knowledge. Each client autonomously selects a lightweight model under accuracy-resource constraints. It then trains it locally using a hybrid objective combining supervised learning and knowledge distillation and shares only predictions on a public reference set. The server then aggregates and smooths these predictions, optionally combining them with a teacher model, to produce stable distillation targets for the next round. Extensive evaluation on six datasets against six representative FL baselines (FedAvg, Ditto, FedMD, FedDF, FedDistill, Local-KD) demonstrates that FedKDNAS consistently achieves superior Pareto efficiency, improving accuracy by up to 15\% under non-IID conditions, reducing client CPU usage by approximately 28\%, and decreasing communication overhead by up to 44 times while maintaining lightweight logit-based communication.

URL PDF HTML ☆

赞 0 踩 0

2605.21318 2026-05-21 cs.CL cs.AI cs.LG

TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization

TextReg: 通过正则化的文本空间优化缓解提示分布过拟合

Lucheng Fu, Ye Yu, Yiyang Wang, Yiqiao Jin, Haibo Jin, B. Aditya Prakash, Haohan Wang

发表机构 * Georgia Institute of Technology（佐治亚理工学院）； University of Illinois Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校）

AI总结本文研究了提示分布过拟合问题，提出TextReg框架通过正则化的文本梯度实现软惩罚目标，结合双证据梯度净化、语义编辑正则化和正则化引导的提示更新，提升模型在分布外（OOD）任务上的泛化能力。

Comments Code: https://github.com/luchengfu6/TextReg

详情

AI中文摘要

大型语言模型（LLMs）对用于指定任务目标和行为约束的提示非常敏感。许多最近的提示优化方法通过迭代使用LLM生成的反馈来重写提示，但结果提示往往变长，积累狭窄的样本特定规则，并在训练分布之外泛化能力差。我们研究这种失败模式作为提示分布过拟合，并认为这反映了离散文本空间优化中表示控制的不足。我们通过表示不效率（representational inefficiency）进行了形式化，这是一种双因素度量，将提示不效率分解为容量成本和范围狭窄，将分布提示过拟合归因于优化过程中两者的耦合增长。我们提出了TextReg，一个正则化框架，通过正则化的文本梯度实现软惩罚目标，结合双证据梯度净化、语义编辑正则化和正则化引导的提示更新。在多个推理基准上，TextReg显著提高了分布外（OOD）泛化能力，其准确性在TextGrad和REVOLVE上分别提高了+11.8%和+16.5%。

英文摘要

Large language models (LLMs) are highly sensitive to the prompts used to specify task objectives and behavioral constraints. Many recent prompt optimization methods iteratively rewrite prompts using LLM-generated feedback, but the resulting prompts often become longer, accumulate narrow sample-specific rules, and generalize poorly beyond the training distribution. We study this failure mode as prompt distributional overfitting and argue that it reflects a lack of representation control in discrete text-space optimization. We formalize this view through representational inefficiency, a dual-factor measure that decomposes prompt inefficiency into capacity cost and scope narrowness, attributing distributional prompt overfitting to their coupled growth during optimization. We propose TextReg, a regularization framework that realizes a soft-penalty objective through regularized textual gradients, combining Dual-Evidence Gradient Purification, Semantic Edit Regularization, and Regularization-Guided Prompt Update. Across multiple reasoning benchmarks, TextReg substantially improves out-of-distribution (OOD) generalization, with accuracy gains of up to +11.8% over TextGrad and +16.5% over REVOLVE.

URL PDF HTML ☆

赞 0 踩 0

2605.21317 2026-05-21 cs.LG

通过强调图像负样本token减少LVLMs中的物体幻觉

Meng Shen, Minghao Wu, Deepu Rajan

发表机构 * Nanyang Technological University（南洋理工大学）； Monash University（墨尔本大学）

AI总结本文通过强调图像负样本token来减少LVLMs中的物体幻觉问题，提出调整不同token的训练权重和数据过滤策略以控制幻觉。

Comments 20 pages, 10 figures, 10 tables

详情

AI中文摘要

物体幻觉是阻碍大型视觉-语言模型（LVLMs）在实践中应用的重要挑战。我们假设幻觉的一个可能来源是模型倾向于优先生成文本而非与图像进行有意义的交互。为此，我们研究了生成过程并将文本token分为三类：图像正样本、不变样本和负样本，基于它们对输入图像token的视觉依赖性。我们的分析发现，大多数生成的token对图像信息影响很小。这表明在模型训练阶段，更强调学习如何遵循文本指令，而非从图像中提取信息。基于此发现，我们提出根据token的视觉依赖性调整训练权重以控制幻觉。此外，我们移除一部分可能包含更多幻觉的训练数据作为数据过滤策略。这两种方法在不牺牲响应长度或引入额外计算成本的情况下减少了幻觉。我们验证了我们的方法在三个LVLM变体上的有效性，展示了其有效性和通用性。

英文摘要

Object hallucination is a significant challenge that hinders the application of large vision-language models (LVLMs) in practice. We hypothesize that one possible origin of hallucination is the model's tendency to prioritize text generation over meaningful interaction with images. To explore this, we examine the generation process and categorize text tokens into three groups: image-positive, invariant, and negative, based on their visual dependence on input image tokens. Our analysis reveals that most generated tokens are minimally influenced by the image information. This suggests that during the model's training stage, more emphasis is placed on learning how to follow textual instructions, rather than extracting information from images. Based on this finding, we propose adjusting the training weights of different tokens depending on their visual dependence to control hallucination. Additionally, we remove a portion of the training data that potentially contains more hallucinations as a data filtering strategy. Both methods achieve a reduction in hallucination without compromising response length or introducing additional computational costs during inference. We validate our methods across three LVLM variants, demonstrating the effectiveness and general applicability.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

PointACT: Vision-Language-Action Models with Multi-Scale Point-Action Interaction

RoadTones: Tone Controllable Text Generation from Road Event Videos

MC-Risk: Multi-Component Risk Fields for Risk Identification and Motion Planning

What Twelve LLM Agent Benchmark Papers Disclose About Themselves: A Pilot Audit and an Open Scoring Schema

Quantifying the cross-linguistic effects of syncretism on agreement attraction

From swept contact to pose: Probe-aware registration via complementary-shape docking

Towards Resilient and Autonomous Networks: A BlueSky Vision on AI-Native 6G

Post-Hoc Understanding of Metaphor Processing in Decoder-Only Language Models via Conditional Scale Entropy

On the Regularity and Generalization of One-Step Wasserstein-guided Generative Models for PDE-Induced Measures

Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration

Closed Loop Dynamic Driving Data Mixture for Real-Synthetic Co-Training

A Non-Reference Diffusion-Based Restoration Framework for Landsat 7 ETM+ SLC-off Imagery in Antarctica

Findings of the Fifth Shared Task on Multilingual Coreference Resolution: Expanding Datasets for Long-Range Entities

LASH: Adaptive Semantic Hybridization for Black-Box Jailbreaking of Large Language Models

Data-Efficient Neural Operator Training via Physics-Based Active Learning

OcclusionFormer: Arranging Z-Order for Layout-Grounded Image Generation

Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media

SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence

Learning Robust Dexterous In-Hand Manipulation from Joint Sensors with Proprioceptive Transformer

Fast and Stable Triangular Inversion for Delta-Rule Linear Transformers

Optimized Federated Knowledge Distillation with Distributed Neural Architecture Search

TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization

CRAFT: Conflict-Resolved Aggregation for Federated Training

A New Framework to Analyse the Distributional Robustness of Deep Neural Networks

DeCoR: Design and Control Co-Optimization for Urban Streets Using Reinforcement Learning

Hyper-V2X: Hypernetworks for Estimating Epistemic and Aleatoric Uncertainty in Cooperative Bird's-Eye-View Semantic Segmentation

Deformba: Vision State Space Model with Adaptive State Fusion

From Circuit Evidence to Mechanistic Theory: An Inductive Logic Approach

Automatic Discovery of Disease Subgroups by Contrasting with Healthy Controls

Reducing Object Hallucination in LVLMs via Emphasizing Image-negative Tokens