arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2604.17398 2026-04-21 cs.CL

Contrastive Analysis of Linguistic Representations in Large Language Model Outputs through Structured Synthetic Data Generation and Abstracted N-gram Associations

S. A. Desimone, L. Alonso Alemany

2604.17397 2026-04-21 cs.CV cs.AI

Speculative Decoding for Autoregressive Video Generation

Yuezhou Hu, Jintao Zhang

2604.17396 2026-04-21 cs.CL

Representation-Guided Parameter-Efficient LLM Unlearning

Zeguan Xiao, Lang Mo, Yun Chen, Lei Yang, Jiehui Zhao, Lili Yang, Guanhua Chen

Comments Findings of ACL 2026

2604.17389 2026-04-21 cs.CV

Deep learning based Non-Rigid Volume-to-Surface Registration for Brain Shift compensation Using Point Cloud

Eashrat Jahan Muniya, Gernot Kronreif, Ander Biguri, Wolfgang Birkfellner, Sepideh Hatamikia

2604.17385 2026-04-21 cs.CV

SpatialImaginer: Towards Adaptive Visual Imagination for Spatial Reasoning

Yian Li, Yang Jiao, Bin Zhu, Tianwen Qian, Shaoxiang Chen, Jingjing Chen, Yu-Gang Jiang

2604.17384 2026-04-21 cs.LG

Towards a Data-Parameter Correspondence for LLMs: A Preliminary Discussion

Ou Wu

Comments 25 pages

2604.17377 2026-04-21 cs.CL

AnchorMem: Anchored Facts with Associative Contexts for Building Memory in Large Language Models

Zhanyu Shen, Sijie Cheng, Zhicheng Guo, Weiqin Wang, Yile Wang, Hui Huang

Comments ACL 2026 Findings

2604.17376 2026-04-21 cs.CV cs.AI cs.LG eess.IV

Towards Generalizable Deepfake Image Detection with Vision Transformers

Kaliki V Srinanda, M Manvith Prabhu, Hemanth K Mogilipalem, Jayavarapu S Abhinai, Vaibhav Santhosh, Aryan Herur, Deepu Vijayasenan

Comments 5 pages, 9 figures, SP Cup - ICASSP 2025

2604.17375 2026-04-21 cs.CV cs.AI

When Text Hijacks Vision: Benchmarking and Mitigating Text Overlay-Induced Hallucination in Vision Language Models

Cui Yakun, Xingqun Qi, TianTian Geng, Yuyao Zhang, Sirui Han, Yike Guo

详情

英文摘要

Recent advances in Vision-Language Models (VLMs) have substantially enhanced their ability across multimodal video understanding benchmarks spanning temporal, action, object, and spatial understanding. However, we identify a critical yet overlooked issue: when embedded on-screen text contradicts the visual scene, existing VLMs systematically hallucinate, prioritizing overlay textual semantics over the actual visual content. We define this phenomenon as Text Overlay-Induced Hallucination (TOIH). In this work, we propose VisualTextTrap, the first comprehensive benchmark, including large-scale human-validated samples with specifically designed evaluation metrics. In particular, we construct VisualTextTrap from widely-used public datasets using a scalable hybrid pipeline of VLMs assisted text generation and rigorous manual verification. The benchmark features 6,057 samples annotated across 88 fine-grained attributes within four dimensions, with hallucination intensity quantified on a five-level scale (L1--L5) that reflects the semantic contradiction between overlay text and visual reality. Moreover, we propose Visual Text Hallucination Mitigation Mixture-of-Experts (VTHM-MoE), a novel Vision-Text Disentanglement framework that employs a dual-encoder architecture. Concretely, four dimension-specialized expert modules spanning Temporal, Action, Object, and Spatial reasoning are first pre-trained to identify and leverage cross-modal discrepancies between textual semantics and actual video content. We develop an Adaptive Token Routing Strategy to enable dynamic expert allocation, conferring robust resistance to TOIH while preserving performance on uncontaminated videos. Extensive experiments conducted on our VisualTextTrap benchmark verify the effectiveness of VTHM-MoE, outperforming state-of-the-art counterparts with diverse video question answering tasks.

URL PDF HTML ☆

赞 0 踩 0

2604.17366 2026-04-21 cs.CL cs.AI

ArgBench: Benchmarking LLMs on Computational Argumentation Tasks

Yamen Ajjour, Carlotta Quensel, Nedim Lipka, Henning Wachsmuth

2604.17364 2026-04-21 cs.AI cs.MA cs.PL

LLM-Guided Strategy Synthesis for Scalable Equality Saturation

Chenyun Yin, Youwei Xiao, Yuze Luo, Yuyang Zou, Yun Liang

2604.17360 2026-04-21 cs.AI

T-DuMpRa: Teacher-guided Dual-path Multi-prototype Retrieval Augmented framework for fine-grained medical image classification

Zixuan Tang, Shen Zhao

2604.17358 2026-04-21 cs.CL cs.AI cs.SD

Still Between Us? Evaluating and Improving Voice Assistant Robustness to Third-Party Interruptions

Dongwook Lee, Eunwoo Song, Che Hyun Lee, Heeseung Kim, Sungroh Yoon

Comments ACL 2026 main conference

2604.17354 2026-04-21 cs.CL cs.CV

More Than Meets the Eye: Measuring the Semiotic Gap in Vision-Language Models via Semantic Anchorage

Wei He

Comments 16 pages, 4 figures. Accepted to the Main Conference of ACL 2026

2604.17353 2026-04-21 cs.AI cs.DC

Hive: A Multi-Agent Infrastructure for Algorithm- and Task-Level Scaling

Zizhang Luo, Yuhao Luo, Youwei Xiao, Yansong Xu, Runlin Guo, Yun Liang

2604.17351 2026-04-21 cs.AI

SOCIA-EVO: Automated Simulator Construction via Dual-Anchored Bi-Level Optimization

Yuncheng Hua, Sion Weatherhead, Mehdi Jafari, Hao Xue, Flora D. Salim

Comments This paper has been accepted to the ACL 2026 Main Conference

2604.17347 2026-04-21 cs.AI

Formal Foundations of Agentic Business Process Management

Giuseppe De Giacomo, Timotheus Kampik, Lukas Kirchdorfer, Marco Montali, Christoph Weinhuber

2604.17346 2026-04-21 cs.CL

Logical Computational Linguistics

Glyn V. Morrill, Oriol Valentín

2604.17344 2026-04-21 cs.LG cs.CL

FLARE: Task-agnostic embedding model evaluation through a normalization process

Jingzhou Jiang, Yixuan Tang, Yi Yang, Kar Yan Tam

Comments Accepted to Findings of ACL 2026

2604.17341 2026-04-21 cs.CV cs.AI

Robust Diabetic Retinopathy Grading Using Dual-Resolution Attention-Based Deep Learning with Ordinal Regression

Afshan Hashmi

2604.17340 2026-04-21 cs.CL

Neuro-Symbolic Resolution of Recommendation Conflicts in Multimorbidity Clinical Guidelines

Shiyao Xie, Jian Du

Comments Accepted by Proceedings of the 40th Annual AAAI Conference on Artificial Intelligence (Bridge Program on Logic & AI: Logical and Symbolic Reasoning in Language Models)

2604.17337 2026-04-21 cs.AI

AutoSearch: Adaptive Search Depth for Efficient Agentic RAG via Reinforcement Learning

Jingbo Sun, Wenyue Chong, Songjun Tu, Qichao Zhang, Yaocheng Zhang, Jiajun Chai, Xiaohan Wang, Wei Lin, Guojun Yin, Dongbin Zhao

2604.17335 2026-04-21 cs.RO

Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking

Zewei Zhang, Kehan Wen, Michael Xu, Junzhe He, Chenhao Li, Takahiro Miki, Clemens Schwarke, Chong Zhang, Xue Bin Peng, Marco Hutter

2604.17325 2026-04-21 cs.CL

Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation

Jiaang Li, Zhendong Mao, Quan Wang, Yuning Wan, Yongdong Zhang

Comments ACL'26 Findings

2604.17323 2026-04-21 cs.CL cs.LG

A Universal Avoidance Method for Diverse Multi-branch Generation

Kyeongman Park, Minha Jhang, Kyomin Jung

2604.17321 2026-04-21 cs.CV

R-FLoRA: Residual-Statistic-Gated Low-Rank Adaptation for Single-Image Face Morphing Attack Detection

Raghavendra Ramachandra

Comments Pre-Print; Accepted in IEEE Transactions on Information Forensics and Security (TIFS), 2026

2604.17320 2026-04-21 cs.CV

Towards Joint Quantization and Token Pruning of Vision-Language Models

Xinqing Li, Xin He, Xindong Zhang, Ming-Ming Cheng, Lei Zhang, Yun Liu

2604.17319 2026-04-21 cs.CV cs.CL

E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition

Meng Zhang, Jinzhong Ning, Xiaolong Wu, Hongfei Lin, Yijia Zhang