arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.19047 2026-04-22 cs.CL cs.AI cs.IR

RARE: Redundancy-Aware Retrieval Evaluation Framework for High-Similarity Corpora

Hanjun Cho, Jay-Yoon Lee

Comments Accepted to ACL 2026 (Main Conference)

详情

英文摘要

Existing QA benchmarks typically assume distinct documents with minimal overlap, yet real-world retrieval-augmented generation (RAG) systems operate on corpora such as financial reports, legal codes, and patents, where information is highly redundant and documents exhibit strong inter-document similarity. This mismatch undermines evaluation validity: retrievers can be unfairly undervalued even when they retrieve documents that provide sufficient evidence, because redundancy across documents is not accounted for in evaluation. On the other hand, retrievers that perform well on standard benchmarks often generalize poorly to real-world corpora with highly similar and redundant documents. We present RARE (Redundancy-Aware Retrieval Evaluation), a framework for constructing realistic benchmarks by (i) decomposing documents into atomic facts to enable precise redundancy tracking and (ii) enhancing LLM-based data generation with CRRF. RAG benchmark data usually requires multiple quality criteria, but LLMs often yield trivial outputs. CRRF scores criteria separately and fuses decisions by rank, improving the reliability of generated data. Applying RARE to Finance, Legal, and Patent corpora, we introduce RedQA, where a strong retriever baseline drops from 66.4% PerfRecall@10 on 4-hop General-Wiki to 5.0-27.9% PerfRecall@10 at 4-hop depth, revealing robustness gaps that current benchmarks fail to capture. RARE enables practitioners to build domain-specific RAG evaluations that faithfully reflect real-world deployment conditions.

URL PDF HTML ☆

赞 0 踩 0

2604.19039 2026-04-22 cs.CV

Generative Texture Filtering

Rongjia Zheng, Shangwei Huang, Lei Zhu, Wei-Shi Zheng, Qing Zhang

Comments Accepted to SIGGRAPH 2026 conference track

2604.19036 2026-04-22 cs.AI cs.LO

Plausible Reasoning and First-Order Plausible Logic

David Billington

Comments 28 pages. arXiv admin note: text overlap with arXiv:1703.01697

2604.19034 2026-04-22 cs.CV

Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents

Xu Chen, Shichao Xie, Zhining Gu, Lu Jia, Minghua Luo, Fei Liu, Zedong Chu, Yanfen Shen, Xiaolong Wu, Mu Xu

2604.19033 2026-04-22 cs.LG cs.AI

Intentional Updates for Streaming Reinforcement Learning

Arsalan Sharifnassab, Mohamed Elsayed, Kris De Asis, A. Rupam Mahmood, Richard S. Sutton

2604.19028 2026-04-22 cs.LG

Learning Posterior Predictive Distributions for Node Classification from Synthetic Graph Priors

Jeongwhan Choi, Jongwoo Kim, Woosung Kang, Noseong Park

Comments Accepted to ICLR 2026. OpenReview: https://openreview.net/forum?id=FmxRzlu0rT

2604.19025 2026-04-22 cs.RO

RoomRecon: High-Quality Textured Room Layout Reconstruction on Mobile Devices

Seok Joon Kim, Dinh Duc Cao, Federica Spinola, Se Jin Lee, Kyu Sung Cho

Comments 23 pages, including supplementary material. Accepted to the 2024 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). Best Paper Nominee

2604.19024 2026-04-22 cs.LG

Policy Gradient Primal-Dual Method for Safe Reinforcement Learning from Human Feedback

Qiang Liu, Adrienne Kline, Ermin Wei

2604.19022 2026-04-22 cs.AI

On Accelerating Grounded Code Development for Research

Santosh Ganji

2604.19021 2026-04-22 cs.LG

FG$^2$-GDN: Enhancing Long-Context Gated Delta Networks with Doubly Fine-Grained Control

Pingwei Sun, Yuxuan Hu, Jianchao Tan, Xue Wang, Jiaqi Zhang, Yifan Lu, Yerui Sun, Yuchen Xie, Xunliang Cai

2604.19018 2026-04-22 cs.LG cs.AI cs.SY eess.SY math.OC stat.ML

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

Julian Skifstad, Xinyue Annie Yang, Glen Chou

Comments Under review

2604.19016 2026-04-22 cs.CL

AlignCultura: Towards Culturally Aligned Large Language Models?

Gautam Siddharth Kashyap, Mark Dras, Usman Naseem

Comments Accepted at ACL Mains 2026

2604.19015 2026-04-22 cs.LG cs.AI

FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion

Tao Fan, Guoqiang Ma, Yuanfeng Song, Lixin Fan, Kai Chen, Qiang Yang

2604.19009 2026-04-22 cs.LG cs.CV

Guiding Distribution Matching Distillation with Gradient-Based Reinforcement Learning

Linwei Dong, Ruoyu Guo, Ge Bai, Zehuan Yuan, Yawei Luo, Changqing Zou

2604.19001 2026-04-22 cs.CL

When Safety Fails Before the Answer: Benchmarking Harmful Behavior Detection in Reasoning Chains

Ishita Kakkar, Enze Zhang, Rheeya Uppaal, Junjie Hu

2604.18993 2026-04-22 cs.CV cs.AI cs.MM

AutoAWG: Adverse Weather Generation with Adaptive Multi-Controls for Automotive Videos

Jiagao Hu, Daiguo Zhou, Danzhen Fu, Fuhao Li, Zepeng Wang, Fei Wang, Wenhua Liao, Jiayi Xie, Haiyang Sun

Comments Accepted by ICMR 2026

2604.18988 2026-04-22 cs.CV

A Multi-Agent Framework with Structured Reasoning and Reflective Refinement for Multimodal Empathetic Response Generation

Liping Wang, Cheng Ye, Weidong Chen, Peipei Song, Bo Hu, Zhendong Mao

Comments Submitted to ACM Multimetida 2026

详情

英文摘要

Multimodal empathetic response generation (MERG) aims to generate emotionally engaging and empathetic responses based on users' multimodal contexts. Existing approaches usually rely on an implicit one-pass generation paradigm from multimodal context to the final response, which overlooks two intrinsic characteristics of MERG: (1) Human perception of emotional cues is inherently structured rather than a direct mapping. The conventional paradigm neglects the hierarchical progression of emotion perception, leading to distorted emotional judgments. (2) Given the inherent complexity and ambiguity of human emotions, the conventional paradigm is prone to significant emotional biases, ultimately resulting in suboptimal empathy. In this paper, we propose a multi-agent framework for MERG, which enhances empathy through structured reasoning and reflective refinement. Specifically, we first introduce a structured empathetic reasoning-to-generation module that explicitly decomposes response generation via multimodal perception, consistency-aware emotion forecasting, pragmatic strategy planning, and strategy-guided response generation, providing a clearer intermediate path from multimodal evidence to response realization. Besides, we develop a global reflection and refinement module, in which a global reflection agent performs step-wise auditing over intermediate states and the generated response, eliminating existing emotional biases and empathy errors, and triggering targeted regeneration. Overall, such a closed-loop framework enables our model to gradually improve the accuracy of emotion perception and eliminate emotion biases during the iteration process. Experiments on several benchmarks, e.g., IEMOCAP and MELD, demonstrate that our model has superior empathic response generation capabilities compared to state-of-the-art methods.

URL PDF HTML ☆

赞 0 踩 0

2604.18982 2026-04-22 cs.AI

SAVOIR: Learning Social Savoir-Faire via Shapley-based Reward Attribution

Xiachong Feng, Yi Jiang, Xiaocheng Feng, Deyi Yin, Libo Qin, Yangfan Ye, Lei Huang, Weitao Ma, Yuxuan Gu, Chonghan Qin, Bing Qin, Lingpeng Kong

Comments ACL 2026 Findings

2604.18980 2026-04-22 cs.CV

AdaGScale: Viewpoint-Adaptive Gaussian Scaling in 3D Gaussian Splatting to Reduce Gaussian-Tile Pairs

Joongho Jo, Hyerin Lim, Hanjun Choi, Jongsun Park

Comments DAC 2026

2604.18976 2026-04-22 cs.CL

STAR-Teaming: A Strategy-Response Multiplex Network Approach to Automated LLM Red Teaming

MinJae Jung, YongTaek Lim, Chaeyun Kim, Junghwan Kim, Kihyun Kim, Minwoo Kim

Comments Accepted at ACL 2026 Findings

2604.18967 2026-04-22 cs.CV

Toward Clinically Acceptable Chest X-ray Report Generation: A Qualitative Retrospective Pilot Study of CXRMate-2

Aaron Nicolson, Elizabeth J. Cooper, Hwan-Jin Yoon, Claire McCafferty, Ramya Krishnan, Michelle Craigie, Nivene Saad, Jason Dowling, Ian A. Scott, Bevan Koopman

2604.18964 2026-04-22 cs.AI cs.DB

DW-Bench: Benchmarking LLMs on Data Warehouse Graph Topology Reasoning

Ahmed G. A. H Ahmed, C. Okan Sakar

Comments 24 pages, 6 figures. Datasets and evaluation code available at GitHub

2604.18963 2026-04-22 cs.LG cs.AI

Distillation Traps and Guards: A Calibration Knob for LLM Distillability

Weixiao Zhan, Yongcheng Jing, Leszek Rutkowski, Dacheng Tao

2604.18961 2026-04-22 cs.RO cs.CV

AI-Enabled Image-Based Hybrid Vision/Force Control of Tendon-Driven Aerial Continuum Manipulators

Shayan Sepahvand, Farrokh Janabi-Sharifi, Farhad Aghili

2604.18957 2026-04-22 cs.CV

Bridging Foundation Models and ASTM Metallurgical Standards for Automated Grain Size Estimation from Microscopy Images

Abdul Mueez, Shruti Vyas

Comments Accepted at the 11th IEEE Workshop on Computer Vision for Multimodal Microscopy Image Analysis (CVMI), CVPR Workshops 2026

2604.18955 2026-04-22 cs.CL cs.AI cs.SI

Assessing Capabilities of Large Language Models in Social Media Analytics: A Multi-task Quest

Ramtin Davoudi, Kartik Thakkar, Nazanin Donyapour, Tyler Derr, Hamid Karimi

2604.18946 2026-04-22 cs.AI

Reasoning Structure Matters for Safety Alignment of Reasoning Models

Yeonjun In, Wonjoong Kim, Sangwu Park, Chanyoung Park

Comments ACL 2026

2604.18944 2026-04-22 cs.CL

A Mechanism and Optimization Study on the Impact of Information Density on User-Generated Content Named Entity Recognition

Jiang Xiaobo, Dinghong Lai, Song Qiu, Yadong Deng, Xinkai Zhan

2604.18943 2026-04-22 cs.AI cs.CL cs.HC cs.IR cs.LG

Personalized Benchmarking: Evaluating LLMs by Individual Preferences

Cristina Garbacea, Heran Wang, Chenhao Tan

Comments Accepted to Findings of ACL 2026

2604.18942 2026-04-22 cs.CL

Disparities In Negation Understanding Across Languages In Vision-Language Models

Charikleia Moraitaki, Sarah Pan, Skyler Pulling, Gwendolyn Flusche, Kumail Alhamoud, Marzyeh Ghassemi