arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.20246 2026-04-23 cs.RO cs.AI

Cortex 2.0: Grounding World Models in Real-World Industrial Deployment

Adriana Aida, Walida Amer, Katarina Bankovic, Dhruv Behl, Fabian Busch, Annie Bhalla, Minh Duong, Florian Gienger, Rohan Godse, Denis Grachev, Ralf Gulde, Elisa Hagensieker, Junpeng Hu, Shivam Joshi, Tobias Knoblauch, Likith Kumar, Damien LaRocque, Keerthana Lokesh, Omar Moured, Khiem Nguyen, Christian Preyss, Ranjith Sriganesan, Vikram Singh, Carsten Sponner, Anh Tong, Dominik Tuscher, Marc Tuscher, Pavan Upputuri

Comments 20 pages, 13 figures

2604.20244 2026-04-23 cs.CL cs.AI

Hybrid Policy Distillation for LLMs

Wenhong Zhu, Ruobing Xie, Rui Wang, Pengfei Liu

Comments WIP

2604.20243 2026-04-23 cs.CV

Bio-inspired Color Constancy: From Gray Anchoring Theory to Gray Pixel Methods

Kai-Fu Yang, Fu-Ya Luo, Yong-Jie Li

Comments 13 pages, 5 figures

2604.20241 2026-04-23 cs.CL physics.comp-ph

Construction of a Battery Research Knowledge Graph using a Global Open Catalog

Luca Foppiano, Sae Dieb, Malik Zain, Kazuki Kasama, Keitaro Sodeyama, Mikiko Tanifuji

2604.20231 2026-04-23 cs.RO

Toward Cooperative Driving in Mixed Traffic: An Adaptive Potential Game-Based Approach with Field Test Verification

Shiyu Fang, Xiaocong Zhao, Xuekai Liu, Peng Hang, Jianqiang Wang, Yunpeng Wang, Jian Sun

2604.20226 2026-04-23 cs.CV

Learning Spatial-Temporal Coherent Correlations for Speech-Preserving Facial Expression Manipulation

Tianshui Chen, Jianman Lin, Zhijing Yang, Chunmei Qing, Guangrun Wang, Liang Lin

详情

英文摘要

Speech-preserving facial expression manipulation (SPFEM) aims to modify facial emotions while meticulously maintaining the mouth animation associated with spoken content. Current works depend on inaccessible paired training samples for the person, where two aligned frames exhibit the same speech content yet differ in emotional expression, limiting the SPFEM applications in real-world scenarios. In this work, we discover that speakers who convey the same content with different emotions exhibit highly correlated local facial animations in both spatial and temporal spaces, providing valuable supervision for SPFEM. To capitalize on this insight, we propose a novel spatial-temporal coherent correlation learning (STCCL) algorithm, which models the aforementioned correlations as explicit metrics and integrates the metrics to supervise manipulating facial expression and meanwhile better preserving the facial animation of spoken content. To this end, it first learns a spatial coherent correlation metric, ensuring that the visual correlations of adjacent local regions within an image linked to a specific emotion closely resemble those of corresponding regions in an image linked to a different emotion. Simultaneously, it develops a temporal coherent correlation metric, ensuring that the visual correlations of specific regions across adjacent image frames associated with one emotion are similar to those in the corresponding regions of frames associated with another emotion. Recognizing that visual correlations are not uniform across all regions, we have also crafted a correlation-aware adaptive strategy that prioritizes regions that present greater challenges. During SPFEM model training, we construct the spatial-temporal coherent correlation metric between corresponding local regions of the input and output image frames as an additional loss to supervise the generation process.

URL PDF HTML ☆

赞 0 踩 0

2604.20225 2026-04-23 cs.CL

The GaoYao Benchmark: A Comprehensive Framework for Evaluating Multilingual and Multicultural Abilities of Large Language Models

Yilun Liu, Chunguang Zhao, Mengyao Piao, Lingqi Miao, Shimin Tao, Minggui He, Chenxin Liu, Li Zhang, Hongxia Ma, Jiaxin Guo, Chen Liu, Liqun Deng, Jiansheng Wei, Xiaojun Meng, Fanyi Du, Daimeng Wei, Yanghua Xiao

Comments Accepted by ACL 2026 main

2604.20221 2026-04-23 cs.CL

Markov reads Pushkin, again: A statistical journey into the poetic world of Evgenij Onegin

Angelo Maria Sabatini

Comments 21 pages, 7 figures, 3 supplementary files; revised version submitted to PLOS ONE

2604.20219 2026-04-23 cs.LG cs.NA math.NA stat.ML

Geometric Layer-wise Approximation Rates for Deep Networks

Shijun Zhang, Zuowei Shen, Yuesheng Xu

2604.20216 2026-04-23 cs.CL

Text-to-Distribution Prediction with Quantile Tokens and Neighbor Context

Yilun Zhu, Yuan Zhuang, Nikhita Vedula, Dushyanta Dhyani, Shaoyuan Xu, Moyan Li, Mohsen Bayati, Bryan Wang, Shervin Malmasi

Comments Accepted to ACL 2026 main conference

2604.20213 2026-04-23 cs.CV

Weighted Knowledge Distillation for Semi-Supervised Segmentation of Maxillary Sinus in Panoramic X-ray Images

Juha Park, Jiho Choi, Jong Pil Yun, Yong Chan Park, Han-Gyeol Yeom, Byung Do Lee, Sang Jun Lee

Comments 14 pages, 6 figures. Under review

2604.20209 2026-04-23 cs.LG

Scaling Self-Play with Self-Guidance

Luke Bailey, Kaiyue Wen, Kefan Dong, Tatsunori Hashimoto, Tengyu Ma

2604.20208 2026-04-23 cs.RO math.PR

Stochastic Barrier Certificates in the Presence of Dynamic Obstacles

Rayan Mazouz, Luca Laurenti, Morteza Lahijanian

2604.20204 2026-04-23 cs.LG

ACT: Anti-Crosstalk Learning for Cross-Sectional Stock Ranking via Temporal Disentanglement and Structural Purification

Juntao Li, Liang Zhang

Comments 15 pages

2604.20200 2026-04-23 cs.CL

Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows

Hardy Chen, Nancy Lau, Haoqin Tu, Shuo Yan, Xiangyan Liu, Zijun Wang, Juncheng Wu, Michael Qizhe Shieh, Alvaro A. Cardenas, Cihang Xie, Yuyin Zhou

Comments 25 pages

2604.20199 2026-04-23 cs.CL

All Languages Matter: Understanding and Mitigating Language Bias in Multilingual RAG

Dan Wang, Guozhao Mo, Yafei Shi, Cheng Zhang, Bo Zheng, Boxi Cao, Xuanang Chen, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Le Sun

Comments ACL 2026 main conference

2604.20193 2026-04-23 cs.RO

LLM-Guided Safety Agent for Edge Robotics with an ISO-Compliant Perception-Compute-Control Architecture

Xu Huang, Ruofan Zhang, Lu Cheng, Yuefeng Song, Xu Huang, Huayu Zhang, Sheng Yin, Anyang Liang, Chen Qian, Yin Zhou, Xiaoyun Yuan, Yuan Cheng

2604.20190 2026-04-23 cs.CV cs.LG

WildFireVQA: A Large-Scale Radiometric Thermal VQA Benchmark for Aerial Wildfire Monitoring

Mobin Habibpour, Niloufar Alipour Talemi, John Spodnik, Camren J. Khoury, Fatemeh Afghah

2604.20188 2026-04-23 cs.LG math.DS

Structure-Aware Variational Learning of a Class of Generalized Diffusions

Yubin Lu, Xiaofan Li, Chun Liu, Qi Tang, Yiwei Wang

2604.20168 2026-04-23 cs.CL

Duluth at SemEval-2026 Task 6: DeBERTa with LLM-Augmented Data for Unmasking Political Question Evasions

Shujauddin Syed, Ted Pedersen

2604.20166 2026-04-23 cs.CL cs.HC

Aligning Human-AI-Interaction Trust for Mental Health Support: Survey and Position for Multi-Stakeholders

Xin Sun, Yue Su, Yifan Mo, Qingyu Meng, Yuxuan Li, Saku Sugawara, Mengyuan Zhang, Charlotte Gerritsen, Sander L. Koole, Koen Hindriks, Jiahuan Pei

2604.20161 2026-04-23 cs.LG stat.ME stat.ML

SMART: A Spectral Transfer Approach to Multi-Task Learning

Boxin Zhao, Mladen Kolar, Jinchi Lv

Comments 53 pages, 4 figures, 1 table

2604.20158 2026-04-23 cs.AI

Stateless Decision Memory for Enterprise AI Agents

Vasundra Srinivasan

Comments 16 pages, 4 figures, 4 tables. Companion paper to "Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents" (arXiv:TBD). Code and reproducibility artifacts at https://github.com/vasundras/stateless-decision-memory-enterprise-ai-agents

详情

英文摘要

Enterprise deployment of long-horizon decision agents in regulated domains (underwriting, claims adjudication, tax examination) is dominated by retrieval-augmented pipelines despite a decade of increasingly sophisticated stateful memory architectures. We argue this reflects a hidden requirement: regulated deployment is load-bearing on four systems properties (deterministic replay, auditable rationale, multi-tenant isolation, statelessness for horizontal scale), and stateful architectures violate them by construction. We propose Deterministic Projection Memory (DPM): an append-only event log plus one task-conditioned projection at decision time. On ten regulated decisioning cases at three memory budgets, DPM matches summarization-based memory at generous budgets and substantially outperforms it when the budget binds: at a 20x compression ratio, DPM improves factual precision by +0.52 (Cohen's h=1.17, p=0.0014) and reasoning coherence by +0.53 (h=1.13, p=0.0034), paired permutation, n=10. DPM is additionally 7-15x faster at binding budgets, making one LLM call at decision time instead of N. A determinism study of 10 replays per case at temperature zero shows both architectures inherit residual API-level nondeterminism, but the asymmetry is structural: DPM exposes one nondeterministic call; summarization exposes N compounding calls. The audit surface follows the same one-versus-N pattern: DPM logs two LLM calls per decision while summarization logs 83-97 on LongHorizon-Bench. We conclude with TAMS, a practitioner heuristic for architecture selection, and a failure analysis of stateful memory under enterprise operating conditions. The contribution is the argument that statelessness is the load-bearing property explaining enterprise's preference for weaker but replayable retrieval pipelines, and that DPM demonstrates this property is attainable without the decisioning penalty retrieval pays.

URL PDF HTML ☆

赞 0 踩 0

2604.20157 2026-04-23 cs.CV

HumanScore: Benchmarking Human Motions in Generated Videos

Yusu Fang, Tiange Xiang, Tian Tan, Narayan Schuetz, Scott Delp, Li Fei-Fei, Ehsan Adeli

2604.20156 2026-04-23 cs.LG

Temporally Extended Mixture-of-Experts Models

Zeyu Shen, Peter Henderson

2604.20151 2026-04-23 cs.RO cs.LG

Toward Safe Autonomous Robotic Endovascular Interventions using World Models

Harry Robertshaw, Nikola Fischer, Han-Ru Wu, Andrea Walker Perez, Weiyuan Deng, Benjamin Jackson, Christos Bergeles, Alejandro Granados, Thomas C Booth

Comments This manuscript is a preprint and has been submitted to the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2026

2604.20148 2026-04-23 cs.CL cs.AI cs.LG

Meta-Tool: Efficient Few-Shot Tool Adaptation for Small Language Models

Sachin Kumar

Comments Accepted to Findings of ACL 2026

2604.20141 2026-04-23 cs.LG math.DS

Fourier Weak SINDy: Spectral Test Function Selection for Robust Model Identification

Zhiheng Chen, Urban Fasel, Anastasia Bizyaeva

Comments Accepted to the 8th Annual Learning for Dynamics & Control Conference (L4DC 2026)

2604.20140 2026-04-23 cs.AI cs.LG

HiPO: Hierarchical Preference Optimization for Adaptive Reasoning in LLMs

Darsh Kachroo, Adriana Caraeni, Arjun Prasaath Anbazhagan, Brennan Lagasse, Kevin Zhu

Comments 12 pages, 4 figures, 6 tables. Includes ablation study across Qwen2.5-7B-Instruct and Llama-3.1-8B-Instruct on 5 math reasoning benchmarks (GSM8K, MATH500, Minerva, AIME24, Gaokao2023). GPT-4.1 used for structured evaluation of reasoning quality

2604.20136 2026-04-23 cs.CV cs.AI

IMPACT-CYCLE: A Contract-Based Multi-Agent System for Claim-Level Supervisory Correction of Long-Video Semantic Memory

Weitong Kong, Di Wen, Kunyu Peng, David Schneider, Zeyun Zhong, Alexander Jaus, Zdravko Marinov, Jiale Wei, Ruiping Liu, Junwei Zheng, Yufan Chen, Lei Qi, Rainer Stiefelhagen

Comments 7 pages, 2 figures, code are available at https://github.com/MKong17/IMPACT_CYCLE