arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.18215 2026-04-22 cs.CV

Memorize When Needed: Decoupled Memory Control for Spatially Consistent Long-Horizon Video Generation

Yanjun Guo, Zhengqiang Zhang, Pengfei Wang, Xinyue Liang, Zhiyuan Ma, Lei Zhang

Comments 24 pages, with supplementary material

详情

英文摘要

Spatially consistent long-horizon video generation aims to maintain temporal and spatial consistency along predefined camera trajectories. Existing methods mostly entangle memory modeling with video generation, leading to inconsistent content during scene revisits and diminished generative capacity when exploring novel regions, even trained on extensive annotated data. To address these limitations, we propose a decoupled framework that separates memory conditioning from generation. Our approach significantly reduces training costs while simultaneously enhancing spatial consistency and preserving the generative capacity for novel scene exploration. Specifically, we employ a lightweight, independent memory branch to learn precise spatial consistency from historical observation. We first introduce a hybrid memory representation to capture complementary temporal and spatial cues from generated frames, then leverage a per-frame cross-attention mechanism to ensure each frame is conditioned exclusively on the most spatially relevant historical information, which is injected into the generative model to ensure spatial consistency. When generating new scenes, a camera-aware gating mechanism is proposed to mediate the interaction between memory and generation modules, enabling memory conditioning only when meaningful historical references exist. Compared with the existing method, our method is highly data-efficient, yet the experiments demonstrate that our approach achieves state-of-the-art performance in terms of both visual quality and spatial consistency.

URL PDF HTML ☆

赞 0 踩 0

2604.18164 2026-04-22 cs.CL cs.AI cs.CV

MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge

Sua Lee, Sanghee Park, Jinbae Im

Comments ACL 2026 Main

2604.17857 2026-04-22 cs.CL cs.AI cs.LG

On the Emergence of Syntax by Means of Local Interaction

Zichao Wei

2604.17393 2026-04-22 cs.CL

Who Watches the Watchmen? Humans Disagree With Translation Metrics on Unseen Domains

Finn Schmidt, Jan Philip Wahle, Terry Ruas, Bela Gipp

Comments Accepted at ACL2026 (Findings)

2604.17390 2026-04-22 cs.CV cs.AI cs.GR

MESA: A Training-Free Multi-Exemplar Deep Framework for Restoring Ancient Inscription Textures

Vasileios Toulatzis, Sofia Theodoridou, Ioannis Fudos

2604.16755 2026-04-22 cs.AI

Machine individuality: Separating genuine idiosyncrasy from response bias in large language models

Valentin Kriegmair, Dirk U. Wulff

Comments 18 pages, 1 figure. Supporting information included; v2: minor formatting fixes

2604.16654 2026-04-22 cs.CL

IYKYK (But AI Doesn't): Automated Content Moderation Does Not Capture Communities' Heterogeneous Attitudes Towards Reclaimed Language

Christina Chance, Rebecca Pattichis, Arjun Subramonian, James He, Shruti Narayanan, Saadia Gabriel, Kai-Wei Chang

2604.16368 2026-04-22 cs.CL

Cross-Family Speculative Decoding for Polish Language Models on Apple~Silicon: An Empirical Evaluation of Bielik~11B with UAG-Extended MLX-LM

Krzysztof Fonal

2604.14165 2026-04-22 cs.CL

EviSearch: A Human in the Loop System for Extracting and Auditing Clinical Evidence for Systematic Reviews

Naman Ahuja, Saniya Mulla, Muhammad Ali Khan, Zaryab Bin Riaz, Kaneez Zahra Rubab Khakwani, Mohamad Bassam Sonbol, Irbaz Bin Riaz, Vivek Gupta

2604.13571 2026-04-22 cs.CV

Radar-Informed 3D Multi-Object Tracking under Adverse Conditions

Bingxue Xu, Emil Hedemalm, Ajinkya Khoche, Patric Jensfelt

Comments 7 pages, 5 figures

2604.12942 2026-04-22 cs.RO

RMGS-SLAM: Real-time Multi-sensor Gaussian Splatting SLAM

Dongen Li, Yi Liu, Junqi Liu, Zewen Sun, Zefan Huang, Shuo Sun, Jiahui Liu, Chengran Yuan, Hongliang Guo, Francis E. H. Tay, Marcelo H. Ang

Comments The manuscript has been improved, with refined content and updated and corrected experimental results

2604.08475 2026-04-22 cs.CV

LAMP: Lift Image-Editing as General 3D Priors for Open-world Manipulation

Jingjing Wang, Zhengdong Hong, Chong Bao, Yuke Zhu, Junhan Sun, Guofeng Zhang

2604.05301 2026-04-22 cs.CV

SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration

Xueming Fu, Lixia Han

Comments Lab Report for NTIRE 2026 3DRR Track 2

2604.03540 2026-04-22 cs.RO

Drift-Based Policy Optimization: Native One-Step Policy Learning for Online Robot Control

Yuxuan Gao, Yedong Shen, Shiqi Zhang, Wenhao Yu, Yifan Duan, Jia pan, Jiajia Wu, Jiajun Deng, Yanyong Zhang

2604.00688 2026-04-22 cs.CL eess.AS

OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models

Han Zhu, Lingxuan Ye, Wei Kang, Zengwei Yao, Liyong Guo, Fangjun Kuang, Zhifeng Han, Weiji Zhuang, Long Lin, Daniel Povey

2603.22650 2026-04-22 cs.CV cs.RO

MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping

Shiyao Li, Antoine Guédon, Shizhe Chen, Vincent Lepetit

Comments Accepted at CVPR 2026 (Oral). Project webpage: https://shiyao-li.github.io/magician/

2603.16497 2026-04-22 cs.LG cs.AI

Bridging the High-Frequency Data Gap: A Millisecond-Resolution Network Dataset for Advancing Time Series Foundation Models

Subina Khanal, Seshu Tirupathi, Merim Dzaferagic, Marco Ruffini, Torben Bach Pedersen

2603.15471 2026-04-22 cs.RO eess.SP

On the Derivation of Tightly-Coupled LiDAR-Inertial Odometry with VoxelMap

Zhihao Zhan

2603.11665 2026-04-22 cs.CL

Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge

Junjie Wu, Xuan Kan, Zihao He, Shunwen Tan, Bo Pan, Kaitai Zhang

Comments ACL 2026 Industry Track

2603.01420 2026-04-22 cs.LG

Tackling multiphysics problems via finite element-guided physics-informed operator learning

Yusuke Yamazaki, Reza Najian Asl, Markus Apel, Mayu Muramatsu, Shahed Rezaei

详情

英文摘要

This work presents a finite element-guided physics-informed operator learning framework for multiphysics problems with coupled partial differential equations (PDEs) on arbitrary domains. The proposed framework learns an operator from the input space to the solution space with a weighted residual formulation based on the finite element method, enabling discretization-independent prediction beyond the training resolution without relying on labeled simulation data. The present framework for multiphysics problems is implemented in Folax, a JAX-based operator learning platform, and is verified on nonlinear coupled thermo-mechanical problems. Two- and three-dimensional representative volume elements with varying heterogeneous microstructures, and a close-to-reality industrial casting example under varying boundary conditions are investigated as the example problems. We investigate the potential of several neural operators combined with the proposed finite element-guided approach, including Fourier neural operators (FNOs), deep operator networks (DeepONets), and a newly proposed implicit finite operator learning (iFOL) approach based on conditional neural fields. The results demonstrate that FNOs yield highly accurate solution operators on regular domains, where the global features can be efficiently learned in the spectral domain, and iFOL offers efficient parametric operator learning capabilities for complex and irregular geometries. Furthermore, studies on training strategies, network decomposition, and training sample quality reveal that a monolithic training strategy using a single network is sufficient for accurate predictions, while training sample quality strongly influences performance. Overall, the present approach highlights the potential of physics-informed operator learning with a finite element-based loss as a unified and scalable approach for coupled multiphysics simulations.

URL PDF HTML ☆

赞 0 踩 0

2602.20409 2026-04-22 cs.CV cs.LG

CLIPoint3D: Language-Grounded Few-Shot Unsupervised 3D Point Cloud Domain Adaptation

Mainak Singha, Sarthak Mehrotra, Paolo Casari, Subhasis Chaudhuri, Elisa Ricci, Biplab Banerjee

Comments Accepted in CVPR 2026

2602.13351 2026-04-22 cs.AI cs.FL cs.LO

A Formal Framework for the Explanation of Finite Automata Decisions

Jaime Cuartas Granada, Alexey Ignatiev, Peter J. Stuckey

2602.11623 2026-04-22 cs.LG

TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees

Weida Li, Yaoliang Yu, Bryan Kian Hsiang Low

2602.09370 2026-04-22 cs.RO

Phase-Aware Policy Learning for Skateboard Riding of Quadruped Robots via Feature-wise Linear Modulation

Minsung Yoon, Jeil Jeong, Sung-Eui Yoon

Comments ICRA 2026 | Project Page: https://minsungyoon.github.io/projects/papl/ | M. Yoon and J. Jeong contributed equally

2601.18891 2026-04-22 cs.CV

Weakly supervised framework for wildlife detection and counting in challenging Arctic environments: a case study on caribou (Rangifer tarandus)

Ghazaleh Serati, Samuel Foucher, Jerome Theau

Comments 30 pages, 8 figures, published in Frontiers in Ecology and Evolution

详情

DOI: 10.3389/fevo.2026.1727514

英文摘要

Caribou across the Arctic has declined in recent decades, motivating scalable and accurate monitoring approaches to guide evidence-based conservation actions and policy decisions. Manual interpretation from this imagery is labor-intensive and error-prone, underscoring the need for automatic and reliable detection across varying scenes. Yet, such automatic detection is challenging due to severe background heterogeneity, dominant empty terrain (class imbalance), small or occluded targets, and wide variation in density and scale. To make the detection model (HerdNet) more robust to these challenges, a weakly supervised patch-level pretraining based on a detection network's architecture is proposed. The detection dataset includes five caribou herds distributed across Alaska. By learning from empty vs. non-empty labels in this dataset, the approach produces early weakly supervised knowledge for enhanced detection compared to HerdNet, which is initialized from generic weights. Accordingly, the patch-based pretrain network attained high accuracy on multi-herd imagery (2017) and on an independent year's (2019) test sets (F1: 93.7%/92.6%, respectively), enabling reliable mapping of regions containing animals to facilitate manual counting on large aerial imagery. Transferred to detection, initialization from weakly supervised pretraining yielded consistent gains over ImageNet weights on both positive patches (F1: 92.6%/93.5% vs. 89.3%/88.6%), and full-image counting (F1: 95.5%/93.3% vs. 91.5%/90.4%). Remaining limitations are false positives from animal-like background clutter and false negatives related to low animal density occlusions. Overall, pretraining on coarse labels prior to detection makes it possible to rely on weakly-supervised pretrained weights even when labeled data are limited, achieving results comparable to generic-weight initialization.

URL PDF HTML ☆

赞 0 踩 0

2601.14152 2026-04-22 cs.CL cs.AI cs.LG

Lost in the Prompt Order: Revealing the Limitations of Causal Attention in Language Models

Hyunjong Ok, Jaeho Lee

Comments ACL 2026 findings

2601.12499 2026-04-22 cs.AI cs.LG

Failure Modes in Multi-Hop QA: The Weakest Link Effect and the Recognition Bottleneck

Meiru Zhang, Zaiqiao Meng, Nigel Collier

Comments Accepted at ACL 2026

2601.11913 2026-04-22 cs.CL cs.AI

LSTM-MAS: A Long Short-Term Memory Inspired Multi-Agent System for Long-Context Understanding

Yichen Jiang, Jiakang Yuan, Chongjun Tu, Peng Ye, Tao Chen

Comments 12 pages, 5 figures

2601.11037 2026-04-22 cs.AI

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

Shiyu Liu, Yongjing Yin, Jianhao Yan, Yunbo Tang, Qinggang Zhang, Bei Li, Xin Chen, Jingang Wang, Xunliang Cai, Jinsong Su

Comments ACL 2026 Conference. Code is available at https://github.com/Liushiyu-0709/BAPO-Reliable-Search

2601.08620 2026-04-22 cs.AI cs.CV

ViDoRe V3: A Comprehensive Evaluation of Retrieval Augmented Generation in Complex Real-World Scenarios

António Loison, Quentin Macé, Antoine Edy, Victor Xing, Tom Balough, Gabriel Moreira, Bo Liu, Manuel Faysse, Céline Hudelot, Gautier Viaud