arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.19411 2026-04-22 cs.CV cs.AI

GOLD-BEV: GrOund and aeriaL Data for Dense Semantic BEV Mapping of Dynamic Scenes

Joshua Niemeijer, Alaa Eddine Ben Zekri, Reza Bahmanyar, Philipp M. Schmälzle, Houda Chaabouni-Chouayakh, Franz Kurz

详情

英文摘要

Understanding road scenes in a geometrically consistent, scene-centric representation is crucial for planning and mapping. We present GOLD-BEV, a framework that learns dense bird's-eye-view (BEV) semantic environment maps-including dynamic agents-from ego-centric sensors, using time-synchronized aerial imagery as supervision only during training. BEV-aligned aerial crops provide an intuitive target space, enabling dense semantic annotation with minimal manual effort and avoiding the ambiguity of ego-only BEV labeling. Crucially, strict aerial-ground synchronization allows overhead observations to supervise moving traffic participants and mitigates the temporal inconsistencies inherent to non-synchronized overhead sources. To obtain scalable dense targets, we generate BEV pseudo-labels using domain-adapted aerial teachers, and jointly train BEV segmentation with optional pseudo-aerial BEV reconstruction for interpretability. Finally, we extend beyond aerial coverage by learning to synthesize pseudo-aerial BEV images from ego sensors, which support lightweight human annotation and uncertainty-aware pseudo-labeling on unlabeled drives.

URL PDF HTML ☆

赞 0 踩 0

2604.19406 2026-04-22 cs.CV cs.AI

HP-Edit: A Human-Preference Post-Training Framework for Image Editing

Fan Li, Chonghuinan Wang, Lina Lei, Yuping Qiu, Jiaqi Xu, Jiaxiu Jiang, Xinran Qin, Zhikai Chen, Fenglong Song, Zhixin Wang, Renjing Pei, Wangmeng Zuo

Comments Accepted by CVPR2026

2604.19405 2026-04-22 cs.CL

Lost in Translation: Do LVLM Judges Generalize Across Languages?

Md Tahmid Rahman Laskar, Mohammed Saidul Islam, Mir Tafseer Nayeem, Amran Bhuiyan, Mizanur Rahman, Shafiq Joty, Enamul Hoque, Jimmy Huang

Comments Accepted at ACL 2026 Findings

2604.19404 2026-04-22 cs.RO cs.AI

M$^{2}$GRPO: Mamba-based Multi-Agent Group Relative Policy Optimization for Biomimetic Underwater Robots Pursuit

Yukai Feng, Zhiheng Wu, Zhengxing Wu, Junwen Gu, Junzhi Yu

2604.19403 2026-04-22 cs.CV

VecHeart: Holistic Four-Chamber Cardiac Anatomy Modeling via Hybrid VecSets

Yihong Chen, Pascal Fua

2604.19401 2026-04-22 cs.LG cs.AI

Revisiting Catastrophic Forgetting in Continual Knowledge Graph Embedding

Gerard Pons, Carlos Escolano, Besim Bilalli, Anna Queralt

Comments Pre-print submitted

2604.19399 2026-04-22 cs.LG cs.DC

Optimal Routing for Federated Learning over Dynamic Satellite Networks: Tractable or Not?

Yi Zhao, Di Yuan, Tao Deng, Suzhi Cao, Ying Dong

2604.19398 2026-04-22 cs.AI

GRASPrune: Global Gating for Budgeted Structured Pruning of Large Language Models

Ziyang Wang, Jiangfeng Xiao, Chuan Xiao, Ruoxiang Li, Rui Mao, Jianbin Qin

Comments Accepted to ACL 2026 Main Conference

2604.19395 2026-04-22 cs.CL

Does Self-Consistency Improve the Recall of Encyclopedic Knowledge?

Sho Hoshino, Ukyo Honda, Peinan Zhang

Comments ACL 2026

2604.19394 2026-04-22 cs.CL

Can Continual Pre-training Bridge the Performance Gap between General-purpose and Specialized Language Models in the Medical Domain?

Niclas Doll, Jasper Schulze Buschhoff, Shalaka Satheesh, Hammam Abdelwahab, Héctor Allende-Cid, Katrin Klug

Comments Accepted to the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026, San Diego, California, July 2 - 7, 2026) as a main conference paper

2604.19392 2026-04-22 cs.CV

HarmoniDiff-RS: Training-Free Diffusion Harmonization for Satellite Image Composition

Xiaoqi Zhuang, Jefersson A. Dos Santos, Jungong Han

Comments 8 pages, 6 figures, CVPR 2026 findings. Code is available at https://github.com/XiaoqiZhuang/HarmoniDiff-RS

2604.19379 2026-04-22 cs.CV

PanDA: Unsupervised Domain Adaptation for Multimodal 3D Panoptic Segmentation in Autonomous Driving

Yining Pan, Shijie Li, Yuchen Wu, Xulei Yang, Na Zhao

Comments Accepted at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2026

2604.19377 2026-04-22 cs.AI

Towards Energy Impact on AI-Powered 6G IoT Networks: Centralized vs. Decentralized

Anjie Qiu, Donglin Wang, Sanket Partani, Andreas Weinand, Hans D. Schotten

Comments 6 pages, 4 figures. Accepted for presentation at the IEEE GLOBECOM 2025 Workshop on Workshop on Green Learning for Wireless Communications

2604.19374 2026-04-22 cs.RO

Achieving Interaction Fluidity in a Wizard-of-Oz Robotic System: A Prototype for Fluid Error-Correction

Carlos Baptista De Lima, Julian Hough, Frank Förster, Patrick Holthaus, Yongjun Zheng

Comments 5 pages, 1 figure, Workshop on Errors, Mistakes, and Failures in Humans and Robots at 2026 ACM/IEEE International Conference on Human-Robot Interaction

2604.19372 2026-04-22 cs.LG cs.AI

TACENR: Task-Agnostic Contrastive Explanations for Node Representations

Vasiliki Papanikou, Evaggelia Pitoura

Comments Accepted at the XAI 2026 Conference. 24 pages, 10 figures

2604.19369 2026-04-22 cs.CV

IonMorphNet: Generalizable Learning of Ion Image Morphologies for Peak Picking in Mass Spectrometry Imaging

Philipp Weigand, Niels Nawrot, Nikolas Ebert, Carsten Hopf, Oliver Wasenmüller

Comments This paper has been accepted at IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2026

2604.19368 2026-04-22 cs.CV cs.HC cs.LG cs.RO

Mind2Drive: Predicting Driver Intentions from EEG in Real-world On-Road Driving

Ghadah Alosaimi, Hanadi Alhamdan, Wenke E, Stamos Katsigiannis, Amir Atapour-Abarghouei, Toby P. Breckon

Comments 8 pages, 4 figures, 6 tables, conference

2604.19365 2026-04-22 cs.CV

Detection of T-shirt Presentation Attacks in Face Recognition Systems

Mathias Ibsen, Loris Tim Ide, Christian Rathgeb, Christoph Busch

2604.19357 2026-04-22 cs.LG

FairTree: Subgroup Fairness Auditing of Machine Learning Models with Bias-Variance Decomposition

Rudolf Debelak

Comments Accepted at ACM FAccT 2026

2604.19354 2026-04-22 cs.AI cs.CR cs.SE

Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges

Ali Al-Kaswan, Maksim Plotnikov, Maxim Hájek, Roland Vízner, Arie van Deursen, Maliheh Izadi

Comments Accepted to AIWare'26 Benchmark and Dataset Track

2604.19350 2026-04-22 cs.CV

Attend what matters: Leveraging vision foundational models for breast cancer classification using mammograms

Samyak Sanghvi, Piyush Miglani, Sarvesh Shashikumar, Kaustubh R Borgavi, Veenu Singla, Chetan Arora

2604.19349 2026-04-22 cs.CV

RAFT-MSF++: Temporal Geometry-Motion Feature Fusion for Self-Supervised Monocular Scene Flow

Xunpei Sun, Zuoxun Hou, Yi Chang, Gang Chen, Wei-Shi Zheng

Comments This work has been submitted to the IEEE for possible publication

2604.19345 2026-04-22 cs.CV

Geometry-Guided Self-Supervision for Ultra-Fine-Grained Recognition with Limited Data

Shijie Wang, Yadan Luo, Zijian Wang, Haojie Li, Zi Huang, Mahsa Baktashmotlagh

2604.19344 2026-04-22 cs.RO

Quadruped Parkour Learning: Sparsely Gated Mixture of Experts with Visual Input

Michael Ziegltrum, Jianhao Jiao, Tianhu Peng, Chengxu Zhou, Dimitrios Kanoulas

Comments 8 pages, 5 figures

2604.19342 2026-04-22 cs.CL

Are Large Language Models Economically Viable for Industry Deployment?

Abdullah Mohammad, Sushant Kumar Ray, Pushkar Arora, Rafiq Ali, Ebad Shabbir, Gautam Siddharth Kashyap, Jiechao Gao, Usman Naseem

Comments Accepted at ACL 2026 (Industry Track)

2604.19341 2026-04-22 cs.LG cs.AI

Evaluation-driven Scaling for Scientific Discovery

Haotian Ye, Haowei Lin, Jingyi Tang, Yizhen Luo, Caiyin Yang, Chang Su, Rahul Thapa, Rui Yang, Ruihua Liu, Zeyu Li, Chong Gao, Dachao Ding, Guangrong He, Miaolei Zhang, Lina Sun, Wenyang Wang, Yuchen Zhong, Zhuohao Shen, Di He, Jianzhu Ma, Stefano Ermon, Tongyang Li, Xiaowen Chu, James Zou, Yuzhi Xu

详情

英文摘要

Language models are increasingly used in scientific discovery to generate hypotheses, propose candidate solutions, implement systems, and iteratively refine them. At the core of these trial-and-error loops lies evaluation: the process of obtaining feedback on candidate solutions via verifiers, simulators, or task-specific scoring functions. While prior work has highlighted the importance of evaluation, it has not explicitly formulated the problem of how evaluation-driven discovery loops can be scaled up in a principled and effective manner to push the boundaries of scientific discovery, a problem this paper seeks to address. We introduce Simple Test-time Evaluation-driven Scaling (SimpleTES), a general framework that strategically combines parallel exploration, feedback-driven refinement, and local selection, revealing substantial gains unlocked by scaling evaluation-driven discovery loops along the right dimensions. Across 21 scientific problems spanning six domains, SimpleTES discovers state-of-the-art solutions using gpt-oss models, consistently outperforming both frontier-model baselines and sophisticated optimization pipelines. Particularly, we sped up the widely used LASSO algorithm by over 2x, designed quantum circuit routing policies that reduce gate overhead by 24.5%, and discovered new Erdos minimum overlap constructions that surpass the best-known results. Beyond novel discoveries, SimpleTES produces trajectory-level histories that naturally supervise feedback-driven learning. When post-trained on successful trajectories, models not only improve efficiency on seen problems but also generalize to unseen problems, discovering solutions that base models fail to uncover. Together, our results establish effective evaluation-driven loop scaling as a central axis for advancing LLM-driven scientific discovery, and provide a simple yet practical framework for realizing these gains.

URL PDF HTML ☆

赞 0 踩 0

2604.19339 2026-04-22 cs.CV

Divide-and-Conquer Approach to Holistic Cognition in High-Similarity Contexts with Limited Data

Shijie Wang, Zijian Wang, Yadan Luo, Haojie Li, Zi Huang, Mahsa Baktashmotlagh

2604.18580 2026-04-22 cs.LG cs.AI cs.CL

Sessa: Selective State Space Attention

Liubomyr Horbatko

Comments v2: revised abstract for clarity; main results unchanged. Code available at: https://github.com/LibratioAI/sessa

2604.18576 2026-04-22 cs.AI

Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs

Kevin Murphy

Comments v2 fixes a critical error in v1 related to calculation of Brier Index, and makes several important changes to the presentation

2604.18557 2026-04-22 cs.CV cs.GR cs.RO

SynAgent: Generalizable Cooperative Humanoid Manipulation via Solo-to-Cooperative Agent Synergy

Wei Yao, Haohan Ma, Hongwen Zhang, Yunlian Sun, Liangjun Xing, Zhile Yang, Yuanjun Guo, Yebin Liu, Jinhui Tang