arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.01039 2026-05-05 cs.LG

Finite-Sample Analysis of Elimination in Active Hypothesis Testing

Ziyuan Lin, Hoang Ngoc Nguyen, Jie Xu, Ivan Ruchkin

Comments Submitted to IEEE Conference on Decision and Control (CDC) 2026. 18 pages, 4 figures

详情

英文摘要

A fixed-confidence, finite-sample problem of active hypothesis testing arises in many safety-critical applications. Situated in the context of sequential hypothesis testing, this paper studies the effect of hypothesis elimination on the stopping time. We introduce an elimination-augmented Track-and-Stop algorithm, in which champion-specific active-opponent sets are progressively pruned, and sensing effort is reallocated toward the surviving alternatives. Our analysis derives a non-asymptotic upper bound on the expected stopping time. The gain in finite-sample from elimination appears on the scale of the non-leading term, resulting from tighter tracking and concentration constants on the reduced hypothesis set. Furthermore, we introduce an aggressiveness parameter to modulate the trade-off between faster elimination and weaker confidence guarantee. An experimental study on synthetic Gaussian instances confirms the theoretical predictions.

URL PDF HTML ☆

赞 0 踩 0

2605.01036 2026-05-05 cs.CV

InterPhys: Physics-aware Human Motion Synthesis in a Dynamic Scene

Chaoyue Xing, Wei Mao, Miaomiao Liu

Comments Accepted to CVPR2026

2605.01034 2026-05-05 cs.CL

A Theoretical Game of Attacks via Compositional Skills

Xinbo Wu, Huan Zhang, Abhishek Umrawal, Lav R. Varshney

Comments arXiv admin note: text overlap with arXiv:2505.20841

2605.01024 2026-05-05 cs.CV cs.AI

EmoMM: Benchmarking and Steering MLLM for Multimodal Emotion Recognition under Conflict and Missingness

Yueru Sun, Yimeng Zhang, Haoyu Gu, Nuo Chen, Dong She, Xianrong Yao, Yang Gao, Zhanpeng Jin

2605.01020 2026-05-05 cs.LG

Continual Learning of Feedback-based Molecular Communication

Siddhant Setia, Junichi Suzuki, Tadashi Nakano

Comments 16 pages, 5 figures. To be published in Proceedings of International Conference on Bio-inspired Information and Communications Technologies 2025

2605.00994 2026-05-05 cs.CL cs.AI

Model Organisms Are Leaky: Perplexity Differencing Often Reveals Finetuning Objectives

Mohammed Abu Baker, Luca Baroni, Dan Wilhelm

2605.00977 2026-05-05 cs.CV cs.AI cs.CL

Democratizing the medieval English legal tradition

Michael Zhang, Elise Wang, Charlotte Whatley, Seth Strickland, Dylan Bannon

Comments Submitted to International Conference on Document Analysis and Recognition (ICDAR) 2026

2605.00973 2026-05-05 cs.LG cs.AI eess.SP

Physiology-Aware Masked Cross-Modal Reconstruction for Biosignal Representation Learning

Hao Zhou, Simon A. Lee, Cyrus Tanade, Keum San Chun, Juhyeon Lee, Migyeong Gwak, Megha Thukral, Justin Sung, Eugene Hwang, Mehrab Bin Morshed, Li Zhu, Viswam Nathan, Md Mahbubur Rahman, Subramaniam Venkatraman, Sharanya Arcot Desai

Comments Proceedings of the 43rd International Conference on Machine Learning

2605.00966 2026-05-05 cs.LG cs.NE q-bio.NC stat.ML

Robust volatility updates for Hierarchical Gaussian Filtering

Christoph Mathys, Nicolas Legrand, Peter Thestrup Waade, Nace Mikus, Lilian Aline Weber

2605.00963 2026-05-05 cs.RO cs.AI

Ablation Study of Multimodal Perception, Language Grounding, and Control for Human-Robot Interaction in an Object Detection and Grasping Task

Zi Tian, Guanting Shen

Comments 10 pages

2605.00960 2026-05-05 cs.CV cs.CL

Energy-Based Constraint Networks: Learning Structural Coherence Across Modalities

Chirag Shinde

Comments 16 pages, 3 figures, 11 tables. Code: https://github.com/cs-cmyk/energy-constraint-networks Weights: https://huggingface.co/cs-cmyk/energy-constraint-networks

2605.00951 2026-05-05 cs.LG cs.AI

Graph Rewiring in GNNs to Mitigate Over-Squashing and Over-Smoothing: A Survey

Hugo Attali, Nathalie Pernelle, Davide Buscaldi, Fragkiskos D. Malliaros

Comments Accepted at the International Joint Conference on Artificial Intelligence (IJCAI 2026), Survey Track

2605.00943 2026-05-05 cs.RO

ARIS: Agentic and Relationship Intelligence System for Social Robots

Stavya Datta, Fucai Ke, Leimin Tian, Hamid Rezatofighi

2605.00940 2026-05-05 cs.LG cs.AI

Interpretable experiential learning based on state history and global feedback

Anton Kolonin

Comments 5 figures

2605.00938 2026-05-05 cs.LG cs.AI

Fusing Urban Structure and Semantics: A Conditional Diffusion Model for Cross-City OD Matrix Generation

Bin Chen, Zhuoya Meng, Fang Yang, Runkang Guo, Jingtao Ding, Yin Zhang, Chuan Ai, Zhengqiu Zhu

详情

英文摘要

Accurate modeling of commuting flows is important for urban governance, traffic planning, and resource allocation. However, the combined influence of individual intentions, geographic constraints, and social dynamics leads to considerable heterogeneity in commuting patterns, making it difficult to develop generation models that generalize across cities. To address this issue, we propose SEDAN, a Structure-Enhanced Diffusion model conditioned on Attributed Nodes for generalizable OD matrix generation. SEDAN models a city as an attributed graph. Each region is treated as a node with demographic and point-of-interest features, and commuting flows are modeled as weighted edges. Adjacency and distance matrices are incorporated to characterize spatial structure. Based on this representation, we design a fusion mechanism within SEDAN to jointly model semantic information and spatial information. Regional semantic attributes are used to model latent travel demand through graph-transformer-based node interactions, while spatial structure is injected into the generation process as explicit constraints. The adjacency matrix guides attention weights to strengthen interactions between neighboring regions. Meanwhile, the distance matrix serves as a diffusion condition to capture spatial proximity and travel impedance. The fusion of urban semantics and spatial constraints enables SEDAN to generate OD matrices that are both behaviorally plausible and geographically coherent. Experiments on real-world OD datasets from U.S. cities show that SEDAN achieves a 7.38\% improvement in RMSE over the state-of-the-art baseline, WEDAN. It also remains robust across heterogeneous urban scenarios and varying structural patterns. Our work provides an effective and generalizable solution for commuting OD matrix generation. The code is available at https://anonymous.4open.science/r/SEDAN.

URL PDF HTML ☆

赞 0 踩 0

2605.00936 2026-05-05 cs.LG cs.AI

EventADL: Open-Box Anomaly Detection and Localization Framework for Events in Cloud-Based Service Systems

Luan Pham, Victor Nicolet, Joey Dodds, Hui Guan, Daniel Kroening

Comments This paper has been accepted to the FSE'26 Conference - Research Track

2605.00935 2026-05-05 cs.LG cs.CV

Watch Your Step: Information Injection in Diffusion Models via Shadow Timestep Embedding

An Huang, Junggab Son, Zuobin Xiong

Comments 14 pages, accepted to ICML 2026

2605.00933 2026-05-05 cs.LG cs.AI

CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining

Hada Melino Muhammad, Zechen Li, Flora Salim, Ahmed A. Metwally

2605.00931 2026-05-05 cs.LG cs.DC cs.IT math.IT

Hierarchical Federated Learning for Networked AI: From Communication Saving to Architecture-Aware Design

Seyed Mohammad Azimi-Abarghouyi, Mehdi Bennis, Leandros Tassiulas

2605.00929 2026-05-05 cs.LG cs.AI

PhaseNet++: Phase-Aware Frequency-Domain Anomaly Detection for Industrial Control Systems via Phase Coherence Graphs

Raviteja Bommireddy, Varshith Bandaru, Lohith Pakala, Pradeep Kumar B

Comments 9 pages, 1 figure

2605.00926 2026-05-05 cs.LG math.PR

A Review of the Receiver Operating Characteristic Curve and a Proof About the Area Beneath It

Steven Redolfi

2605.00925 2026-05-05 cs.LG cs.CV q-bio.QM

Linking spatial biology and clinical histology via Haiku

Yan Cui, Jacob S. Leiby, Wenhui Lei, Dokyoon Kim, Yanxiang Deng, Aaron T. Mayer, Zhenqin Wu, Alexandro E. Trevino, Zhi Huang

详情

英文摘要

Integrating molecular, morphological, and clinical data is essential for basic and translational biomedical research, yet systematic frameworks for jointly modeling these modalities remain limited. Here we present Haiku, a tri-modal contrastive learning model trained on multiplexed immunofluorescence (mIF). It comprises 26.7 million spatial proteomics patches from 3,218 tissue sections across 1,606 patients spanning 11 organ types, with matched hematoxylin and eosin (H&E) histology and clinical metadata aligned in a shared embedding space. Haiku enables three-way cross-modal retrieval, improves downstream classification and clinical prediction tasks over unimodal baselines, and supports zero-shot biomarker inference through fusion retrieval conditioned on clinical metadata-only text descriptions. Across tasks, Haiku outperforms competing approaches, achieving cross-modal retrieval (Recall@50 up to 0.611 versus near-zero baseline), survival prediction (C-index 0.737, +7.91% relative improvement), and zero-shot biomarker inference (mean Pearson correlation 0.718 across 52 biomarkers). Furthermore, we introduce a counterfactual prediction framework in which modifying only clinical metadata while fixing tissue morphology surfaces niche-specific molecular shifts associated with breast cancer stage progression and lung cancer survival outcomes. In a lung adenocarcinoma case study, the counterfactual analysis recovers niche-specific shifts characterized by increased CD8 and granzyme B, reduced PD-L1, and decreased Ki67, broadly consistent with patterns reported for favorable outcomes. We present these counterfactual results as exploratory, hypothesis-generating signals rather than mechanistic claims. These capabilities demonstrate that tri-modal alignment via Haiku enables integrative analysis of spatial biology, bridging molecular measurements with clinical context for biological exploration.

URL PDF HTML ☆

赞 0 踩 0

2605.00916 2026-05-05 cs.CV

SAMamba3D: adapting Segment Anything for generalizable 3D segmentation of multiphase pore-scale images

Rui Zhang, Xianzhi Song, Linqi Zhu, Branko Bijeljic, Gensheng Li, Martin J. Blunt

Comments Code available at https://github.com/ImperialCollegeLondon/SAMamba-3D

2605.00915 2026-05-05 cs.CV cs.AI

Rethink MAE with Linear Time-Invariant Dynamics

Zice Wang

详情

英文摘要

Standard representation probing for visual models relies on mathematically permutation-invariant operations like Global Average Pooling (GAP) or CLS tokens, treating patch representations as an unstructured bag-of-words. We challenge this paradigm by demonstrating that token order is a critical, exploitable dimension in frozen visual representations (e.g., MAE, BEiT, DINOv2, and ViT as CLS-ablation extreme). We propose SSMProbe, a probing framework driven by a State Space Model (SSM). Operating as discrete Linear Time-Invariant (LTI) dynamical systems, SSMs act as permutation-sensitive probes where sequence order strictly dictates the final state due to inherent memory decay. Formulating token ordering as an information scheduling problem, we compare fixed scan heuristics against a differentiable soft permutation (Sinkhorn-based) learned from downstream supervision. Evaluations on standard and fine-grained classification benchmarks reveal a striking order gap: while fixed scans fail dramatically on highly localized patch features, our learned soft permutation successfully extracts highly competitive performance from otherwise heavily localized patch sequences. We find that pre-training objectives fundamentally shape token structure: DINOv2 concentrates global semantics in optimized CLS tokens leaving patches hyperspecialized, pure MAE preserves distributed representations with heterogeneous patch informativeness, and ViT represents a supervised CLS-dominated extreme. BEiT occupies middle ground. This heterogeneity is order-dependent -- meaning the SSM probe's performance depends critically on which tokens are placed at which temporal positions -- and is not merely a topological property of the spatial grid. SSMProbe's learned routing effectively discovers and exploits this heterogeneity, offering a powerful new diagnostic lens for visual representation analysis.

URL PDF HTML ☆

赞 0 踩 0

2605.00913 2026-05-05 cs.CV cs.AI

Leveraging Imperfect Medical Data: A Manifold-Consistent Spatio-Temporal Network for Sensor-based Human Activity Recognition

Jiangtao Fan, Anish Jindal, Amir Atapour-Abarghouei

2605.00912 2026-05-05 cs.CV

Object-Level Explanations for Image Geolocation Models: a GeoGuessr use-case

Emilie Durrieu, Christophe Hurter, Philippe Muller, Victor Boutin

2605.00911 2026-05-05 cs.CV

When Good OCR Is Not Enough: Benchmarking OCR Robustness for Retrieval-Augmented Generation

Lin Sun, Wang Dexian, Jingang Huang, Linglin Zhang, Change Jia, Zhengwei Cheng, Xiangzheng Zhang

2605.00909 2026-05-05 cs.AI cond-mat.mtrl-sci cs.LG

Accelerating battery research with an AI interface between FINALES and Kadi4Mat

Giovanna Tosato, Leon Merker, Monika Vogler, Michael Selzer, Arnd Koeppe

Comments Main manuscript: 21 pages, 9 figures. Supporting material: 3 pages, 5 figures. Submitted to "Batteries & Supercaps", currently under revision

2605.00907 2026-05-05 cs.CV cs.AI cs.LG

TRIP-Evaluate: An Open Multimodal Benchmark for Evaluating Large Models in Transportation

Han Gong, Zhen Zhou, Yunyang Shi, Yan Tan, Jinbiao Huo, Qi Hong, Zhiyuan Liu

Comments 19 pages, 12 figures

2605.00906 2026-05-05 cs.CV cs.AI cs.LG

Generalized Category Discovery under Domain Shifts: From Vision to Vision-Language Models

Hongjun Wang, Po Hu, Kai Han

Comments Submission to TPAMI