arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2603.18940 2026-03-30 cs.CL cs.LG

Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

Xinghao Zhao

2603.18739 2026-03-30 cs.CV

EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation

Longfei Liu, Yongjie Hou, Yang Li, Qirui Wang, Youyang Sha, Yongjun Yu, Yinzhi Wang, Peizhe Ru, Xuanlong Yu, Xi Shen

Comments Code is available at: https://intellindust-ai-lab.github.io/projects/EdgeCrafter/

2603.17576 2026-03-30 cs.CV

LoGSAM: Parameter-Efficient Cross-Modal Grounding for MRI Segmentation

Mohammad Robaitul Islam Bhuiyan, Sheethal Bhat, Melika Qahqaie, Tri-Thien Nguyen, Paula Andrea Perez-Toro, Tomas Arias-Vergara, Andreas Maier

Comments 10 pages, 3 figures

2603.17528 2026-03-30 cs.CV

MM-OVSeg:Multimodal Optical-SAR Fusion for Open-Vocabulary Segmentation in Remote Sensing

Yimin Wei, Aoran Xiao, Hongruixuan Chen, Junshi Xia, Naoto Yokoya

Comments CVPR2026

2603.15636 2026-03-30 cs.AI

AIDABench: AI Data Analytics Benchmark

Yibo Yang, Fei Lei, Yixuan Sun, Yantao Zeng, Chengguang Lv, Jiancao Hong, Jiaojiao Tian, Tianyu Qiu, Xin Wang, Yanbing Chen, Yanjie Li, Zheng Pan, Xiaochen Zhou, Guanzhou Chen, Haoran Lv, Yuning Xu, Yue Ou, Haodong Liu, Shiqi He, Anya Jia, Yulei Xin, Huan Wu, Liang Liu, Jiaye Ge, Jianxin Dong, Dahua Lin, Wenxiu Sun

Comments 22 pages (including appendix), 9 figures, 4 tables. Code: https://github.com/MichaelYang-lyx/AIDABench. Dataset: https://huggingface.co/datasets/MichaelYang-lyx/AIDA

2603.15304 2026-03-30 cs.CV

UE5-Forest: A Photorealistic Synthetic Stereo Dataset for UAV Forestry Depth Estimation

Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

2603.15195 2026-03-30 cs.LG

Massive Redundancy in Gradient Transport Enables Sparse Online Learning

Aur Shalev Merin

Comments 26 pages, 5 figures, 14 tables

详情

英文摘要

Real-time recurrent learning (RTRL) computes exact online gradients by propagating a Jacobian tensor forward through recurrent dynamics, but at O(n^4) cost per step. Prior work has sought structured approximations (rank-1 compression, graph-based sparsity, Kronecker factorization). We show that, in the continuous error signal regime, the recurrent Jacobian is massively redundant:propagating through a random 6% of paths (k=4 of n=64) recovers 84 +/- 6% of full RTRL's adaptation ability across five seeds, and the absolute count k=4 remains effective from n=64 to n=256 (6% to 1.6%, recovery 84 to 78%), meaning sparse RTRL becomes relatively cheaper as networks grow. In RNNs, the recovery is selection-invariant (even adversarial path selection works) and exhibits a step-function transition from zero to any nonzero propagation. Spectral analysis reveals the mechanism: the Jacobian is full-rank but near-isotropic (condition numbers 2.6-6.5), so any random subset provides a directionally representative gradient estimate. On chaotic dynamics (Lorenz attractor), sparse propagation is more numerically stable than full RTRL (CV 13% vs. 88%), as subsampling avoids amplifying pathological spectral modes. The redundancy extends to LSTMs (k=4 matches full RTRL) and to transformers via sparse gradient transport (50% head sparsity outperforms the dense reference; 33% is borderline), with higher thresholds reflecting head specialization rather than isotropy. On real primate neural data, sparse RTRL (k=4) adapts online to cross-session electrode drift (80 +/- 11% recovery, 5 seeds), where sparse propagation is again more stable than full RTRL. Without continuous error signal, Jacobian propagation accumulates numerical drift and degrades all RTRL variants, a scope condition for all forward-mode methods. Results hold with SGD (92 +/- 1% recovery), suggesting independence from optimizer choice.

URL PDF HTML ☆

赞 0 踩 0

2603.14688 2026-03-30 cs.LG cs.AI cs.SE

AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems

Zhaohui Geoffrey Wang

Comments 11 pages, 1 figure, 19 tables. Published at ICLR 2026 Workshop on Agents in the Wild. Camera-ready version with revised layout and framework overview figure

2603.14375 2026-03-30 cs.CV cs.AI

The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics

Xiangbo Gao, Mingyang Wu, Siyuan Yang, Jiongze Yu, Pardis Taghavi, Fangzhou Lin, Zhengzhong Tu

2603.13352 2026-03-30 cs.CV

Local Precise Refinement: A Dual-Gated Mixture-of-Experts for Enhancing Foundation Model Generalization against Spectral Shifts

Xi Chen, Maojun Zhang, Yu Liu, Shen Yan

2603.12760 2026-03-30 cs.CV

HIFICL: High-Fidelity In-Context Learning for Multimodal Tasks

Xiaoyu Li, Yuhang Liu, Xuanshuo Kang, Zheng Luo, Fangqi Lou, Xiaohua Wu, Zihan Xiong

Comments Accepted to CVPR 2026. Code available at https://github.com/bbbandari/HiFICL

2603.12206 2026-03-30 cs.CL

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

Alexandre Le Mercier, Thomas Demeester, Chris Develder

Comments 22 pages, 6 figures

2603.11601 2026-03-30 cs.AI

See, Symbolize, Act: Grounding VLMs with Spatial Representations for Better Gameplay

Ashish Baghel, Paras Chopra

Comments 11 pages, 13 figures. Accepted to LMReasoning Workshop at AAAI 2026

2603.06178 2026-03-30 cs.CV

Making Training-Free Diffusion Segmentors Scale with the Generative Power

Benyuan Meng, Qianqian Xu, Zitai Wang, Xiaochun Cao, Longtao Huang, Qingming Huang

Comments Accepted to CVPR 2026

2603.00717 2026-03-30 cs.CV

Leveraging Arbitrary Data Sources for AI-Generated Image Detection Without Sacrificing Generalization

Qinghui He, Haifeng Zhang, Xiuli Bi, Bo Liu, Chi-Man Pun, Bin Xiao

Comments Accepted to CVPR Findings 2026

2602.22949 2026-03-30 cs.CV

OpenFS: Multi-Hand-Capable Fingerspelling Recognition with Implicit Signing-Hand Detection and Frame-Wise Letter-Conditioned Synthesis

Junuk Cha, Jihyeon Kim, Han-Mu Park

Comments Accepted to CVPR 2026, camera-ready version

2602.22025 2026-03-30 cs.CV

Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments

Shuang Song, Debao Huang, Deyan Deng, Haolin Xiong, Yang Tang, Yajie Zhao, Rongjun Qin

Comments CVPR 2026

2602.21100 2026-03-30 cs.CV cs.GR

Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction

Noé Artru, Rukhshanda Hussain, Emeline Got, Alexandre Messier, David B. Lindell, Abdallah Dib

Comments For our project page, see https://ubisoft-laforge.github.io/character/skullptor/

2602.20396 2026-03-30 cs.LG stat.ME

cc-Shapley: Measuring Multivariate Feature Importance Needs Causal Context

Jörg Martin, Stefan Haufe

2602.19623 2026-03-30 cs.CV cs.AI cs.HC

PedaCo-Gen: Scaffolding Pedagogical Agency in Human-AI Collaborative Video Authoring

Injun Baek, Yearim Kim, Nojun Kwak

2602.19530 2026-03-30 cs.CV

ORION: ORthonormal Text Encoding for Universal VLM AdaptatION

Omprakash Chakraborty, Jose Dolz, Ismail Ben Ayed

Journal ref Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

2602.18846 2026-03-30 cs.CV cs.AI

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference

Aditya Kumar Singh, Hitesh Kandala, Pratik Prabhanjan Brahma, Zicheng Liu, Emad Barsoum

Comments 15 Pages, 8 figures, 15 tables, CVPR 2026; Code: AGI/DUET-VLM" target="_blank" rel="noopener">https://github.com/AMD-AGI/DUET-VLM

2602.18709 2026-03-30 cs.CV cs.RO

IRIS-SLAM: Unified Geo-Instance Representations for Robust Semantic Localization and Mapping

Tingyang Xiao, Liu Liu, Wei Feng, Zhengyu Zou, Xiaolin Zhou, Wei Sui, Hao Li, Dingwen Zhang, Zhizhong Su

2602.11391 2026-03-30 cs.CL

Advancing AI Trustworthiness Through Patient Simulation: Risk Assessment of Conversational Agents for Antidepressant Selection

Md Tanvir Rouf Shawon, Mohammad Sabik Irbaz, Hadeel R. A. Elyazori, Keerti Reddy Resapu, Yili Lin, Vladimir Franzuela Cardenas, Farrokh Alemi, Kevin Lybarger

2602.08277 2026-03-30 cs.CV cs.AI

PISCO: Precise Video Instance Insertion with Sparse Control

Xiangbo Gao, Renjie Li, Xinghao Chen, Yuheng Wu, Suofei Feng, Qing Yin, Zhengzhong Tu

详情

英文摘要

The landscape of AI video generation is undergoing a pivotal shift: moving beyond general generation - which relies on exhaustive prompt-engineering and "cherry-picking" - towards fine-grained, controllable generation and high-fidelity post-processing. In professional AI-assisted filmmaking, it is crucial to perform precise, targeted modifications. A cornerstone of this transition is video instance insertion, which requires inserting a specific instance into existing footage while maintaining scene integrity. Unlike traditional video editing, this task demands several requirements: precise spatial-temporal placement, physically consistent scene interaction, and the faithful preservation of original dynamics - all achieved under minimal user effort. In this paper, we propose PISCO, a video diffusion model for precise video instance insertion with arbitrary sparse keyframe control. PISCO allows users to specify a single keyframe, start-and-end keyframes, or sparse keyframes at arbitrary timestamps, and automatically propagates object appearance, motion, and interaction. To address the severe distribution shift induced by sparse conditioning in pretrained video diffusion models, we introduce Variable-Information Guidance for robust conditioning and Distribution-Preserving Temporal Masking to stabilize temporal generation, together with geometry-aware conditioning for realistic scene adaptation. We further construct PISCO-Bench, a benchmark with verified instance annotations and paired clean background videos, and evaluate performance using both reference-based and reference-free perceptual metrics. Experiments demonstrate that PISCO consistently outperforms strong inpainting and video editing baselines under sparse control, and exhibits clear, monotonic performance improvements as additional control signals are provided. Project page: xiangbogaobarry.github.io/PISCO.

URL PDF HTML ☆

赞 0 踩 0

2602.07374 2026-03-30 cs.CL cs.AI

TernaryLM: Memory-Efficient Language Modeling via Native 1.5-Bit Quantization with Adaptive Layer-wise Scaling

Nisharg Nargund, Priyesh Shukla

2602.03220 2026-03-30 cs.CV

PokeFusion Attention: A Lightweight Cross-Attention Mechanism for Style-Conditioned Image Generation

Jingbang Tang

Comments 12 pages, 5 figures. Revised version with improved method description and corrected references

2601.21419 2026-03-30 cs.LG cs.CV

Revisiting Diffusion Model Predictions Through Dimensionality

Qing Jin, Chaoyang Wang

Comments 19 pages, 5 figures

2601.19933 2026-03-30 cs.CL cs.AI cs.LG

NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

Kei Saito

Comments 25 pages, 5 figures, 7 tables. Replacement synced to repository snapshot v39. Series hub link: https://github.com/kei-saito-research/nrr-series-hub

2601.17468 2026-03-30 cs.CV cs.LG

ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation

Chia-Ming Lee, Yu-Fan Lin, Jin-Hui Jiang, Yu-Jou Hsiao, Chih-Chung Hsu, Yu-Lun Liu

Comments CVPR 2026 Camera Ready; Project page: https://wuw2135.github.io/ReflexSplit-ProjectPage/

AI 大模型

视觉与机器人

科学与医疗