arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.22215 2026-04-27 cs.CL cs.AI

Verbal Confidence Saturation in 3-9B Open-Weight Instruction-Tuned LLMs: A Pre-Registered Psychometric Validity Screen

Jon-Paul Cacioli

Comments 10 pages, 3 figures, 4 tables, 1 appendix. Pre-registered: osf.io/azbvx. Code and data: github.com/synthiumjp/koriat

详情

英文摘要

Verbal confidence elicitation is widely used to extract uncertainty estimates from LLMs. We tested whether seven instruction-tuned open-weight models (3-9B parameters, four families) produce verbalised confidence that meets minimal validity criteria for item-level Type-2 discrimination under minimal numeric elicitation with greedy decoding. In a pre-registered study (OSF: osf.io/azbvx), 524 TriviaQA items were administered under numeric (0-100) and categorical (10-class) elicitation to eight models at Q5_K_M quantisation on consumer hardware, yielding 8,384 deterministic trials. A psychometric validity screen was applied to each model-format cell. All seven instruct models were classified Invalid on numeric confidence (H2 confirmed, 7/7 vs. predicted >=4/7), with a mean ceiling rate of 91.7% (H1 confirmed). Categorical elicitation did not rescue validity. Instead, it disrupted task performance in six of seven models, producing accuracy below 5% (H4 not confirmed). Token-level logprobability did not usefully predict verbalised confidence under the observed variance regime (H5 confirmed, mean cross-validated R^2 < 0.01). Within the reasoning-distilled model, reasoning-trace length showed a strong negative partial correlation with confidence (rho = -0.36, p < .001), consistent with the Reasoning Contamination Effect. These results do not imply that internal uncertainty representations are absent. They show that minimal verbal elicitation fails to preserve internal signals at the output interface in this model-size regime. Psychometric screening should precede any downstream use of such signals.

URL PDF HTML ☆

赞 0 踩 0

2604.22202 2026-04-27 cs.CV

ArchSym: Detecting 3D-Grounded Architectural Symmetries in the Wild

Hanyu Chen, Ruojin Cai, Steve Marschner, Noah Snavely

Comments project page: https://hanyuc.com/archsym/

2604.22199 2026-04-27 cs.RO cs.AI

An LLM-Driven Closed-Loop Autonomous Learning Framework for Robots Facing Uncovered Tasks in Open Environments

Hong Su

2604.22196 2026-04-27 cs.RO cs.MA

V-STC: A Time-Efficient Multi-Vehicle Coordinated Trajectory Planning Approach

Pengfei Liu, Jialing Zhou, Yuezu Lv, Guanghui Wen, Tingwen Huang

Comments 12 pages, 23 figures

2604.22193 2026-04-27 cs.CL

How Large Language Models Balance Internal Knowledge with User and Document Assertions

Shuowei Li, Haoxin Li, Wenda Chu, Yi Fang

Comments Findings of ACL 2026

2604.22190 2026-04-27 cs.CV cs.AI

From Global to Local: Rethinking CLIP Feature Aggregation for Person Re-Identification

Aotian Zheng, Winston Sun, Bahaa Alattar, Vitaly Ablavsky, Jenq-Neng Hwang

Comments 14 pages, 7 figures

2604.22183 2026-04-27 cs.CV

EvFlow-GS: Event Enhanced Motion Deblurring with Optical Flow for 3D Gaussian Splatting

Feiyu An, Yufei Deng, Zihui Zhang, Rong Xiao

Comments Accepted by ICME 2026

2604.22177 2026-04-27 cs.CV

Uni-Encoder Meets Multi-Encoders: Representation Before Fusion for Brain Tumor Segmentation with Missing Modalities

Peibo Song, Xiaotian Xue, Jinshuo Zhang, Zihao Wang, Jinhua Liu, Shujun Fu, Fangxun Bao, Si Yong Yeo

Comments CVPR 2026 Poster

2604.22174 2026-04-27 cs.CV

Unlocking Optical Prior: Spectrum-Guided Knowledge Transfer for SAR Generalized Category Discovery

Jingyuan Xia, Ruikang Hu, Ye Li, Zhixiong Yang, Xu Lan, Zhejun Lu

2604.22170 2026-04-27 cs.LG cs.IR

Sharpness-Aware Poisoning: Enhancing Transferability of Injective Attacks on Recommender Systems

Junsong Xie, Yonghui Yang, Pengyang Shao, Le Wu

详情

DOI: 10.1109/TKDE.2026.3682785

英文摘要

Recommender Systems~(RS) have been shown to be vulnerable to injective attacks, where attackers inject limited fake user profiles to promote the exposure of target items to real users for unethical gains (e.g., economic or political advantages). Since attackers typically lack knowledge of the victim model deployed in the target RS, existing methods resort to using a fixed surrogate model to mimic the potential victim model. Despite considerable progress, we argue that the assumption that \textit{poisoned data generated for the surrogate model can be used to attack other victim models} is wishful. When there are significant structural discrepancies between the surrogate and victim models, the attack transferability inevitably suffers. Intuitively, if we can identify the worst-case victim model and iteratively optimize the poisoning effect specifically against it, then the generated poisoned data would be better transferred to other victim models. However, exactly identifying the worst-case victim model during the attack process is challenging due to the large space of victim models. To this end, in this work, we propose a novel attack method called Sharpness-Aware Poisoning (\textit{SharpAP}). Specifically, it employs the sharpness-aware minimization principle to seek the approximately worst-case victim model and optimizes the poisoned data specifically for this worst-case model. The poisoning attack with SharpAP is formulated as a min-max-min tri-level optimization problem. By integrating SharpAP into the iterative process for attacks, our method can generate more robust poisoned data which is less sensitive to the shift of model structure, mitigating the overfitting to the surrogate model. Comprehensive experimental comparisons on three real-world datasets demonstrate that \name~can significantly enhance the attack transferability.

URL PDF HTML ☆

赞 0 踩 0

2604.22169 2026-04-27 cs.LG cs.AI cs.IR

ReCast: Recasting Learning Signals for Reinforcement Learning in Generative Recommendation

Peiyan Zhang, Hanmo Liu, Chengxuan Tong, Yuxia Wu, Wei Guo, Yong Liu

2604.22168 2026-04-27 cs.LG cs.SY eess.SY

Optimal sequential decision-making for error propagation mitigation in digital twins

Annice Najafi, Shokoufeh Mirzaei

2604.22166 2026-04-27 cs.CL

Fine-Grained Analysis of Shared Syntactic Mechanisms in Language Models

Ryoma Kumon, Hitomi Yanaka

Comments Accepted to ACL 2026 Main

2604.22164 2026-04-27 cs.CV

Learning Reactive Human Motion Generation from Paired Interaction Data Using Transformer-Based Models

Masato Soga, Ryuki Takebayashi

Comments 24 pages

2604.22162 2026-04-27 cs.CV

SAMIDARE: Advanced Tracking-by-Segmentation for Dense Scenarios

Shozaburo Hirano, Norimichi Ukita

2604.22160 2026-04-27 cs.CV cs.AI

GenMatter: Perceiving Physical Objects with Generative Matter Models

Eric Li, Arijit Dasgupta, Yoni Friedman, Mathieu Huot, Vikash Mansinghka, Thomas O'Connell, William T. Freeman, Joshua B. Tenenbaum

Comments 25 pages, 12 figures, CVPR 2026

2604.22156 2026-04-27 cs.LG cs.CV

Sum-of-Checks: Structured Reasoning for Surgical Safety with Large Vision-Language Models

Weiqiu You, Cassandra Goldberg, Amin Madani, Daniel A. Hashimoto, Eric Wong

Comments IPCAI 2026 short communication

2604.22154 2026-04-27 cs.LG cs.AI

Reliable Self-Harm Risk Screening via Adaptive Multi-Agent LLM Systems

Meghana Karnam, Ananya Joshi

2604.22153 2026-04-27 cs.CL cs.AI cs.CY

When AI Speaks, Whose Values Does It Express? A Cross-Cultural Audit of Individualism-Collectivism Bias in Large Language Models

Pruthvinath Jeripity Venkata

Comments 13 pages, 7 figures, 9 tables. Data and code: https://github.com/pruthvinathJV/ai-values-misalignment-study

2604.22152 2026-04-27 cs.RO

dWorldEval: Scalable Robotic Policy Evaluation via Discrete Diffusion World Model

Yaxuan Li, Zhongyi Zhou, Yefei Chen, Yaokai Xue, Yichen Zhu

2604.22142 2026-04-27 cs.CL cs.CY

Voice Under Revision: Large Language Models and the Normalization of Personal Narrative

Tom van Nuenen

2604.22139 2026-04-27 cs.CV cs.LG

Anatomy-Aware Unsupervised Detection and Localization of Retinal Abnormalities in Optical Coherence Tomography

Tania Haghighi, Sina Gholami, Hamed Tabkhi, Minhaj Nur Alam

Comments 11 pages, 3 figures, accepted in CVPR-CV4Clinical

2604.22129 2026-04-27 cs.CV cs.RO

PAGaS: Pixel-Aligned 1DoF Gaussian Splatting for Depth Refinement

David Recasens, Robert Maier, Aljaz Bozic, Stephane Grabli, Javier Civera, Tony Tung, Edmond Boyer

2604.22127 2026-04-27 cs.CL cs.LG

Where Should LoRA Go? Component-Type Placement in Hybrid Language Models

Hector Borobia, Elies Seguí-Mas, Guillermina Tormo-Carbó

Comments 21 pages, 5 figures, 7 tables. Code and data: https://github.com/hecboar/lora-placement-hybrid

2604.22118 2026-04-27 cs.CV

Robust Camera-to-Mocap Calibration and Verification for Large-Scale Multi-Camera Data Capture

Tianyi Liu, Christopher Twigg, Patrick Grady, Kevin Harris, Shangchen Han, Kun He

2604.22110 2026-04-27 cs.LG

Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement

Mahdi Kallel, Johannes Tölle, Ahmed Hendawy, Carlo D'Eramo

2604.22104 2026-04-27 cs.RO math.DG

Dynamic Coupling and Indirect Control of Jointed Robots Rolling Atop A Moving Platform

Hamidreza Moradi, Scott David Kelly

2604.22102 2026-04-27 cs.RO cs.AI cs.LG

Wiggle and Go! System Identification for Zero-Shot Dynamic Rope Manipulation

Arthur Jakobsson, Abhinav Mahajan, Karthik Pullalarevu, Krishna Suresh, Yunchao Yao, Yuemin Mao, Bardienus Duisterhof, Shahram Najam Syed, Jeffrey Ichnowski

2604.22098 2026-04-27 cs.CL

Knowledge-driven Augmentation and Retrieval for Integrative Temporal Adaptation

Weisi Liu, Guangzeng Han, Xiaolei Huang

Comments Accepted at ACL 2026

2604.22095 2026-04-27 cs.CL

An End-to-End Ukrainian RAG for Local Deployment. Optimized Hybrid Search and Lightweight Generation

Mykola Trokhymovych, Yana Oliinyk, Nazarii Nyzhnyk

Comments To appear at UNLP'26