arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2511.09363 2026-04-17 cs.AI cs.SY eess.SY

BarrierBench: Evaluating Large Language Models for Safety Verification in Dynamical Systems

Ali Taheri, Alireza Taban, Sadegh Soudjani, Ashutosh Trivedi

Comments 8th Annual Learning for Dynamics & Control Conference

详情

英文摘要

Safety verification of dynamical systems via barrier certificates is essential for ensuring correctness in autonomous applications. Synthesizing these certificates involves discovering mathematical functions with current methods suffering from poor scalability, dependence on carefully designed templates, and exhaustive or incremental function-space searches. They also demand substantial manual expertise--selecting templates, solvers, and hyperparameters, and designing sampling strategies--requiring both theoretical and practical knowledge traditionally shared through linguistic reasoning rather than formalized methods. This motivates a key question: can such expert reasoning be captured and operationalized by language models? We address this by introducing an LLM-based agentic framework for barrier certificate synthesis. The framework uses natural language reasoning to propose, refine, and validate candidate certificates, integrating LLM-driven template discovery with SMT-based verification, and supporting barrier-controller co-synthesis to ensure consistency between safety certificates and controllers. To evaluate this capability, we introduce BarrierBench, a benchmark of 100 dynamical systems spanning linear, nonlinear, discrete-time, and continuous-time settings. Our experiments assess not only the effectiveness of LLM-guided barrier synthesis but also the utility of retrieval-augmented generation and agentic coordination strategies in improving its reliability and performance. Across these tasks, the framework achieves more than 90% success in generating valid certificates. By releasing BarrierBench and the accompanying toolchain, we aim to establish a community testbed for advancing the integration of language-based reasoning with formal verification in dynamical systems. The benchmark is publicly available at https://hycodev.com/dataset/barrierbench

URL PDF HTML ☆

赞 0 踩 0

2511.09149 2026-04-17 cs.LG cs.AI cs.MA

Enabling Agents to Communicate Entirely in Latent Space

Zhuoyun Du, Runze Wang, Huiyu Bai, Zouying Cao, Xiaoyong Zhu, Yu Cheng, Bo Zheng, Wei Chen, Haochao Ying

Comments Accepted to ACL 2026

2511.07412 2026-04-17 cs.CV cs.RO

TwinOR: Photorealistic Digital Twins of Dynamic Operating Rooms for Embodied AI Research

Han Zhang, Yiqing Shen, Roger D. Soberanis-Mukul, Ankita Ghosh, Hao Ding, Lalithkumar Seenivasan, Jose L. Porras, Zhekai Mao, Chenjia Li, Wenjie Xiao, Lonny Yarmus, Angela Christine Argento, Masaru Ishii, Mathias Unberath

详情

DOI: 10.1007/s11548-026-03644-w
Journal ref: International Journal of Computer Assisted Radiology and Surgery, 2026

英文摘要

Developing embodied AI for intelligent surgical systems requires safe, controllable environments for continual learning and evaluation. However, safety regulations and operational constraints in operating rooms (ORs) limit agents from freely perceiving and interacting in realistic settings. Digital twins provide high-fidelity, risk-free environments for exploration and training. How we may create dynamic digital representations of ORs that capture relevant spatial, visual, and behavioral complexity remains an open challenge. We introduce TwinOR, a real-to-sim infrastructure for constructing photorealistic and dynamic digital twins of ORs. The system reconstructs static geometry and continuously models human and equipment motion. The static and dynamic components are fused into an immersive 3D environment that supports controllable simulation and facilitates future embodied exploration. The proposed framework reconstructs complete OR geometry with centimeter-level accuracy while preserving dynamic interaction across surgical workflows. In our experiments, TwinOR synthesizes stereo and monocular RGB streams as well as depth observations for geometry understanding and visual localization tasks. Models such as FoundationStereo and ORB-SLAM3 evaluated on TwinOR-synthesized data achieve performance within their reported accuracy ranges on real-world indoor datasets, demonstrating that TwinOR provides sensor-level realism sufficient for emulating real-world perception and localization challenge. By establishing a perception-grounded real-to-sim pipeline, TwinOR enables the automatic construction of dynamic, photorealistic digital twins of ORs. As a safe and scalable environment for experimentation, TwinOR opens new opportunities for translating embodied intelligence from simulation to real-world clinical environments.

URL PDF HTML ☆

赞 0 踩 0

2510.24538 2026-04-17 cs.CL

Dark & Stormy: Modeling Humor in Sentences from the Bulwer-Lytton Fiction Contest

Venkata S Govindarajan, Laura Biester

2510.10649 2026-04-17 cs.AI

Unlocking Exploration in RLVR: Uncertainty-aware Advantage Shaping for Deeper Reasoning

Can Xie, Ruotong Pan, Xiangyu Wu, Yunfei Zhang, Jiayi Fu, Tingting Gao, Guorui Zhou

Comments 20 pages, 4 figures, ACL2026

2510.08055 2026-04-17 cs.LG cs.DC

From Tokens to Layers: Redefining Stall-Free Scheduling for MoE Serving with Layered Prefill

Gunjun Lee, Jiwon Kim, Jaiyoung Park, Younjoo Lee, Jung Ho Ahn

Comments 24 pages, 5 figure, 12 tables, accepted at MLSys 2026

2510.07890 2026-04-17 cs.CL

Standard-to-Dialect Transfer Trends Differ across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects

Verena Blaschke, Miriam Winkler, Barbara Plank

Comments ACL 2026 (main)

2510.04116 2026-04-17 cs.AI

Searching Meta Reasoning Skeleton to Guide LLM Reasoning

Ziying Zhang, Yaqing Wang, Quanming Yao

2509.26007 2026-04-17 cs.SD cs.AI cs.LG

MARS: Sound Generation via Multi-Channel Autoregression on Spectrograms

Eleonora Ristori, Luca Bindini, Paolo Frasconi

Comments Accepted at IJCNN 2026 (to appear in IEEE/IJCNN proceedings). This arXiv submission corresponds to the camera-ready version

2509.24886 2026-04-17 cs.LG

Adaptive Canonicalization with Application to Invariant Anisotropic Geometric Networks

Ya-Wei Eileen Lin, Ron Levie

2509.15946 2026-04-17 cs.SD eess.AS eess.SP

Differentiable Acoustic Radiance Transfer

Sungho Lee, Matteo Scerbo, Seungu Han, Min Jun Choi, Kyogu Lee, Enzo De Sena

Comments Accepted to TASLPRO

2509.01728 2026-04-17 cs.RO cs.LG cs.LO

Constrained Decoding for Safe Robot Navigation Foundation Models

Parv Kapoor, Akila Ganlath, Michael Clifford, Changliu Liu, Sebastian Scherer, Eunsuk Kang

2508.15396 2026-04-17 cs.CL

Attribution, Citation, and Quotation: A Survey of Evidence-based Text Generation with Large Language Models

Tobias Schreieder, Tim Schopf, Michael Färber

Comments Accepted at ACL 2026

2507.23121 2026-04-17 cs.CL cs.AI

Uncovering the Fragility of Trustworthy LLMs through Chinese Textual Ambiguity

Xinwei Wu, Haojie Li, Hongyu Liu, Xinyu Ji, Ruohan Li, Yule Chen, Yigeng Zhang

Comments Accepted at KDD workshop on Evaluation and Trustworthiness of Agentic and Generative AI Models (Agentic & GenAI Evaluation Workshop KDD '25)

2507.15066 2026-04-17 cs.LG cs.AI cs.MM

Time-RA: Towards Time Series Reasoning for Anomaly Diagnosis with LLM Feedback

Yiyuan Yang, Zichuan Liu, Lei Song, Kai Ying, Zhiguang Wang, Tom Bamford, Svitlana Vyetrenko, Jiang Bian, Qingsong Wen

Comments ACL 2026 Findings. 27 pages, 11 figures, 15 tables. Code and dataset are publicly available

2506.19807 2026-04-17 cs.AI cs.CL cs.CV cs.LG cs.MA

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

Baochang Ren, Shuofei Qiao, Da Zheng, Huajun Chen, Ningyu Zhang

Comments ACL 2026

2505.20214 2026-04-17 cs.AI

When Slower Isn't Truer: Inverse Scaling Law of Truthfulness in Multimodal Reasoning

Sitong Fang, Wenjing Cao, Jiahao Li, Xuyao Wang, Juntao Dai, Chi-Min Chan, Sirui Han, Yike Guo, Yaodong Yang, Jiaming Ji

Comments Accepted at ACL 2026

2505.10846 2026-04-17 cs.LG cs.CR

AutoRAN: Automated Hijacking of Safety Reasoning in Large Reasoning Models

Jiacheng Liang, Tanqiu Jiang, Yuhui Wang, Rongyi Zhu, Fenglong Ma, Ting Wang

Comments 10 pages, ACL 2026 Main

2504.16455 2026-04-17 cs.CV

Cross Paradigm Representation and Alignment Transformer for Image Deraining

Shun Zou, Yi Zou, Juncheng Li, Guangwei Gao, Guojun Qi

Comments ACM MM2025. Code: https://github.com/zs1314/CPRAformer

2502.21274 2026-04-17 cs.LG cs.AI q-bio.BM

BAnG: Bidirectional Anchored Generation for Conditional RNA Design

Roman Klypa, Alberto Bietti, Sergei Grudinin

2502.21029 2026-04-17 cs.RO cs.LG

Sixth-Sense: Self-Supervised Learning of Spatial Awareness of Humans from a Planar Lidar

Simone Arreghini, Nicholas Carlotti, Mirko Nava, Antonio Paolillo, Alessandro Giusti

2502.12222 2026-04-17 cs.LG cs.AI

IMPACTX: improving model performance by appropriately constraining the training with teacher explanations

Andrea Apicella, Salvatore Giugliano, Francesco Isgrò, Andrea Pollastro, Roberto Prevete

Comments Published on Artificial Intelligence Review

2502.04689 2026-04-17 cs.CL cs.AI cs.LG

Improving Language Models with Intentional Analysis

Yuwei Yin, Giuseppe Carenini

Comments Code at https://github.com/YuweiYin/IA

2410.17448 2026-04-17 cs.CL cs.AI

In Context Learning and Reasoning for Symbolic Regression with Large Language Models

Samiha Sharlin, Tyler R. Josephson

2410.01540 2026-04-17 cs.CV cs.AI cs.GR cs.LG

Edge-preserving noise for diffusion models

Jente Vandersanden, Sascha Holl, Xingchang Huang, Gurprit Singh

2310.17591 2026-04-17 cs.CL

Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways

Venkata S Govindarajan, Juan Diego Rodriguez, Kaj Bostrom, Kyle Mahowald

Comments Proceedings of the BabyLM Challenge

2309.11452 2026-04-17 cs.AI math.OC

Using deep learning to construct stochastic local search SAT solvers with performance bounds

Maximilian J. Kramer, Paul Boes, Jens Eisert

Comments 24 pages, significantly updated version with new datasets and experiments. Code available at https://github.com/porscheofficial/sls_sat_solving_with_deep_learning. Accepted for publication in Machine Learning: Science and Technology 7 (2026) 025057

2305.16409 2026-04-17 cs.CL cs.CY

Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias

Venkata S Govindarajan, Kyle Mahowald, David I. Beaver, Junyi Jessy Li

Comments To appear in Findings of ACL 2023

2210.09817 2026-04-17 cs.LG

Universal hidden monotonic trend estimation with contrastive learning

Edouard Pineau, Sébastien Razakarivony

2209.14774 2026-04-17 cs.CV cs.AI

RECALL: Rehearsal-free Continual Learning for Object Classification

Markus Knauer, Maximilian Denninger, Rudolph Triebel

Comments Accepted as contributed paper at the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)