arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.01131 2026-05-05 cs.LG cs.AI

Forager: a lightweight testbed for continual learning with partial observability in RL

Steven Tang, Xinze Xiong, Anna Hakhverdyan, Andrew Patterson, Jacob Adkins, Jiamin He, Esraa Elelimy, Parham Mohammad Panahi, Martha White, Adam White

Comments 24 pages, 11 figures

详情

英文摘要

In continual reinforcement learning (CRL), good performance requires never-ending learning, acting, and exploration in a big, partially observable world. Most CRL experiments have focused on loss of plasticity -- the inability to keep learning -- in one-off experiments where some unobservable non-stationarity is added to classic fully observable MDPs. Further, these experiments rarely consider the role of partial observability and the importance of CRL agents that use memory or recurrence. One potential reason for this focus on mitigating loss of plasticity without considering partial observability is that many partially-observable CRL environments are prohibitively expensive. In this paper, we introduce Forager, a light-weight partially-observable CRL environment with a constant memory footprint. We provide a set of experiments and sample tasks demonstrating that Forager is challenging for current CRL agents and yet also allows for in-depth study of those agents. We demonstrate that agents exhibit loss of plasticity, proposed mitigations can help, but that most useful is to leverage state construction. We conclude with a variant of Forager that generates an unending stream of new tasks to learn that clearly highlights the limitations of current CRL agents.

URL PDF HTML ☆

赞 0 踩 0

2605.01130 2026-05-05 cs.AI

Iterative Finetuning is Mostly Idempotent

Zephaniah Roe, Jack Sanderson, Dang Nguyen, Julian Huang, Todd Nief, Aryan Shrivastava, Chenhao Tan, Ari Holtzman

2605.01126 2026-05-05 cs.LG

Extreme Weather Bench: A framework and benchmark for evaluation of high-impact weather

Amy McGovern, Taylor Mandelbaum, Daniel Rothenberg, Nicholas Loveday, Corey Potvin, Montgomery Flora, Linus Magnusson, Eric Gilleland, John Allen

2605.01123 2026-05-05 cs.AI

PERSA: Reinforcement Learning for Professor-Style Personalized Feedback with LLMs

Ravi Ranjan, Utkarsh Grover, Xiaomin Lin, Agoritsa Polyzou

Comments 18 pages, 6 figures, 7 tables, accepted to conference ACL-2026, BEA

2605.01122 2026-05-05 cs.LG physics.optics

Machine Learning-Augmented Acceleration of Iterative Ptychographic Reconstruction

Bowen Zheng, Katayun Kamdin, David Shapiro, Alexander Ditter, Dayne Sasaki, Emma Bernard, Roopali Kukreja, Petrus H. Zwart, Slavomír Nemšák, Apurva Mehta, Nicholas Schwarz, Alexander Hexemer, Tanny Chavez

2605.01113 2026-05-05 cs.CV

Disciplined Diffusion: Text-to-Image Diffusion Model against NSFW Generation

Chi Zhang, Changjia Zhu, Xiaowen Li, Yao Liu, Zhuo Lu

2605.01111 2026-05-05 cs.LG cs.AI cs.CL

When Less is Enough: Efficient Inference via Collaborative Reasoning

Yilei Chen, Sharut Gupta, Yannis Paschalidis, Ayush Sekhari, Aldo Pacchiano

2605.01110 2026-05-05 cs.LG cs.SI math.AT stat.ML

Topological Neural Tangent Kernel

Sanjukta Krishnagopal

Comments 9 pages 4 figures

2605.01107 2026-05-05 cs.LG cond-mat.dis-nn stat.ML

Diffusion Operator Geometry of Feedforward Representations

Kanishka Reddy

2605.01106 2026-05-05 cs.CL cs.AI

Component-Aware Self-Speculative Decoding in Hybrid Language Models

Hector Borobia, Elies Seguí-Mas, Guillermina Tormo-Carbó

Comments 29 pages, 1 figure, 9 tables. Code: https://github.com/hecboar/hybrid-speculative-decoding

2605.01102 2026-05-05 cs.AI physics.ao-ph

Towards Multi-Agent Autonomous Reasoning in Hydrodynamics

Jinpai Zhao, Albert Cerrone, Joannes Westerink, Clint Dawson

2605.01100 2026-05-05 cs.AI

A Knowledge-Driven LLM-Based Decision-Support System for Explainable Defect Analysis and Mitigation Guidance in Laser Powder Bed Fusion

Basit Mahmud Shahriar, Md Habibor Rahman

Comments 28 pages, 15 figures

2605.01098 2026-05-05 cs.LG cs.CV

Almost for Free: Crafting Adversarial Examples with Convolutional Image Filters

Alexander Warnecke, Konrad Rieck

2605.01097 2026-05-05 cs.CL cs.AI

Interpretable Difficulty-Aware Knowledge Tracing in Tutor-Student Dialogues

Shuyan Huang, Alexander Scarlatos, Jaewook Lee, Andrew Lan

Comments 11 pages, 5 figures

2605.01096 2026-05-05 cs.LG cs.RO

Learning to Race in Minutes: Infoprop Dyna on the Mini Wheelbot

Devdutt Subhasish, Henrik Hose, Sebastian Trimpe

Comments Originally submitted to the German Robotics Conference, 2026

2605.01089 2026-05-05 cs.LG math.PR stat.CO

Learning Discriminators for Resampling in the Ensemble Gaussian Mixture Filter through a Normalizing Flow Approach

Zain Jabbar, Andrey A. Popov

2605.01084 2026-05-05 cs.CV

Patient-Specific Optimization for Mandibular Reconstruction Planning with Enhanced Bone Union

Hamidreza Aftabi, John E. Lloyd, Amanda Ding, Benedikt Sagl, Eitan Prisman, Antony Hodgson, Sidney Fels

详情

英文摘要

Mandibular reconstruction with vascularized bone grafts is complicated by donor-host nonunion, and current virtual surgical planning produces a geometric plan rather than a configuration that explicitly promotes bone union. We present OsteoOpt++, an image-to-decision planning loop for patient-specific mandibular reconstruction. A pre-operative computed tomography (CT) is converted into a personalized digital twin through template-to-patient registration and CT-derived updates of the muscle and temporomandibular-joint parameters. Bayesian optimization with an expected-improvement-plus acquisition rule then searches six clinically controllable cut-plane and donor-positioning variables under an apposition-driven objective and a safety-factor-regularized variant. The workflow was evaluated on three generic defects (body, symphysis, and ramus-body) and a total of 3+1 patient-specific cases, with 3 used for optimization and 1 for validation. In the generic cases, against a common surgical approach, cycle-averaged donor-mandible apposition increased by up to 29 percentage points (329% relative); in the patient-specific cases, against the surgeon-implemented day-5 post-operative configuration, by up to 26 percentage points. A 10% sensitivity analysis over eleven modeling parameters capped the change in the apposition-driven objective at 3% for generic cases and 4% for patient-specific cases, and the longitudinal case showed Dice overlap of 0.70 and 0.76 between predicted apposition and year-1 bone formation. Clinically, this provides surgeons with a pre-operative, image-driven recommendation for cut-plane orientation and donor placement that is predicted to improve union conditions over the configurations currently delivered in the operating room. The optimization and patient-specific modeling code is open source at https://github.com/hamidreza-aftabi/OsteoOpt.

URL PDF HTML ☆

赞 0 踩 0

2605.01082 2026-05-05 cs.LG cs.GT econ.TH

Networked Information Aggregation for Binary Classification

MohammadHossein Bateni, Zahra Hadizadeh, MohammadTaghi Hajiaghayi, Mahdi JafariRaviz, Shayan Taherijam

Comments Accepted to the 43rd International Conference on Machine Learning (ICML 2026)

2605.01081 2026-05-05 cs.CV

WILD SAM: A Simulated-and-Real Data Augmentation for Autonomous Driving Perception under Challenging Weather

Hamed Khatounabadi, Xiaohu Lu, Hayder Radha

2605.01077 2026-05-05 cs.CL

Teaching LLMs Brazilian Healthcare: Injecting Knowledge from Official Clinical Guidelines

Hugo Abonizio, Filipe Rocha Lopes, Roberto Lotufo, Rodrigo Nogueira

2605.01075 2026-05-05 cs.CV

Neighbor2Inverse: Self-Supervised Denoising for Low-Dose Region-of-Interest Phase Contrast CT

Johannes B. Thalhammer, Lorenzo D'Amico, Lucy Costello, Sebastian Peterhansl, Daniel Frey, Tina Dorosti, Florian Schaff, Jannis Ahlers, Ronan Smith, Marcus Kitchen, Franz Pfeiffer, Martin Donnelley, Daniela Pfeiffer, Kaye S. Morgan

2605.01073 2026-05-05 cs.CL

Controlled Paraphrase Geometry in Sentence Embedding Space: Local Manifold Modeling and Latent Probing

Leonid Bedratyuk

Comments 45 pages

详情

英文摘要

The paper studies the local geometry of embedding clouds induced by \emph{controlled local classes of semantically close sentences}. The central question is how controlled paraphrase-like semantic variation is organized in sentence embedding space and whether this local structure can be explicitly modeled by low-degree fitted carriers. We introduce a local geometric modeling scheme based on affine, quadratic, and cubic fitted models. We also use a surface-based latent probing procedure that constructs synthetic latent points in a reduced local PCA space with respect to the fitted carrier. The procedure is intended as an offline method for representation-space analysis, local manifold modeling, and geometry-aware latent probing. Generated latent points are evaluated using criteria that measure consistency with the fitted surface, preservation of neighborhood structure, agreement with the empirical distribution, stability of Hessian-based second-order shape descriptors, and stability of fitted-model coefficients. Experiments on controlled sets of semantically close sentences show that nonlinear local models describe embedding clouds more accurately than affine models. Surface-based generation provides strong fitted-geometry fidelity, including surface consistency, Hessian-based shape consistency, and coefficient consistency. Downstream experiments show that geometric validity of synthetic latent points does not automatically translate into improved classification performance. The results support explicit local geometric modeling of sentence embedding space and highlight the need to distinguish geometric validity from discriminative utility. As a resource contribution, we introduce \textbf{CoPaGE-300K}, a controlled template-based dataset of semantically close sentence variants with slot-level annotations and precomputed sentence embeddings.

URL PDF HTML ☆

赞 0 踩 0

2605.01069 2026-05-05 cs.RO

Online Safety Filter for Deformable Object Manipulation with Horizon Agnostic Neural Operators

Jiaxing Li, Hanjiang Hu, Zhuoyuan Wang, Yorie Nakahira, Changliu Liu

2605.01067 2026-05-05 cs.LG

Deep Variational Inference Symbolic Regression

James Butterworth, Gevik Grigorian, Alejandro DiazDelaO

Comments Code: ICLR2026" target="_blank" rel="noopener">https://github.com/jamesbut/DVISR-ICLR2026

2605.01066 2026-05-05 cs.LG

A dimensional R2 regression metric

Jaesung Yoo, Stefan Lemke, Jian Zhong Guo, Kanaka Rajan, Adam Hantman

2605.01065 2026-05-05 cs.CL

A Systematic Exploration of Text Decomposition and Budget Distribution in Differentially Private Text Obfuscation

Stephen Meisenbacher, Angelo Kleinert, Florian Matthes

Comments 22 pages, 5 figures, 12 tables. Accepted to PrivateNLP 2026

2605.01063 2026-05-05 cs.LG cs.CV

GEODE: Angle-Adaptive OOD Detection with Universal Scorer Compatibility

Bruno Abrahao

2605.01058 2026-05-05 cs.LG cs.AI cs.CL

LEAP: Layer-wise Exit-Aware Pretraining for Efficient Transformer Inference

Shashank Kapadia, Deep Naryan Mishra, Sujal Reddy Alugubelli, Haoan Wang, Saipraveen Vabbilisetty, Rishi Bhatia, Anupriya Sharma

Comments Accepted at ACL 2026 (Industry Track). 14 pages, 5 figures

2605.01051 2026-05-05 cs.RO cs.AI cs.LG cs.LO math.OC

Value Functions for Temporal Logic: Optimal Policies and Safety Filters

Oswin So, William Sharpless, Sylvia Herbert, Chuchu Fan

2605.01048 2026-05-05 cs.CL cs.LG

Compared to What? Baselines and Metrics for Counterfactual Prompting

Zihao Yang, Mosh Levy, Yoav Goldberg, Byron C. Wallace

Comments 24 pages, 10 figures. Under review

详情

英文摘要

Counterfactual prompting (i.e., perturbing a single factor and measuring output change) is widely used to evaluate things like LLM bias and CoT faithfulness. But in this work we argue that observed effects cannot be attributed to the targeted factor without accounting for baseline ``meaning-preserving'' modifications to text that establish general model sensitivity. This is because every counterfactual edit is a compound treatment that bundles the variable of interest with incidental surface-form variation; this violates treatment variation irrelevance. We observe prediction flip rates on MedQA of 14.9% when we surgically change patient gender. However, this is statistically indistinguishable from the flip rates induced by simply paraphrasing inputs (14.1%). In this case, it would therefore be unwarranted to conclude that the LLM is especially sensitive to patient gender. To account for this and robustly measure the effects of targeted interventions, we propose a framework in which we compare (via statistical testing) differences observed under target interventions to those induced by paraphrasing inputs. We then use this framework to revisit a analysis done on the MedPerturb dataset, which reported evidence of model sensitivity to patient demographics and stylistic cues. We find that these effects largely dissipate when we account for general model sensitivity, with only 5 of 120 tests reaching statistical significance. Applying the same framework to occupational biography classification, we detect clearly significant directional gender bias, showing that the framework identifies real directional effects even when they are small. We evaluate a range of metrics -- aggregate, per-sample distributional, and regression -- and find that per-sample metrics are dramatically more powerful than aggregate metrics and regression powerfully and uniquely characterizes effect direction and magnitude.

URL PDF HTML ☆

赞 0 踩 0