arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.17050 2026-06-16 eess.SY cs.SY math.OC 新提交

Optimal Bounded Thrust Powered Descent with Analytical Ground-Collision Avoidance

带有解析地面碰撞避免的最优有界推力动力下降

Or Nataf, Vitaly Shaferman

AI总结提出一种新方法解决有界推力动力下降问题，通过时间相关多项式近似质量，分层分离推力分配，实现解析地面碰撞避免，并给出饱和感知制导律。

Comments This work has been submitted for journal publication. 32 pages and 15 figures

详情

AI中文摘要

本文提出了一种新方法来解决有界推力动力下降问题，同时确保地面碰撞避免。采用时间相关的多项式近似质量，以制定一个有界线性二次最优控制问题，最小化推力加速度控制努力、终端偏差和终端速度误差。所得近似用于对水平推力剖面施加硬约束，同时保持垂直推力剖面无约束。关键思想是推力分配的分层分离，这使得在有界推力下能够实现解析地面碰撞避免。与基于数值优化和轨迹整形约束的现有有界推力动力下降方法不同，所提方法提供了显式的解析碰撞避免条件。基于此公式，制导律预测饱和弧和非饱和弧之间的切换时间，并塑造推力加速度剖面以实现软着陆，即使控制器在轨迹的较大部分保持饱和。由于其解析性质，制导律计算效率高，且其连续推力剖面便于实时实现。所提方法在真实模拟中在一组扰动初始条件的网格上进行了评估，展示了准确的、无碰撞的软着陆性能。结果突出了在有界推力下将饱和感知制导与地面碰撞避免相结合的重要性。

英文摘要

The paper proposes a new approach to address the bounded-thrust powered-descent problem while ensuring ground-collision avoidance. A time-dependent polynomial approximation of the mass is employed to formulate a bounded linear-quadratic optimal-control problem that minimizes the thrust-acceleration control effort, terminal miss, and terminal velocity error. The resulting approximation is used to impose a hard constraint on the horizontal thrust profile while keeping the vertical thrust profile unconstrained. The key idea is a hierarchical separation of the thrust allocation, which enables analytical ground-collision avoidance under bounded thrust. Unlike existing bounded-thrust powered-descent approaches based on numerical optimization and trajectory-shaping constraints, the proposed method provides explicit analytical collision-avoidance conditions. Building on this formulation, the guidance law predicts the switching times between saturated and unsaturated arcs and shapes the thrust-acceleration profile to achieve a soft landing, even when the controller remains saturated over extended portions of the trajectory. Owing to its analytical nature, the guidance law is computationally efficient, and its continuous thrust profile facilitates real-time implementation. The proposed method was evaluated over a grid of perturbed initial conditions in realistic simulations, demonstrating accurate collision-free soft-landing performance. The results highlight the importance of combining saturation-aware guidance with ground-collision avoidance under bounded thrust.

URL PDF HTML ☆

赞 0 踩 0

2606.17006 2026-06-16 cs.SD cs.AI cs.LG cs.MM eess.AS 新提交

TuneJury: An Open Metric for Improving Music Generation Preference Alignment

TuneJury: 一种改进音乐生成偏好对齐的开放指标

Yonghyun Kim, Junwon Lee, Haiwen Xia, Yinghao Ma, Junghyun Koo, Koichi Saito, Yuki Mitsufuji, Chris Donahue

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； Sony AI（索尼AI）； Georgia Tech（佐治亚理工学院）； KAIST（韩国科学技术院）； Peking University（北京大学）； QMUL（伦敦玛丽女王大学）

AI总结提出TuneJury，一个开放、实例级别的成对奖励模型，用于文本到音乐生成，通过预测偏好分数支持数据筛选、后处理校准，并在推理、优化和训练中提升对齐效果。

Comments 32 pages, 9 figures

详情

AI中文摘要

我们引入了TuneJury，一个开放、实例级别的成对奖励模型，用于文本到音乐生成，它从文本提示和音频片段中预测音乐偏好分数。发布的检查点在公开的人类偏好标签上训练，涵盖竞技场风格（A vs. B）投票、度量对齐偏好对、众包成对比较和专家审美评分。两个片段之间的预测分数差在我们的保留测试集上校准良好，支持通过简单的分数阈值进行数据筛选。TuneJury泛化到保留测试对和分布外基准，在后一任务上与先前基线保持竞争力。对于训练后发布的生成器，我们引入了锚定校准，一种事后、每系统的Bradley-Terry校准，以显著优于从头再训练的数据效率恢复一致性。相同的冻结奖励在三个下游应用中驱动一致的奖励轴增益：推理时的最佳N选择、DITTO风格的潜在优化和专家迭代后训练。TuneJury可在https://github.com/yonghyunk1m/TuneJury获取。

英文摘要

We introduce TuneJury, an open, instance-level pairwise reward model for text-to-music that predicts a music preference score from a text prompt and an audio clip. The released checkpoint is trained on publicly available human-preference labels covering arena-style (A vs. B) votes, metric-alignment preference pairs, crowdsourced pairwise comparisons, and expert aesthetic ratings. The predicted score margin between two clips is well calibrated on our held-out test split, supporting data filtering via a simple score threshold. TuneJury generalizes to both held-out test pairs and out-of-distribution benchmarks, remaining competitive with prior baselines on the latter. For generators released after training, we introduce anchor calibration, a post-hoc, per-system Bradley-Terry calibration that recovers agreement at substantially better data efficiency than from-scratch retraining. The same frozen reward drives consistent reward-axis gains across three downstream applications: inference-time best-of-N selection, DITTO-style latent optimization, and expert-iteration post-training. TuneJury is available at https://github.com/yonghyunk1m/TuneJury.

URL PDF HTML ☆

赞 0 踩 0

2606.17004 2026-06-16 eess.SY cs.SY 新提交

Data-Driven Personalization of Automated Insulin Delivery

自动化胰岛素输送的数据驱动个性化

Ali Kashani, Ali Tavasoli, Heman Shakeri

AI总结提出一种基于日常血糖数据的实时自适应控制参数调整方法，利用投影梯度下降优化日风险指标，并通过收缩理论验证闭环系统收敛性。在100名成年患者的仿真中，该方法在餐时、餐量和胰岛素敏感性变异下，分别于4、8、17周后使时间在范围（70-180 mg/dL）内增加2%、3%和4%。

详情

AI中文摘要

自动化胰岛素输送（AID）系统通常针对人群进行调谐，并且对由进餐模式、体力活动和胰岛素敏感性波动引起的胰岛素需求的个体间和个体内变异提供的在线适应有限。我们提出了一种实时的数据驱动个性化方法，利用受试者的每日血糖数据自适应控制器参数。该适应被表述为在日风险指标上的投影梯度下降，其中梯度估计被设计用于衰减噪声和代谢变异性。我们使用收缩理论来验证优化框架以及自适应下闭环系统的收敛性。在FDA接受的UVA/Padova T1D模拟器的100名成年人群上的计算机仿真表明，在进餐时间、进餐量和胰岛素敏感性的变异下，我们的方法在4周、8周和17周后分别将血糖风险改善并使时间在范围（70-180 mg/dL）内增加2%、3%和4%。

英文摘要

Automated insulin delivery (AID) systems are often tuned for the population and offer limited online adaptation to the inter- and intrapatient variability in insulin needs caused by meal patterns, physical activity, and fluctuations in insulin sensitivity. We present a real-time, data-driven personalization approach that adapts controller parameters using the subject's daily glycemic data. The adaptation is formulated as projected gradient descent on a daily risk metric, where the gradient estimation is designed to attenuate noise and metabolic variability. We use contraction theory to validate the optimization framework and convergence of the closed-loop system under adaptation. In silico experiments on the 100-adult cohort of the FDA-accepted UVA/Padova T1D simulator show that our method improves glycemic risk and increases time-in-range (TIR, 70-180\,mg/dL) by 2%, 3%, and 4% after 4, 8, and 17 weeks, respectively, under variability in meal timing, meal size, and insulin sensitivity.

URL PDF HTML ☆

赞 0 踩 0

2606.17001 2026-06-16 eess.SY cs.SY 新提交

Sandbox-Enabled Digital Twin for Cyber-Physical Systems

支持沙箱的数字孪生用于信息物理系统

Meet Udeshi, Md Raz, Prashanth Krishnamurthy, Ramesh Karri, Farshad Khorrami

AI总结提出一种闭环数字孪生框架，通过将控制器二进制置于沙箱中并与外部植物模拟器耦合，同步捕获控制器侧信道和植物状态，用于CPS控制器的在线测试和异常检测。

Comments 5 Pages, 4 Figures

详情

AI中文摘要

信息物理系统（CPS）控制器易受故障和恶意攻击的影响，包括仅在复杂植物条件下触发的故障，然而部署前的验证通常依赖于将控制器I/O作为黑盒测试的植物模型或数字孪生。用于检测运行时行为异常的侧信道是互补的，但也是开环的，与I/O仪表分离，并由合成输入而非真实植物反馈驱动。我们提出一个闭环数字孪生框架，通过将未经修改的控制器二进制托管在SaMOSA Linux沙箱中，并将其I/O重定向到外部植物模拟器，从而桥接这一差距，允许同步捕获模拟植物条件和事件以及控制器的行为侧信道。该框架捕获四个时间同步的控制器侧信道（硬件性能计数器、系统调用、磁盘活动和网络活动）以及植物状态，并使用编排钩子进行可重复的参数化运行。我们在一个OpenPLC运行时上演示了该框架，该运行时执行结构化文本控制程序，针对一个通过Modbus连接的IEEE 14总线电力系统模型，并讨论了在机器人系统中的应用。捕获的侧信道将控制器行为与模拟植物事件关联起来，为CPS控制器的在线测试、覆盖分析和异常检测建立了可观测性基础。

英文摘要

Cyber-physical system (CPS) controllers are vulnerable to faults and malicious attacks, including failures triggered only under complex plant conditions, yet pre-deployment validation typically relies on plant models or digital twins that exercise the controller's I/O as a black box. Side-channels, used to detect those run-time behavioral anomalies, are complementary but also open-loop, detached from I/O instrumentation, and driven by synthetic inputs rather than realistic plant feedback. We present a closed-loop digital twin framework that bridges this gap by hosting an unmodified controller binary within the SaMOSA Linux sandbox with its I/O rerouted to an external plant simulator, allowing coupled capture of simulated plant conditions and events alongside the controller's behavioral side-channels. The framework captures four time-synchronized controller side-channels (hardware performance counters, system calls, disk activity, and network activity) alongside plant state, and uses orchestration hooks for repeatable, parameterized runs. We demonstrate the framework on an OpenPLC runtime executing a Structured Text control program against a Modbus-connected IEEE 14-bus power-system model, and also discuss the application to robotics systems. The captured side-channels correlate controller behavior with simulated plant events, establishing an observability foundation for online testing, coverage analysis, and anomaly detection in CPS controllers.

URL PDF HTML ☆

赞 0 踩 0

2606.16985 2026-06-16 stat.ML cs.LG eess.SP nlin.CD stat.ME 新提交

Dynestyx: A Probabilistic Programming Library for Dynamical Systems

Dynestyx: 一个面向动态系统的概率编程库

Daniel Waxman, Dmitry Batenkov, John Feser, Andy Zane, Eli Bingham, Youssef Marzouk, Matthew E. Levine

AI总结提出dynestyx库，通过统一接口支持状态空间模型的先验指定、混合效应推断及状态与参数估计，实现贝叶斯动态系统分析。

Comments 7 pages

2606.16978 2026-06-16 cs.RO cs.LG cs.SY eess.SY 新提交

Task-Error Residual Learning for Real-Robot Five-Ball Juggling

任务误差残差学习用于真实机器人五球杂耍

Kai Ploeger, Jan Peters

发表机构 * Technical University of Darmstadt（达姆施塔特工业大学）； German Research Center for AI (DFKI)（德国人工智能研究中心）； Hessian Center for Artificial Intelligence (hessian.AI)（黑森州人工智能中心）

AI总结提出基于任务误差方向监督和误差模型驱动样本选择的残差学习方法，在Barrett WAM机械臂上实现稳定三、四、五球杂耍，首次尝试失败后任务误差单调递减，无需进一步失败。

Comments Submitted to the 2026 International Symposium on Robotics Research (ISRR)

详情

AI中文摘要

对于改进现有行为的残差学习，样本效率取决于两个因素：每次试错返回的信息量，以及学习器使用这些信息的效率。强化学习的标准标量奖励携带的信息远少于定义任务的方向性任务误差。随机探索进一步丢弃了每次试错返回的信息。通过使用方向性任务误差监督和驱动样本选择的任务误差模型进行残差学习，我们在拟人化Barrett WAM机械臂上实现了稳定的三、四、五球杂耍。尽管通过简单、理想化的堆栈进行规划和控制，系统从第二次尝试开始收敛。第一次尝试失败后，任务误差单调递减，没有进一步的失败。相比之下，五球杂耍通常需要人类多年的练习。我们在三个三元轴上比较残差学习器：学习反馈中的方向性信息和分析先验的承诺，涵盖牛顿式雅可比更新、复合贝叶斯优化和随机搜索方法。两个轴都被证明是必要的：方向性反馈或信息性先验单独都不足够，而结合它们的最简单方法——固定雅可比牛顿更新——是最可靠的。学习到的残差能够容忍大量的先验失准和退化的关节跟踪，主要影响收敛速度。因此，真实机器人上残差学习的瓶颈是监督信号的信息内容以及学习器如何使用它，而不是周围堆栈的精度。所有实验的视频文档可在 https://kai-ploeger.com/residual-juggling 获取。

英文摘要

For residual learning that refines existing behavior, sample efficiency depends on two things: how much information each rollout returns, and how efficiently the learner uses that information. Reinforcement learning's standard scalar reward carries far less information than the directional task error that defines the task. Random exploration further discards whatever information each rollout returns. Through residual learning with directional task-error supervision and a task error model that drives sample selection, we achieve stable three-, four-, and five-ball juggling on anthropomorphic Barrett WAM arms. Despite planning and controlling through a simple, idealized stack, the system converges from the second attempt. The first attempt drops, after which task error decreases monotonically without further failures. In comparison, five-ball juggling typically takes humans years of practice. We compare residual learners across two ternary axes, the directional information in the learning feedback and the commitment of the analytic prior, spanning Newton-style Jacobian updates, Composite Bayesian Optimization, and stochastic search methods. Both axes prove necessary: neither directional feedback nor an informative prior suffices alone, and the simplest method that combines them, a fixed-Jacobian Newton update, is the most reliable. The learned residual tolerates substantial prior misalignment and degraded joint tracking, affecting mainly convergence speed. The bottleneck for residual learning on real robots is therefore the information content of the supervision signal and how the learner uses it, not the accuracy of the surrounding stack. Video documentation of all experiments is available at https://kai-ploeger.com/residual-juggling.

URL PDF HTML ☆

赞 0 踩 0

2606.16972 2026-06-16 cs.RO cs.SY eess.SY 新提交

When Should a Robot Replan? Regret-Guided Update Scheduling in Time-Varying MDPs

机器人何时应重新规划？时变MDP中的遗憾引导更新调度

Negin Musavi, Gokul Puthumanaillam, Ruben Hernandez, William Schafer, Melkior Ornik

发表机构 * University of Illinois Urbana–Champaign（伊利诺伊大学厄巴纳-香槟分校）

AI总结针对时变环境下机器人因预算限制无法持续重规划的问题，提出基于动态遗憾的在线更新调度规则，在仿真和实物实验中优于固定预算基线。

详情

AI中文摘要

在非平稳环境中运行的机器人必须随着动态漂移不断调整其策略，但机载能量和计算预算限制了全状态估计和重规划步骤的执行频率。这引出一个问题：在时间轴上，机器人何时应花费其有限的预算？我们在具有已知转移漂移率边界的时变马尔可夫决策过程（TVMDP）中形式化该问题。我们将执行建模为一种“跳过更新”方案，即在选定的更新时间点，智能体通过最大似然估计转移核并计算有限时域策略，而在更新间隔之间，则在传播的状态估计下重用该策略。我们分析了该方案的动态遗憾，并展示了它如何根据TVMDP的性质和跳过长度在跳过区间内增长；由此产生的界限通过一种在线、遗憾引导的更新规则回答了开头的问题，该规则自适应地分配预算。我们在具有时变滑移动力学的模拟火星车导航任务和室内障碍物场中的Crazyflie四旋翼飞行器上评估了该规则。自适应分配优于其他预算基线。

英文摘要

Robots operating in non-stationary environments must continually adapt their policies as the dynamics drift, but onboard energy and compute budgets cap how often a full state estimation and re-planning step can be performed. This raises a question: \emph{when}, along a horizon, should a robot spend its limited budget? We formulate this problem in time-varying Markov decision processes (TVMDPs) with a known bound on the rate of transition drift. We model execution as a \emph{skip-update} scheme in which, at chosen update times, the agent estimates the transition kernel by maximum likelihood and computes a finite-horizon policy, and between updates reuses this policy under a propagated state estimate. We analyze the dynamic regret of this scheme and show how it grows during skip intervals in terms of the properties of the TVMDP and the skip lengths; the resulting bound answers the opening question via an online, regret-guided update rule that allocates the budget adaptively. We evaluate the rule in a simulated Mars-rover navigation task with time-varying slip dynamics and on a Crazyflie quadrotor in indoor obstacle fields. Adaptive allocation outperforms other budgeted baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.16969 2026-06-16 cs.SD cs.AI eess.AS 新提交

Probing Low Frame Rate Degradation in Neural Audio Codecs

探测神经音频编解码器中的低帧率退化

Alex Gichamba, Moise Busogi

发表机构 * Carnegie Mellon University Africa（卡内基梅隆大学非洲校区）

AI总结通过控制帧率消融实验，发现低帧率质量悬崖源于训练配置缺陷而非根本性障碍，修正后帧率可降至3.1Hz和1.6Hz。

Comments Accepted at Interspeech 2026

详情

AI中文摘要

神经音频编解码器中的低帧率对于自回归语音合成具有吸引力，因为生成成本与序列长度线性相关。最近的研究表明，编解码器可以在12.5 Hz及以下运行，但低帧率退化的机制仍未被充分理解。我们通过受控的帧率消融实验来研究这些机制。我们重现了先前工作中报告的6.25 Hz处的质量悬崖，并评估了候选解释：音素冲突和码本饱和，两者均未显示出根本性障碍的证据。该悬崖实际上是由次优的训练配置引起的：训练期间固定的剪辑时长在低帧率下产生过少的令牌，使解码器缺乏令牌间上下文。一旦修正，WER随音素负载平滑退化，直至3.1 Hz和1.6 Hz，这表明低帧率编解码器的推理时效率增益比先前假设的更容易实现。

英文摘要

Low frame rates in neural audio codecs are attractive for autoregressive speech synthesis, where the generation cost scales linearly with the sequence length. Recent work has demonstrated that codecs can operate at 12.5 Hz and below, but the mechanisms underlying low frame rate degradation remain insufficiently understood. We investigate these mechanisms through a controlled frame rate ablation. We reproduce a quality cliff at 6.25 Hz reported in previous works and evaluate candidate explanations: phonemic collisions and codebook saturation, neither of which shows evidence of a fundamental barrier. The cliff is instead caused by suboptimal training configuration: fixed clip duration during training yields too few tokens at low frame rates, starving the decoder of inter-token context. Once corrected, WER degrades smoothly with phonemic load down to 3.1 Hz and 1.6 Hz, suggesting the inference-time efficiency gains of low frame rate codecs are more accessible than previously assumed.

URL PDF HTML ☆

赞 0 踩 0

2606.16951 2026-06-16 cs.CV eess.IV 新提交

Simulation-Based Multi-Fillet Evaluation of Woody Breast Poultry Fillets

基于仿真的多鸡胸肉木质化评估

Chirantan Sen Mukherjee, Seung-Chul Yoon, William J. Beksi

发表机构 * Department of Computer Science and Engineering, The University of Texas at Arlington（德克萨斯大学阿灵顿分校计算机科学与工程系）； Quality and Safety Assessment Research Unit, U.S. National Poultry Research Center, USDA Agricultural Research Service（美国农业部农业研究服务局国家家禽研究中心质量与安全评估研究单元）

AI总结针对单鸡胸肉检测的吞吐量瓶颈，提出一种俯视多鸡胸肉检测架构，通过物理仿真生成数据集并提取二维形状变形分数，实现多鸡胸肉同时评估。

Comments To be published in the 2026 International Conference on Automation Science and Engineering (CASE)

详情

AI中文摘要

木质化鸡胸肉是现代肉鸡的一种肌病，导致胸肌异常僵硬和纤维化，降低肉质并造成重大经济损失。最先进的自动WB检测依赖于侧视成像系统，分析单个鸡胸肉从传送带落下时的弯曲行为。虽然高度准确，但该方法受限于单鸡胸肉视野，在商业加工线上造成吞吐量瓶颈。本文通过一种利用俯视相机配置的新型多鸡胸肉检测架构来解决这一限制。为验证我们的方法，首先开发了工业传送系统的高保真数字孪生。然后，合成多样化的3D鸡胸肉网格数据集，并使用基于物理的仿真引擎模拟其粘弹性弯曲动力学。最后，从俯视视角提取连续的二维形状变形分数，模拟鸡胸肉经过滚轮边缘的过程。实验结果表明，俯视形状分数有效捕捉鸡胸肉弯曲时的轮廓变化，为同时多鸡胸肉WB评估提供了鲁棒且可扩展的侧视成像系统替代方案。

英文摘要

Woody breast (WB) is a myopathy in modern broiler chickens that causes the breast muscle to become unusually stiff and fibrous, leading to decreased meat quality and significant economic losses. State-of-the-art automated WB detection relies on a side-view imaging system to analyze the bending behavior of a single fillet as it falls off a conveyor belt. While highly accurate, this approach is constrained by its single-fillet field of view, creating throughput bottlenecks on commercial processing lines. In this paper, we address this limitation via a novel multi-fillet detection architecture utilizing a top-down camera configuration. To validate our approach, we first develop a high-fidelity digital twin of an industrial conveyor system. Next, we synthesize a diverse dataset of 3D fillet meshes and model their viscoelastic bending dynamics using a physics-based simulation engine. Lastly, a continuous 2D shape deformation score is extracted from the top-down perspective as the simulated fillets traverse the roller precipice. Experimental results demonstrate that the top-down shape score effectively captures the contour changes of the fillets as it bends, providing a robust and scalable alternative to a side-view imaging system for simultaneous multi-fillet WB evaluation.

URL PDF HTML ☆

赞 0 踩 0

2606.16930 2026-06-16 eess.SY cs.SY 新提交

Learning practically stabilizing output-feedback nonlinear controllers

学习实际稳定的输出反馈非线性控制器

Kui Xie, Pablo Krupa, Alberto Bemporad

AI总结本文提出一种离线学习输出反馈替代控制器的方法，通过递归动力学系统模仿给定控制器-观测器对的闭环轨迹，并利用估计状态轨迹学习候选李雅普诺夫函数以促进输入到状态实际稳定性，在非线性连续搅拌釜反应器上验证了约束满足和实际稳定性。

Comments 8 pages, 5 figures

2606.16927 2026-06-16 eess.SP 新提交

Communication Channel Modelling of Unmanned Aerial Vehicles

无人机通信信道建模

Necati Kagan Erkek, Emre Balcı, Berkin Halay, Hakan Ali Çırpan

AI总结针对无人机通信，采用基于测量的方法提取路径损耗、信道冲激响应和功率延迟分布等参数，分析地-地、空-地和空-空场景，指出需结合大尺度与小尺度统计特性以建立真实信道模型。

Comments 78 pages

详情

AI中文摘要

无人机（UAV）的无线通信信道表征对于民用、工业和国防应用中的可靠控制、数据传输和任务性能至关重要。采用基于测量的方法检查信道行为，该方法捕获由路径损耗表示的大尺度传播效应和由信道冲激响应（CIR）和功率延迟分布（PDP）表示的小尺度特性。使用基于SDR的信道探测系统收集和处理同相和正交（IQ）数据，从而提取关键信道参数。在系统验证后，在地-地（G2G）、空-地（A2G）和空-空（A2A）场景中进行测量。结果表明，仅靠路径损耗不足以描述无人机通信信道，因为CIR和PDP提供了对多径传播和延迟域行为的额外洞察。研究结果表明，真实的无人机信道模型应同时包含大尺度和小尺度信道统计量。通过增加探测带宽、增强同步、在更广泛的环境中进行测量以及更详细的多普勒效应分析，可以进一步改进。

英文摘要

Wireless communication channel characterization for unmanned aerial vehicles (UAVs) is essential for reliable control, data transmission, and mission performance in civil, industrial, and defence applications. Channel behaviour is examined using a measurement-based approach that captures both large-scale propagation effects, represented by path loss, and small-scale characteristics, represented by the channel impulse response (CIR) and power delay profile (PDP). An SDR-based channel sounding system is employed to collect and process in-phase and quadrature (IQ) data, enabling the extraction of key channel parameters. Following system verification, measurements are conducted in ground-to-ground (G2G), air-to-ground (A2G), and air-to-air (A2A) scenarios. The results demonstrate that path loss alone is insufficient to describe UAV communication channels, as CIR and PDP provide additional insight into multipath propagation and delay-domain behaviour. The findings indicate that realistic UAV channel models should incorporate both large-scale and small-scale channel statistics. Further improvements may be achieved through increased sounding bandwidth, enhanced synchronization, measurements in a wider range of environments, and more detailed analysis of Doppler effects.

URL PDF HTML ☆

赞 0 踩 0

2606.16835 2026-06-16 eess.IV physics.med-ph 新提交

Conditioning Deep Anatomical Prior Knowledge for Reconstruction of Multispectral Optoacoustic Tomography Images

条件化深度解剖先验知识用于多光谱光声断层成像重建

Sarah Franceschin, Lukas Imanuel Scheel-Platz, Philipp Haim, Guillaume Zahnd, Vasilis Ntziachristos, Dominik Jüstel

AI总结提出APRECOT方法，利用概率解剖模型同时分割组织并重建发色团组成，在模拟数据中显著提高浓度估计精度。

详情

AI中文摘要

从多光谱光声断层成像（MSOT）图像中准确描绘组织并重建其发色团组成是光声成像中的一个关键挑战。困难在于组织内的光通量分布本质上依赖于光谱光学特性，使得逆问题具有固有的病态性。目前，缺乏利用先验概率解剖知识来指导组织分割和推断发色团组成的研究。此外，大多数现有研究顺序处理这两个任务，这可能导致误差累积。为了解决这些问题，我们提出了APRECOT（用于光声断层成像重建的解剖先验），该方法利用解剖结构和组织特性的概率模型，能够同时进行组织分割和其整体发色团组成的重建。在这个使用计算机模拟数据的概念验证中，我们表明，与不使用任何解剖背景或使用顺序策略的参考方法相比，结合概率解剖背景显著提高了整体发色团浓度估计的准确性。这项工作朝着直接提供临床相关信息（如组织氧合动力学或疾病相关组织成分变化的成像）的MSOT成像模式迈出了重要一步。

英文摘要

Accurately delineating tissues and reconstructing their chromophore compositions from Multispectral Optoacoustic Tomography (MSOT) images is a key challenge in optoacoustic imaging. The difficulty arises because light fluence distributions within tissue intrinsically depend on spectral optical properties, making the inverse problem inherently ill-posed. Currently, there is a lack of studies leveraging a priori probabilistic anatomical knowledge to guide tissue segmentation and infer chromophore composition. Moreover, most current studies address these two tasks sequentially, which can result in errors accumulating. through the process. To address these issues, we present Anatomical Priors for Reconstruction of Optoacoustic Tomography (APRECOT), a method that leverages probabilistic models of anatomical structures and tissue properties, to enable simultaneous segmentation of tissues and reconstruction of their bulk chromophore compositions. In this proof-of-concept using in-silico data, we show that incorporating probabilistic anatomical context strongly improves the accuracy of bulk chromophore concentration estimation compared to reference methods that do not use any anatomical context or use sequential strategies. This work represents an essential step towards an MSOT imaging mode that directly provides clinically relevant information, such as imaging tissue oxygenation dynamics or disease-related changes in tissue composition.

URL PDF HTML ☆

赞 0 踩 0

2606.16815 2026-06-16 eess.SP cs.AI cs.LG 新提交

A Perception vs. Distortion Perspective on Score-Based Generative Channel Estimation

基于分数的生成式信道估计中的感知与失真权衡视角

Marco Skocaj, Lukas Eller, Mate Boban

AI总结本文通过感知-失真权衡理论，分析了基于分数的生成模型在信道估计中的优势与局限，指出在高预测不确定性下可接近贝叶斯最优性能，低不确定性下判别式方法更优。

Comments 13 pages

详情

AI中文摘要

受其在计算机视觉和逆问题求解中的显著成功驱动，基于分数的模型越来越多地应用于无线通信，并在一系列物理层任务中展现出潜力。然而，尽管兴趣日益增长，当前文献往往缺乏对分数匹配何时比传统判别学习具有实际优势的严格分析。本文旨在通过信道估计这一无线系统中的基本逆问题用例来填补这一空白。我们通过感知-失真权衡的视角，提出了基于分数的信道估计的理论解释，识别了分数匹配表现优异的条件及其关键局限性。特别是，通过将下游无线任务（如容量最大化）建模为信道估计过程的泛函，我们量化了标准失真最小化方法所导致的超额风险。大量数值结果表明，在高预测不确定性下，大的超额风险差距可以通过基于分数的估计来弥补，从而通过学习的后验实现接近贝叶斯最优的预编码，而在低预测不确定性下，由于复杂度更低且模型容量利用更高效，判别式失真最小化方法更可取。

英文摘要

Driven by their remarkable success in computer vision and inverse problem solving, score-based models are increasingly applied to wireless communications, where they show promise across a range of physical-layer tasks. However, despite this growing interest, the current literature often lacks a rigorous analysis of when score-matching offers a tangible advantage over traditional discriminative learning. This paper aims to address this gap through the use-case of channel estimation, a fundamental inverse problem in wireless systems. We present a theoretically grounded interpretation of score-based channel estimation through the lens of the perception-distortion tradeoff, identifying the conditions where score matching excels as well as its key limitations. In particular, by modeling downstream wireless tasks (e.g., capacity maximization) as functionals of the channel estimation process, we quantify the excess risk incurred by standard distortion-minimization approaches. Extensive numerical results show that under high predictive uncertainty, the large excess risk gap can be offset by score-based estimation, enabling near Bayesian-optimal precoding via the learned posterior, whereas in the low predictive uncertainty regime, discriminative distortion-minimization approaches are preferable due to lower complexity and more efficient use of model capacity.

URL PDF HTML ☆

赞 0 踩 0

2606.16750 2026-06-16 eess.SP 新提交

Data-Aided Channel and Doppler Estimation for mMIMO LEO SatComs with Uncompensated Doppler

数据辅助的未补偿多普勒 mMIMO LEO 卫星通信信道与多普勒估计

Abdollah Masoud Darya, Saeed Abdallah

AI总结针对未补偿多普勒下的 mMIMO LEO 卫星信道，提出基于导频 MMSE 估计后结合多普勒估计与数据辅助（DD-MMSE 或 EM）的框架，相比现有方法提高了估计精度，DD-MMSE 复杂度低而 EM 精度高。

2606.16717 2026-06-16 eess.SP 新提交

Sensing-Assisted Predictive Beamforming for UAV-Enabled Ocean Monitoring Networks

面向无人机海洋监测网络的感知辅助预测波束成形

Bohan Li, Guangfei Gao, Jinpeng Zhang, Min Ye, Qian Li, Huaming Yu, Jingjing Wang, Pei Xiao, Sheng Chen

AI总结针对无人机-浮标海洋监测中波浪引起的浮标动态和海杂波，提出感知辅助预测波束成形框架，通过相关加速度状态空间模型和联合优化感知功率分配与无人机位置，在密集浮标和恶劣海况下实现鲁棒预测与通信。

详情

AI中文摘要

本文研究了一种面向无人机-浮标海洋监测的感知辅助预测波束成形框架，明确考虑了波浪引起的浮标动态和残余海杂波。首先建立了基于帧的无人机任务工作流，其中无人机发射集成感知与通信信号以获取浮标回波并支持后续上行波束对齐。为了表征短时浮标运动，结合波浪驱动的Singer过程和缓慢变化的电流漂移项，开发了一个相关加速度状态空间模型。针对由此产生的非线性反射、多普勒和时延测量，推导了后验Fisher信息矩阵和相应的后验Cramér-Rao界（PCRB），并采用预测的水平位置PCRB作为感知度量。然后，在上行速率、无人机功率和移动性约束下，制定了每帧最差浮标设计，以联合优化感知功率分配和无人机位置。通过利用Schur补重构和滞后逐次凸逼近，将所得子问题转化为具有可处理复杂度的凸锥规划。仿真结果表明，所提方案在更密集的浮标部署和更恶劣的海况下保持了鲁棒的预测和通信性能，并优于几种基线设计。特别是，纯通信基准的显著均方根误差（RMSE）退化表明，在动态海洋环境中，感知辅助状态细化对于精确预测波束成形至关重要。与全一阶泰勒展开方法相比，它在在线部署中实现了更具吸引力的性能-复杂度权衡。

英文摘要

This paper investigates a sensing-assisted predictive beamforming framework for UAV--buoy maritime monitoring by explicitly accounting for wave-induced buoy dynamics and residual sea clutter. A frame-based UAV mission workflow is first established, where the UAV transmits integrated sensing and communication signals to acquire buoy echoes and to support subsequent uplink beam alignment. To characterize short-horizon buoy motion, a correlated-acceleration state-space model is developed by combining a Singer process for wave-driven excitation with a slowly varying current-drift term. Given the resulting nonlinear reflection, Doppler, and delay measurements, the posterior Fisher information matrix and the corresponding posterior Cramér--Rao bound (PCRB) are derived, and the predicted horizontal-position PCRB is adopted as the sensing metric. A per-frame worst-buoy design is then formulated to jointly optimize sensing power allocation and UAV position under uplink-rate, UAV-power, and mobility constraints. By exploiting a Schur-complement reformulation and a lagged successive convex approximation, the resulting subproblem is converted into a convex conic program with tractable complexity. Simulation results show that the proposed scheme maintains robust prediction and communication performance under denser buoy deployments and harsher sea conditions, and outperforms several baseline designs. In particular, the pronounced root mean square error (RMSE) degradation of the communication-only benchmark confirms that sensing-assisted state refinement is essential for accurate predictive beamforming in dynamic maritime environments. Compared with a full first-order Taylor expansion method, it achieves a more attractive performance--complexity tradeoff for online deployment.

URL PDF HTML ☆

赞 0 踩 0

2606.16668 2026-06-16 eess.AS cs.SD 新提交

CraBERT: Efficient Phoneme Encoder Pre-Training via Cascade Fusion of Subword Representations for Text-to-Speech

CraBERT: 通过子词表示的级联融合实现文本转语音的高效音素编码器预训练

Dong Yang, Yuki Saito, Wataru Nakata, Hiroshi Saruwatari

AI总结提出CraBERT音素编码器，通过级联融合架构和子词-音素对齐算法，利用预训练子词BERT减少音素编码器预训练量，一个epoch即可达到基线十个epoch的MOS值。

2606.16657 2026-06-16 eess.SP cs.RO 新提交

Towards mm-Level Accurate UWB Radar: High-Accuracy Phase-Based Obstacle Detection through Multi-Channel Fusion

迈向毫米级精度的UWB雷达：通过多通道融合实现基于相位的高精度障碍物检测

Jelle De Moerloose, Adnan Shahid, Eli De Poorter

AI总结提出一种在无源UWB雷达中利用相位信息进行距离估计的框架，通过多通道融合实现厘米级精度，中位误差1.69 cm，比仅用幅度的方法提升显著。

Comments 13 pages, Submitted to IEEE Transactions On Wireless Communications

详情

AI中文摘要

对于自主导引车、机器人及环境表征等应用，使用超宽带（UWB）雷达进行精确、无标签的距离估计至关重要。对于基于标签的定位系统，基于相位的UWB信号处理技术已展现出亚波长测距精度，但这些方法不适用于无源（无标签）雷达设置，因为其反射弱、多径条件复杂且缺乏已知的飞行时间（ToF）首径参考。本文首次证明，在完全无源的UWB雷达设置中可以有效利用相位信息。我们提出一种信号处理框架，通过将基于幅度的粗估计与跨多个频率通道的高分辨率相位变化相结合，提取可靠的距离信息。通过参考视距分量的相位测量，该方法补偿了硬件引起的相位漂移，而多通道频率分集的使用则能够消除周期性相位信息的模糊性，并提高对特定频率信道退化（如菲涅尔区）的鲁棒性。所提方法在配备使用DW3000设备的双基地UWB雷达的机器人上进行了验证，并在真实的金属工业环境中进行了评估。实验结果表明，我们的工作即使在高速下也能持续达到厘米级精度，中位误差为1.69 cm，显著优于仅依赖幅度信息的现有约10 cm精度的UWB雷达方法。我们进一步展示了多通道融合如何利用不相关的信道退化，相比单通道操作将误差降低超过40%，并概述了如何将相位建模与融合推向亚厘米级精度。

英文摘要

Accurate, tag-free distance estimation with ultrawideband (UWB) radar is essential for applications such as autonomous guided vehicles, robotics, and environment characterization. For tag-based localization systems, phase-based UWB signal processing techniques have demonstrated sub-wavelength ranging precision, but these approaches are not applicable for passive (tagless) radar setups with weak reflections, mixed multipath conditions, and the absence of a known time-of-flight (ToF) first-path reference. This paper demonstrates for the first time that phase information can be effectively exploited in a fully passive UWB radar setting. We introduce a signal processing framework that extracts reliable distance information by combining coarse amplitude-based estimates with high-resolution phase changes across multiple frequency channels. By referencing phase measurements with the line-of-sight component, the method compensates for hardware-induced phase drift, while the use of multichannel frequency diversity enables disambiguation of periodic phase information and improves robustness against frequencyspecific channel degradation such as Fresnel zones. The proposed approach is validated on a robot equipped with a bistatic UWB radar using DW3000 devices and evaluated in a realistic metallic industrial environment. Experimental results show that our work consistently achieves centimeter-level accuracy even at high speeds, with a median error of 1.69 cm, significantly outperforming existing ~10cm accuracy UWB radar approaches relying only on amplitude-information. We further show how multi-channel fusion exploits uncorrelated channel degradation to reduce the error by more than 40% compared to single-channel operation, and outline how phase modeling and fusion can be pushed toward sub-centimeter accuracy.

URL PDF HTML ☆

赞 0 踩 0

2606.16644 2026-06-16 cs.IT cs.SY eess.SY math.IT 新提交

Enhancing Secret Key Generation for UAV Communications via Codeword Reconstruction

通过码字重构增强无人机通信的密钥生成

Yizhuo Wang, Qinghe Du, Ning Shen, Shijiao Zhang, Lei Zhao, Yang Hu

AI总结针对无人机通信中信道估计误差导致密钥不一致的问题，提出基于信道码字重构的物理层密钥生成方案，利用极化特性分离可靠密钥，降低合法用户密钥不一致率并提升生成数量，同时确保窃听者密钥一致率低。

Comments Accepted and presented at an IEEE Wireless Communications and Networking Conference (WCNC) 2026 Workshop. 6 pages, 7 figures

详情

AI中文摘要

随着无人飞行器（UAV）的快速发展，确保无人机间通信链路的安全性变得至关重要。本文提出一种基于信道码字重构的新型物理层密钥生成方案。在无人机通信中，空中节点的高移动性导致信道相干时间短，加之噪声影响，造成不可避免的信道估计误差。这些误差显著降低了基于无线信道的密钥生成性能。因此，我们提出一种码字构建算法，实现极化特性，有效将可靠密钥与不可靠密钥分离。与现有的基于量化的密钥生成方案相比，我们的方法最大化利用原始信道信息，并采用软判决解码生成密钥。仿真结果表明，所提方案降低了合法用户的密钥不一致率，并增加了一致生成的密钥数量。此外，我们的方法确保了窃听者具有较低的密钥一致率，从而保障系统安全性。

英文摘要

With the rapid advancement of unmanned aerial vehicle (UAV), ensuring the security of communication links among UAVs has become crucial. In this paper, we propose a novel physical layer key generation scheme based on channel codeword reconstruction. In UAV communications, the high mobility of aerial nodes leads to short channel coherence time, which together with noise causes inevitable channel estimation errors. These errors significantly degrades the performance of wireless channel-based key generation. Therefore, we propose a codeword construction algorithm that achieves a polarization characteristic, which effectively segregates reliable keys from unreliable ones. Compared to the existing quantization-based key generation scheme, our approach maximize the utilization of raw channel information and employ soft-decision decoding to generate key. Simulation results demonstrate that the proposed scheme reduces the key disagreement rate for legitimate users and increases the number of consistently generated keys. Furthermore, our method ensures a lower key consistency rate for eavesdropper, which guarantees system security.

URL PDF HTML ☆

赞 0 踩 0

2606.16635 2026-06-16 eess.SY cs.SY 新提交

Closed-loop Optimal Fault Detection for Uncertain Systems

不确定系统的闭环最优故障检测

Koen Classens, Tjeerd Ickenroth, Jeroen van de Wijdeven, Tom Oomen

AI总结针对开环或闭环配置下的连续时间线性时不变不确定系统，提出一种基于最坏情况扰动和不确定性模型的鲁棒故障检测滤波器统一设计方法，通过求解单个Riccati方程实现参数和动态不确定性的处理。

详情

AI中文摘要

故障会损害复杂工程系统的可靠性和安全性。本文旨在解决开环或闭环配置下连续时间线性时不变不确定系统的鲁棒故障检测滤波器设计问题。所提出的方法通过求解单个Riccati方程，基于最坏情况扰动和不确定性模型，提供了一种处理参数和动态不确定性的统一方法。该最坏情况模型通过非线性优化和边界Nevanlinna-Pick方法获得。使用光刻工业中实验性掩模台的不确定模型展示了所提出方法的有效性。结果表明，在故障敏感性与建模不确定性和扰动抑制之间实现了最优折衷。这种能力使得残差中故障与不良效应能够清晰区分，从而提高了故障检测的可靠性，最终有助于提升机器的安全性和性能。

英文摘要

Faults compromise the reliability and safety of complex engineering systems. The aim of this article is to address the problem of robust fault detection filter design for continuous-time linear time-invariant uncertain systems in open-loop or closed-loop configurations. The developed method offers a unified approach to handle parametric and dynamic uncertainties by solving a single Riccati equation, based on a worst-case disturbance and uncertainty model. This worst-case model is obtained by nonlinear optimization and application of the boundary Nevanlinna-Pick method. The efficacy of the proposed approach is demonstrated using an uncertain model of an experimental reticle stage used in the lithography industry. The results illustrate that an optimal compromise is achieved between sensitivity to faults and rejection of modelling uncertainties and disturbances on the other hand. This capability enables the clear differentiation between faults and undesired effects in residuals, thereby enhancing fault detection reliability, ultimately contributing to improved safety and performance of machines.

URL PDF HTML ☆

赞 0 踩 0

2606.16631 2026-06-16 eess.SY cs.SY 新提交

Exponential Weighting Model Predictive Control with Observer for Modular Multilevel Converters

模块化多电平换流器的指数加权模型预测控制与观测器

Sunny Singh, Saurabh Mishra, Dušan M Stipanović, Aleksandra Lekić

AI总结提出一种带指数代价函数的模型预测控制方案与观测器，增强模块化多电平换流器动态性能，解决大预测步长导致的数值病态问题，并保证闭环稳定性。

Comments 6 pages

2606.16628 2026-06-16 eess.SP 新提交

XL-ChannelDiff: An Efficient Diffusion-Based Multi-Domain Near-Field Channel Extrapolation Framework for XL-MIMO Systems

XL-ChannelDiff: 一种面向XL-MIMO系统的高效基于扩散的多域近场信道外推框架

Mengyuan Li, Yu Han, Hao Xu, Yongxu Zhu, Chao-Kai Wen, Shi Jin

AI总结提出基于条件去噪扩散隐式模型（CDDIM）的多域近场信道外推框架，通过物理感知骨干网络和WGAN对抗监督实现高精度外推，在多种配置下表现优异。

详情

AI中文摘要

准确的信道状态信息（CSI）获取对于发挥超大规模多输入多输出（XL-MIMO）系统的性能增益至关重要。然而，在近场区域，由于高维信道表示和球面波前传播，CSI获取比远场更具挑战性。为此，本文提出了一种高效的XL-MIMO系统多域近场信道外推框架。利用条件去噪扩散隐式模型（CDDIM），我们的方法能够在天线、频率和空间域上实现精确的信道外推。具体而言，我们设计了一个物理感知的CDDIM骨干网络，结合位置嵌入的块标记化和掩码引导的多头注意力机制，使模型能够利用近场球面波传播引起的位置相关信道相关性。为确保高保真外推，我们引入了Wasserstein GAN（WGAN）判别器，在训练和反向采样阶段为CDDIM提供对抗性监督。此外，引入了一种RePaint风格的细化方案来优化采样轨迹，进一步提高外推精度。大量实验证明了所提框架的优越性，在跨不同域、不同配置和严重掩码条件下实现了卓越的外推精度和鲁棒泛化能力。

英文摘要

Accurate channel state information (CSI) acquisition is essential for unleashing the performance gains of extremely large-scale multiple-input multiple-output (XL-MIMO) systems. However, in near-field regions, CSI acquisition is much more challenging than in the far field due to the high-dimensional channel representation and spherical wavefront propagation. To address this, in this paper, we propose an efficient multi-domain near-field channel extrapolation framework for XL-MIMO systems. Leveraging the conditional denoising diffusion implicit model (CDDIM), our approach enables accurate channel extrapolation across the antenna, frequency, and spatial domains. Specifically, we design a physics-aware CDDIM backbone that incorporates position-embedded patch tokenization and a mask-guided multi-head attention mechanism, enabling the model to exploit position-dependent channel correlations induced by near-field spherical-wave propagation. To ensure high-fidelity extrapolation, we incorporate a Wasserstein GAN (WGAN) discriminator that provides adversarial supervision to the CDDIM during both the training and reverse sampling phases. Additionally, a RePaint-style refinement scheme is introduced to optimize the sampling trajectory, further boosting extrapolation accuracy. Extensive experiments demonstrate the superiority of the proposed framework, achieving superior extrapolation accuracy and robust generalization across diverse domains, varied configurations, and severe masking conditions.

URL PDF HTML ☆

赞 0 踩 0

2606.16618 2026-06-16 eess.SP 新提交

Acoustic, VOC, and Multimodal Stress Source Localization in the Internet of Plants

植物物联网中的声学、VOC和多模态胁迫源定位

Ahmet B. Kilic, Ozgur B. Akan

AI总结提出一种两阶段粗到细定位流程，结合声学到达时间差多边定位和VOC弥散格林函数模型，实现植物网络中胁迫源的空间定位。

详情

AI中文摘要

植物物联网将分布式植物网络视为环境监测的生物传感基础设施，但胁迫源在该网络中的空间定位问题尚未解决。植物胁迫信号具有根本不同的空间动态：声发射全向传播且不受风影响，而挥发性有机化合物羽流狭窄且受平流主导。我们提出了一种针对嵌入冠层的“代理植物”网络（生物混合传感节点）的两阶段粗到细定位流程。第一阶段通过声学到达时间读数上的到达时间差多边定位定位源；第二阶段利用闭式稳态格林函数模型（VOC弥散）细化该估计。VOC信息门和逆方差融合规则根据跨试验可靠性结合两个估计，当未检测到信息性VOC信号时，优雅降级为仅TDOA估计。我们在一个新的开源数据集上评估了仅TDOA、仅VOC和融合方法，该数据集包含通过有限体积平流-扩散求解器和基于射线的声衰减模型生成的52个场景。在1到50个代理植物的网络密度下，一旦三个或更多代理处于声学范围内，TDOA多边定位实现亚米级平均绝对误差，远优于仅VOC定位（所有密度下MAE > 3米）。在大多数情况下，融合与仅TDOA估计的差异很小，且与噪声在统计上不可区分。该流程对物理参数扰动、到达时间噪声、VOC门阈值和边界半径具有鲁棒性。TDOA定位可部署于当前声学硬件，而VOC定位仍是前瞻性能力，有待紧凑型生化传感器的进展。

英文摘要

The Internet of Plants (IoP) treats distributed plant networks as bio-sensing infrastructure for environmental monitoring, but spatial localization of stress sources within such networks remains unaddressed. Plant stress signals have fundamentally different spatial dynamics: acoustic emissions propagate omnidirectionally and independently of wind, while volatile organic compound (VOC) plumes are narrow and advection-dominated. We propose a two-stage, coarse-to-fine localization pipeline for a network of ``agent plants'' -- bio-hybrid sensing nodes embedded in the canopy. Stage 1 localizes the source via time-difference-of-arrival (TDOA) multilateration on acoustic time-of-arrival (ToA) readings; Stage 2 refines this estimate using a closed-form, steady-state Green's function model of VOC dispersion. A VOC informativeness gate and an inverse-variance fusion rule combine the two estimates according to their across-trial reliability, with graceful degradation to the TDOA-only estimate when no informative VOC signal is detected. We evaluate TDOA-only, VOC-only, and fused approaches on a new open-source dataset of 52 scenarios generated via a finite-volume advection-diffusion solver and a ray-based acoustic attenuation model. Across network densities of 1 to 50 agent plants, TDOA multilateration achieves sub-meter mean absolute error (MAE) once three or more agents are within acoustic range, far outperforming VOC-only localization (MAE $> 3$ m at all densities). Fusion differences from the TDOA-only estimate are small and statistically indistinguishable from noise in most cases. The pipeline is robust to physical parameter perturbations, ToA noise, the VOC gate threshold, and the bounding radius. TDOA localization is deployable with current acoustic hardware, whereas VOC localization remains a forward-looking capability pending advances in compact biochemical sensors.

URL PDF HTML ☆

赞 0 踩 0

2606.16607 2026-06-16 eess.SP cs.IT cs.LG math.IT 新提交

Context-Aware Markov VAE for CSI Compression in Wireless Systems

面向无线系统中CSI压缩的上下文感知马尔可夫VAE

Efstathios Chatziloizos, Konstantinos Vandikas, Aneta Vulgarakis Feljan, Zheng Chen, Nikolaos Pappas

AI总结提出基于k-记忆马尔可夫变分自编码器的上下文感知压缩框架，利用有限时间窗口捕捉CSI在潜在空间中的演化，在低中压缩率下显著提升重构性能。

Comments 5 pages, 3 figures, 2 tables

详情

AI中文摘要

本文研究了在频分双工（FDD）系统中，针对时变大规模多输入多输出（MIMO）信道，在有限反馈资源下的神经信道状态信息（CSI）压缩问题。主要挑战在于，由于CSI在连续快照间表现出强时间相关性，需要获得紧凑且高效的CSI表示。现有的无记忆压缩模型未利用这一特性，而简单的时间扩展方法通常合并多个观测值，但未显式建模潜在动态。我们提出了一种基于k-记忆马尔可夫变分自编码器（k-MMVAE）的上下文感知压缩框架，该框架使用有限时间窗口在潜在空间中捕捉CSI的演化。该模型引入了具有有限记忆的马尔可夫结构潜在动态，从而能够有效利用时间依赖性进行压缩。仿真结果表明，与无记忆和弱顺序基线相比，所提方法改善了目标CSI重构性能，尤其是在低和中压缩率下。这些结果表明，显式的潜在时间建模可以在有限反馈约束下为CSI压缩提供有效机制。

英文摘要

This paper considers neural channel state information (CSI) compression for time-varying massive multiple-input multiple-output (MIMO) channels in frequency division duplex (FDD) systems with limited feedback resources. The main challenge lies in obtaining a compact and efficient representation of the CSI given that it exhibits strong temporal correlation across successive snapshots. Existing memoryless compression models do not exploit this property, while simple temporal extensions often incorporate multiple observations without explicitly modeling the latent dynamics. We propose a context-aware compression framework based on a k-memory Markov variational autoencoder (k-MMVAE), which uses a finite temporal window to capture the evolution of CSI in the latent space. The model introduces Markov-structured latent dynamics with finite memory, enabling efficient use of temporal dependencies for compression. Simulation results show that the proposed approach improves target CSI reconstruction performance compared to memoryless and weakly sequential baselines, particularly at low and moderate compression rates. These results suggest that explicit latent temporal modeling can provide an effective mechanism for CSI compression under limited feedback constraints.

URL PDF HTML ☆

赞 0 踩 0

2606.16581 2026-06-16 eess.IV 新提交

Optimizing Multiple Feature Types for Image Inpainting in the Linear and Nonlinear Setting

线性和非线性设置下优化多种特征类型的图像修复

Vassillen Chizhov, Ferdinand Jost, Joachim Weickert

AI总结提出一种通用理论和框架，允许在线性或非线性修复中优化任意特征（如导数、局部积分），自动选择特征位置和类型，实验表明增加特征类型可显著提升压缩重建质量。

详情

AI中文摘要

基于修复的压缩存储完整图像数据的一个精心优化的子集，并通过修复重建缺失数据。这些有损编解码器的质量关键取决于存储的数据。到目前为止，这些数据几乎完全由像素位置及其灰度或颜色值组成。在本文中，我们提出了一种通用理论和实用框架，允许纳入可由线性或非线性方程描述的任意特征。这包括例如任意阶导数或局部积分。我们的特征可以与线性或非线性修复算子结合。此外，我们提出了一种算法，自动优化所选特征的位置和类型。允许不同类型的优化特征的方法将基于修复的压缩转变为更通用、更灵活、更强大的范式。我们的实验报告了当特征类型数量从1增加到5时一致的质量提升。在相同存储数据量下，对于调和（均匀扩散）修复，平均峰值信噪比提升2.76 dB；对于边缘增强扩散修复，提升1.82 dB。

英文摘要

Inpainting-based compression stores a carefully optimized subset of the full image data and reconstructs the missing data by inpainting. The quality of these lossy codecs depends decisively on the stored data. So far, these data consist almost exclusively of pixel locations along with their grayscale or color values. In the present paper, we present a general theory and a practical framework that allows to incorporate arbitrary features which can be described by linear or nonlinear equations. This includes e.g. derivatives of arbitrary order or local integrals. Our features can be combined with linear or nonlinear inpainting operators. Moreover, we present an algorithm that automatically optimizes the location and the type of the selected feature. The approach of allowing different types of optimized features turns inpainting-based compression into a more general, versatile and powerful paradigm. Our experiments report a consistent quality gain when increasing the number of feature types from 1 to 5. With the same amount of stored data, the average peak signal-to-noise improvement is 2.76 dB for harmonic (homogeneous diffusion) inpainting, and 1.82 dB for edge-enhancing diffusion inpainting.

URL PDF HTML ☆

赞 0 踩 0

2606.16567 2026-06-16 cs.AI cs.LG cs.SY eess.SY math.DS 新提交

TNODEV: Toolbox for Neural ODE Verification

TNODEV: 神经ODE验证工具箱

Abdelrahman Sayed Sayed, Pierre-Jean Meyer, Mohamed Ghazel

发表机构 * Univ Gustave Eiffel, COSYS-ESTAS（古斯塔夫·埃菲尔大学，COSYS-ESTAS实验室）

AI总结提出TNODEV，首个集成伪造检查、区间可达性、验证循环和并行调度的神经ODE形式验证器，支持安全集包含和分类鲁棒性验证。

Comments 29 pages, 7 figures, Under review in TMLR

详情

AI中文摘要

神经常微分方程（神经ODE）已开始出现在安全关键场景中，例如网络物理系统的连续时间控制器和集成到自动化决策流水线中的分类器，这引发了对其行为能否被形式化验证的问题。现有的专门用于神经ODE的工具仅提供单次可达性调用，没有迭代输入集细化，将其判定的精度限制在单次可达性调用所能提供的范围内。我们提出了TNODEV，这是首个用于神经ODE的可靠形式验证器，它集成了伪造检查器、基于连续时间混合单调性的快速区间可达性后端、具有三种输入集分裂启发式的验证与细化循环以及并行调度器，构成一个端到端流水线。TNODEV支持纯神经ODE、与神经网络控制器闭环的神经ODE以及通用神经ODE（GNODE）上的安全集包含验证，安全集可指定为区间或由目标分类标签诱导的半空间交集。我们在安全集包含和分类鲁棒性属性的一系列基准上评估了TNODEV，包括与NNV 2.0和CORA的直接可达性比较，以及在MNIST通用神经ODE分类器上与NNV2.0的验证比较。

英文摘要

Neural ordinary differential equations (neural ODE) have started to appear in safety critical settings such as continuous-time controllers for cyber-physical systems and classifiers integrated into automated decision pipelines, raising the question of whether their behavior can be formally verified. Existing tools dedicated to neural ODE provide only a single reachability call without iterative input set refinement, limiting the precision of their verdicts to whatever one reachability call can deliver. We present TNODEV, the first sound formal verifier for neural ODE that integrates a falsification checker, a fast interval-based reachability backend based on continuous-time mixed monotonicity, a verification and refinement loop with three input-set splitting heuristics, and a parallel scheduler in a single end-to-end pipeline. TNODEV supports safe-set inclusion verification on pure neural ODE, neural ODE in closed loop with a neural network controller and general neural ODE (GNODE), with the safe set specified either as an interval or as the half-space intersection induced by a target classification label. We evaluate TNODEV on a range of benchmarks across safe-set inclusion and classification-robustness properties, including a direct reachability comparison against NNV~2.0 and CORA and a verification comparison against NNV2.0 on MNIST general neural ODE classifiers.

URL PDF HTML ☆

赞 0 踩 0

2606.16558 2026-06-16 cs.AI cs.RO cs.SY eess.SY 新提交

ROSA-RL: Uncertainty-Aware Roundabout Optimized Speed Advisory with Reinforcement Learning

ROSA-RL：基于强化学习的不确定性感知环岛优化速度建议

Anna-Lena Schlamp, Jeremias Gerner, Klaus Bogenberger, Werner Huber, Stefanie Schmidtner

发表机构 * Universität der Bundeswehr München（慕尼黑联邦国防军大学）； Hochschule für angewandte Wissenschaften Landshut（兰茨胡特应用科学大学）

AI总结针对混合交通中环岛场景的不确定性，提出ROSA-RL框架，结合Transformer预测冲突区域占用概率与强化学习，实现安全高效的环岛入口速度协调。

Comments 8 pages, 2 figures, 2 tables. Copyright 2026 IEEE. This is the accepted manuscript for 2026 IEEE International Conference on Intelligent Transportation Systems (ITSC), not the final published version

详情

AI中文摘要

环岛在混合交通中对自动驾驶构成挑战，因为异质且非确定性的人类行为、未知的驾驶意图以及高交互复杂性使得在进入时刻冲突区域是被阻塞还是可用存在不确定性。我们提出ROSA-RL——基于强化学习的不确定性感知环岛优化速度建议。它通过概率冲突预测，实现混合交通中自动驾驶和人类驾驶车辆的安全高效环岛进入。一个基于Transformer的模型预测未来五秒内的冲突区域占用情况，捕捉多智能体交互以预测即将发生的冲突和可用间隙。预测输出编码了未来运动和意图的不确定性，并增强经典强化学习框架的状态，实现不确定性感知的速度协调。在基于真实世界数据的仿真评估中，ROSA-RL能有效处理不确定性，并优于基于模型的基线方法，缩小了与假设完全已知占用的理想设置之间的差距，同时提高了交通效率和安全性。本工作的源代码可在github.com/urbanAIthi/ROSA-RL获取。

英文摘要

Roundabouts challenge automated driving in mixed traffic, as heterogeneous and non-deterministic human behavior, unknown driving intentions, and high interaction complexity create uncertainty about whether the conflict zone will be blocked or available at the moment of entry. We present ROSA-RL -- uncertainty-aware Roundabout Optimized Speed Advisory with Reinforcement Learning. It enables safe and efficient roundabout entry for automated and human-driven vehicles in mixed traffic through probabilistic conflict forecasting. A Transformer-based model predicts conflict zone occupancy over a five-second horizon, capturing multi-agent interactions to anticipate upcoming conflicts and available gaps. The prediction outputs encode uncertainty in future motion and intent, and augment the state of a classical RL framework, enabling uncertainty-aware speed coordination. Evaluated in simulations grounded in real-world data, ROSA-RL can effectively handle uncertainty and outperform a comparable model-based baseline, closing the gap to an ideal setting assuming fully known occupancy while improving traffic efficiency and safety. The source code of this work is available under: github.com/urbanAIthi/ROSA-RL.

URL PDF HTML ☆

赞 0 踩 0

2606.16551 2026-06-16 eess.AS 新提交

Learning Input-Channel Permutation Equivariance for Multi-Channel Source Separation: Reducing Bleeding in Small Music Ensembles

学习输入通道置换等变性用于多通道源分离：减少小型音乐合奏中的串音

Ruchi Pandey, Jaime Garcia-Martinez, Pablo Cabanas-Molero, David Diaz Guerra, Ricardo Falcon Perez, Tuomas Virtanen, Julio J. Carabias-Orti, Pedro Vera-Candeas

AI总结针对小型合奏录音中麦克风串音问题，提出以通道置换等变性为核心学习原则，通过训练时随机置换输入通道及其对应参考目标，增强对录音设置变化的鲁棒性，实验表明该方法在未见条件下持续改善SDR并减少串音。

详情

AI中文摘要

麦克风串音是小合奏和管弦乐录音中持续存在的挑战，其中用于单个乐器的近距离麦克风也会捕获来自附近声源的泄漏。这种重叠降低了音轨隔离度并使混音复杂化。本文通过将通道置换等变性作为核心学习原则来解决串音问题。在训练期间，我们对输入麦克风通道及其对应的参考目标应用相同的随机置换。这阻止了对固定通道-乐器关联的依赖，并提高了对录音设置甚至录制乐器变化的鲁棒性。所提出的模型在具有多样化模拟室内声学和麦克风位置的合成合奏上进行训练，并在未见过的模拟条件和真实URMP录音上进行评估。结果表明，与非置换基线相比，置换感知训练在未见条件下持续改善SDR并减少串音。研究结果强调了置换等变性作为一种简单、以数据为中心的策略，用于音乐制作工作流程中的鲁棒去串音和实际多通道源分离。

英文摘要

Microphone bleed is a persistent challenge in small ensembles and orchestral recordings, where close microphones intended for individual instruments also capture leakage from nearby sources. This overlap degrades track isolation and complicates mixing. This paper addresses the bleeding problem by making channel-permutation-equivariance a core learning principle. During training, we apply the same random permutation to the input microphone channels and their corresponding reference targets. This discourages reliance on fixed channel-instrument associations and improves robustness to changes in the recording setup and even in the recorded instruments. The proposed model is trained on synthetic ensembles with diverse simulated room acoustics and microphone placements, and evaluated on unseen simulated conditions and real URMP recordings. The results show that permutation-aware training consistently improves SDR and reduces bleeding under unseen conditions compared with non-permutation baselines. The findings highlight permutation-equivariance as a simple, data-centric strategy for robust debleeding and practical multi-channel source separation in music production workflows.

URL PDF HTML ☆

赞 0 踩 0

2606.16546 2026-06-16 eess.AS cs.SD 新提交

Confidence Score Guided Incremental and Speaker Adaptive Pseudo-Labeling for Semi-Supervised Elderly Speech Recognition

置信度评分引导的增量式和说话人自适应伪标注用于半监督老年人语音识别

Chengxi Deng, Xurong Xie, Shujie Hu, Jiajun Deng, Mengzhe Geng, Youjun Chen, Huimeng Wang, Haoning Xu, Guinan Li, Xunying Liu

发表机构 * The Chinese University of Hong Kong, Hong Kong SAR, China（香港中文大学）； Institute of Software, Chinese Academy of Sciences, China（中国科学院软件研究所）； National Research Council Canada, Canada（加拿大国家研究理事会）

AI总结提出一种置信度评分引导的增量式和说话人自适应伪标注方法，通过渐进式高质量伪标签选择和说话人自适应训练，在老年人语音识别中分别降低词错误率1.45%和字符错误率2.27%。

Comments Accepted by Interspeech 2026

详情

AI中文摘要

本文提出了一种新颖的置信度评分引导的增量式和说话人自适应伪标注方法，用于半监督老年人语音识别。该方法促进了更高质量的伪标签选择和渐进式优化，同时减轻了说话人异质性。设计了一个置信度估计模块来对未转录数据的可靠性进行排序，从而实现从高置信度到低置信度逐步引入未标记数据子集的课程学习轨迹。通过带有可学习提示的说话人自适应训练来捕获说话人特定特征。在英语DementiaBank Pitt和粤语JCCOCC MoCA老年人语音数据集上的实验表明，所提出的方法相比不使用置信度评分引导的增量式或说话人自适应伪标注的半监督基线，在词错误率（WER）或字符错误率（CER）上取得了统计显著的降低，绝对降低分别为1.45%和2.27%（相对降低6.21%和6.98%）。

英文摘要

This paper proposes a novel confidence score guided incremental and speaker adaptive pseudo-labeling approach for semi-supervised elderly speech recognition. It facilitates higher-quality pseudo-label selection and progressive refinement, while also mitigating speaker heterogeneity. A confidence estimation module is designed to rank the reliability of untranscribed data, enabling a curriculum learning trajectory that progressively folds in unlabeled data subsets from high to low confidence. Speaker-specific characteristics are captured through speaker adaptive training with learnable prompts. Experiments on the English DementiaBank Pitt and Cantonese JCCOCC MoCA elderly speech datasets suggest that the proposed method outperforms the semi-supervised baseline using no confidence scores guided incremental or speaker adaptive pseudo-labeling by statistically significant word error rate (WER) or character error rate (CER) reductions of 1.45% and 2.27% absolute (6.21% and 6.98% relative).

URL PDF HTML ☆

赞 0 踩 0

2606.16539 2026-06-16 eess.AS cs.SD 新提交

Decoding while Adapting: Zero-Shot Online Speaker Adaptation via Audio-Textual Prompts for Elderly Speech Recognition

解码与自适应：基于音频-文本提示的零样本在线说话人自适应用于老年人语音识别

Chengxi Deng, Xurong Xie, Shujie Hu, Mengzhe Geng, Tianzi Wang, Youjun Chen, Huimeng Wang, Haoning Xu, Jiajun Deng, Xunying Liu

发表机构 * The Chinese University of Hong Kong, Hong Kong SAR, China（香港中文大学）； Institute of Software, Chinese Academy of Sciences, China（中国科学院软件研究所）； National Research Council Canada, Canada（加拿大国家研究理事会）

AI总结提出一种基于跨语句音频-文本提示的说话人自适应方法，实现零样本实时适应未见说话人，在老年人语音数据集上显著降低词错误率/字符错误率，并大幅提升实时因子。

Comments Accepted by Interspeech 2026

2606.16492 2026-06-16 eess.SY cs.SY math.RA 新提交

On the Lyapunov equation with the state matrix in companion form

关于状态矩阵为伴随矩阵形式的Lyapunov方程

Augusto Ferrante

AI总结研究连续时间Lyapunov方程在状态矩阵为Hurwitz伴随矩阵时的解的非负性，证明当伴随矩阵只有实特征值时解矩阵元素非负，并讨论完全非负性条件。

Comments 13 pages, no figures