arXivDaily arXiv每日学术速递 周一至周五更新

科学与医疗

医学 AI

医学智能、临床 AI、医学影像、病理、诊断和医疗健康大模型。

今日/当前日期收录 43 信号源:cs.CV, cs.LG, q-bio, eess.IV, eess.SP

1. 医学影像 7 篇

2606.19908 2026-06-19 cs.CV 新提交 85%

Gaussian Process Prior Variational Autoencoder for Endoscopic Videos

用于内窥镜视频的高斯过程先验变分自编码器

Ivan De Boi, Xinxing Shi, Xiaoyu Jiang, Tim J. M. Jaspers, Francisco Caetano, Mauricio A. Alvarez, Fons van der Sommen, Sam Van der Jeught

发表机构 * Department of Electromechanics, InViLab, University of Antwerp(安特卫普大学机电工程系InViLab实验室) Department of Computer Science, University of Manchester(曼彻斯特大学计算机科学系) Department of Electrical Engineering, Eindhoven University of Technology(埃因霍温理工大学电气工程系)

专题命中 医学影像 :针对内窥镜视频缺失帧插值与修复。

AI总结 提出高斯过程先验变分自编码器(GPVAE),通过时间高斯过程先验替代因子化先验,结合两种可扩展GP近似和镜面反射掩码,实现内窥镜视频缺失帧的插值与修复,在C3VDv2数据集上平均降低RMSE 21.9%。

详情
AI中文摘要

内窥镜视频分析对于胃肠道诊断和计算机辅助干预至关重要,但视频序列经常受到镜面反射、运动伪影和缺失帧的退化影响。这些瞬态损坏会分散临床医生的注意力,降低图像可解释性,并干扰下游任务(如3D重建和导航)。因此,有效的修复需要利用时间连续性而非孤立处理帧的方法。我们提出了一种用于内窥镜视频修复的高斯过程先验变分自编码器(GPVAE)框架,该框架用时间高斯过程先验替代标准因子化潜在先验,从而能够以不确定性感知的重建方式插值缺失帧。该框架结合了内窥镜专用编码器(包括卷积EndoVAE骨干网络和来自GastroNet-5M的预训练Vision Transformer编码器)以及两种可扩展GP近似:层次先验近似(HPA)和稀疏精度近似(SPA)。镜面反射通过基于DUCKNet的掩码流水线处理,该流水线从重建目标中排除损坏像素。在C3VDv2结肠镜数据集上,最佳GPVAE变体相对于匹配的VAE基线,图像重建RMSE平均降低21.9%,最高降低26.1%。下游轨迹RMSE在经典视觉里程计和预训练PoseNet上平均降低12.7%,而每epoch训练时间平均增加27.3%。最后,GP后验提供每帧不确定性估计,反映时间支持并为修复帧提供置信度信号。

英文摘要

Endoscopic video analysis is essential for gastrointestinal diagnosis and computer-assisted interventions, but video sequences are routinely degraded by specular reflections, motion artifacts, and missing frames. These transient corruptions can distract clinicians, reduce image interpretability, and disrupt downstream tasks such as 3D reconstruction and navigation. Effective restoration therefore requires methods that exploit temporal continuity rather than treating frames in isolation. We introduce a Gaussian Process Prior Variational Autoencoder (GPVAE) framework for endoscopic video restoration that replaces the standard factorized latent prior with a temporal Gaussian process prior, enabling interpolation of missing frames with uncertainty-aware reconstruction. The framework combines endoscopy-specific encoders, including a convolutional EndoVAE backbone and pretrained Vision Transformer encoders from GastroNet-5M, with two scalable GP approximations: Hierarchical Prior Approximation (HPA) and Sparse Precision Approximation (SPA). Specular reflections are handled using a DUCKNet-based masking pipeline that excludes corrupted pixels from the reconstruction objective. On the C3VDv2 colonoscopy dataset, the best GPVAE variants reduced image reconstruction RMSE by 21.9\% on average, and by up to 26.1\%, relative to matched VAE baselines. Downstream trajectory RMSE was reduced by 12.7\% on average across classical visual odometry and a pretrained PoseNet, at an average increase of 27.3\% in training time per epoch. Finally, the GP posterior provides per-frame uncertainty estimates that reflect temporal support and offer a confidence signal for restored frames.

2606.19889 2026-06-19 cs.CV 新提交 85%

SurgVista: Long-Horizon Surgical World Modeling with Plausible Instrument-Tissue Dynamics

SurgVista:具有合理器械-组织动力学的长程手术世界建模

Wentao Pan, Wuyang Li, Shengyuan Liu, Xinyu Liu, Hengyu Liu, Yixuan Yuan

发表机构 * The Chinese University of Hong Kong(香港中文大学) EPFL(瑞士联邦理工学院洛桑) Imperial College London(伦敦帝国学院)

专题命中 医学影像 :手术世界模型,用于机器人手术策略学习。

AI总结 提出SurgVista手术世界模型,通过变形一致性正则化和漂移适应训练,解决空间交互不连贯和时间保真度崩溃问题,在长程预测中显著优于现有方法。

详情
AI中文摘要

将机器人策略学习扩展到自主手术面临挑战,因为专家演示成本高昂且体内探索存在重大安全风险。手术世界模型通过从初始观测生成逼真的、动作条件下的未来帧来解决这一问题,但现有方法存在两种持续失效模式:空间交互不连贯,即可见器械接触未能引起空间一致的组织变形;以及时间保真度崩溃,即预测误差在自回归展开中累积并逐渐破坏视觉质量。我们提出SurgVista,一种通过两种训练策略缓解这两种失效的手术世界模型。变形一致性正则化从训练视频中提取场景点轨迹,并通过潜在对比学习强制跨帧一致性,增强物理一致的器械-组织动力学。漂移适应训练通过用在线预测残差和根据长程漂移统计校准的光度增强扰动条件帧,减轻长程漂移,在扩展展开中维持视觉保真度。为了进行严格评估,我们进一步引入SurgWorld-Bench,包含多样化的手术类型、长程展开以及用于器械运动精度和组织响应保真度的解耦指标。大量实验表明,SurgVista在视觉质量、时间一致性和交互保真度方面持续优于最先进方法,且随着预测视界增长优势扩大。

英文摘要

Scaling robot policy learning for autonomous surgery is challenging, as expert demonstrations are expensive and in vivo exploration poses substantial safety risks. Surgical world models address this by generating realistic, action-conditioned future frames from an initial observation, but existing methods exhibit two persistent failure modes: spatial interaction incoherence, where visible instrument contact fails to induce spatially consistent tissue deformation, and temporal fidelity collapse, where prediction errors compound across autoregressive rollouts and progressively corrupt visual quality. We present SurgVista, a surgical world model that mitigates both failures through two training recipes. Deformation Consistency Regularization extracts scene-point trajectories from training videos and enforces cross-frame coherence through latent contrastive learning, strengthening physically consistent instrument-tissue dynamics. Drift Adaptation Training mitigates long-horizon drift by perturbing conditioning frames with online prediction residuals and photometric augmentations calibrated to long-horizon drift statistics, sustaining visual fidelity over extended rollouts. To enable rigorous evaluation, we further introduce SurgWorld-Bench, featuring diverse procedure types, long-range rollouts, and decoupled metrics for instrument-motion accuracy and tissue-response fidelity. Extensive experiments show that SurgVista consistently outperforms state-of-the-art methods across visual quality, temporal consistency, and interaction fidelity, with gains widening as the prediction horizon grows.

2606.19867 2026-06-19 cs.CV cs.AI 新提交 85%

PSCT-Net: Geometry-Aware Pediatric Skull CT Reconstruction via Differentiable Back-Projection and Attention-Guided Refinement

PSCT-Net: 通过可微反投影和注意力引导细化实现几何感知的儿科颅骨CT重建

Dong Yeong Kim, Jaewon Choi, Youmin Shin, Jungyu Lee, Myeongseop Kim, Jinwook Choi, Joo Whan Kim, Young-Gon Kim

发表机构 * Interdisciplinary Program in Bioengineering, Seoul National University(首尔大学生物工程跨学科项目) Department of Transdisciplinary Medicine, Seoul National University Hospital(首尔大学医院跨学科医学系) Department of Artificial Intelligence, Yonsei University(延世大学人工智能系) Department of Medicine, Seoul National University College of Medicine(首尔大学医学院医学系) Healthcare AI Research Institute, Seoul National University Hospital(首尔大学医院医疗人工智能研究所)

专题命中 医学影像 :儿科颅骨CT重建,低剂量替代方案。

AI总结 提出PSCT-Net,利用可微反投影建立空间先验,结合注意力引导投影和双向Mamba模块,从稀疏双平面X射线重建3D CT,缓解深度模糊并改善骨边界。

Comments 11pages, 5 figures

详情
AI中文摘要

计算机断层扫描(CT)对于诊断儿科颅面异常至关重要,但对发育中的解剖结构存在辐射风险。从稀疏双平面X射线重建3D CT提供了一种低剂量替代方案,但问题严重不适定。现有方法采用几何无关的特征提升,将2D特征天真地投影到3D中,缺乏显式空间建模,导致深度模糊和骨边界退化。我们提出PSCT-Net,一种具有可微反投影的几何感知框架。可微反投影建立了空间保真的体积先验,缓解了深度模糊。然后,注意力引导投影(AGP-3D)模块学习2D区域与3D位置之间的非线性体素级对应关系。双向Mamba(BiM-3D)模块以线性复杂度捕获长程体积依赖关系。我们进一步整理了一个私有的机构儿科颅骨CT数据集PedSkull-CT,包含正常和病理病例用于内部评估,弥补了以成人中心和躯干为主的数据集的空白。

英文摘要

Computed Tomography (CT) is essential for diagnosing pediatric craniofacial abnormalities, yet poses radiation risks to developing anatomies. Reconstructing 3D CT from sparse bi-planar X-rays offers a low-dose alternative but is severely ill-posed. Existing methods employ geometry-agnostic feature lifting, naively projecting 2D features into 3D without explicit spatial modeling, causing depth ambiguity and degraded osseous boundaries. We present PSCT-Net, a geometry-aware framework with differentiable back-projection. Differentiable back-projection establishes a spatially faithful volumetric prior, alleviating depth ambiguity. An Attention-Guided Projection (AGP-3D) module then learns non-linear voxel-wise correspondences between 2D regions and 3D locations. A Bidirectional Mamba (BiM-3D) module captures long-range volumetric dependencies with linear complexity. We further curate a private institutional pediatric skull CT cohort, PedSkull-CT, comprising normal and pathological cases for internal evaluation, addressing the gap in adult-centric, trunk-focused datasets.

2606.19767 2026-06-19 eess.IV cs.CV physics.med-ph 新提交 85%

Contour-Constrained Deformable Registration with Parameter Characterization for Head and Neck Surgical Guidance

面向头颈外科引导的带参数表征的轮廓约束可变形配准

Qingyun Yang, Jon S. Heiselman, Ayberk Acar, Morgan J. Ringel, Michael I. Miga, Matthieu Chabanas, Michael C. Topf, Jie Ying Wu

发表机构 * Vanderbilt University(范德比尔特大学) Vanderbilt University Medical Center(范德比尔特大学医学中心)

专题命中 医学影像 :头颈外科手术引导的可变形配准

AI总结 提出一种基于正则化Kelvinlet基函数的可变形配准框架,通过表面点云、基准标记和轮廓约束校正术后组织变形,在9例头颈标本上将配准误差从刚性配准的11.11mm降至5.62mm,降幅达49.41%。

详情
AI中文摘要

全球每年新增89万例头颈部鳞状细胞癌,其复发率在实体恶性肿瘤中最高。尽管冰冻切片分析是术中切缘评估的标准方法,但由于切除标本与切除床之间的对准不精确,加上切除后黏膜组织收缩,准确地将检测到的阳性切缘重新定位到切除床上仍然具有挑战性。我们提出了一种生物力学驱动的可变形配准框架,用于校正术后组织变形以提供术中引导。该方法基于正则化Kelvinlet基函数的可变形配准方法,将3D标本网格配准到术中切除床点云。配准匹配表面点云、基准标记和边界轮廓约束,直接惩罚标本与切除床边界之间的垂直距离一致性。在来自皮肤、颊粘膜和舌部位的9个标本上,使用刚性配准的整体平均目标配准误差为$11.11 \pm 4.07$ mm,使用无轮廓约束的可变形配准则降至$8.20 \pm 2.68$ mm(降低26.19%)。所提出的轮廓约束可变形配准进一步将误差降至$5.62 \pm 2.28$ mm,相对于刚性配准降低了49.41%。我们在临床最具挑战性的舌标本中观察到最大降幅。我们还进行了系统的两阶段参数搜索,以表征表面配准、基准对应、轮廓约束和应变能正则化的相对重要性。该搜索表明,对于具有大侧向变形的组织类型,轮廓权重主导配准精度,而算法在广泛的参数组合范围内均可运行。

英文摘要

With 890,000 annual new cases globally, head and neck squamous cell carcinoma has one of the highest recurrence rates among solid malignancies. Although frozen section analysis is the standard of care for intraoperative margin assessment, accurately relocating detected positive margins on the resection bed remains challenging due to imprecise alignment between resected specimens and their resection bed, compounded by post-resection mucosal tissue shrinkage. We present a biomechanics-driven deformable registration framework that corrects post-resection tissue deformation to provide intraoperative guidance. Our approach registers 3D specimen meshes to intraoperative resection bed point clouds using a deformable registration approach based on regularized Kelvinlet basis functions. The registration matches surface point clouds, fiducial landmarks, and boundary contour constraints that directly penalize perpendicular distance-to-agreement between specimen and resection bed boundaries. Across nine specimens from skin, buccal mucosa, and tongue sites, the overall mean target registration error was $11.11 \pm 4.07$ mm using rigid registration, which decreased to $8.20 \pm 2.68$ mm (26.19\% reduction) using deformable registration without contour constraint. The proposed contour-constrained deformable registration further reduced the error to $5.62 \pm 2.28$ mm, a 49.41\% reduction relative to rigid registration. We observed the largest reduction in the most clinically challenging tongue specimens. We also performed a systematic two-stage parameter search to characterize the relative importance of surface alignment, fiducial correspondences, contour constraint, and strain energy regularization. This search revealed that contour weighting dominates registration accuracy for tissue types with large lateral deformation, while the algorithm operates over a broad range of parameter combinations.

2512.02748 2026-06-19 physics.med-ph 85%

BART Streams: Real-time Reconstruction Using a Modular Framework for Pipeline Processing

BART Streams: 用模块化框架进行管道处理的实时重建

Philip Schaten, Moritz Blumenthal, Bernhard Rapp, Christina Unterberg-Buchwald, Martin Uecker

专题命中 医学影像 :实时MRI重建,属于医学影像处理

AI总结 本文提出基于BART的模块化框架,用于实时MRI的交互式重建,通过流式处理多维数组实现高效重建,展示了在心脏实时MRI中结合迭代重建与动态线圈压缩等高级功能的成果。

Comments Submitted to Magnetic Resonance in Medicine

详情
AI中文摘要

目的:创建用于交互式实时MRI的模块化解决方案,使用BART实现的重建算法。方法:提出了一种新的多维数组流式传输协议,并将其整合到BART中。通过基于径向FLASH的心脏交互式实时MRI示例演示了新功能,结合迭代重建与动态线圈压缩和梯度延迟校正等高级功能。我们分析了重建的延迟,并测量了整个成像过程的端到端延迟。结果:使用脚本以模块化方式构建了包含迭代重建和高级功能的重建管道。延迟测量显示,BART处理和网络传输时间的延迟约为30 ms,端到端延迟包括采集、供应商处理和显示,约为200 ms。结论:通过新的流式处理能力,可以使用BART灵活地构建实时重建管道,使快速原型设计高级应用如交互式实时MRI成为可能。

英文摘要

Purpose: To create modular solutions for interactive real-time MRI using reconstruction algorithms implemented in BART. Methods: A new protocol for streaming of multidimensional arrays is presented and integrated into BART. The new functionality is demonstrated using examples for cardiac interactive real-time MRI based on radial FLASH, where iterative reconstruction is combined with advanced features such as dynamic coil compression and gradient-delay orrection. We analyze the latency of the reconstruction and measure end-to-end latency of the full imaging process. Results: Reconstruction pipelines with iterative reconstruction and advanced functionality were built in a modular way using scripting. Latency measurements demonstrate latency sufficient for interactive real-time MRI, on the order of 30 ms for BART processing and network transfer time, or 200 ms for end-to-end latency including acquisition, vendor processing, and display. Conclusion: With the new streaming capabilities, real-time reconstruction pipelines can be assembled using BART in a flexible way, enabling rapid prototyping of advanced applications such as interactive real-time MRI.

2606.19365 2026-06-19 cs.LG 新提交 80%

Performance Analysis and Optimization of 3D Generative Diffusion Models across GPU Architectures

跨GPU架构的3D生成扩散模型性能分析与优化

Jeeho Ryoo, Yongchan Jung, Muhammad Ali Khaliq, Weidong Zhang, Jiatong Han, Byeong Kil Lee

发表机构 * Fairleigh Dickinson University(费尔利·迪金森大学) The University of Colorado at Colorado Springs(科罗拉多大学科罗拉多斯普林斯分校) Northeastern University(东北大学)

专题命中 医学影像 :优化3D MRI扩散模型Med-DDPM的性能。

AI总结 针对3D MRI扩散模型Med-DDPM,分析其在三代NVIDIA架构上的内核级性能瓶颈,提出TF32 Tensor Core激活和3D channels-last布局优化,实现SM周期和动态指令减少100倍,Tensor Core利用率提升至9.98倍,IPC提升7%。

详情
AI中文摘要

扩散模型已成为高保真3D MRI合成的关键,但由于每个样本需要数百次U-Net评估以及高度异构的内核行为,其部署仍受到大量GPU资源需求的限制。本文对最先进的医学扩散模型Med-DDPM在三代NVIDIA架构上进行了全面的性能分析,研究了内核级运行时分解、指令混合特征、内存系统利用率、线程束级活动以及分析器优先级得分估计。我们发现训练主要由cuDNN卷积和隐式GEMM内核主导,效率低下源于内存访问模式、张量布局转换和有限的Tensor Core利用率。基于这些洞察,我们评估了两种架构感知优化——TF32 Tensor Core激活和3D channels-last布局,并证明它们将SM周期减少多达100倍,动态指令减少100倍,Tensor Core利用率从1.45倍提高到9.98倍,并在A100上将IPC提高7%,且不降低合成质量。

英文摘要

Diffusion models have become essential for high-fidelity 3D MRI synthesis, yet their deployment remains constrained by substantial GPU resource demands arising from hundreds of U-Net evaluations per sample and a highly heterogeneous kernel behavior. This paper performs a comprehensive performance analysis of the state-of-the-art medical diffusion model, Med-DDPM, across three generations of NVIDIA architectures to study kernel-level runtime breakdowns, instruction-mix characteristics, memory system utilization, warp-level activities, and profiler priority-score estimates. We show that training is overwhelmingly dominated by cuDNN convolution and implicit-GEMM kernels, with inefficiencies arising from memory-access patterns, tensor-layout conversions, and limited Tensor Core utilization. Guided by these insights, we evaluate two architecture-aware optimizations TF32 Tensor Core activation and a 3D channels-last layout and demonstrate that they reduce SM cycles by up to 100x, cut dynamic instructions by 100x, raise Tensor Core utilization from 1.45 to 9.98x, and increase IPC by 7% on A100, all without degrading synthesis quality.

2606.18970 2026-06-19 cs.LG cs.AI cs.CV 新提交 80%

A Controlled Benchmark of Quantum-Latent GAN Augmentation for Brain MRI

脑MRI的量子潜GAN增强的受控基准测试

Syed Mujtaba Haider, Silvia Figini

发表机构 * Department of Mathematics(数学系) Department of Political and Social Sciences(政治与社会科学系)

专题命中 医学影像 :量子GAN增强脑MRI数据,属于医学影像

AI总结 通过受控基准测试,比较量子与经典生成器在脑MRI数据增强中的性能,发现两者均未显著优于仅用真实数据训练,且量子生成器无额外优势。

详情
AI中文摘要

医学图像分类常受限于有限的标注数据,因此生成式增强被提出;最近,量子生成模型被用于此目的,并经常报告准确率提升。然而,这些声称通常基于单次训练运行,未匹配量子与经典生成器的参数预算,也未表征任何收益出现的数据范围。我们提出了一个受控基准测试,隔离量子生成器对脑MRI增强的贡献。图像被编码到KL正则化的潜在空间中,在该空间中,使用变分量子生成器或参数数量几乎相同的经典生成器(1648 vs. 1632)训练带有梯度惩罚的条件Wasserstein GAN。合成样本被解码并用于增强预训练分类器,覆盖从5%到100%的标注数据比例,通过八个随机种子进行配对显著性检验(多重比较校正)以及集内多样性和潜在分布分析。在所有比例下,没有增强变体显著优于仅用真实数据训练,且量子与经典生成器在统计上无法区分。任何低数据优势表现为正则化而非忠实的数据扩展:合成样本分布外移,并且在数据稀缺时严重模式崩溃,而量子生成器并不比经典生成器更多样化。我们发布该协议作为医学成像中量子生成增强严格评估的测试平台。

英文摘要

Medical image classification is often constrained by limited labeled data, motivating generative augmentation; recently, quantum generative models have been proposed for this purpose, frequently reporting accuracy gains. However, such claims are typically based on single training runs, do not match the parameter budgets of the quantum and classical generators, and do not characterize the data regime in which any benefit appears. We present a controlled benchmark that isolates the contribution of a quantum generator to brain-MRI augmentation. Images are encoded into a KL-regularized latent space in which a conditional Wasserstein GAN with gradient penalty is trained using either a variational quantum generator or a classical generator of near-identical parameter count (1648 vs. 1632). Synthetic samples are decoded and used to augment a pretrained classifier across labeled data fractions from 5% to 100%, evaluated over eight random seeds with paired significance testing (with multiple-comparison correction) and with intraset diversity and latent-distribution analyses. Across all fractions, no augmentation variant significantly outperforms real-data-only training, and the quantum and classical generators are statistically indistinguishable. Any low-data benefit behaves as regularization rather than faithful data expansion:synthetic samples are off distribution and severely mode collapsed precisely where data is scarce, and the quantum generator is no more diverse thanits classical counterpart. We release the protocol as a testbed for rigorous evaluation of quantum generative augmentation in medical imaging.

2. 临床大模型 2 篇

2606.19852 2026-06-19 cs.CL cs.LG 新提交 85%

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

提示、规划、提取:用于从临床叙述中提取肺部病理学的零样本智能体LLM工作流

Aman Pathak, Cheng Peng, Mengxian Lyu, Ziyi Chen, Reema Solan, Sankalp Talankar, Yasir Khan, Hiren Mehta, Aokun Chen, Yi Guo, Yonghui Wu

发表机构 * Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida(健康结果与生物医学信息学系,医学院,佛罗里达大学) Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, College of Medicine, University of Florida(呼吸科、重症医学科和睡眠医学科,医学系,医学院,佛罗里达大学) College of Nursing, Florida State University(护理学院,佛罗里达州立大学)

专题命中 临床大模型 :零样本LLM工作流提取肺部病理信息。

AI总结 提出零样本智能体工作流,利用开源大语言模型从肺切除病理报告中提取13个CAP字段,在无训练下达到0.893 Micro-F1,接近监督方法。

Comments 7 pages, 2 figures, 3 tables. Affiliations: (1) Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA; (2) Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, College of Medicine, University of Florida, Gainesville, FL, USA; (3) College of Nursing, Florida State University, Tallahassee, FL, USA

详情
AI中文摘要

从病理报告中提取信息对于癌症分期和肿瘤登记人群至关重要。然而关键数据仍嵌入在叙述性报告中,使得手动提取劳动密集且易出错。传统的监督自然语言处理流程通过完全监督的命名实体识别和关系提取来解决这一问题,但需要昂贵的人工标注,并且当上游实体缺失时会出现级联故障。在本研究中,我们开发了一个零样本智能体工作流,并评估了五个开源生成式大语言模型(LLMs),以从肺切除病理报告中填充13个美国病理学家学会的概要字段。我们使用一种新颖的、与注册对齐的评估框架,将它们与最先进的监督GatorTron NER-RE基线进行比较。基线达到了0.960的Micro-F1,而最佳零样本模型(GPT-OSS-20B)达到了0.893的Micro-F1(召回率:0.949),在没有任务特定训练的情况下准确提取了复杂关系(如病理分期)。这些结果表明,开源零样本智能体LLMs是提取肺部病理信息的低成本解决方案。

英文摘要

Information extraction from pathology reports is essential for cancer staging, tumor registry population. Yet key data remains embedded in narrative reports, making manual extraction labor-intensive and error-prone. Traditional supervised Natural Language Processing pipelines address this through fully supervised Named Entity Recognition and Relation Extraction, but require expensive manual annotation and suffer cascading failures when upstream entities are missed. In this study, we developed a zero-shot, agentic workflow, and evaluated five open-source generative Large Language Models (LLMs) to populate 13 College of American Pathologists synoptic fields from lung resection pathology reports. We compared them against a state-of-the-art supervised GatorTron NER-RE baseline using a novel, registry-aligned evaluation framework. The baseline achieved Micro-F1of 0.960, while the best zero-shot model (GPT-OSS-20B) achieved Micro-F1 of 0.893 (recall: 0.949), accurately extracting complex relations like Pathologic Stage without task-specific training. These results suggest that open-source, zero-shot agentic LLMs are a low-cost solution for extracting lung pathology information.

2606.18613 2026-06-19 cs.CL cs.AI 新提交 85%

Are LLMs Ready to Assist Physicians? PhysAssistBench for Interactive Doctor-Patient-EHR Assistance

LLMs 是否已准备好辅助医生?PhysAssistBench:交互式医患-电子病历辅助基准

Tianming Du, Peijie Yu, Sihan Shang, Danli Shi, My Linh Nguyen, Shengbo Gao, Guangyuan Li, Yinghong Yu, Yan Jiang, Qianlong Zhao, Behzad Bozorgtabar, Shaoxiong Ji, Jiazhen Pan, Daniel Rueckert, Jiancheng Yang

发表机构 * Aalto University(阿尔托大学) Tencent(腾讯) Harbin Institute of Technology, Shenzhen(哈尔滨工业大学(深圳)) Hong Kong Polytechnic University(香港理工大学) Aarhus University(奥胡斯大学) Technical University of Munich(慕尼黑工业大学)

专题命中 临床大模型 :LLM辅助医生交互基准,属于临床大模型

AI总结 提出PhysAssistBench基准,通过构建交互式患者代理评估LLM在医患-EHR交互中的协调能力,发现当前模型不可靠,瓶颈在于多维度协调而非单一能力。

Comments 34 pages with 8 figures

详情
AI中文摘要

医疗LLM最合理的近期角色是辅助而非替代医生,但当前的评估通常测试孤立能力:临床知识、EHR系统交互或患者沟通。而医生辅助需要在同一交互中协调这些能力,其中医生提出不明确的请求,患者模糊描述症状,EHR系统要求精确的工具使用。我们引入PhysAssistBench,一个用于交互式医患-EHR辅助的基准。基于真实的MIMIC-IV病例,PhysAssistBench使用可扩展的流水线构建交互式、记录驱动的患者代理,将静态EHR记录转化为多轮临床场景,同时保持临床事实准确性。PhysAssistBench提供了一个精选的双语评估集,包含1,296个经过人工审查和医生验证的轮次。与领先LLM的实验表明,当前模型在此设置下仍不可靠,这暴露了临床LLM的关键瓶颈:可靠的辅助需要知识、沟通和系统之间的协调,而非任何单一能力的孤立提升。

英文摘要

The most plausible near-term role of medical LLMs is to assist rather than replace physicians, yet current evaluations often test isolated capabilities: clinical knowledge, EHR system interaction, or patient communication. Physician assistance instead requires coordinating these capabilities within the same interaction, where physicians issue underspecified requests, patients describe symptoms ambiguously, and EHR systems demand precise tool use. We introduce PhysAssistBench, a benchmark for interactive doctor-patient-EHR assistance. Built from real MIMIC-IV cases, PhysAssistBench uses a scalable pipeline to construct agentic patients: interactive, record-grounded agents that turn static EHR records into multi-turn clinical scenarios while preserving clinical factuality. PhysAssistBench provides a curated bilingual evaluation set of 1,296 manually reviewed and physician-validated turns. Experiments with leading LLMs show that current models remain unreliable in this setting, which exposes a key bottleneck for clinical LLMs: reliable assistance requires coordination across knowledge, communication, and systems, not isolated gains in any of them.

3. 健康监测 3 篇

2606.20074 2026-06-19 eess.SP cs.AI cs.LG 新提交 80%

Evaluation of EEG Foundation Models for Event-Based Burst-Suppression Detection in ICU

用于ICU中基于事件的爆发-抑制检测的EEG基础模型评估

Elisa Vasta, Thorir Mar Ingolfsson, Andrea Cossettini, Luca Benini, Tilman Beck, Emanuela Keller, Una Pale

发表机构 * DEI, University of Bologna, Bologna, Italy(DEI,博洛尼亚大学,博洛尼亚,意大利)

专题命中 健康监测 :ICU中EEG监测,辅助临床决策,属于医学AI

AI总结 本研究首次评估EEG基础模型在ICU中无需患者校准的爆发检测性能,REVE-base模型在事件级F1分数上达到0.868,并将每分钟爆发错误率分别降低52.1%和36.2%。

Comments 4 pages, 1 figure. Code available upon publication

详情
AI中文摘要

爆发抑制(BS)是一种临床相关的脑电图(EEG)模式,用于监测危重患者的镇静深度和脑活动,特别是在重症监护病房(ICU)的诱导昏迷期间。自动爆发检测仍然具有挑战性,因为BS模式在不同患者之间差异很大,且标注数据集稀缺。最近,EEG基础模型(FMs)在多个下游EEG应用中显示出前景,但它们在BS检测中的实用性尚未被探索。我们提出了第一项研究,评估EEG FMs在减少导联的ICU EEG中无需患者校准的爆发检测性能。我们将REVE-base、LUNA-large和LuMamba-Tiny与自适应阈值基线以及任务特定的EEGNet基线进行比较。此外,我们补充了基于事件的爆发检测评估,以替代传统的EEG窗口分类。这有助于临床评估爆发事件是否被正确检测,减少预期标注变异性的影响。最佳模型REVE-base取得了最高的事件级F1分数($0.868 \pm 0.167$),并且与EEGNet和自适应阈值相比,分别将每分钟爆发错误减少了52.1%和36.2%,支持了FMs在ICU中可扩展的EEG监测。消融实验表明,与冻结骨干训练、两步微调和基于LoRA的适应相比,全微调是最有效的适应策略,对于LUNA-large,事件级F1分数比冻结骨干训练提高了最多$+0.102$。在减少标注数据集的情况下,预训练的REVE-base在25%的队列中比随机初始化高出$+0.723$事件级F1点,证明了在有限标注数据下适应爆发检测时预训练FM表示的优势。

英文摘要

Burst suppression (BS) is a clinically relevant electroencephalographic (EEG) pattern used to monitor sedation depth and brain activity in critically ill patients, particularly during induced coma in Intensive Care Units (ICUs). Automatic burst detection remains challenging because BS patterns vary substantially between patients and annotated datasets are scarce. Recently, EEG Foundation Models (FMs) have shown promise across several downstream EEG applications, but their usefulness for BS detection remains unexplored. We present the first study to evaluate EEG FMs for burst detection in reduced-montage ICU EEG without patient-specific calibration. We compare REVE-base, LUNA-large and LuMamba-Tiny with an adaptive thresholding baseline and a task-specific EEGNet baseline. Additionally, we complement conventional EEG window-based classification with event-based burst detection evaluation. This helps assessing clinically whether burst episodes are correctly detected, reducing the impact of expected annotation variability. The best model, REVE-base, achieved the highest event-based F1-score ($0.868 \pm 0.167$) and reduced burst-per-minute error by 52.1% and 36.2% compared to EEGNet and adaptive thresholding respectively, supporting FMs for scalable EEG monitoring in ICU. Ablation experiments showed that full fine-tuning was the most effective adaptation strategy with respect to frozen-backbone training, two-step fine-tuning, and LoRA-based adaptation, improving event-based F1-score over frozen-backbone training by up to $+0.102$ for LUNA-large. With reduced labeled datasets, pretrained REVE-base outperformed random initialization by $+0.723$ event-based F1 points at 25% of the cohort, demonstrating the benefit of pretraining FM representations when adapted to burst detection with limited labeled data.

2606.19888 2026-06-19 cs.LG cs.AI 新提交 80%

SL-S4Wave: Self-Supervised Learning of Physiological Waveforms with Structured State Space Models

SL-S4Wave:基于结构化状态空间模型的生理波形自监督学习

Feng Wu, Harsh Deep, Eric Lehman, Sanyam Kapoor, Guoshuai Zhao, Rahul Krishnan, Gari Clifford, Li-wei H Lehman

发表机构 * Massachusetts Institute of Technology(麻省理工学院) OpenEvidence, USA(OpenEvidence(美国)) New York University(纽约大学) Xi’an Jiaotong University(西安交通大学) University of Toronto(多伦多大学) Emory University(埃默里大学)

专题命中 健康监测 :自监督学习生理波形,用于心律失常检测。

AI总结 提出SL-S4Wave框架,结合对比学习与基于结构化状态空间模型的编码器,通过多尺度子核全局卷积捕获多通道生理波形的局部和长程依赖,在心律失常检测等任务中优于现有方法。

详情
AI中文摘要

由于高采样率、多通道信号复杂性、固有噪声和有限的标记数据,对长序列医学时间序列数据(如心电图)进行建模面临重大挑战。尽管最近基于各种编码器架构(如卷积神经网络)的自监督学习方法被提出用于从未标记数据中学习表示,但它们往往在捕获长程依赖和噪声不变特征方面存在不足。结构化状态空间模型擅长长序列建模,但现有的S4架构无法捕获多通道生理波形的独特特征。在这项工作中,我们提出了SL-S4Wave,一个自监督学习框架,它将对比学习与基于结构化状态空间模型的定制编码器相结合。该编码器利用多尺度子核实现多层全局卷积,从而能够在嘈杂的高分辨率多通道波形中捕获细粒度局部模式和长程时间依赖。在真实世界数据集上的大量实验表明,SL-S4Wave(1)在具有挑战性的心律失常检测任务中持续优于最先进的监督和自监督基线,(2)使用显著更少的标记示例实现高性能,展示了强大的标签效率,(3)在长波形片段上保持稳健性能,突出了其对大多数现有方法无法有效建模的长序列中复杂时间动态的建模能力,以及(4)有效迁移到未见的心律失常类型,强调了其强大的跨域泛化能力。我们还在多个EEG任务上评估了SL-S4Wave,在强基线上取得了优越性能,证明了我们的方法在心脏波形之外的泛化能力。

英文摘要

Modeling long-sequence medical time series data, such as electrocardiograms (ECG), poses significant challenges due to high sampling rates, multichannel signal complexity, inherent noise, and limited labeled data. While recent self-supervised learning (SSL) methods, based on various encoder architectures such as convolutional neural networks, have been proposed to learn representations from unlabeled data, they often fall short in capturing long-range dependencies and noise-invariant features. Structured state space models (S4) excel at long-sequence modeling, but existing S4 architectures fail to capture the unique characteristics of multichannel physiological waveforms. In this work, we propose SL-S4Wave, a self-supervised learning framework that combines contrastive learning with a tailored encoder built on structured state space models. The encoder incorporates multi-layer global convolution using multiscale subkernels, enabling the capture of both fine-grained local patterns and long-range temporal dependencies in noisy, high-resolution multichannel waveforms. Extensive experiments on real-world datasets demonstrate that SL-S4Wave (1) consistently outperforms state-of-the-art supervised and self-supervised baselines in a challenging arrhythmia detection task, (2) achieves high performance with significantly fewer labeled examples, showcasing strong label efficiency, and (3) maintains robust performance on long waveform segments, highlighting its capacity to model complex temporal dynamics in long sequences that most existing approaches fail to efficiently model, and (4) transfers effectively to unseen arrhythmia types, underscoring its robust cross-domain generalization. We additionally evaluate SL-S4Wave on multiple EEG tasks, achieving superior performance over strong baselines, demonstrating generalizability of our approach beyond cardiac waveforms.

2606.19405 2026-06-19 q-bio.QM math.DS q-bio.PE 新提交 70%

Multi-type branching inference on contact trees with application to COVID-19

接触树上的多类型分支推断及其在COVID-19中的应用

Augustine Okolie, Johannes Müller, Eno Akarawakc, Isaac Ajiboye

专题命中 健康监测 :应用于COVID-19流行病学参数推断

AI总结 提出一种直接作用于接触树上传播树的似然框架,通过多类型分支过程考虑接触度异质性,从部分解析的传播树中推断流行病学参数,并在COVID-19接触追踪数据中验证。

Comments 26 pages, 8 Figures

详情
AI中文摘要

从传播树推断流行病学参数对于理解传染病动态至关重要。现有的基于树的似然方法,包括最初应用于系统动力学环境中的多类型出生-死亡模型,提供了强大的工具,但大多数假设均匀混合,很少捕捉当个体感染更多接触者时传播潜力的变化。在这项工作中,我们开发了一个直接作用于传播树的似然框架,其中节点是个体,边是报告的传播事件,不涉及序列数据。我们推导了一个在有根接触树上的随机SIR过程的似然,其中每个感染个体由有效接触总数和已感染的下游接触数来刻画。我们得到了一个分支完全未被观察到的概率以及它产生一个处于给定状态的观察(采样)末端的概率密度的闭式常微分方程。对于已知末端状态的有根接触树,可以评估得到的似然,并且我们通过将内部分支时间视为潜在变量,将其扩展到部分解析的树。在模拟爆发上的验证确认了准确的参数恢复和良好校准的不确定性。应用于印度卡纳塔克邦的经验COVID-19接触追踪数据,展示了该框架在实际流行病学环境中的实用性。通过在多类型分支似然中纳入接触度异质性,我们的工作为从完全或部分解析的传播树推断传播动态和接触结构提供了一个原则性的基线,补充而非依赖于基于序列的系统动力学推断。

英文摘要

Inferring epidemiological parameters from transmission trees is essential for understanding infectious disease dynamics. Existing tree-based likelihood methods, including the multi-type birth-death models originally applied in phylodynamic settings, provide powerful tools, but most assume homogeneous mixing and rarely capture how transmission potential changes as an individual infects more of their contacts. In this work, we develop a likelihood framework that operates directly on transmission trees, in which nodes are individuals and edges are reported transmission events, with no sequence data involved. We derive a likelihood for a stochastic SIR process on a rooted contact tree in which each infected individual is characterised by the total number of effective contacts, and the number of already infected downstream contacts. We obtain closed-form ordinary differential equations for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed (sampled) tip in a given state. The resulting likelihood can be evaluated for a rooted contact tree with known tip states, and we extend it to partially resolved trees by treating internal branching times as latent variables. Validation on simulated outbreaks confirms accurate parameter recovery and well calibrated uncertainty. Application to empirical COVID-19 contact-tracing data from Karnataka, India, demonstrates the framework's utility for real epidemiological settings. By incorporating contact-degree heterogeneity in a multi-type branching likelihood, our work provides a principled baseline for inferring both transmission dynamics and contact structure from fully or partially resolved transmission trees, complementing rather than relying on sequence-based phylodynamic inference

4. 其他医学AI 1 篇

2606.19827 2026-06-19 cs.LG cs.AI 新提交 80%

When, Where, and How: Adaptive Binning for Tabular Self-Supervised Learning

何时、何地以及如何:面向表格自监督学习的自适应分箱

Daehwan Kim, Haejun Chung, Ikbeom Jang

发表机构 * Hanyang University(汉阳大学) Hankuk University of Foreign Studies(韩国外国语大学)

专题命中 其他医学AI :自适应分箱用于医疗表格自监督学习,提升性能。

AI总结 提出自适应分箱方法,通过特征级粗到细课程学习动态优化离散化,结合类别重建与顺序监督,在医疗表格数据上提升自监督学习性能。

Comments Accepted to MICCAI 2026

详情
AI中文摘要

医疗表格数据在临床研究中无处不在,但表格数据的深度学习仍未被充分探索,因为可靠的标签通常需要昂贵的专家判定,尽管结构化临床变量通常以表格形式常规可用。自监督学习可以利用这些未标记的表格,而最近基于分箱的前置任务提供了一种有前景的归纳偏置,但现有目标固定单个全局分位数离散化并应用特征无关的监督。我们提出自适应分箱,一种用于表格自监督学习的训练自适应离散化前置任务,通过特征级粗到细课程将离散化与学习耦合。受神经网络的频谱偏差和课程学习原则的启发,我们的方法在检测到平台期时逐步细化每个特征的离散化,并选择表示感知的分割点,以联合改善值空间浓度和表示空间一致性。一种异质性感知目标统一了类别重建与数值特征的顺序监督,在统一评估协议下对公共医疗表格数据集的实验显示,线性探测和微调均取得一致改进,无需数据集特定的离散化调整。我们进一步引入一个医疗表格自监督学习基准,配备标准化协议,以支持这一未被充分探索领域的可重复进展。我们的代码可在该网址获取。

英文摘要

Medical tabular data are ubiquitous in clinical research, but deep learning for tables remains underexplored because reliable labels often require costly expert adjudication, even though structured clinical variables are routinely available in tabular form. Self-supervised learning can leverage these unlabeled tables, and recent binning-based pretexts offer a promising inductive bias, but existing objectives fix a single global quantile discretization and apply feature-agnostic supervision. We propose Adaptive Binning, a training-adaptive discretization pretext for tabular SSL that couples discretization to learning through a feature-wise coarse-to-fine curriculum. Motivated by the spectral bias of neural networks and the principles of curriculum learning, our method progressively refines discretization per feature upon plateau detection and selects representation-aware splits to jointly improve value-space concentration and representation-space coherence. A heterogeneity-aware objective unifies categorical reconstruction with ordinal supervision for numerical features, and experiments on public medical tabular datasets under unified evaluation protocols show consistent gains for linear probing and fine-tuning without dataset-specific discretization tuning. We further introduce a medical tabular SSL benchmark with standardized protocols to support reproducible progress in this underexplored domain. Our code is available at https://github.com/labhai/Adaptive-Binning.