arXivDaily arXiv每日学术速递 周一至周五更新

科学与医疗

医学 AI

医学智能、临床 AI、医学影像、病理、诊断和医疗健康大模型。

今日/当前日期收录 41 信号源:cs.CV, cs.LG, q-bio, eess.IV, eess.SP
2606.19300 2026-06-18 cs.CV cs.LG 新提交 95%

Confidence is Not Reliability: Rethinking MC Dropout in Brain Tumour Segmentation

置信度不等于可靠性:重新思考脑肿瘤分割中的MC Dropout

Xin Ci Wong, Duygu Sarikaya, Kieran Zucker, Marc De Kamps, Nishant Ravikumar

发表机构 * Centre for Doctoral Training in AI for Medical Diagnosis and Care(人工智能辅助医疗诊断与护理博士培训中心) School of Computing, University of Leeds(利兹大学计算机学院) School of Computer Science, University of Leeds(利兹大学计算机科学学院) Leeds Cancer Centre, St James’s University Hospital, Leeds, UK(利兹癌症中心,圣詹姆斯大学医院,利兹,英国)

专题命中 医学影像 :脑肿瘤分割中的MC Dropout不确定性估计,聚焦临床安全

AI总结 通过MC Dropout不确定性估计,发现全局不确定性-误差对齐(AUROC≈0.97)可能掩盖关键子区域(如增强肿瘤)的严重误校准(ECE=0.915),表明子区域校准评估对临床安全至关重要。

Comments Accepted for MIUA2016

详情
AI中文摘要

多参数MRI中的胶质瘤分割是治疗计划的关键组成部分。一个在治疗关键子区域上静默失败的分割模型会带来患者安全风险,而Dice分数等基于重叠的指标无法暴露这种风险。我们探究通过蒙特卡洛(MC)Dropout进行的体素级不确定性估计能否可靠地识别临床关键子区域中的分割错误,以及校准失败模式是否仅从标准报告指标中可检测。在126名BraTS21患者的两模型实证案例研究中,我们评估了高性能预训练SegResNet和本地训练的带有残差单元的UNet(UNet-Res)。MC dropout保持了分割准确性($|\Delta \text{Dice}|$ $<0.01$),同时实现了强不确定性-误差对齐(熵(H)的AUROC $\approx$0.97),表明不确定性正确地将错误体素排在正确体素之上。基于熵的患者分层识别出一个高不确定性亚组,其分割性能显著较低(全肿瘤Dice中位数$0.835$ vs. $0.925$),支持不确定性作为实用的分诊信号。然而,全局对齐可能掩盖重要的区域特异性差异。尽管AUROC相似,UNet-Res在增强肿瘤熵上接近零($0.054$),期望校准误差(ECE)为$0.915$,Dice仅为$0.714$,表明在最临床关键子区域上置信度严重误校准,这是标准Dice和AUROC报告无法发现的失败模式。这些发现表明,强不确定性-误差对齐对于临床安全是必要但不充分的:在选择临床部署模型时,子区域特异性校准评估必须伴随AUROC评估。

英文摘要

Glioma segmentation in multiparametric MRI is a critical component of treatment planning. A segmentation model that fails silently on treatment-critical sub-regions represents a patient safety risk that overlap-based metrics such as Dice scores cannot expose. We ask whether voxel-level uncertainty estimation via Monte Carlo (MC) Dropout can reliably identify segmentation errors in clinically critical sub-regions, and whether calibration failure modes are detectable from standard reporting metrics alone. In an empirical two-model case study on 126 BraTS21 patients, we evaluate a high-performance pretrained SegResNet and a locally trained UNet with residual units (UNet-Res). MC dropout preserved segmentation accuracy ($|Δ\text{Dice}|$ $<0.01$) while achieving strong uncertainty-error alignment (AUROC for entropy (H) $\approx$0.97), indicating uncertainty correctly ranks erroneous voxels above correct ones. Entropy-based patient stratification identified a high-uncertainty subgroup with substantially lower segmentation performance (median whole-tumour Dice $0.835$ vs. $0.925$), supporting uncertainty as a practical triage signal. However, global alignment can mask important region-specific differences. Despite similar AUROC, UNet-Res exhibited near-zero enhancing tumour entropy ($0.054$) and Expected Calibration Error (ECE) of $0.915$, with a Dice of only $0.714$, indicating severely miscalibrated confidence on the most clinically critical sub-region, a failure mode invisible to standard Dice and AUROC reporting. These findings demonstrate that strong uncertainty-error alignment is necessary but insufficient for clinical safety: sub-region-specific calibration assessment must accompany AUROC evaluation when selecting models for clinical deployment.

2606.18707 2026-06-18 cs.CV 新提交 95%

PEFT-MedSAM: Efficient Fine-Tuning of Medical Foundation Models for Explainable Skin Lesion Segmentation

PEFT-MedSAM:面向可解释皮肤病变分割的医学基础模型高效微调

Asad Channa, Abdullah Khan, Asghar Ali Chandio, Aamir Akbar, Shahzad Memon, Aqib Hussain, Ameer Hamza

发表机构 * Department of Computer Science, Quaid-e-Awam University of Engineering, Sciences & Technology(计算机科学系,卡迪尔-阿瓦姆工程、科学与技术大学) Department of Artificial Intelligence, Quaid-e-Awam University of Engineering, Sciences & Technology(人工智能系,卡迪尔-阿瓦姆工程、科学与技术大学) Department of Computer Science, Sindh Madressatul Islam University, City Campus, Karachi(计算机科学系, Sind 阿里斯坦伊斯兰大学,卡拉奇城校区) Department of Computer Science and Digital Technologies, School of Architecture, Computing and Engineering, University of East London(计算机科学与数字技术系,建筑、计算与工程学院,东伦敦大学)

专题命中 医学影像 :提出医学图像分割微调方法,应用于皮肤病变分割

AI总结 提出参数高效微调方法PEFT-MedSAM,冻结预训练编码器仅训练轻量解码器,在ISIC 2018上达到0.9411 Dice系数,并通过Grad-CAM可解释性增强临床可信度。

详情
AI中文摘要

使用深度学习模型对皮肤镜图像进行皮肤病变自动分割,有助于比常规检测更早发现黑色素瘤。然而,大多数现有的深度学习方法性能不佳。本文旨在提出一种名为PEFT-MedSAM的参数高效微调方法,用于适配医学分割一切模型(MedSAM)以自动分割皮肤镜皮肤病变。PEFT-MedSAM方法仅使用轻量级掩码解码器训练模型,同时保持预训练图像编码器和提示编码器冻结。在ISIC 2018基准数据集上的实验表明,与完全训练的U-Net基线(0.8715 Dice系数)和零样本MedSAM推理(0.8997 Dice系数)相比,PEFT-MedSAM获得了0.9411的Dice系数和0.8918的交并比。使用PH2数据集进行的外部验证显示Dice系数为0.9467,标准差为±0.0310。这些主张的支持证据包括比较两个数据集的Wilcoxon符号秩检验p值小于0.0001,以及bootstrap估计的95%置信区间[0.9364, 0.9447],该区间表示重复测试获得的平均Dice系数的估计范围。为了增加临床可信度,我们使用Grad-CAM可解释性以及基于指向游戏的评估方法,在验证集上评估CNN基线模型。结果表明,在包含519张图像的验证集上,准确率达到98.27%,并确认模型正确分类了包含皮肤病变的区域。

英文摘要

Automated segmentation of skin lesions using deep learning models for dermoscopic images can be very helpful in finding melanomas earlier than they would normally be detected. However, most deep learning methods available do not perform well. The aim of this paper is to present a parameter-efficient fine-tuning method called PEFT-MedSAM for adapting the Medical Segment Anything Model (MedSAM) to automatically segment dermoscopic skin lesions. The PEFT-MedSAM method uses only the lightweight mask decoder for training the model while keeping the pre-trained image encoder and prompt encoder frozen. The experiments performed on the ISIC 2018 benchmark dataset shows that PEFT-MedSAM obtains a dice coefficient of .9411 and an intersection over union value of .8918 when compared to both a fully trained U-Net baseline (.8715 dice coefficient) and zero-shot MedSAM inference (.8997 dice coefficient). The external validation of the model using PH2 dataset shows .9467 dice coefficient with +/- .0310 standard deviation. Supportive evidence for these claims include a p-value less than .0001 for Wilcoxon signed rank tests comparing the two datasets and bootstrap-estimated 95% confidence intervals of [.9364,.9447] that represent the estimated range of possible values for the average dice coefficient obtained by repeating the test. To increase clinical trustworthiness, we used Grad-CAM explainability along with a pointing game based evaluation methodology to evaluate the CNN baseline model on the validation set. The results showed that we had an accuracy rate of 98.27% on the validation set of 519 images and confirmed that the model classified regions containing skin lesions.

2606.18682 2026-06-18 cs.CV 新提交 95%

Multi-Class Brain Tumor Classification Using Advanced Deep Learning Models: A Comparative Study

使用先进深度学习模型的多类脑肿瘤分类:一项比较研究

Asad Channa, Asghar Ali Chandio, Akhtar Hussain Jalbani, Mehwish Leghari, Shahzad Memon

发表机构 * Department of Computer Science, Quaid-e-Awam University of Engineering, Sciences & Technology(夸迪-艾瓦姆工程、科学与技术大学计算机科学系) Department of Artificial Intelligence, Quaid-e-Awam University of Engineering, Sciences & Technology(夸迪-艾瓦姆工程、科学与技术大学人工智能系) The Faculty of Artificial Intelligence and Cyber Security, Universiti Teknikal Malaysia Melaka(马来西亚梅拉卡技术大学人工智能与网络安全学院) Department of Data Science, Quaid-e-Awam University of Engineering, Sciences & Technology(夸迪-艾瓦姆工程、科学与技术大学数据科学系) Department of Computer Science and Digital Technologies, School of Architecture, Computing and Engineering, University of East London(东伦敦大学建筑、计算与工程学院计算机科学与数字技术系)

专题命中 医学影像 :脑肿瘤MRI分类,比较CNN架构

AI总结 本研究比较五种CNN架构(包括定制模型和四种预训练模型)在约10,000张MRI图像上的多类脑肿瘤分类性能,发现EfficientNetB0以95%准确率最优,尤其显著提高了脑膜瘤的召回率(89%)。

详情
AI中文摘要

尽管深度学习最近取得了进展,但从MRI图像中准确分类脑肿瘤仍然面临挑战。在本研究中,我们对五种不同的卷积神经网络(CNN)架构进行了全面评估,包括一个定制的基线模型和四个预训练模型,用于使用临床来源的约10,000张MRI图像数据集对多类脑肿瘤进行分类。我们使用了五种不同的架构:VGG16、VGG19、DenseNet121和EfficientNetB0,它们都在相同的实验框架内进行了测试和训练。性能通过总体准确率和肿瘤召回率来衡量,以评估每种架构的临床相关性能。我们发现,与其他测试的架构相比,EfficientNetB0具有最佳的整体分类准确率95%;具体来说,VGG16(94.37%)、VGG19(92.29%)、DenseNet121(90.91%)和定制CNN(78.00%)。我们研究的一个特别重要的发现是,在检测脑膜瘤方面有显著改进;具体而言,简单的CNN可以以约20%的召回率检测脑膜瘤,而EfficientNetB0能够以89%的召回率检测脑膜瘤。脑膜瘤通常难以检测,因为它们在MRI图像上可能表现得非常微妙。此外,一个有趣的发现是,更深的VGG19性能不如较浅的VGG16。这表明,在处理医学图像时,CNN模型的架构效率可能比其深度更重要。总体而言,EfficientNetB0似乎在分类准确率、模型参数数量和临床有意义性能之间提供了最佳权衡。

英文摘要

Despite recent advancements in deep learning, accurately classifying brain tumors from MRI images continues to pose challenges. In this research, we present a comprehensive evaluation of five different convolutional neural networks (CNN) architectures, including a customized baseline model and four pre-trained models - for use in classifying multi-class brain tumors using a clinically-sourced dataset of approximately 10,000 MRI images. We have utilized five different architectures; VGG16, VGG19, DenseNet121, and EfficientNetB0, which were all tested and trained within an identical experimental framework. Performance was measured by both overall accuracy and tumor-wise recall as a means to measure the clinically-relevant performance of each architecture. We found that EfficientNetB0 had the best overall classification accuracy at 95%, when compared to the other architectures tested; specifically VGG16 (94.37%), VGG19 (92.29%), DenseNet121 (90.91%) and the customized CNN (78.00%). An especially important finding of our research was the considerable improvement in detecting meningiomas; specifically, while simple CNNs could detect meningiomas with a recall rate of approximately 20%, EfficientNetB0 was able to detect meningiomas with a recall rate of 89%. Meningiomas are often difficult to detect because they can appear very subtly on MRI images. Additionally, an interesting finding was that the deeper VGG19 performed worse than the shallower VGG16. This indicates that in many cases the architectural efficiency of a CNN model may be more important than its depth when working with medical images. Overall, EfficientNetB0 appears to provide the optimal trade-off between classification accuracy, number of parameters used in the model and clinically meaningful performance.

2606.18675 2026-06-18 cs.CV 新提交 95%

BrainFusionNet: a deep learning and XAI model to understand local, global, and sequential features of MRI images for improved brain tumour detection

BrainFusionNet:一种用于理解MRI图像局部、全局和序列特征以改进脑肿瘤检测的深度学习与XAI模型

Md Taimur Ahad, Bo Song, Yan Li

发表机构 * School of Mathematics, Physics and Computing, University of Southern Queensland(南方昆士兰大学数学、物理与计算学院) School of Engineering, University of Southern Queensland(南方昆士兰大学工程学院)

专题命中 医学影像 :脑肿瘤检测混合模型,结合CNN/ViT/GRU

AI总结 提出BrainFusionNet混合模型,结合CNN、ViT和GRU提取MRI空间、上下文和序列特征,并集成SHAP、LIME和GradCAM进行可解释性分析,在公开数据集上达到98%准确率,优于SOTA CNN。

Journal ref Brain Inf. 13, 21 (2026)

详情
AI中文摘要

磁共振成像(MRI)的噪声给深度学习(DL)带来挑战,当肿瘤边界模糊、肿瘤位置和外观复杂时尤其如此。因此,我们开发了BrainFusionNet,它结合卷积神经网络(CNN)、视觉变换器(ViT)和门控循环单元(GRU),从MRI图像中提取空间、上下文和序列特征,以改进脑肿瘤分类。此外,集成了可解释AI(如SHAP、LIME和GradCAM),以可视化和突出显示有助于BrainFusionNet决策过程的图像区域。所提出的BrainFusionNet模型在两个公开MRI数据集上进行了评估,K折验证表明在两个数据集上准确率均达到98%。该模型与六种最先进的(SOTA)CNN和迁移学习进行了比较。在SOTA CNN中,DenseNet121和VGG16达到了96%的最高准确率。BrainFusionNet的新颖之处在于,该混合模型能够有效提取MRI图像的局部和全局特征,即使在小尺度肿瘤区域和肿瘤尺寸较小的情况下也是如此。该模型具有平衡的序列CNN架构,以捕获低层和深层特征;以及定制的ViT,可捕获局部特征、稳定梯度流并降低MRI图像训练期间梯度消失的风险。CNN和ViT的输出被馈送到GRU以进行最终分类。此外,我们分析像素强度以确定MRI图像质量是否影响图像分类。我们的发现在图像解释方面非常新颖,因为我们发现MRI图像中像素强度的分布会影响DL性能。

英文摘要

The noise of Magnetic Resonance Imaging MRI poses challenges for Deep Learning DL when tumor boundaries are obscured tumor location and appearance are complex Therefore we develop BrainFusionNet that combines Convolutional Neural Networks CNNs Vision Transformers ViT and Gated Recurrent Units GRUs to extract spatial contextual and sequential features from MRI images for improved brain tumor classification Furthermore explainable AI such as SHAP LIME and GradCAM are integrated to visualise and highlight image regions that contribute to BrainFusionNets decisionmaking process The proposed BrainFusionNet model is evaluated on two publicly available MRI datasets Kfold validation suggests 98 accuracy on both datasets The model was compared with the six stateoftheart SOTA CNNs and transfer learning Among the SOTA CNNs DenseNet121 and VGG16 achieved the highest accuracy of 96 The novelty of BrainFusionNet is that the hybrid model effectively extracts local and global features from MRI images even in smallscale tumor regions and small tumor sizes The model has a balanced sequential CNN architecture to capture lowlevel and deeperlayer features a customized ViT that captures local features stabilizes gradient flow and reduces the risk of vanishing gradients during MRI image training The CNN and ViT outputs are fed into a GRU for final classification Furthermore we analyze pixel intensities to determine whether MRI image quality affects image classification Our findings are very novel in image interpretation as we found that the distribution of pixel intensities in MRI images affects DL performance

2606.18609 2026-06-18 cs.CV 新提交 95%

Hallucination Detection and Correction in Medical VLMs via Counter-Evidence Verification

基于反事实证据验证的医学视觉语言模型幻觉检测与纠正

Nan Zhou, Ke Zou, Meng Liu, Linchao He, Jiaqi Zhu, Yi Zhang, Hu Chen, Huazhu Fu

发表机构 * College of Computer Science, Sichuan University(四川大学计算机科学学院) Yong Loo Lin School of Medicine, National University of Singapore(新加坡国立大学杨潞龄医学院) Key Laboratory of Data Protection and Intelligent Management, Ministry of Education, Sichuan University(四川大学数据保护与智能管理教育部重点实验室) National Key Laboratory of Autonomous Intelligent Unmanned Systems, Beijing Institute of Technology(北京理工大学自主智能无人系统国家重点实验室) Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR)(新加坡科技研究局高性能计算研究所)

专题命中 医学影像 :提出CoEV框架检测和纠正医学VLM幻觉,聚焦医学诊断

AI总结 提出CoEV框架,通过文本与视觉证据的双向验证检测并纠正医学VLM幻觉,无需重新训练,在四个数据集上显著提升检测和纠正性能。

Comments MICCAI 2026 Accept. Submission Version

详情
AI中文摘要

视觉语言模型(VLM)在医学诊断中的可靠性受到幻觉的挑战,这削弱了信任。现有的幻觉检测方法主要关注识别生成文本与参考数据之间的事实不一致性。虽然一些研究分析了模型在图像中的注意力区域,但它们很少验证这种注意力是否真正反映了支持生成文本的视觉证据。为了解决这一差距,我们提出了反事实证据验证(CoEV),一个无需训练的即插即用框架,通过基于证据的事实一致性验证来检测和纠正幻觉。CoEV在文本断言和视觉证据之间执行双向验证,测试每个陈述是否得到其对应证据区域的支持,并将每个陈述分配到一个四象限诊断图中,该图捕获文本事实性和视觉基础性的组合。CoEV检测幻觉内容,并作为事后细化工具,无需重新训练即可纠正幻觉。在四个医学数据集上的大量实验表明,CoEV能够对抗幻觉。在幻觉检测方面,CoEV始终优于现有方法,平均PR-AUC和ROC-AUC分别提高了3.0%和3.9%的绝对百分点,在特定VQA场景中提升高达18.5%。在幻觉纠正方面,它将Micro-F1提高了高达12.5%,在医学报告生成中将幻觉率降低了超过11.9%,并提高了医学VQA的准确性。这些结果表明,CoEV能够可靠地检测和纠正幻觉,为临床医生提供可靠的、基于证据的诊断线索。代码将在接收后发布。

英文摘要

Vision-Language models (VLMs) reliability in medical diagnosis is challenged by trust-undermining hallucinations. Existing hallucination detection approaches mainly focus on identifying factual inconsistencies between generated text and reference data. While some studies analyze where models attend in images, they seldom verify whether such attention truly reflects the visual evidence supporting the generated text. To address this gap, we propose Co}unter-Evidence Verification (CoEV), a training-free plug-and-play framework that detects and corrects hallucinations through evidence-based factual consistency verification. CoEV performs bidirectional verification between textual assertions and visual evidence, testing whether each statement is supported by its corresponding evidence region, and assigns each statement into a four-quadrant diagnostic map capturing combinations of text factuality and visual grounding. CoEV detects hallucinated content and serves as a post hoc refinement tool, correcting hallucinations without retraining. Extensive experiments on four medical datasets show that CoEV combats hallucinations in VLMs.For hallucination detection, CoEV consistently outperforms existing methods, improving average PR-AUC and ROC-AUC by 3.0% and 3.9% absolute points respectively, with notable gains of up to 18.5% in specific VQA scenarios. For hallucination correction, it improves Micro-F1 by up to 12.5%, reduces hallucination rates by over 11.9% on medical report generation, and also boosts medical VQA accuracy. These results show that CoEV enables reliable detection and correction of hallucinations, providing clinicians with dependable, evidence-based cues for diagnosis. Code will be released upon acceptance.

2604.14837 2026-06-18 cs.CV 95%

Improved Multiscale Structural Mapping with Supervertex Vision Transformer for the Detection of Alzheimer's Disease Neurodegeneration

改进的多尺度结构映射与超顶点视觉Transformer用于阿尔茨海默病神经退行性病变的检测

Geonwoo Baek, David H. Salat, Ikbeom Jang

发表机构 * Department of Computer Science \& Engineering, Hankuk University of Foreign Studies, Seoul, Republic of Korea Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, MA, USA Department of Radiology, Harvard Medical School, Boston, MA, USA Neuroimaging Research for Veterans (NeRVe) Center, VA Boston Healthcare System, Boston, MA, USA

专题命中 医学影像 :使用MRI检测阿尔茨海默病,属于医学影像

AI总结 本文提出MSSM+结合SSVM和SV-ViT,通过多尺度结构映射和超顶点映射提高阿尔茨海默病早期检测的准确性,实现了更显著的组间差异识别和分类性能提升。

Comments Submitted to Human Brain Mapping

Journal ref Human Brain Mapping 47(8), e70548 (2026)

详情
AI中文摘要

阿尔茨海默病(AD)的确认通常依赖于正电子发射断层扫描(PET)或脑脊液(CSF)分析,这些方法成本高且侵入性。因此,结构MRI生物标志物如皮层厚度(CT)被广泛用于非侵入性AD筛查。多尺度结构映射(MSSM)最近被提出,以整合灰白质对比(GWCs)与CT,从单个T1加权MRI(T1w)扫描中。在此框架基础上,我们提出了MSSM+,结合表面超顶点映射(SSVM)和超顶点视觉Transformer(SV-ViT)。对具有AD和认知正常(CN)控制的个体的3D T1w图像进行了分析。MSSM+通过在顶点层面整合沟回深度和皮层曲率扩展了MSSM。SSVM将皮层表面划分为超顶点(表面块),有效代表区域间和区域内的空间关系。SV-ViT是一种在这些超顶点上运行的视觉Transformer架构,使从表面网格表示中获得解剖学信息的学习成为可能。与MSSM相比,MSSM+在AD和CN之间识别了更广泛且统计上显著的组差异。在AD vs. CN分类中,MSSM+在精确率-召回率曲线下面积比MSSM高3%。针对特定供应商的分析进一步表明,信号变异性减少,并且在MR制造商之间,相对于CT、GWCs和MSSM,分类性能一致提高。这些发现表明,结合SV-ViT的MSSM+是一种有前景的MRI成像生物标志物,用于在CSF/PET确认之前检测AD。

英文摘要

Alzheimer's disease (AD) confirmation often relies on positron emission tomography (PET) or cerebrospinal fluid (CSF) analysis, which are costly and invasive. Consequently, structural MRI biomarkers such as cortical thickness (CT) are widely used for non-invasive AD screening. Multiscale structural mapping (MSSM) was recently proposed to integrate gray-white matter contrasts (GWCs) with CT from a single T1-weighted MRI (T1w) scan. Building on this framework, we propose MSSM+, together with surface supervertex mapping (SSVM) and a Supervertex Vision Transformer (SV-ViT). 3D T1w images from individuals with AD and cognitively normal (CN) controls were analyzed. MSSM+ extends MSSM by incorporating sulcal depth and cortical curvature at the vertex level. SSVM partitions the cortical surface into supervertices (surface patches) that effectively represent inter- and intra-regional spatial relationships. SV-ViT is a Vision Transformer architecture operating on these supervertices, enabling anatomically informed learning from surface mesh representations. Compared with MSSM, MSSM+ identified more spatially extensive and statistically significant group differences between AD and CN. In AD vs. CN classification, MSSM+ achieved a 3%p higher area under the precision-recall curve than MSSM. Vendor-specific analyses further demonstrated reduced signal variability and consistently improved classification performance across MR manufacturers relative to CT, GWCs, and MSSM. These findings suggest that MSSM+ combined with SV-ViT is a promising MRI-based imaging marker for AD detection prior to CSF/PET confirmation.

2602.02370 2026-06-18 cs.CV 95%

Uncertainty-Aware Image Classification In Biomedical Imaging Using Spectral-normalized Neural Gaussian Processes

利用谱归一化神经高斯过程进行生物医学影像中的不确定性感知图像分类

Uma Meleti, Jeffrey J. Nirschl

发表机构 * Department of Pathology(病理学部) Lab Medicine, University of Wisconsin-Madison(实验室医学,威斯康星大学麦迪逊分校)

专题命中 医学影像 :生物医学影像分类,属于医学影像

AI总结 本文提出SNGP模型,通过谱归一化和高斯过程层改进单模型不确定性估计与异常检测,在三个生物医学分类任务中表现优异。

Comments Published at the IEEE International Symposium on Biomedical Imaging (ISBI) 2026

Journal ref Proc. 2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI),London, United Kingdom, Apr. 8-11, 2026, pp. [1-4], 2026

详情
AI中文摘要

准确的组织病理学解释对临床决策至关重要;然而,当前的数字病理深度学习模型在分布外(OOD)设置中往往过于自信且校准不佳,限制了信任和临床应用。安全关键的医学影像工作流程受益于内在的不确定性感知属性,能够准确拒绝OOD输入。我们实现了SNGP,即一组轻量级修改,应用谱归一化并用高斯过程层替代最终密集层,以提高单模型不确定性估计和OOD检测。我们在六个数据集上评估SNGP与确定性和蒙特卡洛dropout,涵盖三个生物医学分类任务:白血球、淀粉样斑块和结直肠组织病理学。SNGP在分布内性能相当,同时显著提高不确定性估计和OOD检测。因此,SNGP或相关模型提供了一个有用的框架,用于数字病理学中的不确定性感知分类,支持安全部署并建立与病理科医生的信任。

英文摘要

Accurate histopathologic interpretation is key for clinical decision-making; however, current deep learning models for digital pathology are often overconfident and poorly calibrated in out-of-distribution (OOD) settings, which limit trust and clinical adoption. Safety-critical medical imaging workflows benefit from intrinsic uncertainty-aware properties that can accurately reject OOD input. We implement the Spectral-normalized Neural Gaussian Process (SNGP), a set of lightweight modifications that apply spectral normalization and replace the final dense layer with a Gaussian process layer to improve single-model uncertainty estimation and OOD detection. We evaluate SNGP vs. deterministic and MonteCarlo dropout on six datasets across three biomedical classification tasks: white blood cells, amyloid plaques, and colorectal histopathology. SNGP has comparable in-distribution performance while significantly improving uncertainty estimation and OOD detection. Thus, SNGP or related models offer a useful framework for uncertainty-aware classification in digital pathology, supporting safe deployment and building trust with pathologists.

2508.11211 2026-06-18 eess.IV cs.CV 版本更新 95%

Efficient Image-to-Image Schrödinger Bridge for CT Field of View Extension

面向CT视野扩展的高效图像到图像薛定谔桥

Zhenhao Li, Song Ni, Long Yang, Xiaojie Yin, Haijun Yu, Jiazhou Wang, Hongbin Han, Weigang Hu, Yixing Huang

发表机构 * Institute of Medical Technology, Peking University Health Science Center(北京大学人民医院医学技术研究所) Shanghai Cancer Center, Fudan University(复旦大学上海癌症中心) Department of Electrical and Computer Engineering, University of Massachusetts Lowell(马萨诸塞大学洛厄尔分校电气与计算机工程系) Beijing Key Laboratory of Intelligent Neuromodulation and Brain Disorder Treatment(北京智能神经调控与脑疾病治疗重点实验室)

专题命中 医学影像 :CT视野扩展,属于医学影像

AI总结 提出基于图像到图像薛定谔桥(I²SB)扩散模型的CT视野扩展框架,通过直接学习有限视野与扩展视野图像间的随机映射,实现单步快速推理,在精度和速度上均超越现有扩散模型。

Comments 12 pages

Journal ref IEEE Transactions on Radiation and Plasma Medical Sciences 2026

详情
AI中文摘要

计算机断层扫描(CT)是一种用于无创、高分辨率可视化内部解剖结构的基石成像模态。然而,当扫描物体超出扫描仪的视野(FOV)时,投影数据被截断,导致重建不完整并在FOV边界附近出现明显伪影。传统重建算法难以从这类数据中恢复准确的解剖结构,限制了临床可靠性。深度学习方法已被探索用于FOV扩展,其中扩散生成模型代表了图像合成的最新进展。然而,传统扩散模型由于迭代采样过程,计算量大且推理速度慢。为解决这些限制,我们提出了一种基于图像到图像薛定谔桥(I$^2$SB)扩散模型的高效CT FOV扩展框架。与从纯高斯噪声合成图像的传统扩散模型不同,I$^2$SB学习配对的有限FOV和扩展FOV图像之间的直接随机映射。这种直接对应关系产生了更可解释和可追踪的生成过程,增强了重建中的解剖一致性和结构保真度。I$^2$SB实现了优越的定量性能,在模拟噪声数据上的均方根误差(RMSE)值为49.8 HU,在真实数据上为152.0 HU,优于最先进的扩散模型,如条件去噪扩散概率模型(cDDPM)和基于块的扩散方法。此外,其单步推理使得每2D切片的重建仅需0.19秒,相比cDDPM(135秒)实现了超过700倍的加速,并超过了第二快的DiffusionGAN(0.58秒)。这种准确性和效率的结合表明I$^2$SB具有实时或临床部署的潜力。

英文摘要

Computed tomography (CT) is a cornerstone imaging modality for non-invasive, high-resolution visualization of internal anatomical structures. However, when the scanned object exceeds the scanner's field of view (FOV), projection data are truncated, resulting in incomplete reconstructions and pronounced artifacts near FOV boundaries. Conventional reconstruction algorithms struggle to recover accurate anatomy from such data, limiting clinical reliability. Deep learning approaches have been explored for FOV extension, with diffusion generative models representing the latest advances in image synthesis. Yet, conventional diffusion models are computationally demanding and slow at inference due to their iterative sampling process. To address these limitations, we propose an efficient CT FOV extension framework based on the image-to-image Schrödinger Bridge (I$^2$SB) diffusion model. Unlike traditional diffusion models that synthesize images from pure Gaussian noise, I$^2$SB learns a direct stochastic mapping between paired limited-FOV and extended-FOV images. This direct correspondence yields a more interpretable and traceable generative process, enhancing anatomical consistency and structural fidelity in reconstructions. I$^2$SB achieves superior quantitative performance, with root-mean-square error (RMSE) values of 49.8 HU on simulated noisy data and 152.0 HU on real data, outperforming state-of-the-art diffusion models such as conditional denoising diffusion probabilistic models (cDDPM) and patch-based diffusion methods. Moreover, its one-step inference enables reconstruction in just 0.19 s per 2D slice, representing over a 700-fold speedup compared to cDDPM (135 s) and surpassing DiffusionGAN (0.58 s), the second fastest. This combination of accuracy and efficiency indicates that I$^2$SB has potential for real-time or clinical deployment.

2512.09185 2026-06-18 cs.CV cs.AI 版本更新 95%

Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation

学习患者特异性疾病动态:基于潜在流匹配的纵向影像生成

Hao Chen, Rui Yin, Yifan Chen, Qi Chen, Chao Li

发表机构 * University of Cambridge(剑桥大学) Nanjing First Hospital(南京第一医院) Nanjing Medical University(南京医科大学) Johns Hopkins University(约翰霍普金斯大学) University of Dundee(邓迪大学)

专题命中 医学影像 :提出纵向MRI生成框架,建模疾病进展

AI总结 提出Δ-LFM框架,利用流匹配对齐患者潜在轨迹,通过患者特异性潜在对齐实现单调疾病进展建模,在三个纵向MRI基准上验证了可解释性和性能。

Comments ICLR 2026 accepted

详情
AI中文摘要

理解疾病进展是一个直接的临床挑战,对早期诊断和个性化治疗具有重要意义。虽然最近的生成方法试图对进展进行建模,但关键不匹配仍然存在:疾病动态本质上是连续且单调的,然而潜在表示通常是分散的,缺乏语义结构,并且基于扩散的模型通过随机去噪过程破坏了连续性。在这项工作中,我们提出将疾病动态视为速度场,并利用流匹配(FM)来对齐患者数据的时间演变。与先前方法不同,它捕捉了疾病的内在动态,使进展更具可解释性。然而,一个关键挑战仍然存在:在潜在空间中,自动编码器(AE)不能保证跨患者的对齐或与临床严重性指标(例如年龄和疾病状况)的相关性。为了解决这个问题,我们提出学习患者特异性潜在对齐,这迫使患者轨迹沿着特定轴延伸,其幅度随疾病严重程度单调增加。这导致了一个一致且语义上有意义的潜在空间。总之,我们提出了Δ-LFM,一个用于通过流匹配建模患者特异性潜在进展的框架。在三个纵向MRI基准上,Δ-LFM展示了强大的实证性能,更重要的是,为解释和可视化疾病动态提供了一个新框架。

英文摘要

Understanding disease progression is a central clinical challenge with direct implications for early diagnosis and personalized treatment. While recent generative approaches have attempted to model progression, key mismatches remain: disease dynamics are inherently continuous and monotonic, yet latent representations are often scattered, lacking semantic structure, and diffusion-based models disrupt continuity with random denoising process. In this work, we propose to treat the disease dynamic as a velocity field and leverage Flow Matching (FM) to align the temporal evolution of patient data. Unlike prior methods, it captures the intrinsic dynamic of disease, making the progression more interpretable. However, a key challenge remains: in latent space, Auto-Encoders (AEs) do not guarantee alignment across patients or correlation with clinical-severity indicators (e.g., age and disease conditions). To address this, we propose to learn patient-specific latent alignment, which enforces patient trajectories to lie along a specific axis, with magnitude increasing monotonically with disease severity. This leads to a consistent and semantically meaningful latent space. Together, we present $Δ$-LFM, a framework for modeling patient-specific latent progression with flow matching. Across three longitudinal MRI benchmarks, $Δ$-LFM demonstrates strong empirical performance and, more importantly, offers a new framework for interpreting and visualizing disease dynamics.

2510.10779 2026-06-18 cs.CV 版本更新 95%

Structured Spectral Graph Representation Learning for Multi-label Abnormality Analysis from 3D CT Scans

结构化谱图表示学习用于3D CT扫描的多标签异常分析

Theo Di Piazza, Carole Lazarus, Olivier Nempont, Loic Boussel

发表机构 * INSA Lyon, University of Lyon, CNRS, INSERM, CREATIS UMR 5220, U1294(里昂国立应用科学学院、里昂大学、国家科学研究中心、法国国家医学研究院、CREATIS UMR 5220、U1294)

专题命中 医学影像 :3D CT异常分析,多标签分类

AI总结 提出一种基于谱图卷积的2.5D框架,将3D CT体积表示为结构化图,通过轴向切片三元组节点建模层间依赖,实现多标签异常分类,跨数据集泛化性能强。

Comments Accepted at MELBA Journal 2026

详情
AI中文摘要

随着CT检查数量的增长,对器官分割、异常检测和报告生成等自动化工具的需求日益增加,以支持放射科医生管理临床工作负载。由于三维数据中固有的复杂空间关系和异常的广泛变异性,3D胸部CT扫描的多标签分类仍然是一个关键但具有挑战性的问题。基于3D卷积神经网络的现有方法难以捕捉长距离依赖,而视觉Transformer通常需要在大规模领域特定数据集上进行大量预训练才能获得竞争力。在这项工作中,我们提出了一种2.5D替代方案,引入了一个新的基于图的框架,将3D CT体积表示为结构化图,其中轴向切片三元组作为节点,通过谱图卷积处理,使模型能够推理层间依赖,同时保持与临床部署兼容的复杂度。我们的方法在来自独立机构的3个数据集上进行训练和评估,实现了强大的跨数据集泛化能力,并与最先进的视觉编码器相比表现出竞争性能。我们进一步进行了全面的消融研究,以评估各种聚合策略、边加权方案和图连接模式的影响。此外,我们通过自动放射学报告生成和腹部CT数据的迁移实验展示了我们方法的更广泛适用性。

英文摘要

With the growing volume of CT examinations, there is an increasing demand for automated tools such as organ segmentation, abnormality detection, and report generation to support radiologists in managing their clinical workload. Multi-label classification of 3D Chest CT scans remains a critical yet challenging problem due to the complex spatial relationships inherent in volumetric data and the wide variability of abnormalities. Existing methods based on 3D convolutional neural networks struggle to capture long-range dependencies, while Vision Transformers often require extensive pre-training on large-scale, domain-specific datasets to perform competitively. In this work, we propose a 2.5D alternative by introducing a new graph-based framework that represents 3D CT volumes as structured graphs, where axial slice triplets serve as nodes processed through spectral graph convolution, enabling the model to reason over inter-slice dependencies while maintaining complexity compatible with clinical deployment. Our method, trained and evaluated on 3 datasets from independent institutions, achieves strong cross-dataset generalization, and shows competitive performance compared to state-of-the-art visual encoders. We further conduct comprehensive ablation studies to evaluate the impact of various aggregation strategies, edge-weighting schemes, and graph connectivity patterns. Additionally, we demonstrate the broader applicability of our approach through transfer experiments on automated radiology report generation and abdominal CT data.

2606.18354 2026-06-18 eess.IV cs.LG 新提交 90%

Structural MRI Synthesis for Alzheimer's Disease via Conditional Diffusion on Anatomical Masks

基于解剖掩膜条件扩散的阿尔茨海默病结构MRI合成

Muge Zhang, Muhammad Ali Khaliq, Jamal Alsakran, Byeong Kil Lee, Jeeho Ryoo

发表机构 * Fairleigh Dickinson University(Fairleigh Dickinson大学) University of Colorado at Colorado Springs(科罗拉多州立大学)

专题命中 医学影像 :合成阿尔茨海默病结构MRI,条件扩散模型

AI总结 针对阿尔茨海默病结构MRI合成中细微解剖变化难以捕捉的问题,本文扩展Med-DDPM条件扩散模型,以解剖分割掩膜为条件生成3D结构MRI,实验表明合成数据训练的模型Dice分数与真实数据相当,混合数据训练则显著提升性能。

Journal ref 2025 IEEE 8th International Conference on Multimedia Information Processing and Retrieval (MIPR)

详情
AI中文摘要

生成式机器学习模型的最新进展显著改善了医学成像,为数据增强、隐私保护和模型泛化提供了有前景的解决方案。然而,由于神经退行性病变相关的细微、区域特异性和渐进性解剖变化,合成阿尔茨海默病(AD)的高质量结构MRI数据仍然具有挑战性。在本文中,我们将最初为脑肿瘤合成设计的Med-DDPM条件扩散模型扩展,以生成专门针对AD的3D结构MRI。我们采用Med-DDPM,因为与其他生成模型相比,它具有稳定的结构和保真度,特别适合捕捉AD特征的细微解剖变化。我们的方法以来自ADNI数据集的解剖分割掩膜为条件,将关键的AD相关脑结构纳入生成过程。我们通过在真实、合成和混合数据集上训练分割模型,系统评估了合成图像的质量和实用性。实验结果表明,仅在合成数据上训练的分割模型达到了与真实数据训练(0.6513)相当的Dice分数(0.6532),同时召回率显著提高。值得注意的是,在混合数据集(混合真实和合成图像)上训练的模型优于真实和纯合成基线,Dice分数达到0.7244。这些发现强调了条件扩散模型在生成解剖准确、AD特异性合成MRI方面的成功应用,并突出了它们在增强训练数据可用性、提高诊断准确性和促进神经影像研究可重复性方面的潜力。

英文摘要

Recent advances in generative machine learning models have significantly improved medical imaging, offering promising solutions for data augmentation, privacy preservation, and improved model generalization. However, synthesizing high-quality structural MRI data for Alzheimer's Disease (AD) remains challenging due to the subtle, region-specific, and progressive anatomical changes associated with neurodegeneration. In this paper, we extend the Med-DDPM conditional diffusion model -- originally designed for brain tumor synthesis -- to generate 3D structural MRIs specifically tailored to AD. We adopted Med-DDPM due to its established stability and structural fidelity compared to other generative models, which makes it particularly suitable for capturing the subtle anatomical changes characteristic of AD. Our approach conditions the diffusion process on anatomical segmentation masks derived from the ADNI dataset, incorporating key AD-relevant brain structures into the generation process. We systematically evaluate the quality and utility of the synthetic images by training segmentation models on real, synthetic, and hybrid (mixed) datasets. Experimental results demonstrate that segmentation models trained exclusively on synthetic data achieve comparable Dice scores (0.6532) to those trained on real data (0.6513), while exhibiting significantly enhanced recall. Notably, models trained on hybrid datasets (mixing real and synthetic images) outperform both real and synthetic-only baselines, achieving a Dice score of 0.7244. These findings underscore the successful use of conditional diffusion models for generating anatomically accurate, AD-specific synthetic MRIs, and highlight their potential for enhancing training data availability, improving diagnostic accuracy, and promoting research reproducibility in neuroimaging studies.

2606.19215 2026-06-18 cs.CV 新提交 90%

GUMP-Net: An interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation

GUMP-Net: 一种用于多类盆腔分割的可解释模型-数据驱动智能算法

Liheng Wang, Yinghui Zhang, Licheng Zhang, Hailin Xu, Qiyong Cao, Chong Chen

发表机构 * State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences(数学科学国家重点实验室,数学与系统科学研究院,中国科学院) University of Chinese Academy of Sciences(中国科学院大学) Department of Orthopedics, The Fourth Medical Center of Chinese PLA General Hospital(中国人民解放军第四医学中心骨科部) National Clinical Research Center for Orthopedics, Sports Medicine and Rehabilitation(骨科、运动医学与康复临床研究中心) Department of Trauma and Orthopedics, People’s Hospital Peking University(北京大学人民医院创伤与骨科部) Department of Orthopedics and Traumatology, Beijing Jishuitan Hospital, Capital Medical University(首都医科大学北京积水潭医院骨科与创伤科)

专题命中 医学影像 :盆腔分割,属于医学影像分析

AI总结 提出GUMP-Net,结合改进测地线活动轮廓模型与深度神经网络,实现多类盆腔分割,在小训练数据下表现更优,并提供可解释几何视角。

Comments 26 pages, 8 figures, 3 tables

详情
AI中文摘要

盆腔分割是盆腔骨折精准智能诊疗及手术规划导航中最重要和基础的研究问题之一。通过将改进的测地线活动轮廓模型与深度神经网络相结合,我们提出了GUMP-Net,一种用于多类盆腔分割的可解释模型-数据驱动智能算法,其中设计了三个网络模块共同构成整体分割框架:用于自动水平集初始化的目标检测模块、用于学习解剖感知边缘检测函数的边缘检测器模块以及用于深度水平集演化的迭代模块。利用水平集表示和深度学习的优势,GUMP-Net在分割性能上比最先进的方法更准确、鲁棒和一致,尤其是在小训练数据情况下。在盆腔数据集上的大量实验证明了所提算法的合理性和有效性。扩展到踝关节数据集的进一步实验表明其对其他解剖结构具有更广泛的应用。所提算法不仅为复杂骨折复位提供了高效的分割方法,而且为理解深度学习分割提供了可解释的几何视角。

英文摘要

Pelvic segmentation is one of the most important and fundamental research problems in precise and intelligent diagnosis and treatment, as well as surgical planning and navigation for pelvic fractures. By combining an improved geodesic active contour model with deep neural networks, we propose GUMP-Net, an interpretable model-data-driven intelligent algorithm for multi-class pelvic segmentation, in which three network modules are designed to constitute the overall segmentation framework together: the object detection module for automatic level set initialization, the edge detector module for learning an anatomy-aware edge detector function and the iteration module for deep level set evolution. Leveraging the advantages of level set representation and deep learning, GUMP-Net shows more accurate, robust and consistent segmentation performance, especially in small training data situation, compared to the state-of-the-art methods. Extensive experiments on pelvic datasets demonstrate the rationality and effectiveness of the proposed algorithm. Further experiments extended to ankle dataset indicate broader applications to other anatomies. The proposed algorithm not only provides an efficient segmentation method for complex fracture reduction, but also gives an interpretable geometric perspective for understanding deep learning segmentation.

2606.18886 2026-06-18 cs.CV 新提交 90%

DINO-Med3D: Bridging Dimension and Domain Gaps in Volumetric Segmentation via Progressive Adaptation

DINO-Med3D:通过渐进式适应弥合体分割中的维度与领域差距

Haoyu Hu, Xiyao Ma, Shiqi Liu, Linsen Zhang, Xiaoliang Xie, Xiaohu Zhou, Zeng-Guang Hou

发表机构 * University of Chinese Academy of Sciences(中国科学院大学) Institute of Automation, Chinese Academy of Sciences(中国科学院自动化研究所)

专题命中 医学影像 :将DINOv3适配到3D医学分割

AI总结 提出两阶段渐进框架DINO-Med3D,通过多切片嵌入模块、3D适配器和并行细节恢复流,将DINOv3适配到3D医学分割,在五个数据集上超越现有方法。

Comments Accepted at MICCAI 2026. The camera-ready version and link will be made publicly available upon publication

详情
AI中文摘要

尽管DINOv3在自然图像中展现了显著的语义判别能力,但其直接应用于体医学分割受到固有的维度和领域差异的阻碍。为解决这些问题,我们提出DINO-Med3D,一个两阶段渐进框架,将预训练的DINOv3编码器重新用于3D医学任务。在第一阶段,我们通过引入融合伪3D上下文的多切片嵌入模块来弥合维度差距,同时采用分割代理任务将从自然场景学到的表示适应到医学领域。随后,我们通过在冻结的主干中添加轻量级3D适配器来增强体理解,以强制执行全局切片间连续性。最后,为补偿嵌入过程中固有的空间信息损失,我们设计了一个并行细节恢复流,以显式保留高频边界线索。在五个公共数据集上的大量实验表明,我们的方法成功地将DINOv3适应到医学领域,并显著优于最先进的基线方法。

英文摘要

Although DINOv3 has demonstrated remarkable semantic discrimination in natural imagery, its direct application to volumetric medical segmentation is hindered by inherent dimension and domain disparities. To resolve these issues, we propose DINO-Med3D, a two-stage progressive framework that repurpose the pre-trained DINOv3 encoder for 3D medical tasks. In the first stage, we mitigate the dimension gap by introducing a multi-slice embedding module that incorporates pseudo-3D context, while simultaneously employing a segmentation proxy task to adapt representations learned from natural scenes to the medical domain. Subsequently, we further enhance volumetric understanding by adding lightweight 3D adapters into the frozen backbone to enforce global inter-slice continuity. Finally, to compensate for the spatial information loss inherent in the embedding process, we design a parallel detail recovery stream to explicitly preserve high-frequency boundary cues. Extensive experiments on five public datasets demonstrate that our approach successfully adapts DINOv3 to the medical domain and significantly outperforms state-of-the-art baselines.

2606.18876 2026-06-18 cs.CV cs.LG 新提交 90%

Test-Time Adaptation in Optical Coherence Tomography Using Trajectory-Aligned Time-Independent Flow

光学相干断层扫描中基于轨迹对齐的时间无关流的测试时自适应

Veit Hucke, Thomas Pinetz, Gregor Reiter, Ursula Schmidt-Erfurth, Hrvoje Bogunović

发表机构 * Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria(人工智能研究所、医学数据科学中心、维也纳医学大学,奥地利) Comprehensive Center for Artificial Intelligence in Medicine, Medical University of Vienna, Austria(医学人工智能综合中心、维也纳医学大学,奥地利) Department of Ophthalmology and Optometry, Medical University of Vienna, Austria(眼科与视光学部、维也纳医学大学,奥地利) Laboratory for Ophthalmic Image Analysis, Medical University of Vienna, Austria(眼科图像分析实验室、维也纳医学大学,奥地利)

专题命中 医学影像 :OCT图像质量自适应,用于AMD分割,医学影像核心。

AI总结 提出一种基于流匹配的测试时自适应方法,通过直方图匹配和去除时间条件,生成高质量替代图像,在AMD分割中达到最优性能。

Comments Accepted in MICCAI

详情
AI中文摘要

光学相干断层扫描(OCT)在眼科中至关重要,但图像质量不一致,尤其是在低成本设备中,阻碍了自动化分析。为了解决这个问题,我们引入了一种基于流匹配的测试时自适应方法,从噪声输入生成高质量替代图像。通常,测试数据和训练数据之间的域差距会导致去噪过程中像素分布不匹配。我们通过将测试图像的直方图与合成参考轨迹匹配来克服这一问题,成功地将输入与预期分布对齐。此外,我们移除了网络的时间条件,以考虑真实世界噪声分布的轻微偏差。我们的方法在分割年龄相关性黄斑变性(AMD)两个阶段的关键生物标志物方面达到了最先进的性能。代码地址:this https URL。

英文摘要

Optical coherence tomography (OCT) is essential in ophthalmology, but inconsistent image quality especially in low-cost devices hinders automated analysis. To address this, we introduce a flow-matching-based test-time adaptation method that generates high-quality surrogate images from noisy inputs. Typically, domain gaps between test and training data cause pixel distribution mismatches during the denoising process. We overcome this by matching the test image's histogram to synthetic reference trajectories, successfully aligning the input with expected distributions. Additionally, we remove the network's time conditioning to account for slight deviations in real-world noise distributions. Our approach achieves state-of-the-art performance in segmenting critical biomarkers for two stages of Age-related Macular Degeneration (AMD). Code is available: https://github.com/Veit21/tta-flow.

2606.18872 2026-06-18 cs.CV 新提交 90%

Bridging Single Distortion Artifacts and Mmultifactorial Clinical Quality: Few-shot Biparametric MRI Quality Assessment via Distortion-trained Prototypical Networks

桥接单一失真伪影与多因素临床质量:基于失真训练的原型网络的少样本双参数MRI质量评估

Yuheng Tang, Alexander Ng, Wen Yan, Natasha Thorley, Pawel Rajwa, Yipei Wang, Aqua Asif, Clare Allen, Louise Dickinson, Francesco Giganti, Shonit Punwani, Daniel Alexander, Veeru Kasivisvanathan, Yipeng Hu

发表机构 * UCL Hawkes Institute(UCL Hawkes研究所) Department of Medical Physics and Biomedical Engineering(医学物理与生物医学工程系) University College London(伦敦大学学院) Division of Surgery and Interventional Science(外科与介入科学分会) Centre for Medical Imaging(医学成像中心) British Urology Researchers in Surgical Training (BURST)(英国泌尿外科手术培训研究人员(BURST)) Department of Radiology(放射科) University College London Hospitals NHS Foundation Trust(伦敦大学学院医院国家健康服务信托基金) Centre of Medical Imaging, Division of Medicine(医学成像中心,医学分会) Centre for Medical Image Computing(医学图像计算中心) Department of Computer Science(计算机科学系) Department of Urology(泌尿科)

专题命中 医学影像 :前列腺MRI质量评估,少样本原型网络。

AI总结 提出一种少样本双参数原型网络,利用失真标签元训练,通过特征融合和域对齐,仅用5个样本即可预测PI-QUAL临床质量评分,解决临床数据稀缺问题。

详情
AI中文摘要

临床前列腺多参数MRI高度依赖高质量扩散加权成像(DWI),但DWI读图常因几何失真(通常由直肠气体引起)而受损。通过PI-QUAL评分系统评估质量是新兴的临床标准,但该方法主观、耗时,且存在类别不平衡问题,其中低质量病例多样且相对稀少。以PRIME临床试验为例,6%的图像PI-QUAL评分低于4,87%的DWI问题源于失真,许多其他临床质量问题代表性不足。为解决这种标注临床数据的双重稀缺性,我们提出了一种用于自动图像质量评估(IQA)的少样本双参数原型网络。我们的框架利用双分支3D ResNet融合T2加权和DWI特征,提供解剖背景以区分真实形态与失真。为处理现实异质性,我们引入特征级线性调制(FiLM)和梯度反转层(GRL),以对齐基于不同b值的特征分布,同时抑制采集相关偏差。我们证明,仅基于相对客观、易于获取的失真标签进行元训练的模型,能够仅使用五个代表性样本有效适应预测复杂的多因素临床质量评分(如PI-QUAL)。在两个数据集上的实验结果表明,我们的方法在此具有挑战性的IQA任务中显著优于少样本学习基线,为临床工作流程中标准化前列腺MRI质量控制提供了实际可行且数据高效的解决方案。

英文摘要

Clinical prostate multi-parametric MRI relies heavily on high-quality diffusion-weighted imaging (DWI), yet reading DWI is frequently compromised by geometric distortion, often caused by rectal air. Assessing quality via the PI-QUAL scoring system is an emerging clinical standard, but it is subjective, time-consuming and suffers from a class imbalance where low-quality cases are diverse and relatively scarce. Using the PRIME clinical trial as an example, there are $6\%$ images with PI-QUAL scores lower than 4, $87\%$ of DWI issues are due to distortion. Many of the other clinical quality issues are under-represented. To address this common dual-scarcity of annotated clinical data, we propose a few-shot biparametric prototypical network for automated image quality assessment (IQA). Our framework utilizes a dual-branch 3D ResNet to fuse T2-weighted and DWI features, providing anatomical context to distinguish true morphology from distortion. To handle real-world heterogeneity, we introduce feature-wise linear modulation (FiLM) and a gradient reversal layer (GRL) to align feature distributions conditioned on varying b-values while suppressing acquisition-related biases. We demonstrate that a model meta-trained solely on comparatively objective, readily obtainable distortion labels can effectively adapt to predicting complex, multi-factorial clinical quality scores such as PI-QUAL using only five representative samples. Experimental results on two datasets show that our method significantly outperforms few-shot learning baselines for this challenging IQA task, offering a practically feasible and data-efficient solution for standardizing prostate MRI quality control in clinical workflows.

2606.18869 2026-06-18 cs.CV 新提交 90%

Learning to Distort: Weakly-Supervised Image Quality Transfer for Prostate DWI Correction

学习扭曲:用于前列腺DWI校正的弱监督图像质量迁移

YuCheng Tang, Wen Yan, Alexander Ng, Natasha Thorley, Pawel Rajwa, Yipei Wang, Aqua Asif, Clare Allen, Louise Dickinson, Francesco Giganti, David Atkinson, Shonit Punwani, Daniel Alexander, Shaheer Ullah Saeed, Veeru Kasivisvanathan, Yipeng Hu

发表机构 * UCL Hawkes Institute(UCL哈维斯研究所) Department of Medical Physics and Biomedical Engineering(医学物理与生物医学工程系) University College London(伦敦大学学院) Division of Surgery and Interventional Science(外科与介入科学分会) Centre for Medical Imaging(医学成像中心) British Urology Researchers in Surgical Training (BURST)(英国泌尿外科手术培训研究人员(BURST)) Department of Radiology(放射科) University College London Hospitals NHS Foundation Trust(伦敦大学学院医院国家健康服务信托基金) Centre for Medical Image Computing(医学图像计算中心) Department of Computer Science(计算机科学系) Department of Urology(泌尿科)

专题命中 医学影像 :前列腺DWI失真校正,弱监督图像质量迁移。

AI总结 提出弱监督图像质量迁移框架,利用图像质量评估信号从无失真图像学习生成真实失真,并训练校正模型,在PI-RADS和Gleason评分分类任务中优于现有无配对方法。

详情
AI中文摘要

单次激发平面回波前列腺弥散加权成像(DWI)常因几何失真而复杂化,影响从这些图像中获得可靠诊断的能力。开发自动化校正方法面临缺乏配对的失真和未失真临床扫描的挑战。本文首先提出一种新颖的弱监督图像质量迁移(IQT)框架,从无失真图像到失真图像,利用图像质量评估(IQA)信号监督迁移过程。与传统方法需要昂贵的体素级配对数据或采用无配对算法不同,我们的方法利用图像级质量标签(此处为失真与无失真)在预训练特征空间中建立潜在质量原型。认识到模拟真实失真比直接无配对校正更可靠,我们描述了一种弱监督原型流匹配算法,显式正则化生成轨迹朝向失真原型,产生模拟临床退化的真实磁敏感伪影。通过合成这些真实配对,我们能够训练第二个IQT模型进行正向失真校正。实验结果表明,我们生成的图像成功模拟了真实伪影的诊断干扰,从而产生更强大的失真校正IQT模型。除定性比较外,我们还通过评估临床下游任务性能(PI-RADS和Gleason评分分类),使用分布内和外部数据集,将我们的方法与现有无配对方法(如CycleGAN、UNIT-DDPM和OT-FM)作为正向或反向替代方案进行详尽的定量评估。

英文摘要

Single-shot echo-planar prostate diffusion-weighted imaging (DWI) is frequently complicated by geometric distortions, which impact the ability to derive reliable diagnoses from such images. Developing automated correction methods is challenged by the absence of paired distorted and undistorted clinical scans. In this paper, we first propose a novel weakly-supervised image quality transfer (IQT) framework from undistorted to distorted images that utilizes image quality assessment (IQA) signals to supervise the transfer process. Unlike traditional methods that require expensive, voxel-wise paired data or resort to developing unpaired algorithms, our approach utilizes image-level quality labels (here, distorted vs. undistorted) to establish latent quality prototypes within a pre-trained feature space. Recognizing that simulating realistic distortions is more reliable than direct unpaired correction, we describe a weakly-supervised prototype flow matching algorithm to explicitly regularize generative trajectories towards distorted prototypes, producing realistic susceptibility artifacts that mimic clinical degradations. By synthesizing these realistic pairs, we enable a second IQT model to be trained in the forward direction for distortion correction. Experimental results demonstrate that our generated images successfully mimic the diagnostic interference of real-world artifacts, which leads to more capable distortion correction IQT models. In addition to qualitative comparisons, we also conduct exhaustive quantitative evaluations that compare our approach with existing unpaired approaches (e.g., CycleGAN, UNIT-DDPM, and OT-FM) - as either forward or reverse alternatives - by assessing clinical downstream task performance in PI-RADS and Gleason score classification, using both in-distribution and external data sets.

2606.18860 2026-06-18 cs.CV cs.LG 新提交 90%

Quantification of Uncertainty with Adversarial Models in Medical Image Segmentation

医学图像分割中对抗模型的不确定性量化

Hana Jebril, Thomas Pinetz, Günter Klambauer, Hrvoje Bogunović

发表机构 * Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria(人工智能研究所、医学数据科学中心、维也纳医学大学,奥地利) Comprehensive Center for AI in Medicine, Medical University of Vienna, Austria(医学人工智能综合中心、维也纳医学大学,奥地利) ELLIS Unit Linz, LIT AI Lab and Institute for Machine Learning, Johannes Kepler University Linz, Austria(林茨ELLIS单位、LIT人工智能实验室和机器学习研究所、林茨约瑟夫·冯·克拉夫特大学,奥地利) Institute for Machine Learning, Johannes Kepler University Linz, Austria(机器学习研究所、林茨约瑟夫·冯·克拉夫特大学,奥地利) Clinical Research Center for Medical AI, Johannes Kepler University Linz, Austria(医学人工智能临床研究中心、林茨约瑟夫·冯·克拉夫特大学,奥地利)

专题命中 医学影像 :医学图像分割不确定性量化,后处理框架。

AI总结 提出QUAM-SM后处理框架,通过针对性对抗搜索识别脆弱像素,量化不确定性并分离认知与偶然不确定性,在公开数据集上优于现有方法。

Comments Accepted at MICCAI 2026

详情
AI中文摘要

可靠的像素级不确定性量化具有通过实现高保真纵向监测和区分真实病理变化与伪影来改变临床工作流程的潜力。理想情况下,这些模型提供关键治疗计划和手术干预所需的稳定性。然而,标准深度学习模型常常遭受校准不良,产生过度自信的预测,掩盖了微妙病理边界处的潜在脆弱性。为了解决这个问题,我们提出了QUAM-SM,一种使用针对性对抗搜索来识别“对抗脆弱”像素的后处理框架。通过主动寻找暴露预测不稳定性的扰动,我们的方法突出了决策最容易被翻转的区域。重要的是,该框架将认知不确定性与偶然不确定性分离。在两个具有多个专家标注的公开数据集上的实验表明,QUAM-SM在可靠性和边界敏感性方面优于标准和最新的不确定性估计方法。代码可在以下网址获取:https://this https URL

英文摘要

Reliable pixel-level uncertainty quantification holds the potential to transform clinical workflows by enabling high-fidelity longitudinal monitoring and distinguishing true pathological changes from artifacts. Ideally, these models provide the stability required for critical treatment planning and surgical intervention. However, standard deep learning models often suffer from miscalibration, yielding overconfident predictions that mask underlying vulnerabilities at subtle pathological boundaries. To address this, we propose QUAM-SM, a post-hoc framework using targeted adversarial search to identify "adversarially fragile" pixels. By actively seeking perturbations that expose predictive instability, our method highlights regions where decisions are most vulnerable to being flipped. Importantly, the framework disentangles epistemic uncertainty from aleatoric uncertainty. Experiments on two public datasets with multiple expert annotations demonstrate that QUAM-SM outperforms both standard and recent uncertainty estimation approaches in terms of reliability and boundary sensitivity. Code is available at https://github.com/HanaJebril/quam_sm

2606.18825 2026-06-18 cs.CV 新提交 90%

DreamReg: Belief-Driven World Model for 2D-3D Ultrasound Registration

DreamReg:基于信念驱动的世界模型用于2D-3D超声配准

Luoyao Kang, Yuelin Zhang, Jiwei Shan, Haifan Gong, Qingpeng Ding, Shing Shin Cheng

发表机构 * T Stone Robotics Institute, The Chinese University of Hong Kong(香港中文大学T Stone机器人研究所) Multi-scale Medical Robotics Center(多尺度医疗机器人中心) Perelman School of Medicine, University of Pennsylvania(宾夕法尼亚大学佩雷尔曼医学院)

专题命中 医学影像 :2D-3D超声配准,用于手术导航

AI总结 提出DreamReg框架,将2D-3D超声配准建模为信念更新,通过世界模型模拟探头运动并整合想象结果,在CAMUS和u-RegPro数据集上实现鲁棒且准确的实时配准。

详情
AI中文摘要

超声(US)广泛应用于手术导航,但由于部分可观测性、散斑噪声以及依赖于动作的US采集,术中2D切片与术前3D体积之间的实时配准仍然具有挑战性。现有方法是一次性的或短视的,难以随时间收集证据或捕捉外科医生如何根据屏幕反馈调整探头运动。我们提出DreamReg,一个基于信念驱动的世界模型框架,将2D-3D配准形式化为对刚性变换的信念更新。DreamReg维护一个潜在信念状态,总结过去的观测和位姿信息,并在新切片到达时通过学习到的动态不断细化变换。在训练期间,DreamReg暴露于模拟临床扫描行为的探头运动轨迹,并通过将位姿细化条件于当前US观测来学习更新其信念。在推理期间,DreamReg通过内部想象来细化配准:它展开学习到的世界模型以模拟候选探头运动及其预测的观测,并整合这些想象的结果以收敛到准确的刚性变换。在CAMUS和u-RegPro数据集上的实验表明,与最先进方法相比,DreamReg在实时引导中具有改进的鲁棒性和有竞争力的配准精度。

英文摘要

Ultrasound (US) is widely used for surgical navigation, yet real-time registration between intraoperative 2D slices and preoperative 3D volumes remains challenging due to partial observability, speckle noise, and the action-dependent US acquisition. Existing methods are one-shot or short-horizon, making it hard for them to gather evidence over time or capture how surgeons adjust probe motion based on on-screen feedback. We propose DreamReg, a belief-driven world-model framework that formulates 2D-3D registration as belief updating over rigid transformations. DreamReg maintains a latent belief state that summarizes past observations and poses information, and continuously refines the transformation through learned dynamics as new slices arrive. During training, DreamReg is exposed to probe-motion trajectories that mimic clinical scanning behavior and learns to update its belief by conditioning pose refinement on the current US observation. During inference, DreamReg refines registration via internal imagination: it rolls out the learned world model to simulate candidate probe motions and their predicted observations, and integrates these imagined outcomes to converge to an accurate rigid transformation. Experiments on CAMUS and u-RegPro datasets demonstrate improved robustness and competitive registration accuracy for real-time guidance compared with state-of-the-art methods.

2606.18753 2026-06-18 cs.CV 新提交 90%

SMART: A Flexible, Interpretable, and Scalable Spatio-temporal Brain Atlas from High-Resolution Imaging Data

SMART:一种灵活、可解释且可扩展的高分辨率成像数据时空脑图谱

John Kalkhof, Boris Gutman, Emile d'Angremont, Daniel C. Alexander, Marco Lorenzi

发表机构 * Illinois Institute of Technology(伊利诺伊理工学院) Amsterdam University Medical Center(阿姆斯特丹大学医学中心) University College London(伦敦大学学院)

专题命中 医学影像 :时空脑图谱,高分辨率3D医学图像建模。

AI总结 提出SMART框架,通过解耦全局疾病动态与患者特定解剖表现,学习连续疾病时间图谱,实现高分辨率3D医学图像中时空变化的灵活、可解释和可扩展建模。

详情
AI中文摘要

我们介绍了SMART,一个从纵向高分辨率3D医学图像中学习灵活、可解释且可扩展的时空脑图谱的框架。现有的时空图谱构建方法依赖于黑盒生成模型,缺乏灵活性、限制可解释性,并且难以扩展到高维数据。SMART通过学习一个连续的疾病时间图谱来解决这些挑战,该图谱将全局群体级疾病动态与患者特定的解剖表现解耦。在解剖学启发先验的指导下,SMART通过区域特异性微分方程,沿着共享的疾病时间线建模可解释的全局区域进展轨迹。全局轨迹进一步通过由灵活且可扩展的多尺度神经细胞自动机参数化的密集微分同胚位移,个性化到个体解剖结构。在阿尔茨海默病的五个纵向MRI数据集(ADNI-1/GO/2、OASIS-3、AIBL;>1300名受试者)上评估,SMART产生了解剖学上有意义的疾病进展预测,并实现了最先进的预测准确性和比对抗性和扩散基线更好的时间一致性。我们的方法为高维医学图像时间序列中时空变化的灵活、可解释和可扩展建模建立了一个新范式。

英文摘要

We introduce SMART, a framework for learning a flexible, interpretable, and scalable spatio-temporal brain atlas from longitudinal high-resolution 3D medical images. Existing approaches to spatio-temporal atlas construction rely on black-box generative models that lack flexibility, limit interpretability, and struggle to scale to high-dimensional data. SMART addresses these challenges by learning a continuous disease-time atlas that decouples global group-wise disease dynamics from their patient-specific anatomical manifestation. Guided by anatomically inspired priors, SMART models interpretable global trajectories of regional progression along a shared disease timeline through region-specific differential equations. Global trajectories are further personalized to individual anatomies via dense diffeomorphic displacements parameterized by a flexible and scalable multi-scale Neural Cellular Automata. Evaluated on five longitudinal MRI datasets in Alzheimer's disease (ADNI-1/GO/2, OASIS-3, AIBL; > 1,300 subjects), SMART produces anatomically meaningful predictions of disease progression and achieves state-of-the-art forecasting accuracy and improved temporal consistency over adversarial and diffusion baselines. Our approach establishes a new paradigm for flexible, interpretable, and scalable modeling of spatio-temporal change in high-dimensional medical image time-series.

2606.18723 2026-06-18 cs.CV cs.LG 新提交 90%

Clinically Aligned Geometry Constraints for Robust IVUS Vessel Boundary Segmentation

临床对齐的几何约束用于鲁棒的IVUS血管边界分割

Yunshu Chen, Litao Yang, Giuseppe Di Giovanni, Jordan Tan, Deval Mehta, Andrew Lin, Derek Chew, Masasi Fujino, Julie Butters, Stephen Nicholls, Zongyuan Ge, Kyung Hoon Cho

发表机构 * AIM For Health Lab, Monash University(莫纳什大学AIM健康实验室) Department of Data Science and Artificial Intelligence, Faculty of IT, Monash University(莫纳什大学信息技术学院数据科学与人工智能系) Monash University Victorian Heart Institute(莫纳什大学维多利亚心脏研究所) School of Computing Technologies, RMIT University(皇家墨尔本理工大学计算技术学院) National Cerebral and Cardiovascular Center(国立循环器病研究中心) Department of Cardiology, Chonnam National University Hospital and Medical School(全南大学医院和医学院心脏病学系)

专题命中 医学影像 :IVUS血管边界分割,几何约束网络。

AI总结 提出GeoCat网络,通过双编码器与可微几何一致性损失,在IVUS分割中降低边界漂移和拓扑错误,提升临床几何测量精度。

Comments MICCAI2026 Accepted

详情
AI中文摘要

血管内超声(IVUS)管腔和外弹性膜(EEM)分割对于定量评估冠状动脉斑块负荷至关重要。管腔或EEM勾画的误差会直接传播到斑块面积、斑块负荷和几何测量中。然而,优先考虑重叠分数的标准方法常常遭受边界漂移和拓扑错误,导致临床测量不准确。我们提出GeoCat,一个几何一致性网络,使用双笛卡尔-极坐标编码器,结合跨域注意力和时间融合,处理5帧IVUS片段。可微的几何一致性损失直接监督临床相关描述符,包括直径、方向和横截面积。该模型在来自146名患者的12,242张标注帧上训练,这些帧使用两种商用IVUS系统采集。我们使用分割准确性和斑块相关临床指标评估性能,包括Dice/IoU、边界测量(95HD(mm)、ASSD)、拓扑违规率和临床几何误差(dmax/dmin、角度和面积)。在我们的数据集上,GeoCat实现了0.93的Dice,将95HD降低到0.14 mm,并将拓扑违规率降低到1.0%。重要的是,它显著提高了几何保真度,产生0.13-0.16 mm的直径误差和约8度的角度误差,支持可靠的斑块负荷量化。

英文摘要

Intravascular ultrasound (IVUS) lumen and external elastic membrane (EEM) segmentation is important for quantitative coronary plaque burden assessment. Errors in lumen or EEM delineation directly propagate to plaque area, plaque burden and geometric measurements. However, standard methods prioritising overlap scores often suffer from boundary drift and topology errors, leading to inaccurate clinical measurements. We present GeoCat, a geometry-consistent network that processes 5-frame IVUS clips using dual Cartesian-polar encoders with cross-domain attention and temporal fusion. A differentiable geometry consistency loss directly supervises clinically relevant descriptors including diameters, orientations, and cross-sectional areas. The model is trained on 12,242 annotated frames from 146 patients acquired with two commercial IVUS systems. We evaluate performance using both segmentation accuracy and plaque-relevant clinical metrics, including Dice/IoU, boundary measures(95HD (mm), ASSD), topology violation rate, and clinical geometry errors (dmax/dmin, angles, and areas). On our dataset, GeoCat achieves a Dice of 0.93, reduces 95HD to 0.14 mm, and lowers topology violations to 1.0%. Importantly, it significantly improves geometric fidelity, yielding diameter errors of 0.13-0.16 mm and angular errors of ~8 degrees, supporting reliable plaque burden quantification.

2606.15604 2026-06-18 eess.IV cs.CV 新提交 90%

Parameter-Efficient Adaptation of SAM 3 for Automated ITV Generation from 4DCT Images

基于参数高效微调SAM 3从4DCT图像自动生成内靶区

Changwoo Song

发表机构 * Oncosoft Inc.(Oncosoft公司) Department of Computer Science & Engineering, Chungnam National University(忠南大学计算机科学与工程系)

专题命中 医学影像 :4DCT图像分割生成内靶区,医学影像

AI总结 提出轻量框架,通过LoRA参数高效微调SAM 3,结合硬负样本挖掘和相位相干滤波,仅用7个标注体数据实现高精度内靶区自动生成,中位Dice达0.968。

Comments 10 pages, 4 figures, 2 tables

详情
AI中文摘要

四维计算机断层扫描(4DCT)捕获了胸部解剖结构的完整呼吸周期,然而当前的内靶区勾画流程孤立处理每个相位,丢弃了时间相干性,使轮廓易受相位特定伪影影响。我们提出一个轻量框架,通过低秩适应(LoRA)对Segment Anything Model 3(SAM 3)进行参数高效微调,仅使用七个标注的3D CT体数据,将其文本提示分割与医学领域对齐。此外,该框架结合了硬负样本挖掘策略,以改善低对比度胸部区域的边界判别。在推理时,通过相位相干时间滤波和空间连通性分析细化逐相位预测。由于呼吸运动是连续且周期性的,真实解剖结构出现在连续的相位块中,而瞬态伪影零星出现,因此被有效抑制。在肺部和心脏结构上的实验分别产生中位Dice分数0.968和0.910,95百分位Hausdorff距离分别为0.998 mm和2.931 mm。所提框架有效消除了未适应SAM 3零样本推理中固有的严重假阳性预测。仅用七个标注体数据,框架保留了超过95%的全数据准确率,且整个流水线可在单个消费级GPU上训练,展示了自适应放疗中可扩展、数据高效的解决方案。

英文摘要

Four-dimensional computed tomography (4DCT) captures the full respiratory cycle of thoracic anatomy, yet current Internal Target Volume contouring workflows process each phase in isolation, discarding temporal coherence and leaving contours vulnerable to phase-specific artifacts. We present a lightweight framework that applies parameter-efficient fine-tuning to the Segment Anything Model 3 (SAM 3) via low-rank adaptation (LoRA) to align its text-prompted segmentation with the medical domain using only seven annotated 3D CT volumes. Furthermore, the framework incorporates a hard negative mining strategy to improve boundary discrimination in low-contrast thoracic regions. At inference, phase-wise predictions are refined through phase-coherent temporal filtering and spatial connectivity analysis. Since respiratory motion is continuous and periodic, genuine anatomy appears in contiguous blocks of phases, whereas transient artifacts appear sporadically and are thus effectively suppressed. Experiments on pulmonary and cardiac structures yield median Dice scores of 0.968 and 0.910 with 95th-percentile Hausdorff distances of 0.998 mm and 2.931 mm, respectively. The proposed framework effectively eliminates the severe false-positive predictions inherent in the zero-shot inference of the unadapted SAM 3. With only seven annotated volumes, the framework retains over 95% of full-data accuracy, and the entire pipeline is trainable on a single consumer-grade GPU, demonstrating a scalable, data-efficient solution for adaptive radiotherapy.

2606.00491 2026-06-18 cs.CV cs.AI 版本更新 90%

Pre-Deployment Robustness Stress Testing for CT Segmentation Systems Using Clinically Motivated Multi-Corruption Augmentation

CT分割系统的部署前鲁棒性压力测试:使用临床驱动的多损坏增强

CholMin Kanga, Jonghyun Chung, Amanpreet Kaur, Nagesh Gulkotwar, Aarthi Sivasankaran

发表机构 * Seoul National University(首尔国立大学) Google Inc.(谷歌公司)

专题命中 医学影像 :CT分割系统鲁棒性压力测试,医学影像增强

AI总结 提出RAMP框架,通过多损坏增强提升CT分割模型在临床异质成像条件下的鲁棒性,显著缩小干净与损坏图像性能差距。

详情
AI中文摘要

基于深度学习的CT分割系统在干净基准图像上通常能达到高精度,但在噪声、分辨率损失、对比度变化、强度偏移和伪影等异质临床成像条件下,其性能可能会下降。这种不稳定性可能限制其在真实医疗成像工作流程中的可靠部署。 我们提出鲁棒性增强多损坏流水线(RAMP),这是一个面向鲁棒性的CT分割增强框架。RAMP结合了解剖约束的空间扰动、CT强度变换和随机多损坏组合,使模型在训练过程中暴露于临床可行的图像退化。 在两个CT分割评估设置中,RAMP实现了最强的损坏图像性能和最小的干净到损坏鲁棒性差距。在五器官噪声评估基准中,与nnU-Net基线相比,RAMP将平均损坏Dice从0.610提高到0.753,并将鲁棒性差距从0.264降低到0.064。在Abdomen1K中,RAMP将平均损坏Dice从0.633提高到0.789,并将鲁棒性差距从0.290降低到0.070。尽管RAMP未达到最高的干净图像Dice,但它显著减轻了严重图像退化下的最坏情况分割崩溃。 这些结果表明,多损坏增强可以作为提高CT分割系统在异质临床环境中可靠性的实用部署前策略。

英文摘要

Deep learning-based CT segmentation systems often achieve high accuracy on clean benchmark images, but their performance may degrade under heterogeneous clinical imaging conditions such as noise, resolution loss, contrast variation, intensity shift, and artifacts. This instability can limit reliable deployment in real-world medical imaging workflows. We propose Robustness via Augmented Multi-corruption Pipeline (RAMP), a robustness-oriented augmentation framework for CT segmentation. RAMP combines anatomically constrained spatial perturbations, CT intensity transformations, and stochastic multi-corruption composition to expose models to clinically plausible image degradation during training. Across two CT segmentation evaluation settings, RAMP achieved the strongest corrupted-image performance and the smallest clean-to-corrupted robustness gap. In the five-organ noisy evaluation benchmark, RAMP improved mean corrupted Dice from 0.610 to 0.753 and reduced the robustness gap from 0.264 to 0.064 compared with the nnU-Net baseline. In Abdomen1K, RAMP improved mean corrupted Dice from 0.633 to 0.789 and reduced the robustness gap from 0.290 to 0.070. Although RAMP did not achieve the highest clean-image Dice, it substantially mitigated worst-case segmentation collapse under severe image degradation. These results suggest that multi-corruption augmentation can serve as a practical pre-deployment strategy for improving the reliability of CT segmentation systems in heterogeneous clinical environments.

2605.12567 2026-06-18 cs.CV cs.AI 版本更新 90%

Pyramid Self-Contrastive Learning for Single-shot Test-time Ultrasound Image Denoising

金字塔自对比学习框架用于测试时超声图像去噪

Jiajing Zhang, Bingze Dai, Xi Zhang, Yue Xu, Wei-Ning Lee

发表机构 * Department of Electrical and Computer Engineering, The University of Hong Kong(香港大学电子与计算机工程系) Department of Biomedical Engineering, Duke University(达特茅斯大学生物医学工程系)

专题命中 医学影像 :提出测试时超声图像去噪框架,提升结构细节。

AI总结 本文提出一种纯测试时训练框架,用于单次超声图像去噪,应用于合成孔径超声,通过自对比学习分离解剖相似性和噪声随机性,提升去噪效果和结构细节。

详情
AI中文摘要

内在的电子噪声和斑点噪声使超声图像的临床解释复杂化。传统去噪方法依赖显式噪声假设,其有效性在复合噪声条件下减弱。基于学习的方法需要大量标注数据和模型参数。这些预定义和预训练的方法在复杂体内环境中不可避免地导致领域偏移,因此局限于特定噪声类型并常模糊结构细节。本文提出了一种纯测试时训练框架用于单次超声图像去噪,并应用于合成孔径超声(SAU),该方法通过自对比学习在金字塔潜在空间中分离解剖相似性和噪声随机性。干净图像随后从解剖空间解码,而丢弃噪声空间。A2A在测试时仅使用一个噪声样本的SAU信号进行训练,从而从根本上消除了领域偏移和预训练成本。模拟实验,包括电子噪声水平0至30 dB和不同包含几何形状,证明了A2A在SNR和CNR上的改进分别为69.3%和34.4%。体内结果表明,仅使用心脏六个超声切面、肝脏和肾脏的两个孔径数据,SNR和CNR分别提高了84.8%和25.7%。A2A在多种成像目标和配置中产生清晰的图像/信号,为更可靠的超声解剖可视化和功能评估铺平了道路。

英文摘要

The inherent electronic and speckle noise complicates clinical interpretation of ultrasound images. Conventional denoising methods rely on explicit noise assumptions whose validity diminishes under composite noise conditions. Learning-based methods are usually pretrained in a limited image domain using a labeled dataset, which implies inevitable domain shift in complex in vivo environments. This study proposes a Pyramid Self-Contrastive Learning (PSCL) framework for test-time ultrasound image denoising without pretraining. Given multiple noisy samples from only one-shot imaging, PSCL disentangles anatomical similarity and noise randomness into separate pyramid latent spaces. The clean image is then decoded from the anatomy space while discarding the noise space. We first apply PSCL to synthetic aperture ultrasound (SAU), where an Aperture-to-Aperture loop serves as a self-supervised proxy task to ensure denoising fidelity. Simulation experiments, including noise levels from 0 to 30 dB and inclusion geometries from simple to complex, demonstrated improvements of 69.3% in SNR and 34.4% in CNR. The in vivo results showed 84.8% SNR and 25.7% CNR gains using only two aperture data of the heart in six echocardiographic views, liver, and kidney. PSCL delivers clear images across diverse imaging targets and configurations, paving the way for more reliable anatomical visualization without domain shift and pretraining costs.

2602.11467 2026-06-18 cs.LG 版本更新 90%

PRISM: A 3D Probabilistic Neural Representation for Interpretable Shape Modeling

PRISM:一种用于可解释形状建模的三维概率神经表示

Yining Jiao, Sreekalyani Bhamidi, Carlton Jude Zdanski, Julia S Kimbell, Andrew Prince, Cameron P Worden, Samuel Kirse, Christopher Rutter, Benjamin H Shields, Jisan Mahmud, Marc Niethammer

发表机构 * Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, USA(北卡罗来纳大学教堂山分校计算机科学系) Department of Computer Science, University of California San Diego, La Jolla, USA(加州大学圣地亚哥分校计算机科学系) School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, USA(北卡罗来纳大学教堂山分校医学院)

专题命中 医学影像 :可解释形状建模,解剖结构不确定性

AI总结 提出PRISM框架,结合隐式神经表示与不确定性感知统计形状分析,通过封闭形式Fisher信息度量实现高效局部时间不确定性量化,在形状演化、个性化预测和异常检测任务中表现优异。

Comments ICML 2026, camera-ready version, 24 pages

详情
AI中文摘要

理解解剖形状如何响应发育协变量而演变——并量化其空间变化的不确定性——在医疗保健研究中至关重要。现有方法通常依赖于忽略空间异质性动态的全局时间扭曲公式。我们引入PRISM,一种新颖的框架,将隐式神经表示与不确定性感知统计形状分析相结合。PRISM建模给定协变量下形状的条件分布,提供总体均值和协变量依赖不确定性在任意位置的空间连续估计。一个关键的理论贡献是封闭形式的Fisher信息度量,通过自动微分实现高效、解析可处理的局部时间不确定性量化。在三个合成数据集和一个临床数据集上的实验表明,PRISM在统一框架内从建模形状演化到个性化形状预测和异常检测等多样化任务中表现出色,同时提供可解释且临床有意义的不确定性估计。

英文摘要

Understanding how anatomical shapes evolve in response to developmental covariates - and quantifying their spatially varying uncertainties - is critical in healthcare research. Existing approaches typically rely on global time-warping formulations that ignore spatially heterogeneous dynamics. We introduce PRISM, a novel framework that bridges implicit neural representations with uncertainty-aware statistical shape analysis. PRISM models the conditional distribution of shapes given covariates, providing spatially continuous estimates of both the population mean and covariate-dependent uncertainty at arbitrary locations. A key theoretical contribution is a closed-form Fisher Information metric that enables efficient, analytically tractable local temporal uncertainty quantification via automatic differentiation. Experiments on three synthetic datasets and one clinical dataset demonstrate PRISM's strong performance across diverse tasks - from modeling shape evolution to personalized shape prediction and anomaly detection - within a unified framework, while providing interpretable and clinically meaningful uncertainty estimates.

2512.10353 2026-06-18 cs.CV 版本更新 90%

Hybrid Transformer-Mamba for Weakly Supervised Volumetric Medical Segmentation

混合Transformer-Mamba用于弱监督体积医学分割

Yiheng Lyu, Lian Xu, Coen Arrow, Mohammed Bennamoun, Farid Boussaid, Girish Dwivedi

发表机构 * University of Western Australia(西澳大学) Harry Perkins Institute of Medical Research(哈利·佩金斯医学研究所) National Imaging Facility(国家成像设施) Fiona Stanley Hospital(菲奥娜·斯蒂尔医院) Victor Chang Cardiac Research Institute(维多利亚·张心脏研究中心)

专题命中 医学影像 :混合Transformer-Mamba用于弱监督体积医学分割

AI总结 提出TranSamba混合架构,通过跨平面建模捕获3D上下文,在弱监督下实现高效体积分割,在三个数据集上达到最优性能。

详情
AI中文摘要

弱监督分割使得模型能够从平面级标签进行训练。现有方法通常依赖2D编码器,忽略了医学数据的体积特性。我们提出TranSamba,一种混合Transformer-Mamba架构,旨在通过跨平面建模捕获3D上下文。TranSamba在Vision Transformer骨干网络基础上增加跨平面Mamba块,利用线性时间建模实现相邻平面间的高效信息交换。这种交换改善了平面内自注意力以及后续用于目标定位的注意力图。TranSamba在输入体积深度上保持线性时间复杂度和恒定空间复杂度。在涵盖不同模态和病理的三个数据集上的大量实验表明,TranSamba达到了最先进的性能,展示了跨平面建模的泛化有效性。代码可在以下网址获取:this https URL.

英文摘要

Weakly supervised segmentation enables model training from plane-level labels. Existing methods often rely on 2D encoders, neglecting the volumetric nature of medical data. We propose TranSamba, a hybrid Transformer-Mamba architecture designed to capture 3D context via cross-plane modeling. TranSamba augments a Vision Transformer backbone with Cross-Plane Mamba blocks, leveraging linear-time modeling for efficient information exchange across neighboring planes. This exchange improves in-plane self-attention and subsequent attention maps for object localization. TranSamba maintains linear time complexity and constant space complexity with respect to the input volume depth. Extensive experiments on three datasets covering diverse modalities and pathologies show that TranSamba achieves state-of-the-art performance, demonstrating the generalizable efficacy of cross-plane modeling. Code is available at: https://github.com/YihengLyu/TranSamba.

2511.12126 2026-06-18 eess.IV 90%

Volumetric Ultrasound via 3D Null Subtraction Imaging with Circular and Spiral Apertures

体积分层超声成像:基于圆形和螺旋孔径的3D空子减法成像

Bingze Dai, Xi Zhang, Wei-Ning Lee

专题命中 医学影像 :提出3D空子减法成像技术用于体积超声,属于医学影像。

AI总结 本文提出3D空子减法成像技术,通过高效空子减法与稀疏孔径设计提升体积分层超声成像的图像质量、帧率和硬件复杂度平衡,实验显示其在方位和仰角分辨率及对比度方面优于传统DAS方法。

Comments 10 pages,12 figures

Journal ref Ultrasonics, 2026: 108179

详情
AI中文摘要

体积分层超声成像面临图像质量、帧率和硬件复杂度之间的根本性权衡。本文介绍了一种非线性波束成形框架,即三维空子减法成像(3D NSI),通过结合计算高效的空子减法过程与针对矩阵阵列的多路复用感知稀疏孔径设计,解决这一权衡问题。我们评估了三种声学孔径配置:一个完全驱动的圆形孔径和两个费马螺旋稀疏孔径。为克服矩阵阵列在与低通道数超声系统多路复用时常见的通道共享限制,我们提出了一种螺旋“无重复”孔径,强制在发射-接收事件之间保持非重叠的元件集。该设计解决了多路复用冲突,并使仅使用1024个元件探头中的240个主动元件即可实现高达16倍的采集体积速率。在计算机模拟和组织仿生假体实验中,3D NSI在方位和仰角分辨率方面平均提高了36%,对比度比传统延迟求和(DAS)波束成形器提高了约20%。当与螺旋无重复孔径结合时,3D NSI框架实现了每秒超过1000个体积分层,计算负载仅为DAS的三倍以下,使其成为实时4D成像的实用解决方案。

英文摘要

Volumetric ultrasound imaging faces a fundamental trade-off among image quality, frame rate, and hardware complexity. This study introduces three-dimensional Null Subtraction Imaging (3D NSI), a nonlinear beamforming framework that addresses this trade-off by combining computationally efficient null-subtraction process with multiplexing-aware sparse aperture designs on matrix arrays. We evaluate three apodization configurations: a fully addressed circular aperture and two Fermat's spiral sparse apertures. To overcome channel-sharing constraints common in matrix arrays multiplexed with low-channel-count ultrasound systems, we propose a spiral "no-reuse" apodization that enforces non-overlapping element sets across transmit-receive events. This design resolves multiplexing conflicts and enables up to a 16-fold increase in acquisition volume rate using only 240 active elements on a 1024-element probe. In computer simulations and tissue-mimicking phantom experiments, 3D NSI achieved an average improvement of 36% in azimuthal and elevational resolutions, along with an approximately 20% higher contrast ratio, compared to the conventional Delay-and-Sum (DAS) beamformer under matched transmit/receive configurations. When implemented with the spiral no-reuse aperture, the 3D NSI framework achieved over 1000 volumes per second with a computational load less than three times that of DAS, making it a practical solution for real-time 4D imaging.

2510.13562 2026-06-18 physics.med-ph cs.CV cs.NA math.NA 90%

An efficient approach with theoretical guarantees to simultaneously reconstruct activity and attenuation sinogram for TOF-PET

一种具有理论保证的高效方法用于同时重建TOF-PET的活动和衰减正弦图

Liyang Hu, Chong Chen

发表机构 * State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China(数学科学国家重点实验室,数学与系统科学研究院,中国科学院,北京100190,中国) University of Chinese Academy of Sciences, Beijing 100190, China(中国科学院大学,北京100190,中国)

专题命中 医学影像 :PET重建,核心医学影像方法

AI总结 本文提出一种基于最大似然估计的新方法,用于同时重建TOF-PET的活动和衰减正弦图,通过利用指数形式的衰减校正因子和活动总量约束,证明了方法的可解性,并通过实验验证了其在精度和效率上的优越性。

Comments 32 pages, 11 figures, 4 tables

Journal ref IEEE Transactions on Computational Imaging 2026

详情
AI中文摘要

在正电子发射断层扫描(PET)中,进行衰减校正对于获得体内定量准确的活动图(示踪剂分布)至关重要。通常,这基于从计算机断层扫描或磁共振成像获得的估计衰减图。然而,除了衰减校正因子的误差外,额外的扫描不仅会引入新的辐射剂量或增加扫描时间,还会由于两次连续扫描之间的各种运动导致严重的对齐问题。为了解决这些问题,基于最大似然估计,我们提出了一种新的数学模型,仅从时间飞越(TOF)-PET发射数据中同时重建活动和衰减正弦图。特别地,我们充分利用了衰减校正因子的唯一指数形式,并在所提出的模型中考虑了某些掩码区域的活动总量约束。此外,我们证明了其可解性,包括解的存在性、唯一性和稳定性。我们提出了一种交替更新算法来求解该模型,并分析了其收敛性。最后,使用各种TOF-PET发射数据的数值实验表明,所提出的方法在数值收敛性和抗噪性方面表现良好,并在精度和效率上优于一些最先进的方法,且具有自主衰减校正的能力。

英文摘要

In positron emission tomography (PET), it is indispensable to perform attenuation correction in order to obtain the quantitatively accurate activity map (tracer distribution) in the body. Generally, this is carried out based on the estimated attenuation map obtained from computed tomography or magnetic resonance imaging. However, except for errors in the attenuation correction factors obtained, the additional scan not only brings in new radiation doses and/or increases the scanning time but also leads to severe misalignment induced by various motions during and between the two sequential scans. To address these issues, based on maximum likelihood estimation, we propose a new mathematical model for simultaneously reconstructing the activity and attenuation sinogram from the time-of-flight (TOF)-PET emission data only. Particularly, we make full use of the exclusively exponential form for the attenuation correction factors, and consider the constraint of a total amount of the activity in some mask region in the proposed model. Furthermore, we prove its well-posedness, including the existence, uniqueness and stability of the solution. We propose an alternating update algorithm to solve the model, and also analyze its convergence. Finally, numerical experiments with various TOF-PET emission data demonstrate that the proposed method is of numerical convergence and robust to noise, and outperforms some state-of-the-art methods in terms of accuracy and efficiency, and has the capability of autonomous attenuation correction.

2507.05647 2026-06-18 eess.IV cs.CV 90%

Diffusion-Based Limited-Angle CT Reconstruction under Noisy Conditions

基于扩散的噪声条件下有限角度CT重建

Jiaqi Guo, Santiago López-Tapia

发表机构 * Dept. of Electrical and Computer Engineering, Northwestern University, Evanston, IL, USA(电气与计算机工程系,西北大学,埃文斯顿,伊利诺伊州,美国)

专题命中 医学影像 :CT重建方法,直接应用于医学影像

AI总结 本文提出基于扩散的有限角度CT重建方法,通过Mean-Reverting随机微分方程完成缺失角度视图,结合噪声感知校正机制提升鲁棒性,实验表明在不同噪声强度和采集条件下均表现优异。

Comments Accepted at the 2025 IEEE International Conference on Image Processing (ICIP), Workshop

详情
AI中文摘要

有限角度计算机断层扫描(LACT)是一个具有挑战性的逆问题,其中缺失的角度投影导致不完整的sinogram和重建图像中的严重伪影。尽管最近的基于学习的方法已显示出有效性,但大多数方法假设理想、无噪声的测量,并未能解决测量噪声的影响。为了克服这一限制,我们将LACT视为sinogram修复任务,并提出基于扩散的框架,利用Mean-Reverting随机微分方程(MR-SDE)公式来完成缺失的角度视图。为了在现实噪声下提高鲁棒性,我们提出RNSD$^+$,一种新的噪声感知校正机制,该机制在推理时显式建模不确定性,从而实现可靠且稳健的重建。广泛的实验表明,我们的方法在数据一致性和感知质量上一致优于基线模型,并且在不同噪声强度和采集场景下具有良好的泛化能力。

英文摘要

Limited-Angle Computed Tomography (LACT) is a challenging inverse problem where missing angular projections lead to incomplete sinograms and severe artifacts in the reconstructed images. While recent learning-based methods have demonstrated effectiveness, most of them assume ideal, noise-free measurements and fail to address the impact of measurement noise. To overcome this limitation, we treat LACT as a sinogram inpainting task and propose a diffusion-based framework that completes missing angular views using a Mean-Reverting Stochastic Differential Equation (MR-SDE) formulation. To improve robustness under realistic noise, we propose RNSD$^+$, a novel noise-aware rectification mechanism that explicitly models inference-time uncertainty, enabling reliable and robust reconstruction. Extensive experiments demonstrate that our method consistently surpasses baseline models in data consistency and perceptual quality, and generalizes well across varying noise intensity and acquisition scenarios.

2606.19182 2026-06-18 eess.IV 新提交 85%

Optimized Multi-Contrast Self-Supervised MRI Reconstruction using Learned k-space Partitioning

使用学习型k空间划分的优化多对比度自监督MRI重建

Brenden Kadota, Charles Millard, Mark Chiew

专题命中 医学影像 :提出多对比度自监督MRI重建方法

AI总结 提出一种多对比度自监督学习框架,通过端到端学习最优k空间数据划分,无需全采样数据即可提升MRI重建质量。

详情
AI中文摘要

目的:深度学习在通过从欠采样数据重建高质量图像来加速MRI方面显示出前景。虽然最近的工作利用多对比度信息来提高重建性能,但这些方法依赖于监督学习,需要全采样k空间进行训练。一种方法,通过数据欠采样的自监督学习(SSDU),通过将k空间划分为两个集合,并在两者之间进行网络映射,从而能够直接在欠采样k空间上进行训练。在这项工作中,我们通过两项修改改进了MRI自监督重建。方法:我们提出了一个多对比度自监督学习框架,该框架联合训练多个欠采样对比度,无需全采样k空间数据作为参考。此外,我们以端到端的方式为每个对比度学习最优的自监督数据划分,进一步提高了重建质量。具体来说,我们学习一个最优的划分概率分布,对其进行采样以生成用于划分的掩码。结果:在两个公开可用的多对比度MRI数据集上的实验表明,与当前的单对比度自监督学习方法相比,我们提出的自监督多对比度学习划分方法提高了重建质量。我们还证明了学习k空间数据的划分进一步增强了重建的保真度。结论:多对比度重建与学习划分相结合,比单对比度自监督MRI重建提高了重建保真度。意义:与之前的自监督方法相比,我们的方法可以实现更高的图像保真度和/或加速MRI协议时间,并且无需全采样k空间进行训练。

英文摘要

Objective: Deep Learning has shown promise in accelerating MRI by reconstructing high-quality images from under-sampled data. While recent work has leveraged multi-contrast information to improve reconstruction performance, these methods rely on supervised learning, which requires fully sampled k-space for training. One method, self-supervised learning via data undersampling (SSDU), enables direct training on under-sampled k-space by partitioning it into two sets, with a network mapping between the two. In this work, we improve MRI self-supervised MRI reconstruction with two modifications. Methods: We propose a multi-contrast self-supervised learning framework that jointly trains on multiple under-sampled contrasts without requiring fully sampled k-space data as a reference. Moreover, we learn an optimal self-supervised data partitioning for each contrast in an end-to-end manner, further enhancing reconstruction quality. Specifically, we learn an optimal partitioning probability distribution, which is sampled to generate a mask for partitioning. Results: Experiments on two publicly available multi-contrast MRI datasets demonstrate the improved reconstruction quality of our proposed self-supervised multi-contrast learned partitioning method compared to the current single-contrast self-supervised learning methods. We also demonstrate that learning the partitioning of k-space data further enhances the fidelity of reconstructions. Conclusion: Multi-contrast reconstruction combined with learned partitioning improves reconstruction fidelity over single-contrast self-supervised MRI reconstructions. Significance: Our method can facilitate higher image fidelity and/or accelerated MRI protocol times compared to previous self-supervised methods, and without requiring fully sampled k-space for training.

2606.18489 2026-06-18 eess.IV 新提交 85%

GHOST-CAT: An Efficient and Practical Network for Mesh Generation from 3D Echocardiography

GHOST-CAT: 一种高效实用的三维超声心动图网格生成网络

Edward Ferdian, Debbie Zhao, Alistair A. Young, Martyn P. Nash

专题命中 医学影像 :从3D超声心动图生成左心室网格,属于医学影像处理

AI总结 提出GHOST-CAT两阶段网络,结合CNN、图卷积和Transformer,从3D超声心动图生成拓扑一致、时间连贯的左心室网格,在100例测试集上Dice系数达0.87(腔室)和0.75(心肌),优于现有方法。

详情
AI中文摘要

深度学习的最新进展显著加速了心脏成像工作流程,从分割到用于计算建模的网格生成。然而,由于3D超声心动图的低对比度噪声比、锥形视野以及对声影的敏感性,其分析面临独特挑战。在此,我们提出了一种专为3D超声心动图定制的高效实用网络。我们的方法由一个两阶段网络组成,结合了卷积神经网络、图卷积网络和Transformer,以创建准确的时间变化3D左心室网格,这些网格在整个心动周期中拓扑一致且时间连贯。我们的模型在100张3D超声图像的保留测试数据集上实现了比当前最先进方法更优越的网格重建精度,与心脏磁共振成像导出的参考分割相比,Dice系数为0.87±0.05(腔室)和0.75±0.07(心肌),平均±标准差表面距离为3.3±0.6毫米(心内膜)和3.5±0.5毫米(心外膜)。重建的网格能够自动计算常规临床指标,如体积、质量和应变,并支持生物物理数字孪生的高级应用。源代码在此https URL公开共享。

英文摘要

Recent advances in deep learning have significantly accelerated cardiac imaging workflows, from segmentation to the generation of meshes for computational modelling. Nevertheless, analysis of 3D echocardiograms presents unique challenges due to their low contrast-to-noise ratio, conical field of view, and susceptibility to acoustic shadowing. Here, we present an efficient and practical network tailored for 3D echocardiograms. Our method consists of a two-stage network that combines convolutional neural networks, graph convolutional networks, and transformers, to create accurate time-varying 3D meshes of the left ventricle that are topologically consistent and temporally coherent throughout the cardiac cycle. Our model achieved superior mesh reconstruction accuracy compared to current state-of-the-art methods on a held-out test dataset of 100 3D echo images, with a Dice coefficient of 0.87 +/- 0.05 (cavity) and 0.75 +/- 0.07 (myocardium), and mean +/- SD surface distances of 3.3 +/- 0.6 mm (endocardium) and 3.5 +/- 0.5 mm (epicardium), against reference segmentations derived from cardiac magnetic resonance imaging. The reconstructed mesh enables automated calculation of routine clinical indices, such as volume, mass, and strain, and enables advanced applications with biophysical digital twins. Source code is openly shared at https://github.com/EdwardFerdian/ghost-cat.