arXivDaily arXiv每日学术速递 周一至周五更新

视觉与机器人

图像生成

图像生成、文生图、图像编辑、扩散模型和可控生成。

今日/当前日期收录 33 信号源:cs.CV, cs.GR, cs.MM

1. 图像修复 1 篇

2606.20095 2026-06-19 cs.CV 新提交 60%

Stitching and dimensionality effects on large artificially generated volume datasets

拼接和维度对大规模人工生成体数据集的影响

Lucas von Chamier, Jan Philipp Albrecht, Dagmar Kainmüller

发表机构 * GFZ Helmholtz-Zentrum für Geoforschung(亥姆霍兹地球科学中心) Max Delbrück Center for Molecular Medicine in the Helmholtz Association(亥姆霍兹协会马克斯·德尔布吕克分子医学中心) Helmholtz Imaging(亥姆霍兹成像) Humboldt-Universität zu Berlin(柏林洪堡大学) University of Potsdam(波茨坦大学)

专题命中 图像修复 :拼接伪影影响生成质量

AI总结 研究深度学习生成大图像时的拼接伪影对风格迁移的影响,比较2D与3D模型,发现FID无法检测影响下游任务的细微伪影,3D模型略优但计算成本高。

详情
AI中文摘要

通过深度学习生成大图像需要对输入数据进行分块以适应硬件内存限制,然后组装输出块,这一过程在相邻块边界不对齐时可能引入拼接伪影。虽然已知这些伪影会影响分割任务,但它们对风格迁移生成模型的影响尚不清楚。我们使用在冷冻电镜数据集上训练的cycleGAN模型,研究了三种拼接方法和两种块维度(2D vs 3D)。我们评估了感知质量和下游线粒体分割的性能。主要发现如下:(1)FID分数无法检测到显著影响下游分割性能的细微拼接伪影;(2)具有无伪影拼接的3D模型在下游任务上略优于2D模型,尽管改进勉强证明计算成本合理;(3)2D模型由于更大的批量大小而训练更稳定。此外,我们证明从三个正交方向集成预测可以改善低质量体,但对高质量输出无益。这些结果表明,在大型科学数据集上最大化生成模型性能需要仔细考虑和减轻拼接伪影,并且仅凭感知指标不足以评估生物医学成像中的域适应质量。

英文摘要

Generating large images via deep learning requires patching input data to accommodate hardware memory limitations, then assembling output patches, a process that can introduce stitching artifacts when neighboring patches do not align at borders. While these artifacts are known to affect segmentation tasks, their impact on generative models for style-transfer remains poorly understood. We investigated three stitching approaches and two patch dimensionalities (2D vs 3D) using cycleGAN models trained on cryo-electron microscopy datasets. We evaluated both perceptual quality and performance on downstream mitochondria segmentation. Our key findings reveal that: (1) FID scores fail to detect subtle stitching artifacts that significantly impact downstream segmentation performance, (2) 3D models with artifact-free stitching marginally outperform 2D models on downstream tasks, though the improvement barely justifies the computational cost, and (3) 2D models train more stably due to larger batch sizes. Additionally, we demonstrate that ensembling predictions from three orthogonal directions can improve low-quality volumes but provides no benefit for high-quality outputs. These results demonstrate that maximizing generative model performance on large scientific datasets requires careful consideration and mitigation of stitching artifacts, and that perceptual metrics alone are insufficient for evaluating domain adaptation quality in biomedical imaging.

2. 其他图像生成 2 篇

2606.19957 2026-06-19 cs.CY 新提交 60%

Modest, artistic, and radical solutions to the environmental impact of image-generating machine learning

图像生成机器学习的环境影响:温和、艺术与激进的解决方案

Laura U. Marks, Jess MacCormack, Kehui Li

专题命中 其他图像生成 :讨论图像生成ML的环境影响与解决方案

AI总结 针对图像生成ML的高能耗问题,从计算机工程、媒体研究和艺术角度探索非精确计算、小模型、低精度硬件等解决方案,并提出真实成本核算。

Comments Paper in Proceedings of LIMITS 2026: 12th Workshop on Computing within Limits, 2026-06-23-25, Online

详情
AI中文摘要

机器学习常被宣称能提高信息通信技术的效率,但这种微小收益被数据中心和ML就绪设备的巨大碳、水和土地足迹所淹没。我们调查了ML应用在训练和推理中的电力消耗,重点关注电力密集型的图像生成。我们的团队由一名计算机工程师、一名媒体学者和一名艺术家组成,探索了包括非精确计算、微型语言模型、低精度硬件架构、有限容量硬件以及在设计阶段预测和缓解能源需求等解决方案。我们将概述正在进行的、使用非抓取数据的道德且美学上精致的微型图像生成器的工作。着眼于经济背景,我们将提出机器学习环境影响的真实成本核算,并表明效率标准是由信息通信技术的股东资本主义框架驱动的。

英文摘要

Machine learning is often touted to improve the efficiency of ICT, but that small gain is overwhelmed by the enormous carbon, water, and land footprints of data centers and ML-ready devices. We survey the electricity consumption of ML applications in training and inference, focusing on electricity-intensive image generation. Our team of a computer engineer, a media scholar, and an artist explore solutions including inexact computing; tiny language models; low-precision hardware architectures; hardware with limited capacity; and anticipating and mitigating energy demands at the design phase. We will sketch our work in progress of an ethical and aesthetically sophisticated tiny image generator using non-scraped data. Looking to the economic context, we will propose a true-cost accounting for the environmental impact of machine learning and suggest that the criterion of efficiency is driven by the shareholder-capitalist framing of ICT.

2606.19701 2026-06-19 astro-ph.HE 新提交 55%

On the Contribution of Local Sources to the Galactic Cosmic-Ray Spectrum: An Exact Series Solution for Two-Zone Diffusion

论局部源对银河宇宙射线谱的贡献:两区扩散的精确级数解

Zi-Hang Liu, Yiwei Bao, Ruo-Yu Liu

专题命中 其他图像生成 :局部源对宇宙射线谱贡献的扩散模型

AI总结 本文推导了两区扩散模型的级数格林函数,通过蒙特卡洛模拟发现近源慢扩散使局部源贡献概率从0.4%升至1.7-2.2%,但统计困难仍存,且局部源解释高度依赖模型。

Comments submitted to PRD, The code accompanying this paper will be released soon

详情
AI中文摘要

膝以下宇宙射线质子和氦谱的测量显示出偏离简单幂律的行为,包括多TeV结构。一种可能的解释是,一个或几个附近的源为局部谱贡献了额外的成分。然而,先前的研究表明,在均匀扩散模型下,主导的局部贡献在统计上不太可能。在这项工作中,我们基于银河加速器周围扩展伽马射线发射的观测,研究了如果宇宙射线在其源附近经历低效输运,这一概率如何变化。我们推导了一个级数格林函数,能够快速计算该场景下的粒子分布,使得银河源群的蒙特卡洛计算可行。内部慢扩散区域延迟逃逸并在时间和能量上重新分布到达的粒子。在蒙特卡洛实现中,最强的局部源在$10\,\ m{TeV}$处与背景相当的概率从均匀扩散中的约$0.4\%$增加到两区模型中的$1.7$--$2.2\%$。因此,抑制的近源输运削弱了统计困难,但并未消除。然后,我们检查了编录的附近候选超新星遗迹,并表明只有在额外假设下,特别是更硬的局部注入谱和有利的扩散系数,才能重现$10\,\ m{TeV}$特征。给定源的预测贡献在不同粒子输运模型之间变化很大。因此,局部源解释是合理的但高度依赖模型,并且需要对源注入历史、粒子输运机制和局部星际湍流进行独立约束。

英文摘要

Measurements of cosmic-ray proton and helium spectra below the knee show deviations from simple power laws, including multi-TeV structures. A possible explanation is that one or a few nearby sources contribute an additional component to the local spectrum. However, previous study shows that a dominant local contribution is statistically unlikely under a homogeneous diffusion model. In this work, we investigate how this probability changes if cosmic rays experience inefficient transport near their sources, motivated by observations of extended gamma-ray emission around Galactic accelerators. We derive a series Green's function that enables fast calculation of the particle distribution in this scenario, making Monte Carlo calculations for Galactic source populations feasible. The inner slow-diffusion region delays escape and redistributes the arriving particles in time and energy. In Monte Carlo realizations, the probability that the strongest local source becomes comparable to the background at $10\,\rm{TeV}$ increases from about $0.4\%$ in homogeneous diffusion to $1.7$--$2.2\%$ in the two-zone models. Thus inhibited near-source transport weakens, but does not remove, the statistical difficulty. We then examine cataloged nearby candidate supernova remnants and show that a $10\,\rm{TeV}$ feature can be reproduced only with additional assumptions, especially a harder local injection spectrum and a favorable diffusion coefficient. The predicted contribution of a given source changes strongly among different particle transport model. Therefore, the local source interpretations are plausible but highly model dependent, and require independent constraints on source injection history, particle transport mechanisms, and local interstellar turbulence.