arXivDaily arXiv每日学术速递 周一至周五更新

视觉与机器人

多模态信息融合

面向图像、视频、多传感器和跨模态感知的信息融合,包括 Image Fusion、红外可见光、遥感、医学影像、LiDAR/雷达/相机和音视频融合。

今日/当前日期收录 2 信号源:cs.CV, eess.IV, eess.SP, cs.RO, cs.MM
2606.05368 2026-06-18 cs.CV 版本更新 80%

Biomazon: A Multimodal Dataset for 3D Forest Structure and Biomass Modeling in the Amazon Basin

Biomazon:亚马逊盆地三维森林结构与生物量建模的多模态数据集

Sayan Mandal, Rocco Sedona, Simon Besnard, Mikhail Urbazaev, Morris Riedel, Ehsan Zandi, Gabriele Cavallaro

发表机构 * Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich(julich超级计算中心(JSC),julich研究所) School of Engineering and Natural Sciences (SENS), University of Iceland(工程与自然科学学院(SENS),冰岛大学) Global Land Monitoring Group, GFZ Helmholtz Centre for Geosciences(全球土地监测组,geofz赫尔姆霍兹研究中心)

专题命中 遥感融合与全色锐化 :多传感器预测因子融合用于森林结构建模

AI总结 针对现有方法未将森林垂直结构作为有序轮廓学习的问题,提出Biomazon多模态基准数据集,结合GEDI RH和AGBD目标与多传感器预测因子,通过共享编码器-解码器框架进行消融研究,为热带森林结构一致RH轮廓预测和结构-生物量建模建立参考基准。

Comments 32 pages, 21 figures, 8 tables

详情
AI中文摘要

准确、空间明确的描述热带森林结构对于碳核算和生态系统监测至关重要,然而大多数机器学习流程预测冠层顶部高度代理(例如RH95/RH98)或AGBD作为单独的标量目标,而不是将森林垂直结构作为有序轮廓学习。社区缺乏一个ML就绪的多模态基准,用于联合预测整个GEDI RH轮廓与AGBD,或评估强制RH百分位数之间物理一致排序的方法。我们通过Biomazon解决了这一问题,这是一个覆盖亚马逊盆地的20米多模态基准数据集,在标准化的空间划分和评估协议下,将GEDI RH和AGBD目标与多传感器预测因子(Sentinel-1/2、ALOS-2 PALSAR-2、Copernicus DEM、Dynamic World LULC和AlphaEarth嵌入)配对。使用共享编码器-解码器与任务特定头作为基线框架,我们对(i)骨干/模型规模、(ii)模态贡献以及(iii)在独立和融合设置下使用辅助嵌入进行了全面的消融研究,并报告了单目标和联合目标结果,以量化统一训练协议下的权衡。最后,我们通过与现有网格化产品(包括GEDI L4D RH10-RH98和AGBD)在匹配时间尺度上的区域对齐比较,将基线性能置于背景中。Biomazon连同随附的协议和基线结果,为未来热带森林中结构一致的RH轮廓预测和结构-生物量建模工作建立了参考基准。

英文摘要

Accurate, spatially explicit characterization of tropical forest structure is essential for carbon accounting and ecosystem monitoring, yet most ML pipelines predict canopy-top height proxies (e.g., RH95/RH98) or AGBD as separate scalar targets, rather than learning the forest vertical structure as an ordered profile. The community lacks a ML-ready multimodal benchmark for predicting the entire GEDI RH profile jointly with AGBD, or for evaluating methods that enforce physically consistent ordering across RH percentiles. We address this with Biomazon, a 20 m multimodal benchmark dataset over the Amazon Basin that pairs GEDI RH and AGBD targets with multi-sensor predictors (Sentinel-1/2, ALOS-2 PALSAR-2, Copernicus DEM, Dynamic World LULC, and AlphaEarth embeddings) under standardized spatial splits and evaluation protocols. Using a shared encoder-decoder with task-specific heads as a baseline framework, we conduct a comprehensive ablation study of (i) backbone/model scale, (ii) modality contributions, and (iii) the use of auxiliary embeddings under standalone and fusion settings, and we report both single-target and joint-target results to quantify tradeoffs under a unified training protocol. Finally, we contextualize baseline performance through regionally aligned comparisons against existing gridded products, including GEDI L4D RH10-RH98 and AGBD, at matching temporal scale. Biomazon, together with the accompanying protocols and baseline results, establishes a reference benchmark for future work on structurally consistent RH-profile prediction and structure-biomass modeling in tropical forests.

2511.20302 2026-06-18 cs.CV 版本更新 80%

CrossEarth-Gate: Fisher-Guided Adaptive Tuning Engine for Efficient Adaptation of Cross-Domain Remote Sensing Semantic Segmentation

CrossEarth-Gate:基于Fisher引导的自适应调优引擎用于高效跨域遥感语义分割

Shilei Cao, Ziyang Gong, Hehai Lin, Yang Liu, Jiashun Cheng, Xiaoxing Hu, Haoyuan Liang, Guowen Li, Chengwei Qin, Hong Cheng, Xue Yang, Juepeng Zheng, Haohuan Fu

发表机构 * Sun Yat-sen University(中山大学) The Chinese University of Hong Kong(香港中文大学) Shanghai Jiao Tong University(上海交通大学) National Supercomputing Center in Shenzhen(深圳国家超算中心) The Hong Kong University of Science and Technology(香港科技大学) Beijing Institute of Technology(北京理工大学) The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) Tsinghua University(清华大学)

专题命中 遥感融合与全色锐化 :跨域遥感语义分割的自适应调优

AI总结 提出CrossEarth-Gate,通过Fisher信息引导的自适应模块选择机制,动态激活最关键的跨域模块,在18个跨域基准中16个达到最优性能。

详情
AI中文摘要

在遥感(RS)中,参数高效微调(PEFT)已成为激活基础模型泛化表示能力以用于下游任务的关键方法。然而,现有的专用PEFT方法在应用于大规模地球观测任务时常常失败,因为它们无法完全处理遥感数据中固有的多面且不可预测的域差距(例如空间、语义和频率偏移)。为克服这一问题,我们提出CrossEarth-Gate,它包含两个主要贡献。首先,我们建立了一个全面的遥感模块工具箱,以解决多方面的域差距,包括空间、语义和频率模块。其次,我们开发了一种基于Fisher引导的自适应选择机制,该机制作用于该工具箱。该选择由Fisher信息引导,通过衡量每个模块对任务特定梯度流的贡献来量化其重要性。它动态地仅在适当层激活最关键模块,引导梯度流以最大化适应效果和效率。全面实验验证了我们方法的有效性和泛化能力,其中CrossEarth-Gate在18个遥感语义分割跨域基准中的16个上达到了最先进性能。

英文摘要

In Remote Sensing (RS), Parameter-Efficient Fine-Tuning (PEFT) has emerged as a key approach to activate the generalizable representation ability of foundation models for downstream tasks. However, existing specialized PEFT methods often fail when applied to large-scale Earth observation tasks, as they are unable to fully handle the multifaceted and unpredictable domain gaps (e.g., spatial, semantic, and frequency shifts) inherent in RS data. To overcome this, we propose CrossEarth-Gate, which introduces two primary contributions. First, we establish a comprehensive RS module toolbox to address multifaceted domain gaps, comprising spatial, semantic, and frequency modules. Second, we develop a Fisher-guided adaptive selection mechanism that operates on this toolbox. This selection is guided by Fisher Information to quantify each module's importance by measuring its contribution to the task-specific gradient flow. It dynamically activates only the most critical modules at the appropriate layers, guiding the gradient flow to maximize adaptation effectiveness and efficiency. Comprehensive experiments validate the efficacy and generalizability of our method, where CrossEarth-Gate achieves state-of-the-art performance on 16 out of 18 cross-domain benchmarks for RS semantic segmentation.