arXivDaily arXiv每日学术速递 周一至周五更新

AI 大模型

多模态大模型

跨文本、图像、视频、音频等模态的大模型与学习方法。

今日/当前日期收录 1 信号源:cs.CV, cs.CL, cs.AI, cs.MM, eess.AS
2603.10791 2026-06-19 eess.IV 版本更新 80%

Semantic Satellite Communications for Synchronized Audiovisual Reconstruction

面向同步视听重建的语义卫星通信

Fangyu Liu, Peiwen Jiang, Wenjin Wang, Xiao Li, Shi Jin

专题命中 音视频多模态 :提出多模态语义传输系统实现视听同步重建。

AI总结 提出自适应多模态语义传输系统,通过双流生成架构和动态关键帧更新机制,在带宽受限的卫星场景下实现高质量同步视听重建,显著降低带宽消耗并提升鲁棒性。

详情
AI中文摘要

卫星通信在支持高保真同步视听服务方面面临严重瓶颈,因为传统方案在信道波动、带宽有限和长传播延迟下难以处理跨模态一致性。为了解决这些问题,本文提出了一种针对卫星场景的自适应多模态语义传输系统,旨在带宽约束下实现高质量同步视听重建。与具有固定模态优先级的静态方案不同,我们的框架采用双流生成架构,可灵活切换视频驱动音频生成和音频驱动视频生成。这使得系统能够动态解耦语义,仅传输最重要的模态,同时利用跨模态生成恢复另一种模态。为了平衡重建质量和传输开销,动态关键帧更新机制根据无线场景和用户需求自适应维护共享知识库。此外,引入基于大语言模型的决策模块以增强系统适应性。通过集成卫星特定知识,该模块联合考虑任务需求和信道因素(如天气引起的衰落),主动调整传输路径和生成工作流。仿真结果表明,所提系统在实现高保真视听同步的同时显著降低带宽消耗,提高了挑战性卫星场景下的传输效率和鲁棒性。

英文摘要

Satellite communications face severe bottlenecks in supporting high-fidelity synchronized audiovisual services, as conventional schemes struggle with cross-modal coherence under fluctuating channel conditions, limited bandwidth, and long propagation delays. To address these limitations, this paper proposes an adaptive multimodal semantic transmission system tailored for satellite scenarios, aiming for high-quality synchronized audiovisual reconstruction under bandwidth constraints. Unlike static schemes with fixed modal priorities, our framework features a dual-stream generative architecture that flexibly switches between video-driven audio generation and audio-driven video generation. This allows the system to dynamically decouple semantics, transmitting only the most important modality while employing cross-modal generation to recover the other. To balance reconstruction quality and transmission overhead, a dynamic keyframe update mechanism adaptively maintains the shared knowledge base according to wireless scenarios and user requirements. Furthermore, a large language model based decision module is introduced to enhance system adaptability. By integrating satellite-specific knowledge, this module jointly considers task requirements and channel factors such as weather-induced fading to proactively adjust transmission paths and generation workflows. Simulation results demonstrate that the proposed system significantly reduces bandwidth consumption while achieving high-fidelity audiovisual synchronization, improving transmission efficiency and robustness in challenging satellite scenarios.