On Aligning Hierarchical Standardized Embedding for Audio-visual Generalized Zero-shot Learning
面向音视频广义零样本学习的层次化标准化嵌入对齐
发表机构 * Southeast University(东南大学) ; The University of Hong Kong(香港大学) ; Beijing Institute of Technology(北京理工大学) ; Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education(新一代人工智能技术及其跨学科应用重点实验室(东南大学),教育部) ; School of Computer Science and Engineering, Southeast University(东南大学计算机科学与工程学院)
AI总结 提出AHSE方法,通过Z-score标准化和层次化对齐策略(语义、类别、批次三级)解决音视频与文本模态间的分布与结构差异,在三个基准数据集上取得竞争性能。