Chain-of-Glimpse: Search-Guided Progressive Object-Grounded Reasoning for Video Understanding
链式窥视:面向视频理解的搜索引导渐进性对象基础推理
发表机构 * State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.(网络与交换技术国家重点实验室,北京邮电大学,北京,中国) ; Institute of Big Data, College of Computer Science and Artificial Intelligence, Fudan University, China.(大数据研究院,复旦大学计算机科学与人工智能学院,中国) ; ARC Lab, Tencent PCG, Shenzhen, China.(腾讯PCG深圳实验室,深圳,中国) ; School of Artificial Intelligence, Beijing University of Technology, Beijing, China.(北京理工大学人工智能学院,北京,中国)
AI总结 本文提出Chain-of-Glimpse框架,通过搜索引导的渐进推理解决视频中对象变化问题,提升多步骤决策的准确性和可解释性。