arXivDaily arXiv每日学术速递 周一至周五更新

视觉与机器人

3D 视觉

三维重建、NeRF、Gaussian Splatting、点云和空间智能。

今日/当前日期收录 1 信号源:cs.CV, cs.GR, cs.RO
2606.19586 2026-06-19 cs.RO 新提交 80%

One Demo is Worth a Thousand Trajectories: Action-View Augmentation for Visuomotor Policies

一个演示胜过千条轨迹:用于视觉运动策略的动作-视角增强

Chuer Pan, Litian Liang, Dominik Bauer, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, Shuran Song

发表机构 * Stanford University(斯坦福大学) Columbia University(哥伦比亚大学) Toyota Research Institute(丰田研究所)

专题命中 Gaussian Splatting :使用高斯泼溅重建3D场景进行数据增强

AI总结 提出一种数据增强框架,通过高斯泼溅和轨迹优化生成逼真的鱼眼图像序列和物理可行的动作轨迹,提升操作策略在场景变化和障碍物下的成功率。

Comments Project website: https://chuerpan.com/1001-demos.github.io/. Published at CoRL 2025

Journal ref Proceedings of The 9th Conference on Robot Learning, PMLR 305:3902-3914, 2025

详情
AI中文摘要

用于操作的视觉运动策略在建模复杂机器人行为方面展现出显著潜力,但机器人初始配置的微小变化和未见障碍物容易导致分布外观测。在没有大量数据收集工作的情况下,这些会导致灾难性的执行失败。在这项工作中,我们引入了一个有效的数据增强框架,该框架从真实世界的眼在手演示中生成视觉上逼真的鱼眼图像序列和相应的物理上可行的动作轨迹,这些演示使用带有单个鱼眼摄像头的便携式平行夹爪捕获。我们引入了一种新颖的高斯泼溅公式,适用于广角鱼眼摄像头,以重建和编辑带有未见物体的3D场景。我们利用轨迹优化生成平滑、无碰撞、视图渲染友好的动作轨迹,并从相应新视角渲染视觉观测。在仿真和现实世界中的综合实验表明,我们的增强框架提高了各种操作任务在相同场景和需要避障的增强场景中的成功率。

英文摘要

Visuomotor policies for manipulation have demonstrated remarkable potential in modeling complex robotic behaviors, yet minor alterations in the robot's initial configuration and unseen obstacles easily lead to out-of-distribution observations. Without extensive data collection effort, these result in catastrophic execution failures. In this work, we introduce an effective data augmentation framework that generates visually realistic fisheye image sequences and corresponding physically feasible action trajectories from real-world eye-in-hand demonstrations, captured with a portable parallel gripper with a single fisheye camera. We introduce a novel Gaussian Splatting formulation, adapted to wide FoV fisheye cameras, to reconstruct and edit the 3D scene with unseen objects. We utilize trajectory optimization to generate smooth, collision-free, view-rendering-friendly action trajectories and render visual observations from corresponding novel views. Comprehensive experiments in simulation and the real world show that our augmentation framework improves the success rate for various manipulation tasks in both the same scene and the augmented scene with obstacles requiring collision avoidance.