Learning in Low-Dimensional Subspaces: Orthogonal Bottlenecks for Reinforcement Learning
低维子空间中的学习:强化学习的正交瓶颈
Aleksandar Todorov, Matthia Sabatelli
AI总结 提出一种在强化学习编码器特征中插入固定正交投影以约束低维子空间的简单先验,证明其在线性可实现性假设下保持表达能力,并在实验中显示价值表示可压缩至极低维度而不损失性能。
详情
深度强化学习代理通常依赖高维神经表示,尽管越来越多的证据表明任务相关的价值和策略结构本质上是低维的。在这项工作中,我们提出了一种简单而有效的表示级先验,它插入一个固定的正交投影以将编码器特征约束到低维子空间,无需辅助目标、预训练或对底层RL算法的更改。在线性可实现性假设下,我们证明当瓶颈维度超过特征空间中最优价值函数的内在秩时,瓶颈保持表达能力,并将诱导的梯度动力学保留到等价的低维参数化。实验上,我们发现,在单任务和多任务基准测试中,一旦瓶颈维度超过一个小的任务相关阈值,基线性能要么匹配要么提高;在许多情况下,价值表示可以压缩到极低维度而不损失,最小充分维度更多地取决于环境复杂性而非编码器宽度。此外,我们分析了表示几何,发现正交瓶颈稳定了特征范数,并与更高的有效秩相关。这些结果共同支持了强化学习中流形假设的表示空间解释,并将正交瓶颈定位为一种轻量级、架构无关的塑造RL表示的机制。
Deep reinforcement learning (RL) agents commonly rely on high-dimensional neural representations, despite growing evidence that task-relevant value and policy structure may be intrinsically low-dimensional. In this work, we present a simple yet effective representation-level prior that inserts a fixed orthonormal projection to constrain encoder features to a low-dimensional subspace, requiring no auxiliary objectives, pretraining, or changes to the underlying RL algorithm. Under a linear realizability assumption, we prove that when the bottleneck dimension exceeds the intrinsic rank of the optimal value function in feature space, the bottleneck preserves expressivity and leaves the induced gradient dynamics unchanged up to an equivalent low-dimensional parameterization. Empirically, we find that across both single and multi-task benchmarks, baseline performance is either matched or improved once the bottleneck dimension exceeds a small task-dependent threshold; in many cases, value representations can be compressed to extremely low dimensions without loss, and the minimal sufficient dimension depends far more on environment complexity than encoder width. In addition, we analyze representation geometry and find that orthogonal bottlenecks stabilize feature norms and are associated with higher effective rank. Together, these results support a representation-space interpretation of the manifold hypothesis in reinforcement learning and position orthogonal bottlenecks as a lightweight, architecture-agnostic mechanism for shaping RL representations.