R2P2: A Reparameterized Pushforward Policy for Diverse, Precise Generative Path Forecasting
We propose a method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features (e.g., from LIDAR and images) embedded in an overhead map. The method learns a policy inducing a distribution over simulated trajectories that is both diverse (produces most paths likely under the data) and precise (mostly produces paths likely under the data). This balance is achieved through minimization of a symmetrized cross-entropy between the distribution and demonstration data. By viewing the simulated-outcome distribution as the pushforward of a simple distribution under a simulation operator, we obtain expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization. We propose concrete policy architectures for this model, discuss our evaluation metrics relative to previously-used metrics, and demonstrate the superiority of our method relative to state-of-the-art methods in both the KITTI dataset and a similar but novel and larger real-world dataset explicitly designed for the vehicle forecasting domain.