CVPR2020
Distilled Semantics for Comprehensive Scene Understanding from Videos
Fabio Tosi, Filippo Aleotti, Pierluigi Zama Ramirez, Matteo Poggi, Samuele Salti, Luigi Di Stefano, Stefano Mattoccia
摘要
a) (b) (c) (d) (e) (f) Figure 1. Given an input monocular video (a), our network can provide the following outputs in real-time: depth (b), optical flow (c), semantic labels (d), per-pixel motion probabilities (e), motion mask (f).