CVPR2023

FLAG3D: A 3D Fitness Activity Dataset with Language Instruction

Yansong Tang, Jinpeng Liu, Aoyang Liu, Bin Yang, Wenxun Dai, Yongming Rao, Jiwen Lu, Jie Zhou, Xiu Li

Abstract

Figure 1. An overview of the proposed FLAG3D dataset, which contains 180K videos of 60 daily fitness activities. Our dataset is comprised of (a) 3D activity sequences captured from advanced MoCap system, (b) rendered videos of different people with their SMPL parameters, and (c) real-world videos obtained by cost-effective phones from both indoor and outdoor natural environments. FLAG3D also provides a series of detailed and professional sentence-level language instructions for each fitness activity. All figures are best viewed in color.