ICLR2025

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Jiabo Ye, Haiyang Xu, Haowei Liu, Anwen Hu, Ming Yan, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou

Abstract

At the beginning of the movie, what does the policemen wear on their faces At the beginning of the movie, the policemen wear masks on their faces In the post-production segment of the film, what color is the car lifted by the bulldozer? The car lifted by the bulldozer is red in color.