NeurIPS2021

WebQA: A Multimodal Multihop NeurIPS Challenge

Yingshan Chang, Yonatan Bisk

被引用 4 次

摘要

Scaling the current QA formulation to the open-domain and multi-hop nature of web searches requires fundamental advances in visual representation learning, multimodal reasoning and language generation. To facilitate research at this intersection, we propose WebQA challenge that mirrors the way humans use the web: 1) Ask a question, 2) Choose sources to aggregate, and 3) Produce a fluent language response. Our challenge for the community is to create unified multimodal reasoning models that can answer questions regardless of the source modality, moving us closer to digital assistants that search through not only text-based knowledge, but also the richer visual trove of information.