CVPR2024

Making Visual Sense of Oracle Bones for You and Me

Runqi Qiao, Lan Yang, Kaiyue Pang, Honggang Zhang

5 citations

Abstract

Visual perception evolves over time. This is particularly the case of oracle bone scripts, where visual glyphs seem intuitive to people from distant past prove difficult to be understood in contemporary eyes. While semantic correspon-dence of an oracle can be found via a dictionary lookup, this proves to be not enough for public viewers to connect the dots, i.e., why does this oracle mean that? Common solution relies on a laborious curation process to collect visual guide for each oracle (Fig. 1), which hinges on the case-by-case effort and taste of curators. This paper delves into one natural follow-up question: can AI take over? Begin with a comprehensive human study, we show par-ticipants could indeed make better sense of an oracle glyph subjected to a proper visual guide and its efficacy can be approximated via a novel metric termed TransOV (Trans-ferable Oracle Visuals). We then define a new conditional visual generation task based on an oracle glyph and its se-mantic meaning and importantly approach it by circumventing any form of model training in the presence of fatal lack of oracle data. At its heart is to leverage foundation model like GPT-4V to reason about the visual cues hidden inside an oracle and take advantage of an existing text-to-image model for final visual guide generation. Extensive empirical evidence shows our AI-enabled visual guides achieve signif-icantly comparable TransOV performance compared with those collected under manual efforts. Finally, we demon-strate the versatility of our system under a more complex setting, where it is required to work alongside with an AI image denoiser to cope with raw oracle scan image inputs (cf processed clean oracle glyphs). Code is available at https://github.com/RQ-Lab/OBS-Visual.