ACL2024

Set the Clock: Temporal Alignment of Pretrained Language Models

Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith

Abstract

Language models (LMs) are trained on web 001 text originating from many points in time 002 and, in general, without any explicit temporal 003 grounding. This work investigates the temporal 004 chaos of pretrained LMs and explores various 005 methods to align their internal knowledge to a 006 target time, which we call "temporal alignment." 007 To do this, we first automatically construct a 008 dataset containing 20K time-sensitive questions 009 and their answers for each year from 2000 to 010 2023. Based on this dataset, we empirically 011 show that pretrained LMs (e.g., LLaMa2), de-012 spite having a recent pretraining cutoff (e.g., 013 2022), mostly answer questions using earlier 014 knowledge (e.g., in 2019). We then develop sev-015 eral methods, from prompting to finetuning, to 016 align LMs to use their most recent knowledge 017 when answering questions, and investigate var-018 ious factors in this alignment. Our experiments 019 show that aligning LLaMa2 to the year 2022 020 can boost its performance by up to 62% rela-021 tively as measured by that year, even without 022 mentioning time information explicitly, indicat-023 ing the possibility of aligning models' internal 024 sense of time after pretraining. Finally, we find 025 that alignment to a historical time is also pos-026 sible, with up to 2.8× the performance of the 027 unaligned LM in 2010 if finetuning models to 028 that year. These findings hint at the sophistica-029 tion of LMs' internal knowledge organization 030 and the necessity of tuning them properly. 1