EMNLP2022

BERT in Plutarch's Shadows

Ivan P. Yamshchikov, Alexey Tikhonov, Yorgos Pantis, Charlotte Schubert, Jürgen Jost

5 citations

Abstract

The extensive surviving corpus of the ancient scholar Plutarch of Chaeronea (ca. 45-120 CE) also contains several texts which, according to current scholarly opinion, did not originate with him and are therefore attributed to an anonymous author Pseudo-Plutarch. These include, in particular, the work Placita Philosophorum (Quotations and Opinions of the Ancient Philosophers), which is extremely important for the history of ancient philosophy. Little is known about the identity of that anonymous author and its relation to other authors from the same period. This paper presents a BERT language model for Ancient Greek. The model discovers previously unknown statistical properties relevant to these literary, philosophical, and historical problems and can shed new light on this authorship question. In particular, the Placita Philosophorum, together with one of the other Pseudo-Plutarch texts, shows similarities with the texts written by authors from an Alexandrian context (2nd/3rd century CE). "I do not need a friend who changes when I change and who nods when I nod; my shadow does that much better." (Plutarch, Quomodo adulator ab amico internoscatur 53b 10) 1 https://github.com/brennannicholson/ancient-greek- char-bert formed into a digital representation under professional supervision. Data can be obtained from the following free repositories Perseus Digital Library 2 and First Thousand Years of Greek 3 as part of Open Greek and Latin 4 . The resulting representation was stored in XML format (TEI guidelines) and enriched with metadata. The XML structure and metadata were removed. The strings were transferred into lowercase letters. The diacritics were removed. We did not touch hyphenation, punctuation, or any multilingual remains, nor did we apply any special language-related transformations. The resulting data set consists of 1 244 documents with 199 809 paragraphs or 14 373 311 words.