ACL2024
Towards the TopMost: A Topic Modeling System Toolkit
Xiaobao Wu, Fengjun Pan, Anh Tuan Luu
摘要
Topic models have a rich history with various applications and have recently been reinvigorated by neural topic modeling. However, these numerous topic models adopt totally distinct datasets, implementations, and evaluations. This impedes quick utilization and fair comparisons, and thereby hinders their research progress and applications. To tackle this challenge, we in this paper propose a Topic Modeling System Toolkit (TOPMOST). Compared to existing toolkits, TOPMOST stands out by supporting more extensive features. It covers a broader spectrum of topic modeling scenarios with their complete lifecycles, including datasets, preprocessing, models, training, and evaluations. Thanks to its highly cohesive and decoupled modular design, TOP-MOST enables rapid utilization, fair comparisons, and flexible extensions of diverse cuttingedge topic models. These improvements position TOPMOST as a valuable resource to accelerate the research and applications of topic models. Our code, tutorials, and documentation are available at https://github. com/bobxwu/topmost . Our demo video is at https://youtu.be/9bN-rs4Gu3E?si= LunquCRhBZwyd1Xg .