AAAI2025

Kolmogorov-Arnold Networks Still Catastrophically Forget but Differently from MLP

Anton Lee, Heitor Murilo Gomes, Yaqian Zhang, W. Bastiaan Kleijn

2 citations

Abstract

In today's continuously shifting innovation and technological growth environment, effective intellectual property (IP) management and organization have become critical, resulting in more significant patent classification. Moreover, recent advances in natural language processing (NLP) technology have resulted in enhanced patent categorization. However, incorporating multilayer perceptron (MLP) layers in NLP algorithms frequently results in higher memory needs, particularly as network size rises. We suggest using the Kolmogorov Arnold Network (KAN) instead of MLP layers to solve this issue. In this work, we used a dataset from the European Patent Office (EPO) to categorize patents into three groups. We experimented with several KAN setups and discovered that decreasing hidden dimension sizes considerably reduced the number of parameters while keeping good accuracy. The [32, 16, 8] configuration achieved an accuracy of 74.84%, which rose to 75.12% after adjusting crucial hyperparameters such as spline_order and grid_size. Compared to other machine learning models such as MLP (75.83%), Random Forest, and XGBoost, KAN consistently surpassed them in accuracy and efficiency. Our findings broaden the use of KAN to patent classification and offer new avenues for its usage in other text-based classification tasks. KAN's proven efficiency and performance make it a promising alternative to existing machine learning models in this area, emphasizing its potential for further application in patent-related activities.