KDD2025
Boost the Performance of Tabular Data Models with GPU Accelerated Feature Engineering
Chris Deotte, Ronay Ak
摘要
Feature engineering remains a crucial technique for improving the performance of models trained on tabular data. Unlike computer vision and natural language processing, where deep learning models automatically extract hierarchical features from raw data, the most accurate tabular models, such as gradient boosted decision trees, still benefit significantly from manually crafted features. This is demonstrated in Team NVIDIA's many first-place data science competition victories.