KDD2022

Deep Learning for Network Traffic Data

Manish Marwah, Martin F. Arlitt

被引用 9 次

摘要

Network traffic data is key in addressing several important cybersecurity problems, such as intrusion and malware detection, and network management problems, such as application and device identification. However, it poses several challenges to building machine learning models. Two main challenges are manual feature engineering and scarcity of training data due to privacy and security concerns. In this tutorial we provide a comprehensive review of recent advances to address these challenges through use of deep learning. Network traffic data can be cast as a multivariate time-series (sequential) data, attributed graph data, or image data to leverage representation learning architectures available in deep learning. To preserve data privacy, generative methods, such as GANs and autoregressive neural architectures can be used to synthesize realistic network traffic data. In particular, our tutorial is organized into three parts: 1) we describe network traffic data, applications to security and network management, and challenges; 2) we present different deep learning architectures used for representation learning instead of feature engineering of network traffic data; and, 3) we describe use of generative neural models for synthetic generation of network traffic data.