In the context of Pledger EU project it was developed a Machine Learning approach on detecting certain types of cyber security attacks. The dataset we worked on is the intrusion detection evaluation dataset (CIC-IDS2017). The dataset was created by Canadian Institute for Cybersecurity (CIC) and University of New Brunswick (UNB). The CIC-IDS2017 consists of labeled network flows, including payloads in pcap format, the corresponding profiles and the labeled flows and csv files for machine and deep learning purposes. The datasets included different types of cyber security attacks among them being DDos attacks, PortScan, Web attacks and Infiltration attacks. We have chosen to mainly work with DDos and PortScan attacks and their combination in order to try to create a machine learning model so to be able to create them simultaneously. Infiltration attacks and web scan attacks are greatly imbalanced in their datasets and for that reason the experiments results are not very promising in the current datasets.
Our three experiments are comprised of ML models which are able to detect DDos attacks, PortScan attacks and their combination. In all of our experiments we followed the same methodology in order to create our AI pipelines which are consisted of three modules as can be described in the photo below.
The data analysis module in which the Data Exploration and the data visualization (if the data can be visualized) , the data engineering module in which the data cleansing, the data normalizing and the data splitting happens. Finally from the AI modelling module which is comprised from the model selection, model training, validation and in the end for the model’s evaluation. In our experiments the following pipeline was followed. We firstly measured the features correlation by constructing a matrix by following the Pearson correlation coefficient. This technique is introduced to our experiments so to achieve two things. The first one is to decrease the dataset’s features so to work with fewer features and to decrease the time of our model’s training phase and our second goal is not to include features that are correlated between them which can result to redundant information in our dataset and additional in increased fault accuracies. Furthermore, continuing after having removed the redundant features and we continued to encode our labels, split our main dataset to train, validation and test. Then we normalized our resulting datasets with our training test mean value and variance. The model created is an artificial neural network and more specifically an Multilayer Perceptron (MLP) which has only one hidden layers comprised of 128 neurons.
DDos Attacks Experiment
The initial dataset was consisted of 78 features which after the computation of the matrix correlation where decreased to 30 features as shown in the image below. The Benign samples are almost 98k and the DDos samples are 128k. Both classes do have sufficient samples in the dataset and we can be optimistic that the class imbalance will not pose a problem to our experiment.
After the splitting the test dataset resulted in 20 % of the initial dataset and the training and validation dataset consisted of the remaining 80% with a further 80-20% splitting between them. The labels were encoded with the label encoder scaler and the problem resulted in a binary classification problem. Furthermore all the resulted datasets were normalized according to the train dataset’s mean value and variance with the sklearn’s StandardScaler.
Below the training and validation accuracy and loss during the model’s training phase can be observed and we can see that their graphical visualization shows great accuracies a representation which is the same in the evaluation phase too. As depicted in the classification report below we can see that all the metrics are perfect which can show us that our model has achieved its task and that maybe the dataset at our hand is maybe a little easier than we could expect.
PortScan Attacks Experiment
In the next experiment the PortScan attacks were analyzed and we created one more artificial neural network in order to predict the attacks. The experiment’s flow is the same as in the DDos attacks experiment and we followed the same preprocessing for the dataset. Again the problem is a binary classification problem and the dataset consists of 128k Benign class samples and of 159k PortScan class samples. Once again the class imbalance problem will not play a part in our experiment due to the good ratio of the classes in the dataset. In the images below we can observe the accuracy and the loss in the model’s training phase. We can find that the model is capable to fully learn it’s training dataset and to reach high performances. Finally in the classification report we produced we can observe once more that the test dataset was able to be fully predicted and that our test instances can be perfectly separatable from our model.
DDos and PortScan Attacks combined Experiment
In the last experiment performed the two previous datasets were combined in order to create a unified model so to be able to detect both attacks without having to use the previous models. The same AI pipeline logic was used in order to develop our final experiment. The problem we have at hands due to having more than two classes is no longer a binary classification but a multi-class classification problem with three classes. Thus, the label encoder used in the previous experiments is now converted to a label binarizer encoder. The datasets are again normalized by using the training’s dataset mean value and variance in order to be ready to be fed to the artificial neural network. Once again from the training’s phase accuracy and loss graphs produced we can observe that the model is able to reach a high performance. Finally from the classification report produced we can evaluate our model and identify that is able to generalize its gained knowledge to the test dataset too.
Overall, the datasets used to train our Deep Learning models it was able to be learned from them and produce overall very good results. The infiltration and web attacks if used will produce a weaker in performance model which will produce a lot poorer results and will note be reliable to be reused. The possession of real world attacks with the additional same features used in our experiments would be a good scenario to test the generalization of our model produced in new unseen data.
Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani, “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization”, 4th International Conference on Information Systems Security and Privacy (ICISSP), Portugal, January 2018