Skip to main content

Table 1 Network intrusion detection techniques that have been developed utilizing cloud-computing technology

From: A survey of cloud-based network intrusion detection analysis

Work Goal Dataset(s) Major approaches Cloud environment ML algorithm(s) Advantages Challenges
Lee et al. [21] Monitor internet traffic flow Simulated NetFlow packets (1) Packet sampling, (2) Flow aggregation, and (3) MapReduce programming model Apache Hadoop None Flow computation time improved by 72 % over legacy tools Batch-processing jobs and text input file formats difficult to handle; flow analysis tools are not adequately developed for the MapReduce interface
Singh et al. [65] P2P botnet detection Simulated and CAIDA sample datasets (1) Information gain measurement and (2) Clustering (random forest) in mahout Apache Hadoop Random forest Process high bandwidth in quasi-real-time, effectively classifies malicious traffic on a cluster High packet drop rates, detection times still a little too high, cannot respond to newer, more sophisticated threats
Bhat et al. [67] Anomaly intrusion detection NSL-KDD 99 (1) Naïve Bayes (NB) tree and (2) A hybrid approach of NB tree and random forest Amazon EC2 NB tree and random forest Good performance, high accuracy, low false positive rate for NB tree/random forest hybrid implementation High false positive rate for non-hybrid implementations
Chen et al. [20] Phishing attack detection Simulated dataset Apache Hadoop Eucalyptus, Apache Hadoop, and Amazon EC2 Collaborative algorithm based on distributed hash tables (DHT) Practical scheme, can be generalized to other attacks Not tested with various datesets
Chen et al. [57] Intrusion detection KDD 99, CMDC 2012 (1) Feature reduction, (2) Vertical compression, and (3) Intrusion detection Apache Hadoop OneR algorithm, affinity propagation, KNN, and SVM Faster than traditional models No incremental clustering ability—feature reduction and training steps can provide significant overhead
Marnerides et al. [22] Malware detection Simulated dataset (1) Energy estimation, (2) Feature selection, and (3) Covariance analysis Unknown Choi-Williams distribution Effective for identifying Kelihos injection Not tested with various datesets
Muthurajkumar et al. [51] Intrusion detection Simulated dataset (1) Feature selection and (2) Fuzzy SVM Unknown Rough set based feature selection algorithm (RSFSA), fuzzy SVM Reduces number of decision attributes and the size of log data, faster than traditional models Not tested with various datesets
Vieira et al. [64] Intrusion detection technique Simulated dataset Utilization of grid and cloud computing Unknown Feed-forward neural network Successfully explores communication events to mark intrusion Large sample period of data is required and training cannot adapt new threats
Wang et al. [68] Network traffic passive measurement CAIDA dataset (anonymized traffic data collected from equinix-chicago and equinix-sanjose) IP trace analysis system (IPTAS) Unknown None Useful prototype of passive traffic analysis tool Not provide a fine-grained traffic analysis