First, a clarification: AI is a broad domain and while technically the term AI is a correct denomination for the algorithms used in the security industry, there is no ‘real’ AI in use today. An Artificial Intelligence is a system that can sense its environment, reason about it and pivot according to perceived changes allowing it to act autonomously in new situations it has never seen before.
The AI in cybersecurity today is more concisely described as Machine Learning. Machine Learning is a subset of AI that consists of algorithms whose performance improve as they are exposed to more data over time. A specific subset of algorithms based on multilayered neural networks within the Machine Learning domain are called Deep Learning.
To better appreciate the state of machine learning algorithms in use today, I like to refer the below graphic. On one end of the machine learning spectrum there are the deterministic algorithms that are modeled closely to the problem they need to solve. The behavior of the algorithm is deterministic and transparent as it was implemented according to a model to corresponds to the data measured from the world it is attempting to approximate.
Data collected from the environment is used to seed the algorithms of the model and provide baselines for the model to perform at its best. As the environment changes, new data samples provide slightly modified baselines allowing the model to adapt and remain accurate in its detection. Since the algorithms are deterministic and transparent, there are no, or a predictable ratio of, false positives.
As complexity of the environment increases, it becomes harder to create a model that approximates the real world. Approximating a higher complexity problem (high dimensionality of feature vectors) with a generic model that can be shaped based on sample measurements and able to make predictions with a certain degree of generalizing becomes a necessity.
On this opposite side of the spectrum one finds the Deep Learning neural networks. These networks can be considered generic models that are tuned (trained) by example (data). The coding of the model, so to speak, happens based on existing data samples.
Think of a neural network as a complex polynomial that can fit a plane through samples provided in a multi-dimensional space, or even more simplified fitting a curve through sample points in a 2-dimensional space. To provide accurate approximation, the polynomial, the neural network in this case, needs to be of a high enough order to fit the diversity of the data and at the same time should not be too high as to not over-fit the data and keep a degree of generalization that allows the network to make ‘fuzzy’ decisions based on new samples that are similar but not exactly match the training samples used to seed the model.
This leads us directly to the main challenges in applying deep learning for cybersecurity.
The #1 Challenge to AI Adoption in Cybersecrity
Giving the high degree of complexity of deep neural nets (high degree polynomials), there is a need for a vast amount of data to define the model.
Since designing deep learning networks is more of an engineering task than it is an exact science, in addition to a training set, there is need for a testing set. Deep learning experts rely on their vast experience to design an initial deep learning model and adapt the model through multiple iterations to improve its performance. The testing set is the data that allows them to evaluate the performance of their model and allows them to improve the model until it provides an adequate measure of approximation and prediction in the test set based on seeding the neural net with the training set. Lack of data will lead to under-performing models while bad data will produce wrong results – garbage in, garbage out…
The need for large amounts of high quality data is one of the main challenges for generic applicability of AI in cybersecurity. Synthetic generation of data will be misleading as there is structure and correlation between the generated data points and by consequence real world classified samples are the only way to feed a well performing deep learning systems. The use of deep learning in malware detection and classification, for example, provides good results as there is a vast amount of historic data available to train and test deep neural nets.
Related to the need of vast amounts of good data is the challenge of learning in adversarial contexts. Placing a sensor that collects data and have a generic model learning from that data in a real world environment is susceptible to poisoning. The system, during its training, cannot make the difference between good and bad samples and as such its decisions can be influenced by feeding it bad data, aka poisoning – cfr Microsoft’s AI chatbot experiment ‘Tay’.
Another challenge related to the engineering and tuning of the model relates to learning in changing environments. As the diversity in the data changes, the complexity of the model needs to be adapted to ensure consistent performance. For example, adding a new application to a network will result in new network flows and data packets. An adequately tuned deep neural net will not provide a degree of freedom to adapt to new patterns and as such the model needs an increase in complexity in order to adequately learn the new patterns and fit them consistently. Operating a deep neural net requires continuous testing and tuning of the model to keep its performance over time. It is not a one-off or hands-off experience.
While deep learning can be trained and tuned in less sensitive environments, in the domain of cybersecurity protection the margin for error is zero! Attackers can fail at their attempts to compromise the security of an organization on many occasions before discovering a weakness, defenses however need only fail once to create a disaster. Shortcuts do not work and are not allowed in modeling defense, i.e., the Cylance whitelisting.
Today’s AI Trends in Cybersecurity
The defining trend for AI is optimization and application of different techniques and technologies to improve the performance of deep neural networks in different applications within the domain of cybersecurity. The use of new and slightly adapted deep neural network architectures and the experience of using deep neural networks provides for incremental improvements and increase the applicability for cybersecurity.
Another trend is the use of multi-layered systems with high performance, low latency, low false positive models based on traditional machine learning closer to the edge while centralizing the vaster, more complex and more resolving analytics using big data and crowd sourced data through deep learning in the cloud.
The application of real-time threat intelligence and signaling allows upstream analytics and detection to be enforced downstream and close to the edge, providing additional levels of security and a multi-layered approach that provides protection in the case of failure by one or multiple of the earlier layers in the system.
The Future of AI in Cybersecurity
The future of AI in cybersecurity is in automated defense systems that can learn in dynamic and adversarial contexts. Systems that can learn based on limited amounts of data and autonomously differentiate the good from the bad so they can adapt their defenses to an ever changing and sophisticating threat landscapes.
Whether these future systems will be based on current deep or machine learning technology remains a question, but one thing is sure: artificial intelligence is a requirement to keep ahead of the threats today and even more so in the future.