Tag: machine learning

Is Artificial Intelligence the solution for cybersecurity – Part 2?

So, let me go back to cybersecurity. Sorry for the long, boring mathematical-based explanation in the last part, but if one wants to use a given tool, he/she must understand how it works and, more essentially, its limits. And I shall give you my list of concerns why believing in the current cybersecurity hype can be dangerous for you and your organization:

  • We must not compare a Machine Learning model to the human brain: We have no idea how the human brain works, and more especially how ideas creation and generalization work. Additionally, the pure power consumption of a machine learning model is times bigger than a human brain. Sure it is faster but much more expensive. The average power consumption of a typical adult is 100 Watts, and the brain consumes 20% of this, making the brain’s power consumption around 20 W. For comparison, Google’s DeepMind project uses a whole data center to achieve the same result, which a two-year-old kid does with 20 W.
On the diagram, you can see what kind of problems Machine Learning algorithms can solve in the cybersecurity field. All of the activities listed in the last row are some form of categorization used for detection. No prevention is mentioned.
  • Machine learning is weak in generalization: The primary purpose of polynomial generation is to solve the so-named categorization problem. We have a set of objects with characteristics, and we want to put them in different categories. Machine learning is good at that. However, if we add a new category or dramatically change the set of objects, it fails miserably. In comparison, the human brain is excellent in generalization or, in social words – improvisation. If we transfer this to cybersecurity – ML is good in detection, but weak in prevention.
  • Machine Learning offers nothing new in Cybersecurity: For a long time, antivirus and anti-spam software have used rule engines to categorize whether the incoming file or email is malicious or not. Essentially, this method is just a simple categorization, where we mark the incoming data as harmful or not. All of the currently advertised AI-based cybersecurity platforms do that – instead of making the rule engine manually, they use Machine Learning to train their detection abilities. 

In conclusion, cybersecurity Machine Learning models are good in detection but not in prevention. Marketing them as the panacea for all your cybersecurity problems could be harmful for organizations. A much better presentation of these methods is to call them another tool in the cybersecurity suite and use them appropriately. A good cybersecurity awareness course will undoubtedly increase your chances of prevention rather than the current level of Artificial Intelligence systems.

Is Artificial Intelligence the solution for cybersecurity – Part 1?

Lately, we see a trend in cybersecurity solutions advertising themselves as Artificial Intelligence systems, and they claim to detect and prevent your organization from cyber threats. Many people do not understand what stands behind modern Machine Learning methods and problems they can solve. And more importantly, that these methods do not provide a full range of tools to achieve the wet dream of every Machine Learning specialist – aka general Artificial Intelligence or, in other words – a perfect copy of the human brain.

But how Machine Learning works? Essentially, almost every algorithm in Machine Learning uses the following paradigms. First, we have a set of data, which we call training data. We divide this data into input and output data. The input data is the data our Machine Learning model will use to generate an output. We compare this generated output with the output of the training data and decide whether this result is good. During my learnings in Machine Learning, I was amazed how many training materials could not explain how we create these models. Many authors just started giving the readers mathematical formulas and even highly complex explanations comparing the models to the human brain. In this article, I am trying to provide a simple high-level description of how they work.

On the diagram you can see a standard Deep Learning Machine Learning model. The Conv/Relu and SoftMax parts are actually polynomials sending data from one algebraic space to another

So how do these Machine Learning algorithms do their job? A powerful mathematics branch is learning the properties of algebraic structures (in my university, it was called Abstract Algebra). The whole idea behind this branch is to define algebraic vector spaces, study their properties, and define operations over the created algebraic space. But let’s go back to Machine Learning, essentially our training data represents an algebraic space, and our input and output data is a set of vectors in that space. Luckily, some algebraic spaces can perform the standard polynomial operations, and we even can generate a polynomial rendering the input and output data. And voila, a Machine Learning model is, in many cases, a generated polynomial, which by given input data produces an output data similar to what we expect.

The modern Deep Learning approach is using a heavily modified version of this idea. In addition, it tries to train its models using mathematical analysis over the generated polynomial and, more essentially, its first derivative. And this method is not new. The only reason it has raised its usage lately is that NVidia managed to expose its GPU API to the host system via CUDA and make matrices calculation way faster than on the standard CPUs. And yeah, the definition of a matrix is a set of vectors. Surprisingly, the list of operations supported by a modern GPU is the same set used in Abstract Algebra.

In the next part, we shall discuss how these methods are used in Cybersecurity.