Myths of machine learning
Machine learning has never been more prevalent or accessible than it is today. It is a phenomenon that is transforming how we interact with businesses, devices, and each other. From retailers and advertisers recommending appropriate products and email providers detecting spam, to cars that can drive themselves, and phones that recognize their owner’s face. Meanwhile artificial intelligence and machine learning must be the two most misunderstood concepts in tech today. In this post I will tackle some of the most common machine learning misconceptions.
Myth: Machine learning can’t predict previously unseen events
The argument goes, if something has never happened before it’s predicted probability must be zero, what else could it be? That’s false as events are composed of many smaller components, all of which have relationships and similarities. The power of machine learning is identifying these relationships, and using them to predict rare events with high accuracy.
Myth: Machine learning ignores preexisting knowledge
Experts in different fields have invested significant human effort to develop domain knowledge. It’s easy to think that this knowledge is lost when a machine learning system is used. A vital part of machine learning is feature engineering, the process of extracting patterns or features from raw data. Domain experts are often better than machines at suggesting features that hold predictive power. As such, domain experts form a key part in defining the input to a machine learning system, from which preexisting knowledge can be extracted, extended, and refined.
Myth: There’s no such thing as unsupervised learning
In supervised learning an algorithm is given an input and a desired output, and the aim is to learn the mapping from input to output. Think of it as a school test, where there are questions and answers, and you are graded by how close your answers are to the actual ones.
Now imagine there are no answers, what can you learn if you only have the questions? In unsupervised learning the aim is to uncover the underlying structure or distribution of the data in order to learn more about it. For example, finding groups of users that behave in the same way, or identifying events that are anomalous. Unsupervised learning can even generate completely new data. If given enough images of faces, we can train a computer to create realistic images of people who don’t even exist.
Myth: Machine learning just summarizes data
Machine learning can identify redundant and duplicate data, and for that reason machine learning can represent most of the information in a data-set with only a fraction of the content. However, it can do a lot more, and in reality, its main purpose is to make predictions. Summarizing the products you purchased in the past is just a means to predict which ones you might like to buy in the future. Knowing how product sales have fluctuated in the past is a guide to how they will behave over the coming weeks and months.
Jazz Networks’ machine learning
The amount of data and events generated in corporate networks is beyond the capacity of human experts, making it impossible for them to shoulder the burden of cyber threats alone. The Jazz Platform collects billions of events every day and utilizes a broad range of cutting edge algorithms to identify when something abnormal is happening. Jazz Networks provides a system for experts to focus on the small number of events that really matter, and to investigate an incident from start to finish in more detail than ever before.
Mark holds a first-class master’s degree in Computer Science from Bristol University, specializing in Machine Learning. He has over 7 years’ experience developing predictive models and network solutions at leading technology companies including Cisco and Playtech, and has a passion for extracting knowledge from unstructured data.