Last Updated on June 7, 2016
It is important to know why machine learning matters so that you know the intrinsic value of the field and of methods and open questions in the field.
Like knowing your why, knowing the value of the field can be used as a powerful filter of information and help you focus on those methods that actually deliver on the promise that the field makes.
In this post you will learn that machine learning matters because it provides methods that can create solutions to complex problems. You will discover that there are problems for which it is not feasible to manually specify how a program solves a problem.
The promise that machine learning makes is that it provides tools to generate a solution to complex problems, faster, more accurately and more scalable than we could program a solution manually.
Writing programs in a computer can be summarized as automating procedures on input data to create output artifacts. Almost always, they are linear, procedural and logical. A traditional program is written in a programming language to some specification, and it has properties like:
- You know or can control the inputs to the program
- You can specify how the program will achieve its goal
- You can map out what decisions the program will make and under what conditions it makes them
- You can test your program and be confident that because the inputs and outputs are known and all conditions have been exercised the program will achieve its goal
There are some problems that you can represent in a computer that you cannot write a traditional program to solve. They resist a procedural and logical solution. They have properties such as:
- The scope of all possible inputs is not known beforehand
- You cannot specify how to achieve the goal of the program, only what that goal is
- You cannot map out all the decisions the program will need to make to achieve its goal
- You can collect sample input data for the program
Problems like this resist traditional programmed solutions because manually specifying a solution would require a disproportionate amount of resources.
You are probably a programmer, and you might be an experienced programmer. This might sound very odd, even unbelievable. As programmers, we believe as long as we can define what a program needs to do, we are confident we can define how a program can achieve that goal. This is not always the case.
Spam Filter Example
An example of an every-day decision problem that resists a manually defined solution is the discrimination of spam email from non-spam email.
How would you write a program to filter emails as they come into your email account and decide whether to put them in the spam folder or the inbox folder?
Some of my thoughts on how to do this are:
- I’d collect examples of emails I knew to be spam or not-spam
- I’d read the emails I had collected and write down any patterns I saw in either group
- I’d think about abstracting those patterns into more general rules I could program
- I’d look for emails that I could safely and quickly categorize as either spam or non-spam
- I’d write tests for my program to ensure it was making accurate decisions
- I’d monitor the deployed system and keep an eye on the decisions it was making
I could write a program to do this, and so could you. It would take a lot of time. A lot of emails would have to be read. The problem would need to be thought about very deeply. It would take a lot of development and testing time before the system could be trusted enough to be put into operations. Once in operations, there would be so many hard coded rules that were specific to the email I had read that it would be a maintenance nightmare.
The process above also describes a machine learning solution to the problem of discriminating spam email from non-spam email. The punch line is that machine learning methods can automate the process for you.
Pro Tip: Approaching complex problems in this way is an incredibly valuable skill that will serve you well later on in preparing data and selecting the right machine learning method. Thinking through the process of “how would I manually write a program to solve this” is a master skill that is often overlooked and forgotten by professionals.
Machine Learning Matters
The field of machine learning provides tools to automatically make decisions from data in order to achieve some goal or requirement. The research questions focus on how to do this better and what the results mean.
Let us focus on the practical problem-solving capabilities of the tools and practices of machine learning. These tools and practices of machine learning matter to the world. Four reasons that they matter are:
- Automatically: Machine learning methods are automated processes (algorithms) that create algorithms. The methods run on data and produce a model that specifies how to achieve the program’s goal.
- Fast: Machine learning methods save you time. The methods can analyze sample input data and deliver a program faster than you could manually write one.
- Accurate: Machine learning methods can do a better job than you. As automated methods, they can run longer on more data than you in order to make more accurate decisions.
- Scale: Machine learning methods can provide solutions to problems that you cannot solve. The methods can scale and be interconnected to achieve solutions to problems that previously could not be considered or even conceived.
In this post you learned that machine learning matters because it provides methods that can create solutions to complex problems. Specifically, these are problems that resist a manually specified solution.
You learned that the promise of machine learning is that it can solve these types of problems automatically, faster and more accurately than a manually specified solution and at a larger scale.
What are some complex problems that you think resist a manually programmed solution? Leave a comment.
I have a question.
“As automated methods, they can run longer on more data than you in order to make more accurate decisions.”
When would the algorithm know to stop? After all, these are problems with no clear yes/no solutions. So how would you say, “This is good enough”? Your algorithm could run for years….
A really perceptive question Shantnu.
You define stopping criteria or success criteria for an algorithm run. For example, you could say, run until performance on a test dataset has an accuracy of x. Alternatively, you could run until there is no performance increase for y seconds or z iterations.
It might be a good idea if you also explained the phenomina of
1. Overfitting (which may be a result of overtraining)
2. False positives (part of how we check accuracy(
Check out this post on overfitting:
A Simple Intuition for Overfitting, or Why Testing on Training Data is a Bad Idea
Check out this post on accuracy measures:
Classification Accuracy is Not Enough: More Performance Measures You Can Use
If we consider the manually programmed solution consisting of a bunch of if else statements then, I think the order of if else statement will matter e.g which condition should be given preference etc, so how should one go about choosing those conditions.
Mirroring the spam example with emails, could predicting the positive and negative habits of twins be a machine learning problem?
Medical diagnosis predicting weather a patient is suffering from a disease or not is a good area for machine learning.
I totally agree from a high-level. Specific cases may have merit. E.g. cases were we do not have a reliable physical model for what is going on and ML is demonstrated to the best forecasting/prediction solution we have. (fog forecasting? some bioinformatics problems perhaps…).
Jason, I have a question.
How Machine learning can be used to classify different types of attacks, malware, viruses? As RAMESWARES are creating lots of problems now a days, Is possible to have machine learing solutions for identification ,classification of the same? are there any good research papers in this regard to give some idea.
Yes, this would be the application of machine learning to computer security.
This domain is not my speciality, but try search terms like “machine learning computer security” and “machine learning malware”.
I also believe that there are some datasets from the KDDCup related to computer security that you may be able to use as practice.
Jason, Thanks for sharing your views.
How about “DNA based medical machine learning?” Usually DNA based data is high dimensinal data. AS DEEP learning is active are for research, Can “Dimensionality reduction of DNA based high dimensional data” be visualaized as Deep learning application?
I wanted to persue Research in machine learning area (Deep Learning)and currently looking for suitable application for the same.
Another though in my mind is about recommendation system for call center empolyees to recommend right kind of suggestions to customers. with special emphasise on Indian languages / native languages.
As suggested by you,Machine learning based malware classification is another area.
Suggest your views and other suitable / upcomming research areas / applcations of machine learning.
Thanks for sharing this blog.
I have a question. You mentioned the example of detecting spam emails. What could be the stopping criteria for the algorithm developed in ML for detecting spam emails? Like we have to specify certain number of keywords as spam or what?
Another point I have mentioned here is an example of ML.
I guess to identify the personality traits through handwriting could be an example for ML.
Request you to share your views over the same.
I would suggest that the stopping criteria would be model accuracy or an acceptable spam false positive rate.
Estimating personality traits from handwriting sounds great, but the difficulty may be the need to collect a high quality dataset for training.
I can see machine learning isn’t the stuff for genius, as I thought earlier. I have downloaded n implemented predictions in the R packages like kernlab,and ksvm, and VIM…pLUS RANDOMforest….its just super exciting.
Iris inbuilt data is used in R packages to demo his machine learning works.
I am not sure whether my R profeicienxy is the driving advantage here, but I find machine learning kernels sexy/romantic!
I would now love to share this sweetness with others!
Thanks prof Jason for demystifying Machine learning.
You’re welcome Meshack, I’m glad to hear that you’re making progress.
Yes Meshack introduced me to Machine learning and I am quite on it especially R forest. It is amazing and I cant wait to have good mastery of it.
Stick with it!
Programming a Self-Driving car is really a daunting task. Here more novel machine learning algorithms such as convolutional and recurrent neural networks shine and perform better than a hand-written solution.
Another area I think has benefited A LOT from the rapid progress in the ML field is online marketing. Some algorithms are so accurate that sometimes it feels like they read our minds!
Hello . As it is said above that we can solve complex problems using machine learning that humans could not manually . Also the algorithms are fed with data. My question is , is there a chance that there is no data given before and any model development is done . Now if we directly implement a algorithm with real case inputs ( like email spam) how does the prediction is going to happen. There may be a case even a non-spam email can be taken as spam .
Is there of question of not feeding an algorithm to develop a model before hand to execute it? OR should we definitely give data and thus make a predictive model ? .
Supervised learning needs some data to learn from.
Mr. Professor, May I know the present problem statements in machine Learning. i’m interested to do my research work in Machine learning using Python and R.
This is a common question that I answer here:
I am interested and have passion to work with research problem on algorithmic trading (Auto) , Please some body can share some resources and papers on this topic.
Sorry, I don’t know about algorithmic trading.
thank you very much for this post, i would like to see how use the machine learning for optimize the solutions in automate analysis for big data