The first step in any project is defining your problem. You can use the most powerful and shiniest algorithms available, but the results will be meaningless if you are solving the wrong problem.
In this post you will learn the process for thinking deeply about your problem before you get started. This is unarguably the most important aspect of applying machine learning.
Problem Definition Framework
I use a simple framework when defining a new problem to address with machine learning. The framework helps me to quickly understand the elements and motivation for the problem and whether machine learning is suitable or not.
The framework involves answering three questions to varying degrees of thoroughness:
- Step 1: What is the problem?
- Step 2: Why does the problem need to be solved?
- Step 3: How would I solve the problem?
Step 1: What is the Problem
The first step is defining the problem. I use a number of tactics to collect this information.
Informal description
Describe the problem as though you were describing it to a friend or colleague. This can provide a great starting point for highlighting areas that you might need to fill. It also provides the basis for a one sentence description you can use to share your understanding of the problem.
For example: I need a program that will tell me which tweets will get retweets.
Formalism
In a previous blog post defining machine learning you learned about Tom Mitchell’s machine learning formalism. Here it is again to refresh your memory.
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
Use this formalism to define the T, P, and E for your problem.
For example:
- Task (T): Classify a tweet that has not been published as going to get retweets or not.
- Experience (E): A corpus of tweets for an account where some have retweets and some do not.
- Performance (P): Classification accuracy, the number of tweets predicted correctly out of all tweets considered as a percentage.
Assumptions
Create a list of assumptions about the problem and it’s phrasing. These may be rules of thumb and domain specific information that you think will get you to a viable solution faster.
It can be useful to highlight questions that can be tested against real data because breakthroughs and innovation occur when assumptions and best practice are demonstrated to be wrong in the face of real data. It can also be useful to highlight areas of the problem specification that may need to be challenged, relaxed or tightened.
For example:
- The specific words used in the tweet matter to the model.
- The specific user that retweets does not matter to the model.
- The number of retweets may matter to the model.
- Older tweets are less predictive than more recent tweets.
Similar problems
What other problems have you seen or can you think of that are like the problem you are trying to solve? Other problems can inform the problem you are trying to solve by highlighting limitations in your phrasing of the problem such as time dimensions and conceptual drift (where the concept being modeled changes over time). Other problems can also point to algorithms and data transformations that could be adopted to spot check performance.
For example: A related problem would be email spam discrimination that uses text messages as input data and needs binary classification decision.
Step 2: Why does the the problem need to be solved?
The second step is to think deeply about why you want or need the problem solved.
Motivation
Consider your motivation for solving the problem. What need will be fulfilled when the problem is solved?
For example, you may be solving the problem as a learning exercise. This is useful to clarify as you can decide that you don’t want to use the most suitable method to solve the problem, but instead you want to explore methods that you are not familiar with in order to learn new skills.
Alternatively, you may need to solve the problem as part of a duty at work, ultimately to keep your job.
Solution Benefits
Consider the benefits of having the problem solved. What capabilities does it enable?
It is important to be clear on the benefits of the problem being solved to ensure that you capitalize on them. These benefits can be used to sell the project to colleagues and management to get buy in and additional time or budget resources.
If it benefits you personally, then be clear on what those benefits are and how you will know when you have got them. For example, if it’s a tool or utility, then what will you be able to do with that utility that you can’t do now and why is that meaningful to you?
Solution Use
Consider how the solution to the problem will be used and what type of lifetime you expect the solution to have. As programmers we often think the work is done as soon as the program is written, but really the project is just beginning it’s maintenance lifetime.
The way the solution will be used will influence the nature and requirements of the solution you adopt.
Consider whether you are looking to write a report to present results or you want to operationalize the solution. If you want to operationalize the solution, consider the functional and nonfunctional requirements you have for a solution, just like a software project.
Step 3: How would I solve the problem?
In this third and final step of the problem definition, explore how you would solve the problem manually.
List out step-by-step what data you would collect, how you would prepare it and how you would design a program to solve the problem. This may include prototypes and experiments you would need to perform which are a gold mine because they will highlight questions and uncertainties you have about the domain that could be explored.
This is a powerful tool. It can highlight problems that actually can be solved satisfactorily using a manually implemented solution. It also flushes out important domain knowledge that has been trapped up until now like where the data is actually stored, what types of features would be useful and many other details.
Collect all of these details as they occur to you and update the previous sections of the problem definition. Especially the assumptions and rules of thumb.
We have considered a manually specified solution before when describing complex problems in why machine learning matters.
Summary
In this post you learned the value of being clear on the problem you are solving. You discovered a three step framework for defining your problem with practical tactics at at step:
- Step 1: What is the problem? Describe the problem informally and formally and list assumptions and similar problems.
- Step 2: Why does the problem need to be solve? List your motivation for solving the problem, the benefits a solution provides and how the solution will be used.
- Step 3: How would I solve the problem? Describe how the problem would be solved manually to flush domain knowledge.
How do you define your problem for machine learning? Have you used any of the above tactics and if so, what were your experiences? Leave a comment.
This is so useful!.
Thanks Gusseppe.
Hi Jason,
I would like to to discuss a project with you.. My email is nosa@alum.mit.edu
You can contact me any time via the contact page:
https://machinelearningmastery.com/contact/
Very well thought out way of the very first step – purpose. Aka problem definition.
Thanks Jason. Always a joy to read your blog. Keep it coming so you can continue to educate the world. Many thanks to you !!
Thanks Joe, I’m glad you found it useful.
Thank you so much for this post. It has given me a direction on how to kick_start my new task. Please is there another post that extends the steps discussed here?
Glad to hear it Kola. This might be the only place where I discuss these ideas.
This is post is very useful, thank you. It helps a lot in keeping the focus on the essential. I am currently writing my first paper and I do really find this very very helpful!
I’m really glad to hear that Federica.
I am still trying to understand mine 🙁 (Detection of malicious Office documents using machine learning algorithm) please help
Under Step 3, do you mean decide what algorithm are we going to use?
Not quite Prakriti.
It suggests thinking of the problem as a programming exercise and think about what you might have to do to solve it, what data you would need, what structures, etc.
This can help to force you to think deeply about the problem upfront and think about what other data or data transforms and maybe even what techniques you might need to use later on.
See the full process here:
https://machinelearningmastery.com/start-here/#process
Jason, thank you for these thoughts.
Great way to start and understand what I want/can.
Thanks Andrey.
The questions to steps 1 & 2 were very helpful for writing the introduction to my thesis!
Glad to hear it Ivan. Nice work! Best of like with your thesis (been there…).
Chronic Kidney Disease Detection..which Kind of data sets needed
I don’t know, perhaps you should contact a domain expert?
Good Evening,
I’ve been looking for info regarding k-means output and what it means.
Would you have information regarding same?
Thank you,
Norm
Sorry, I don’t have material on k-means clustering.
Sir, I want to create a performance analysis tool. In which I will keep track of the amount of work that I do and the amount of rest time I take in between. And I will assign intensity level to my each work. I will multiply that intensity with hour I did that work. And at the end of the day I will sum those which will give my performance. Every day I will try to check my performance with taking different combinations of the intensity of the work and the time that I give. So at the end will ML be able to help me plan my works.So that I will be able to complete my work with least amount of time. Or will ML be able to give me a prediction such that if I enter my total day plan which consists of how much amount i will work and rest I take it will give me %age of possibility of its happening.
Sounds great. Collect the data.
That’s perfect, long live your hand
Thanks.
Great collection of material for this newbie. Thank you! This particular model reminds me of a (once) famous book by George Ploya: How To Solve It, a heuristic approach to mathematics. He has a single-page summary of questions that elaborate on your three questions. You might find it interesting to compare. Full PDF is at https://notendur.hi.is/hei2/teaching/Polya_HowToSolveIt.pdf
In particular see questions on p. xvi — xvii
Wikipedia summary: https://en.wikipedia.org/wiki/How_to_Solve_It
Thanks, yes I remember reading that book as a student.
Hi Jason and fellow ML lovers, wonder if somebody can point me to some examples of defining the problem?
Thank you very much for this post. As you point out in the post, the third point is key (and arguably the most important when we are defining the context of the problem) and could greatly improve the solution development speed. As in any other area of software engineering, jumping immediately into code is a great recipe for a subpar solution (and bugs!)
Right on!
Hi Jason,
I like the ‘TEP’ approach of problem definition basis the famous definition of Machine Learning by Tom Mitchell . Also I have always thought of ‘why should this be solved’ but I feel that documenting in terms of Solution Benefits or Solution Use and thinking of the Lifetime Maintenance of the project makes complete sense
Appreciate all your content and learning by the day !
Also I wanted to ask if investing in a formal distance learning course from the University of Chicago Graham School (for eg.) will be helpfu in terms of validating my abilitiesl in the long run or I should continue with self learning ?
Thanks,
Hussain
Thanks.
If you like the course, then go for it. I cannot give advice on what would be a good fit for you.
what is difference between q-learning and deep q-learning in terms of high dimensionality of data.
I hope to cover reinforcement learning in more detail soon.
First, many thanks about the contents.
Second: What are the RL algorthm types? or do you know a reference in which I can find all possible types? (need it for my thesis)
This might be a good place to start:
https://en.wikipedia.org/wiki/Reinforcement_learning
Hi Jason , i have question about data, it is possible to conduct fraude détection reasarch without datasets about fraud, if no there ils a alternative
Not with machine learning. Developing a predictive model with machine learning requires a dataset.
It is instructive yet little helpful to my existing problem while training a model with a data-set of stl10. I am confused about why loss always be fluctuating within a tiny range, meaning that it has no distinct descend over large number of epoch. I am a fresh bird in this field, how could I fine-tune my model parameters? It might be caused by what reason ?
Perhaps the model is underpsecified (e.g. a large model is required).
This post might give you some ideas on how to lift model skill:
https://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/
I am just starting out with Machine Learning and this is very helpful!
Thanks, I’m glad to hear that.
It is very helpful…Good one..Keep posting.!!!
Thanks!
Good, I am starting machine learning using Matlab. Is it better than python?
My job is related to forecasting the electricity demand. We have many predefined algos in matlab, do you think they will work well?
I would recommend Python, here’s why:
https://machinelearningmastery.com/python-growing-platform-applied-machine-learning/
Thnks, Jason it really help me to understand the basic concepts of ML.
I’m happy to hear that.
Thanks a lot , this is very usefull
You’re welcome, I’m happy to hear that.
Its amaizing..!! thanks Mr Jason
I’m happy it helped.
I am interested ! i appreciate all the explanation!
i am wondered if you could share with us the codes of building algorithm step by step!
I have many such examples, I also have a book on the topic here:
https://machinelearningmastery.com/machine-learning-algorithms-from-scratch/
Realy useful. Sometimes (as my case), we have a bunch of problems but wanted to solve them in one shot.
Thanks!
Thanks.
Thank you for this valuable information.
You’re welcome.
Hi, great resource, I like step 3, but not step 2! Why? Most people in tech nowadays seem to work in a matrix organisation, therefore collaboration and pushing for finding business value through collab is more important than anything else. So step 2, “motivation” section, might need to include others besides yourself, as real beneficiaries. So you need to be close to them, and understand their real need. Solving ML probl for yourself will benefit no-one else, mostly (unless you assume the role of someone else in your organisation, and solve a problem they have not asked for/thought of. Cheers!
Great point, thanks Dan!
always a helpful writing!
Thanks.
Thanks Jason. This is a very good help to start with machine learning.
I’m glad it was helpful.
Great article, many thanks!
Thanks.
Nice information to start Machine Learning.I have read all the steps but can you explain more on Step 3: How would I solve the problem?
Here I am not getting “Manually” means what ?
Can you please explain with example…
Thanks,
In that case, I’m prompting you to think about how you might approach the problem if you had to code a solution from scratch.
This cannot be done with a real predictive modeling problem, but can with other problems. It helps sort out that question.
Bonjour,
je suis débutant dans le Deep learning, et je voudrais de classifier une image IRM en trois classe, matiere blanche et matiére grise et liquide céphalo rachdien…..?
Sounds like a fascinating problem.
Perhaps start with a CNN, and try some transfer learning to get good results very quickly.
sir,could you give your email id
You can contact me any time here:
https://machinelearningmastery.com/contact/
Sir, thanks for this post.
I want to remove music from the song but the human voice.
How can I do this, any clue?
Please answer.
Sounds like an amazing project!
I don’t have any tutorials related to this, but perhaps search or arxiv for related projects and discover what types of methods they are using?
Sorry Sir, I don’t want to remove the human voice.
hi can you please send me a book about machine learning
You can access my best free tutorials here:
https://machinelearningmastery.com/start-here/
Thank you, Jason!
You’re welcome.
Thanks, about the data collection! what if i have an issue with detecting it from videos?
do you have any tutorials on here?
I don’t have tutorials on working with video data, I hope to cover the topic in the future.
Thank you for this article. I found it concise and well written. It will help with my primary research. Having worked with a colleague software dev I was wondering how much ‘data’ you would need to be able to use machine learning algorithms. I was under the impression you need at least a couple thousand per label to get anywhere close to it being indicative. Does it make sense to develop an algorithm on a data set that is 2,500 items spread over 25 different labels?
Great question, I answer it here:
https://machinelearningmastery.com/much-training-data-required-machine-learning/
Hi, it’s me again.
I suggest my machine learning problem: today African farmers practice agriculture without knowing the potential of the soil, with my program, depending on the richness of the soil, I predict an agricultural crop that the farmer can grow without the need for fertilizer.
Sounds great.
What data would you use?
I would like to know asking yourself those questions, do we need to write down all feature and answer to those question because touching the data that we gonna use ?
Perhaps use the parts of the framework the best help you work through your problem?
I would like to introduce these projects to each other and teach the whole project with Python code. That is, project-based education.
https://morioh.com/p/b56ae6b04ffc
Sorry, I don’t understand. Can you elaborate?
hello and thanks
how to reference your contents in my thesis?
Good question, this will help:
https://machinelearningmastery.com/faq/single-faq/how-do-i-reference-or-cite-a-book-or-blog-post
Wow, this was a great read. One quick question though, Where does actual algorithm development happen. I assume post step 3. Step 3 is collecting data setting models. I think I m a bit unclear on how exactly algorithm will learn over time, or given near to accurate the answer. This could be due to my s/w code development background. Is there any reference page you could guide me to? Thanks much. This page has been super helpful to me.
Thanks!
Good question, see this broader process:
https://machinelearningmastery.com/start-here/#process
Came across your article through my MIT Sloan CSAIL: AI course. This structured approach to problem solving in this space is spot on. As an engineer and data geek I prefer structure that aids in consistently tackling problems. Very helpful.
Thanks!
I think this is really useful even I am very beginning in understanding machine learning.
I would say this could be good processing any problem solving!
Thanks!
Hi Jason .
How would I know that my ML solution is kind of good and there would not be a better way to do it ? Yes , one check point is the model metric , but I feel stuck up when I can not improve on the metric. Is this normal or is it that it’s better to work on a project as a group of 2 at least so that there can be a peer review ?? Or do we look for a mentor who could do the peer review ?
You can know if it is good by comparing the result to a naive model:
https://machinelearningmastery.com/faq/single-faq/how-to-know-if-a-model-has-good-performance
You cannot know you have the best performance, all you can do is collect evidence that your model performs well and is robust, and that other models do not perform better.
Peer review is a great idea.
Hey Jason
Many thanks for this article!
Could please explain this phrase more?
“ If you want to operationalize the solution, consider the functional and nonfunctional requirements you have for a solution”
These are terms from software engineering.
Functional requirement
https://en.wikipedia.org/wiki/Functional_requirement
Non-functional requirement
https://en.wikipedia.org/wiki/Non-functional_requirement
This is the most useful article/tutorial I’ve come across as a beginner. Thank you for the invaluable lesson.
Thanks!
Many thanks for the article, Jason. This really helped me a lot in writing a blog as a project assignment. Do you have any links written by you focusing on the below subhead 2,3,4,5 & 6.
1. Problem solving
2. Data Analysis
3. EDA Concluding Remarks
4. Pre-processing Pipeline
5. Building Machine Learning Models
6. Concluding Remarks
Please if you could post links. However, this truly helped a lot.
Yes, you can use the blog search to locate them.
thank you so much
I think in this way you return more time before starting writing the code
You’re welcome.
Very helpful.
Thanks!
My first time reading this and I am an ML novice, but I followed it quite well. Very clearly and succinctly put. Thank you.
Thanks, I’m happy that it helped.
Great work!!
Thanks!
This is a great article. It’s clear, well written. Thank you!
Thanks!
Thanks for the information. The article helped me framed some challenges as well as opportunities as we look to implement AI in our organization.
You’re welcome!
Thank You Jason, I am making a new data science project for my portfolio and this has helped me immensely. I had an interview some time ago and I had a similar question being asked, I have an understanding of this process, but was not able to put it forward properly. This is very helpful.
Thanks, I’m happy it helped!
Machine learning is not all about data and programming. It takes human to support the project too. Thanks
It sure does.
can we apply machine learning algorithms in supermarkets to reduce queue wait time at bill paying counter.
Perhaps.
thanks a lot, it was very helpful for me.
You’re welcome.
Thank you for this article, I am really happy to read you.
I am currently working on a project to create a sculpin that can offer services (restaurant, pharmacy, etc.) to users based on the data they have entered through the messaging system. Now, thanks to this post, I have questioned some things. First I would like to know if I should train the model with a database containing information related to each partner company, which will allow the bot to be able to offer a service or a company to the customer based on what he has entered. Or rather, I first start to enter the model on databases containing conversational data in order to bring the bot to understand and exchange with the user. Also, I would like you to propose me databases related to my problem because the ones I found do not contain enough data.
Please help me sir.
This is specific to your project, I don’t know. Perhaps you can discuss the issue with project stakeholders or experts in your domain.
Beautiful. Pls continue the good work is only God that will reward you
1. Learning from experience has limitations. Some event under some environment, a result and most importantly an interpretation of the result is what we call experience. No two individuals have the same experience from the same event or same individual under different frame of mind on the same event carries different experiences. Human beings are prone to errors in judgement. Machines don’t learn by themselves, it is fundamentally guided by human beings in some way or the other. How can one accept or trust learning of machine as a Holy Grail
2. Some of the greatest inventions have started with solutions, identifying the problems came later.
You addressed the all questions which arises in mind that why we are doing this, thanks a lot
Thank you for the great feedback Muhammad!
thank you so much, it’s a great job
Thanks a lot for this, this has been so helpful to me as a beginner in ML
Hi Ndidi…You are very welcome! Let us know if we can ever answer any questions regarding our content.