So, You are Working on a Machine Learning Problem…

So, you’re working on a machine learning problem.

I want to really nail down where you’re at right now.

Let me make some guesses…

So, You are Working on a Machine Learning Problem...

So, You are Working on a Machine Learning Problem…
Photo by David Mulder, some rights reserved.

1) You Have a Problem

So you have a problem that you need to solve.

Maybe it’s your problem, an idea you have, a question, or something you want to address.

Or maybe it is a problem that was provided to you by someone else, such as a supervisor or boss.

This problem involves some historical data you have or can access. It also involves some predictions required from new or related data in the future.

Let’s dig deeper.

2) More on Your Problem

Let’s look at your problem in more detail.

You have historical data.

You have observations about something, like customers, voltages, prices, etc. collected over time.

You also have some outcome related to each observation, maybe a label like “good” or “bad” or maybe a quantity like 50.1.

The problem you want to solve is, given new observations in the future, what is the most likely related outcome?

So far so good?

3) The Solution to Your Problem

You need a program. A piece of software.

You need a thing that will take observational data as input and give you the most likely outcome as output.

The outcomes provided by the program need to be right, or really close to right. The program needs to be skillful at providing good outcomes for observations.

With such a piece of software, you could run it multiple times for each observation you have.

You could integrate it into some other software, like an app or webpage, and make use of it.

Am I right?

4) Solve with Machine Learning

You want to solve this problem with machine learning or artificial intelligence, or something.

Someone told you to use machine learning or you just think it is the right tool for this job.

But, it’s confusing.

  • How do you use machine learning on problems like this?
  • Where do you start?
  • What math do you need to know before solving this problem?

Does this describe you?

Or maybe you’ve started working on your problem, but you’re stuck.

  • What data transforms should you use?
  • What algorithm should you use?
  • What algorithm configurations should you use?

Is this a better fit for where you’re at?

I Am Here to Help

I am working on a step-by-step playbook that will walk you through the process of defining your problem, preparing your data, selecting algorithms, and ultimately developing a final model that you can use to make predictions for your problem.

But to make this playbook as useful as possible, I need to know where you are having trouble in this process.

Please, describe where you’re stuck in the comments below.

Share your story. Or even just a small piece.

I promise to read every single one, and even offer advice where possible.

Update:

If you are struggling, I strongly recommend following this process when working through a predictive modeling problem:

101 Responses to So, You are Working on a Machine Learning Problem…

  1. ML rookie April 4, 2018 at 5:44 am #

    Thank you for your blog. So much great posts here, and did not go through all your previous posts. Let me describe where am I right now:

    1- I have an idea: To create a DL model that generates code.

    2- More: Actually my model aims to generate some templates. Those templates need some additional data from the user before they can be rendered into complete code.

    3- I have been reading about it, and I guess I need to use RNN (LSTM) models in order to generate code (or templates). My problems:
    A- If I want my DL to generate templates for a linear regression program, for example. My training data should be linear regression programs, right? How can I also input the performance of these programs to be considered as training data as well?
    B- Most linear regression programs have a lot in common. So for example, how can I teach my DL model to be proficient in generating linear regression programs, without necessarily going through predicting the next character or word?

    Just to summarize, I want to create a DL model that generates ML programs based on some user input 😀 what are the logical step I can take to do so?

    • Jason Brownlee April 4, 2018 at 6:23 am #

      Interesting problem.

      I have some examples of LSTMs learning to compute that might help:
      https://machinelearningmastery.com/learn-add-numbers-seq2seq-recurrent-neural-networks/

      This is a general process to work through for a new predictive model:
      https://machinelearningmastery.com/start-here/#process

      I’d recommend spending a lot of time on defining the problem (first step) and working on a large dataset for input/output. Data used to train the model.

    • Bart April 5, 2018 at 12:26 am #

      Hate to spoil day dreaming but it is not possible to have DL writing a code for you:

      “…, you could not train a deep-learning model to read a product description and generate the appropriate codebase. That’s just one example among many. In general, anything that requires reasoning—like programming or applying the scientific method—long-term plan-
      ning, and algorithmic data manipulation is out of reach for deep-learning models, no
      matter how much data you throw at them. Even learning a sorting algorithm with a
      deep neural network is tremendously difficult.
      This is because a deep-learning model is just a chain of simple, continuous geometric
      transformations mapping one vector space into another. All it can do is map one data
      manifold X into another manifold Y, assuming the existence of a learnable continuous
      transform from X to Y. A deep-learning model can be interpreted as a kind of pro-
      gram; but, inversely, most programs can’t be expressed as deep-learning models—for most
      tasks, either there exists no corresponding deep-neural network that solves the task or,
      even if one exists, it may not be learnable: the corresponding geometric transform may
      be far too complex, or there may not be appropriate data available to learn it.”

      – françois chollet – in a book “Deep learning with Keras”
      (keras creator)

      • Bart April 5, 2018 at 12:30 am #

        correction the name of teh book I quoted from is “Deep Learning with Python” by F. Chollet
        https://www.manning.com/books/deep-learning-with-python

      • Jason Brownlee April 5, 2018 at 6:05 am #

        I’m hesitant to count anything out. I have seen some of the learning to compute and code generating LSTM papers and it’s impressive stuff.

        Agreed, there is a long way to go.

        François does has a pessimistic outlook in his writing and tweeting. It was not long ago that object identification and automatic image captioning were a pipe dream and now it is practically a hello world for beginners. Neural nets could “never learn a language model”, until LSTMs started working at scale.

        For a kid who used to learn XOR with backprop back in the day (me), the progress is incredible.

        • Bart April 5, 2018 at 7:48 am #

          Hello Jason, I strongly advice against the over-hyping DL. In fact phrases such as “deep learning” and “neural networks” are misleading for most of the public.

          “… a deep-learning model is just a chain of simple, continuous geometric
          transformations mapping one vector space into another. All it can do is map one data
          manifold X into another manifold Y, assuming the existence of a learnable continuous
          transform from X to Y….”

          Hence, if we start raising expectations that our chain of simple, continuous geometric
          transformations will do magic and start writing a programming code on its own we are likely to end up like pet.coms in internet bubble mania 2001.
          Machine learning is equipped in powerful tools DL included, but DL is no magic box.

      • Kleyn Guerreiro April 6, 2018 at 5:26 am #

        I do agree with you. But I guess one of the worse (and inexact) tasks DL can do as it can be mapped with data to feed DL is:

        Function point analysis, a method to raise the cost of an app or system.

        As it is based on past projects, you can load previous projects’ features (word2vector from the code, screenshots, type of language, number of entities, their attributes and types and so on) in order to quantify numeric outcomes like number of function points for future projects or for improvements in existent ones.

  2. Hissashi April 4, 2018 at 6:07 am #

    I have tons of data at my disposal and I want to find insights for the business but I am not sure how to find the questions in the first place.

    • Jason Brownlee April 4, 2018 at 6:24 am #

      Perhaps talk to the business and ask what types of insights would really interest them, what areas, what structure, etc.

      Ideally, you want information that is actionable. E.g. where you can then devise an intervention.

  3. Emeka Farrier April 4, 2018 at 7:32 am #

    I have over 1000 hours of audio and transcribed text for the said audio. I’m also embarking on studying TensorFlow.

    My issue at present is how I should prepare thus data to train a model (and which model should I use?). I don’t want to wait till I get a handle on TensorFlow and ML concepts before preparing my data appropriately to use in training, because I’m awear that data preparation can be 80% of the work in AI

    • Matt April 4, 2018 at 1:04 pm #

      What are you trying to predict from the transcripts?

    • Jason Brownlee April 5, 2018 at 5:44 am #

      That does sound like a great project.

      I would recommend reading some papers on deep learning audio transcription (speech to text) type projects to see what type of representations are common.

      The material I have on preparing text data might help for one half of the project:
      https://machinelearningmastery.com/start-here/#nlp

      • Emeka Farrier April 6, 2018 at 10:04 am #

        Thanks a million. I would like to get in touch with you to keep you posted on my progress.

  4. Dan April 4, 2018 at 11:11 am #

    Thanks Jason – you are the first educator that I have known in ML to define the problem facing users in order to develop a solution!! My main problem is that I have variables with high variance (sometimes maybe outliers but can’t be excluded for convience). A struggle with finding the best model to extract feature importance.

  5. Puja April 4, 2018 at 1:24 pm #

    I have a dataset x-API dataset that is related to educational data mining and I want to use association rule mining and clustering on this dataset using WEKA tool.Previously I have worked on this dataset using classification to predict students academic performance. So this time what can I do new to it using above mentioned techniques.
    Could you help me in this problem and i want to know how training, testing and validation can be done in WEKA tool.
    Thank you.

  6. Samarth Barthwal April 4, 2018 at 7:50 pm #

    How to choose a paper to solve a problem? And if usng transfer learning which existing architecture should I begin with?

    • Jason Brownlee April 5, 2018 at 5:56 am #

      Start with a strong description of your problem:
      http://machinelearningmastery.com/how-to-define-your-machine-learning-problem/

      Then find papers on and related to the definition of your problem. There will be 1000 ways to solve a given problem, find methods that look skilful, simple and that you understand. Too much complexity in learning models is often a smell (e.g. code smell).
      https://en.wikipedia.org/wiki/Code_smell

      For transfer learning, again use a strong definition of your problem and project goals to guide you. Maybe using the most skilful model matters most, maybe it doesn’t. Or maybe you must test a suite of methods to see what works best for your specific problem.

  7. Muhammad Younas April 4, 2018 at 8:22 pm #

    I have a task to predict student retention by modelling student behavior by observing observable states (such as interaction with log data that contain accessing lectures,discussion, problem and so on) using “hidden markov model”,I have data also some research papers related to my problem.But have not idea how to implement HMM on this type of task,Please can you refer any link related to this type of objective implemented with HMM,or anything through which I can get idea how and where to start.Thanx

    • Jason Brownlee April 5, 2018 at 5:58 am #

      Perhaps google search and find some existing open source code that you can play around with to see if the approach will be viable. It will save days/weeks.

      Also, I have worked on similar problems and found very simple stats and quantile models to be very effective. Basically, users that use the software more stay longer. Sometimes obvious works.

  8. Ruhin shaikh April 4, 2018 at 11:16 pm #

    Hello sir,
    Sir,my problem statement which I have choosen for my project is “NETWORK ANOMALY DETECTION OF DDOS ATTACKS “.
    We have planned of using DEEP LEARNING TECHNIQUES “to solve the problem.
    We have planned to use a two stage classifiers in our model.In first stage we classify using STACKED AUTOENCODERS and in the second stage using RNN(LSTM).
    The first stage classifier will mark anamolous that will be fed to second stage classifier.In the first stage most of the known attacks will be classified properly whereas in the second stage the novel attacks will be classified.This is done to reduce the false positives.
    The dataset which we will be using is NSL KDD dataset.
    The dataset contains 42 features.
    And we will be using python as platform.
    Sir,my questions are:
    1)The feasibility of this model .
    2)can you please help me with a sample code which can help to detect DDOS attack using STACKED AUTOENCODER and LSTM.
    Sir,I’m finding difficulty in implementing this model.I would be really grateful to you if you could help me.

    • Jason Brownlee April 5, 2018 at 6:02 am #

      Sounds fine. Opinions on feasibility don’t matter though. Only the skill of the model matters.

      Sorry, I cannot write code for you. I hope to cover autoencoders in the future. Until then, perhaps you can find some open source code to use as a starting point?

  9. Komal April 5, 2018 at 1:27 am #

    Hi Jason,

    I am a newbie in ML. I am in process of preparing an approach to an ML problem. I came up the following:

    Filling missing values -> Scatter Plot -> Transformations(Log/Pow etc..) -> Normalization -> Train Model->Evaluate Model with the metrics.

    I have couple of questions. Your inputs are highly appreciated.

    Around the process of choosing a transformation /normalization for any give data set. I looked for this in internet, most of the blogs suggest it’s specific to the data set.

    Would like understand, if there is a way at least to narrow down to few transformations/normalization algorithms for a given data set.

    The other important understanding i lack is the statistical importance of any metric and how to choose a right metric over the other for a given data set.

    Thanks

    • Jason Brownlee April 5, 2018 at 6:12 am #

      Most methods that use a linear/non-linear sum of the inputs benefit from scaling, think neural nets and logistic regression. Also methods that use distance calculations, think SVM and KNN.

      If unsure, try it and see, use model skill/hard results to guide you.

      Metric – for evaluating model skill? In that case think about what matters in a given model. How do you/stakeholders know it is good or not. Pick a metric or metrics that make answering this question crystal clear for everyone involved.

  10. Mahesh Nirmal April 5, 2018 at 4:02 pm #

    The article gave incredible insights on Machine learning and its importance in the present day. Loved the content. Thanks.

  11. Sangeeta Industries April 5, 2018 at 6:05 pm #

    Hey Jason!
    Thanks for sharing a marvelous article your content is amazing.

  12. Alberto April 5, 2018 at 8:06 pm #

    Hi Jason!

    I will try to describe our problem as easy as I can:

    We need to classify some economic registers. We have about 20 different categories. There are attributes whose type is easily interpretable: price is a continuous variable, product type is a discrete variable, iva type is an ordinal variable, …

    Our first attempt has consisted in find the best binary classifier for each category. I’m not sure if it is the best. But with it we can check what kind of algorithms can work better.

    Our main problem is to manage some attributes as the nif (The NIF number is the tax code allowing you to have fiscal presence in Spain). We believe that our dataset will grow and then that “discrete variable” will have a huge variety of values… And we think that this variable can be decisive to classify a new register… How we have to treat this variable?

    The problem we see is that we need to encode this variable values because some machine learning algorithms only works with numbers. Using the label and count encoders strategy we are generating a lot of columns (one per nif code) and this can underestimate the rest of the columns…

    What do you think? Does exist a machine learning algorithm which works better with this kind of variables?

    Thanks a lot for your job, your blog is very useful for us!

    • Jason Brownlee April 6, 2018 at 6:30 am #

      Thanks for sharing, good problem!

      I think you’re describing a situation where you have a categorical input (a factor) with a large number of categories (levels). E.g. a categorical variable with high and perhaps growing cardinality.

      If so, some ideas off the cuff to handle this might be:

      – Confirm that the variable adds value to the model, prove with experiments with/without it.
      – Scope all possible values and one hot encode them.
      – Integer encode labels, perhaps set an upper limit and normalize the integer (1/n)
      – Group labels into higher order categories then integer or one hot encode.
      – Analyse labels and perhaps pull out flags/binary indicators of properties of interest as new variables.

      Get creative and brainstorm domain specific feature engineering like the last suggestion.

      Does that help or did I misunderstand the question?

  13. Srihari Katti April 6, 2018 at 12:10 am #

    I stay in the Metropolitan city of Bengaluru and wanted to regulate the water supply to different parts of the city using Machine learning. ie use different places to classify of usage is heavy or less and distribute water accordingly using a central water supply. How do I enunciate this?

  14. John R April 6, 2018 at 5:47 am #

    Hi Jason,

    Thanks so much for your blog! It’s been very helpful with getting my feet wet in applying machine learning.

    I have data on many different parameters from health sensors (heart rate, skin temperature, breathing rate, air temperature, humidity, etc.) and want to try and predict the next reading of one of them (heart rate) based on the current readings of the others.

    Any thoughts on how this could be accomplished?

    I have this data for many different people, and eventually want to model how each individual responds to changes in their measurements. For example, fit people’s heart rates may not change as much with humidity.

    Cheers!

  15. Charles Brauer April 6, 2018 at 6:12 am #

    Hello Dr. Brownlee,

    I am doing research into patterns that occur in financial data. In particular, trade data from the major exchanges.

    To see what I am working on, please visit: http://blog.cypresspoint.com/.

    The main problem I am trying to solve is modeling a dataset that is out-of-balance. Packages like SKLearn and H2O address this problem with an API argument like class_weight = ‘balanced’. This helps, but I feel that this is not enough.

    It looks like Google is ignoring the unbalanced dataset problem. That’s understandable. Their business model is based on totally balanced datasets that contain text, images and audio data.

    Any comment or suggestions will be greatly appreciated.

    Charles Brauer

  16. khaldoon April 6, 2018 at 2:38 pm #

    hello Jason, I am working on text classification research, for that first we need to extraction features as you know, I am confused to select machine learning or deep learning for my research, how to select one and way ….. thanks

    • Jason Brownlee April 6, 2018 at 3:53 pm #

      Perhaps take a quick survey of the literature for similar problems and see what is common.

      Perhaps start with a technique that you are familiar with, then expand from there.

      Perhaps find a tutorial on a similar problem and adapt it for your needs.

      There will not be a single best approach, try a suite of methods to see what works and incrementally improve your code and understanding until you are happy with the skill of the system. It may take a while.

  17. sai sowmya grandhi April 6, 2018 at 4:37 pm #

    Lacking the consistency of keeping up the work of modeling using ANNs. The results are not good enough and I lose interest. I don’t have skills of coding by my own but take codes from here and there and customize for my work. If I have to change some parameter in it for improving results I seriously get overwhelmed by the amount of material I have to search through to finally get what I need. I am in short of time and getting frustrated about why I took up this challenge (project) in the first place.I use R studio for modeling and have large amounts of data to deal (daily climate data of 40-50 years) with. Please suggest me Jason what I can do to speed up my learning process.

  18. Maria April 6, 2018 at 6:25 pm #

    Hi Jason,
    thank you for all your posts about the different problems in ML and DL. They are always very detailed and therefore very helpful.

    I have to solve a multi-label classification problem with blogposts. For me as a student in Digital Humanities it is very difficult to understand all the different parameters and statistics. I am using doc2vec to get the vector representation of the text as input to the keras model. I tried to find out which model is the best for my problem and came to the solution, that a LSTM should fit.

    But I still have many questions:
    – How can I get something like accuracy for the multi-label multi-class prediction? How can I evaluate the multi-label model?
    – How many and which kind of layers would be good? Several LSTM cells in a row?
    – Does it make sense to use an autoencoder between doc2vec and the LSTM to improve my accuracy?
    – How big is the impact of the doc2vec parameters on the LSTM output?
    – How can I find the best combination of all these different parameters?

    I think it is even harder for a newbie to solve a multi-label classification problem with text instead of a multi-class classification problem with images because there are a lot more useful papers, tutorials and examples about that.

    Thanks.

    • Jason Brownlee April 7, 2018 at 6:18 am #

      Great comment!

      Yes, multi-label classification is under served. For example, I have nothing on it and I should. I hope to change that in the future.

      Every project will have lots of questions, lots of unknowns. You must get comfortable with the idea that there are no great truths out there, that all “results” and “findings” are provisional until you learn or discover something to overt turn them. That is the scientific mindset required in applied machine learning.

      I would recommend writing out each question as you have done. Tackle each in turn. Survey literature on google scholar, look at code and related projects, ask experts, get provisional answers to each question and move on. Circle back as needed. Iterate, but always continue to progress the project forward.

      Many questions we have, like how many layers or how many neurons or which algorithm is best for your problem have no answer. No expert in the world can tell you. They might have ideas of things to try, but the answer is unknown and must be discovered through experimentation on your specific dataset.

      I do see this a lot and I think the remedy is a change in mindset:

      – From: “I am working on a closed problem with a single solution that I just don’t know yet”
      – To: “I am working on an open problem with no best answer and many good enough answers and must use experiments and empirical results to discover what works”.

      Does that help?

      I write about this here:
      https://machinelearningmastery.com/applied-machine-learning-is-hard/

      And here:
      https://machinelearningmastery.com/applied-machine-learning-as-a-search-problem/

  19. Patrick April 6, 2018 at 7:35 pm #

    Hi Jason,

    Thanks for your blog! I have data of all previous shipping vessel positions in the world. I am looking at a specific market, and I am trying to forecast the freight price based on features extracted (variables created) from the data (AIS shipping data).

    To this point I have mainly focused on data exploration and feature extraction. Looking at counting number of vessels in specific areas over time, distances to loading port etc. So I have lots of variables that can be used. What do you recommend to do in finding the “right” and “right number” variables to use in my model? I have read that random forests might do that.

    I’ve done a lot of reading, and think that RNN (LSTM or MLP) might be the way to go and use diagnostics and grid search to find epochs, nr of neurons etc. I’ve also read that other types of neural networks might be used? And that this kind of problem also might be solved by a Support vector regressor, or use the SVR as a fitting technique as in the multivariate case, when the number of variables is high and data become complex, traditional fitting techniques, such as ordinary least squares (OLS), lose efficiency. Lastly, I’ve been told that multivariate adaptive regression splines (MARS) can give some results in forecasting.

    So to summarise, I want to do a multivariate forecast for the price data based on several variables found from global vessel positions, using LSTM, MLP, SVR, MARS or other ANN algorithms.

    What do you recommend/ what ate your thoughts?

    One last question, do you have any good resources on stream learning? To my understanding, it might be inefficient to re-train the model every time a new observation is made.

    Thank you!

  20. iulia April 6, 2018 at 7:47 pm #

    Currently I’m working on multiclass classification problem with RF.
    My biggest challenge for this particular problem is heavily imbalanced classes : I have one class that contains only one sample. I cannot ignore this class, I cannot collect more samples for this class and I don’t know how to generate more samples for this class.

    If smbd was facing the same problem please help 🙂

  21. Carlos Augusto April 7, 2018 at 5:35 am #

    Hi Jason, congratulations for your posts and availability for help people on ML problem.
    We are working no ML project that involves health features to predict a infant mortality in specific region.
    We are using regression modeling on this project and the difficulties we are dealing with now is that non of the regression models we
    tested (linear, general linear model and SVM) provided a good statiscal measure as result, such as low p-value and residual is not normal.
    The features on dataset have outliers that can not be removed, non of the predictors features presents a linear with target feature.
    The predictors present a low values on correlations matrix, whats is good. The features on dataset was selected by experient people on the domain.
    We also test some padronization on data such as normalization, zscore, log but dont solved the problem.
    Do you have any guess on where we are failing? May you suggest some ideia for us please.
    Thanks and sorry about some english error writing because i’m from Brasil and don’t have mastery of the english language.

    • Jason Brownlee April 7, 2018 at 6:40 am #

      It could be a million things, for example:

      – Perhaps you need more data.
      – Perhaps you need to relax your ideas about removing outliers (inputs won’t be gaussian invalidating most linear methods)
      – Perhaps you need to try a suite of nonlinear methods
      – Perhaps you need to use an alternate error metric to evaluate model skill.
      – …

      No one can give you the silver bullet answer, you are going to have to work to figure it out. Gather evidence to support your findings. Be prepared to change your mind on everything in order to get the most out of the problem.

      I list a ton of ideas to think about here:
      http://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/

  22. Gautam Karmakar April 8, 2018 at 4:12 pm #

    I have a problem that I need help or guidance. I have a prioblem like finding similar questions given a new questions. I have around 1500 un labeled questions to start with. What is best approach?

    • Jason Brownlee April 9, 2018 at 6:06 am #

      Good question.

      There are many excellent classical text similarity metrics. Perhaps experiment with them? I would recommend start by surveying the field to see what your options are.

  23. khaldoon April 10, 2018 at 2:50 am #

    How can use information gain in text classification? what output will be like? for example, if we have X_train.shape=(10,50) what output shape for information gain? and can we used it for classifiers like NB or SVM?

  24. Jordan April 10, 2018 at 3:03 am #

    Hello,
    I’m just starting to learn machine learning. So I don’t have much idea about this field. I don’t even know if the problem I’m going to take up is even a Machine learning problem, asking it here anyway.

    There is an exam that is being conducted once in every year. And the rank list (result) is also publicly available. If I have the rank list of last 5 years ( that is 5 ranklists, and a ranklist will have the rank of the aspirants and the score they gained). Is is possible for me to predict the rank of a new user based on the score of a mock test he attended.

    Is this even a machine learning problem?

  25. Anthony Weche April 13, 2018 at 9:34 am #

    Hello Jason, what is the best algorithm to use for preventive maintenance take for example you want to do predictive maintenance early enough before the machine breaks down. Or you want to see that the machine has worked sufficiently before changing parts or oil.
    please advise thank you

    • Jason Brownlee April 13, 2018 at 3:30 pm #

      Great question, perhaps look into survival analysis methods?

  26. Novoszáth András May 12, 2018 at 12:07 am #

    My actual issue: How do I recognize as quickly as possible that what type of model should I start to use and tinker with. After that the main problem becomes understanding the model because of the notations with which it is described.

    Nonetheless, these can learned, and I happy to do so in the long term. A big constrain for for that, however, is the lack of accessible, worked out and solid examples providing practice opportunities. There are some really good ones out there, but it always takes lots of time and trial and error to find them.

  27. Riad June 2, 2018 at 3:41 am #

    hello jason
    I want to do a sentiment analysis on a dataset of movie reviews but I don’t know where do I start so please can you help me

  28. Amandeep June 14, 2018 at 5:38 pm #

    Excellent article,
    Thanks for sharing this good information on machine learning…. 🙂

  29. Sevval June 29, 2018 at 5:02 pm #

    Thank you for your help and this helpful website. I sent an email to you.
    Thanks

  30. German July 5, 2018 at 1:55 am #

    Hi Jason,
    We are building a logger program to track user interface with a core business system. The data set will have the system name, object type, description, order step and a process tag.

    What we would like to predict for example is,
    – In a new logger instance, which is the process tag?
    – Which is the next step?

    Thanks,
    German.

  31. Mohamed Bennouf July 22, 2018 at 10:14 am #

    THANK YOU JASON!

    I amazed i . found your blog where I can actually ask the question I had for so long but did not know where to begun and who to ask!

    Anyway I am service engineer and I am trying to figure the feasibility of troubleshooting system to assist me (and other engineers) solve tool failures (of mass spectrometer instrument) Over the last 12 years we accumulated a 20K+ cases for failures in form of problem/solution database (filled up by our engineers on the field. I would like to use machine learning (maybe deep learning) to assist our engineers found the most likely case they can apply. For now they use the regular search function in the database software we are using. Of course this means that they have use the correct word since the search for keywords. It will be great if the AI system could find solution to problem stated in a sentence rather than keywords. Of course the system actually learn the domain it is even better. In the past something like case-based reasoning would have been a solution.

    My issue is how to approach this learning problem given the fact that it is a text based domain (rather than just some numbers) I can output the case (problem type, problem and solution) into an excel file if needed. The goal is to state the new problem and then have the AI show all the cases (solution part) that could apply to the new problem. That way the engineers do not have to re-invent the wheel. Of course the AI may or may not find the correct answer but could at least give the engineers to solutions that they can check.

    I hope I am making sense, if please do not hesitated to ask me questions. To help, i pasted below what a typical case looks like from our 20k+ database.

    Thank you again for allowing us to pick your brain 🙂

    Mo

    CASE#1

    TYPE: MAGNET ISSUES

    PROBLEM:
    The Hall probe is stuck at 7000. Reseting the chassis and the real time did not work.

    SOLUTION:
    Found a dead power transistor along with bunch of dead 0.1 power resistors. One transistor support was also damaged. Replaced those parts and the magnet worked correctly. Working with Cs2 (hall probe around 8000) is very hard on the magnet and that we can expect more failure if using that mode too long.

    CASE#2

    TYPE: VACUUM ISSUES

    PROBLEM:
    The projection turbo pump failed again. The power went up and the vacuum degraded in that area.

    SOLUTION:
    Was going to replace the pump again but noticed that backing valve of the pump was closed (?) but the synoptic show the valve open. Closed/open the valve from the synoptic and the vacuum got better and pump power went down to normal. I also noticed that the compressed air was set to the minimum value. We increased it and so far it is work fine. Customer will return the pump back to Madison (never used)

    CASE#3
    TYPE: EGUN ISSUES

    PROBLEM:
    The egun current is lower than normal (6-7 uA instead of 30-70 uA) I also found that emission current is very low (0.5 mA at 3000 bits filament)

    SOLUTION:
    Replaced the egun filament. At first I adjusted the filament height by unscrewing the part 3/5 turn (3 notches back) But then I saw that the manual now said to unscrew 1/4 to 1/3 turns so I open and I did. Still I could not get any normal emission current (got 0.5 mA at filament set to 3000 bits instead of 2 mA)  Finally I found out that the resistors R1 and R2 had bad solders connection. Now we can get 2 mA with around 3000 bits.

  32. Mohamed Bennouf July 25, 2018 at 4:11 pm #

    Thank you so much Jason! I will look into the link. So you think it will be more like a search problem rather than machine learning issue per se? I was hoping that system X would learn about the domain (from the 20K cases) and then come up with a solution of an novel problem description.

    In any event, thank you for an amazing blog and for taking the time to help us learn how to solve real problems with Ai.

    Mo

    • Jason Brownlee July 26, 2018 at 7:36 am #

      Try many different framings of the problem. You know more about it than me, try it as a supervised learning, try it as a search, go with what works best.

  33. Rashmiranjan Sharma September 4, 2018 at 7:16 pm #

    i want to predict the min,max and modal price of agricultural commodities for next 30 days.
    so can you help me how to solve this problem ,yes i have already a dataset. i dont know how to select best algorithm.
    Thank you in advance

  34. Johann September 27, 2018 at 10:39 am #

    Hello,

    First, thanks for your awesome website: so much great articles inside!

    My problem: I have technical issues (expressed in common english / natural words like “my application is not working and the error message is ‘cannot access blabla.com'”) and i would like to make a bot to automatically answer it with the most probable reason (“did you try to open the firewall? Here is the link to the how to”). Where should i start? How should i train? Any information / guidance will be super appreciated.

    Thanks a lot,
    Johann

    • Jason Brownlee September 27, 2018 at 2:46 pm #

      Thanks.

      Perhaps there is a natural relationship between problems and things to try. One approach would be to structure it like a recommender system: people that ask questions like this find it help to get suggestions like that.

      If you don’t have such data, you may have to collect it using an existing dumber system with added randomness.

      If you do have such data, you can use it to see the system.

      This won’t be a new problem, I’d encourage you to browse the literature on scholar.google.com to get idea of how others have approached it. Perhaps try a few different framings and see what makes sense based on the time and resources to you have.

  35. jagjeet kaur October 25, 2018 at 4:36 am #

    hey Jason,

    I am new to machine learning and i want to start with a problem solving which will boost up my confidence in this field.Plz help me out by suggesting some beginner level problems!

  36. kavya October 28, 2018 at 11:53 pm #

    Hi Jason
    Great article! I really liked it,
    Thanks for sharing such a good informative article insights… 🙂

  37. Björn Ludin October 31, 2018 at 8:54 pm #

    Hi!
    My problem is this:
    I got loads of data.
    I got the odds on a betexchange for horse races during the race – for about 2 years.
    That is – for every race during these two years I got data like
    r1 -rn are the runners (horses)

    t1 r1 r2 r3 …
    0001 5.25 2.04 3.25
    0002 5.10 2.50 2.75

    3254 55 1.01 520

    here – r2 won

    The sample frequence is about 5 times per second.

    I also know if the runner won/placed/lost

    My goal is to be able to say that runnerx X will win/place after ca 50-75% of the expected racetime with say 80% accuracy.

    My problem is that I don’t know how to model this situation. I’ve seem tournament strategies – ie who out of two runner will win – but here’s more data – both in time and in participants

    What model should I pay attention to?

  38. Zafar November 16, 2018 at 3:41 pm #

    well I am trying to develop my Phd Problem statement. what actually I want to do,

    The existing state of the art classifier expound low accuracy on imbalanced multi label data set.
    there is a need to design/develop a novel/intelligence classifier to improve the accuracy on imbalance multi label data set.

    how can you help me to mature the problem statement, advising how to improve an existing classifier or start designing a new classifier.

    • Jason Brownlee November 17, 2018 at 5:42 am #

      I recommend talking to your research supervisor.

  39. Suleman Zafar Paracha November 19, 2018 at 5:33 pm #

    Hi,
    I just started to learn ANN. I want to start by solving a problem. Lets suppose I have an image with many symbols in it. The image is not clear as the signs are old and rough but gives clear idea that what is the actual sign. Now I want to extract those signs from the image and find those signs in the database and create a new image with those signs matched in the database.
    now new HD image will contain signs one the same locations.
    I don’t know from where to start and what tools to use for this. Thank you.

    • Jason Brownlee November 20, 2018 at 6:33 am #

      Start by building up a dataset of images and their associated clean symbols. Thousands of examples.

  40. Farzan Zaheer November 24, 2018 at 4:56 am #

    Hi,

    I have historic data of money spent on advertisements by various industries to a tv channel. Now I want to know if there is a budget shift in future, to be more specific if and when there is a budget shift from one industry to another.

    I don’t know where to start. If it is a classification problem or regression, or a combination of both? as I have to predict the shift in spendings from one industry to another.

    I would be glad if you could point me in the right direction. Thank you

  41. Vince Creed December 4, 2018 at 7:32 pm #

    Hello Jason! I am currently working on a system that wants to detect if a frame in a video contains a fire or not, using machine learning. I have no idea yet how to start. I bet it is a classification problem. Can you suggest the best algorithm to use for this problem? Your advise would be highly appreciated. Thank you!

    • Jason Brownlee December 5, 2018 at 6:13 am #

      Sounds like time series classification, a CNN-LSTM might be a good starting point.

  42. Tyrion December 7, 2018 at 5:44 pm #

    Hello Jason! I recently came across the problem to classify, find patterns and analyze the news articles about HIV over the past 10 years. What would be the best approach and which would be the best algorithm to use?

Leave a Reply