So, You are Working on a Machine Learning Problem…

So, you’re working on a machine learning problem.

I want to really nail down where you’re at right now.

Let me make some guesses…

So, You are Working on a Machine Learning Problem...

So, You are Working on a Machine Learning Problem…
Photo by David Mulder, some rights reserved.

1) You Have a Problem

So you have a problem that you need to solve.

Maybe it’s your problem, an idea you have, a question, or something you want to address.

Or maybe it is a problem that was provided to you by someone else, such as a supervisor or boss.

This problem involves some historical data you have or can access. It also involves some predictions required from new or related data in the future.

Let’s dig deeper.

2) More on Your Problem

Let’s look at your problem in more detail.

You have historical data.

You have observations about something, like customers, voltages, prices, etc. collected over time.

You also have some outcome related to each observation, maybe a label like “good” or “bad” or maybe a quantity like 50.1.

The problem you want to solve is, given new observations in the future, what is the most likely related outcome?

So far so good?

3) The Solution to Your Problem

You need a program. A piece of software.

You need a thing that will take observational data as input and give you the most likely outcome as output.

The outcomes provided by the program need to be right, or really close to right. The program needs to be skillful at providing good outcomes for observations.

With such a piece of software, you could run it multiple times for each observation you have.

You could integrate it into some other software, like an app or webpage, and make use of it.

Am I right?

4) Solve with Machine Learning

You want to solve this problem with machine learning or artificial intelligence, or something.

Someone told you to use machine learning or you just think it is the right tool for this job.

But, it’s confusing.

  • How do you use machine learning on problems like this?
  • Where do you start?
  • What math do you need to know before solving this problem?

Does this describe you?

Or maybe you’ve started working on your problem, but you’re stuck.

  • What data transforms should you use?
  • What algorithm should you use?
  • What algorithm configurations should you use?

Is this a better fit for where you’re at?

I Am Here to Help

I am working on a step-by-step playbook that will walk you through the process of defining your problem, preparing your data, selecting algorithms, and ultimately developing a final model that you can use to make predictions for your problem.

But to make this playbook as useful as possible, I need to know where you are having trouble in this process.

Please, describe where you’re stuck in the comments below.

Share your story. Or even just a small piece.

I promise to read every single one, and even offer advice where possible.

77 Responses to So, You are Working on a Machine Learning Problem…

  1. ML rookie April 4, 2018 at 5:44 am #

    Thank you for your blog. So much great posts here, and did not go through all your previous posts. Let me describe where am I right now:

    1- I have an idea: To create a DL model that generates code.

    2- More: Actually my model aims to generate some templates. Those templates need some additional data from the user before they can be rendered into complete code.

    3- I have been reading about it, and I guess I need to use RNN (LSTM) models in order to generate code (or templates). My problems:
    A- If I want my DL to generate templates for a linear regression program, for example. My training data should be linear regression programs, right? How can I also input the performance of these programs to be considered as training data as well?
    B- Most linear regression programs have a lot in common. So for example, how can I teach my DL model to be proficient in generating linear regression programs, without necessarily going through predicting the next character or word?

    Just to summarize, I want to create a DL model that generates ML programs based on some user input 😀 what are the logical step I can take to do so?

    • Jason Brownlee April 4, 2018 at 6:23 am #

      Interesting problem.

      I have some examples of LSTMs learning to compute that might help:
      https://machinelearningmastery.com/learn-add-numbers-seq2seq-recurrent-neural-networks/

      This is a general process to work through for a new predictive model:
      https://machinelearningmastery.com/start-here/#process

      I’d recommend spending a lot of time on defining the problem (first step) and working on a large dataset for input/output. Data used to train the model.

    • Bart April 5, 2018 at 12:26 am #

      Hate to spoil day dreaming but it is not possible to have DL writing a code for you:

      “…, you could not train a deep-learning model to read a product description and generate the appropriate codebase. That’s just one example among many. In general, anything that requires reasoning—like programming or applying the scientific method—long-term plan-
      ning, and algorithmic data manipulation is out of reach for deep-learning models, no
      matter how much data you throw at them. Even learning a sorting algorithm with a
      deep neural network is tremendously difficult.
      This is because a deep-learning model is just a chain of simple, continuous geometric
      transformations mapping one vector space into another. All it can do is map one data
      manifold X into another manifold Y, assuming the existence of a learnable continuous
      transform from X to Y. A deep-learning model can be interpreted as a kind of pro-
      gram; but, inversely, most programs can’t be expressed as deep-learning models—for most
      tasks, either there exists no corresponding deep-neural network that solves the task or,
      even if one exists, it may not be learnable: the corresponding geometric transform may
      be far too complex, or there may not be appropriate data available to learn it.”

      – françois chollet – in a book “Deep learning with Keras”
      (keras creator)

      • Bart April 5, 2018 at 12:30 am #

        correction the name of teh book I quoted from is “Deep Learning with Python” by F. Chollet
        https://www.manning.com/books/deep-learning-with-python

      • Jason Brownlee April 5, 2018 at 6:05 am #

        I’m hesitant to count anything out. I have seen some of the learning to compute and code generating LSTM papers and it’s impressive stuff.

        Agreed, there is a long way to go.

        François does has a pessimistic outlook in his writing and tweeting. It was not long ago that object identification and automatic image captioning were a pipe dream and now it is practically a hello world for beginners. Neural nets could “never learn a language model”, until LSTMs started working at scale.

        For a kid who used to learn XOR with backprop back in the day (me), the progress is incredible.

        • Bart April 5, 2018 at 7:48 am #

          Hello Jason, I strongly advice against the over-hyping DL. In fact phrases such as “deep learning” and “neural networks” are misleading for most of the public.

          “… a deep-learning model is just a chain of simple, continuous geometric
          transformations mapping one vector space into another. All it can do is map one data
          manifold X into another manifold Y, assuming the existence of a learnable continuous
          transform from X to Y….”

          Hence, if we start raising expectations that our chain of simple, continuous geometric
          transformations will do magic and start writing a programming code on its own we are likely to end up like pet.coms in internet bubble mania 2001.
          Machine learning is equipped in powerful tools DL included, but DL is no magic box.

      • Kleyn Guerreiro April 6, 2018 at 5:26 am #

        I do agree with you. But I guess one of the worse (and inexact) tasks DL can do as it can be mapped with data to feed DL is:

        Function point analysis, a method to raise the cost of an app or system.

        As it is based on past projects, you can load previous projects’ features (word2vector from the code, screenshots, type of language, number of entities, their attributes and types and so on) in order to quantify numeric outcomes like number of function points for future projects or for improvements in existent ones.

  2. Hissashi April 4, 2018 at 6:07 am #

    I have tons of data at my disposal and I want to find insights for the business but I am not sure how to find the questions in the first place.

    • Jason Brownlee April 4, 2018 at 6:24 am #

      Perhaps talk to the business and ask what types of insights would really interest them, what areas, what structure, etc.

      Ideally, you want information that is actionable. E.g. where you can then devise an intervention.

  3. Emeka Farrier April 4, 2018 at 7:32 am #

    I have over 1000 hours of audio and transcribed text for the said audio. I’m also embarking on studying TensorFlow.

    My issue at present is how I should prepare thus data to train a model (and which model should I use?). I don’t want to wait till I get a handle on TensorFlow and ML concepts before preparing my data appropriately to use in training, because I’m awear that data preparation can be 80% of the work in AI

    • Matt April 4, 2018 at 1:04 pm #

      What are you trying to predict from the transcripts?

    • Jason Brownlee April 5, 2018 at 5:44 am #

      That does sound like a great project.

      I would recommend reading some papers on deep learning audio transcription (speech to text) type projects to see what type of representations are common.

      The material I have on preparing text data might help for one half of the project:
      https://machinelearningmastery.com/start-here/#nlp

      • Emeka Farrier April 6, 2018 at 10:04 am #

        Thanks a million. I would like to get in touch with you to keep you posted on my progress.

  4. Dan April 4, 2018 at 11:11 am #

    Thanks Jason – you are the first educator that I have known in ML to define the problem facing users in order to develop a solution!! My main problem is that I have variables with high variance (sometimes maybe outliers but can’t be excluded for convience). A struggle with finding the best model to extract feature importance.

  5. Puja April 4, 2018 at 1:24 pm #

    I have a dataset x-API dataset that is related to educational data mining and I want to use association rule mining and clustering on this dataset using WEKA tool.Previously I have worked on this dataset using classification to predict students academic performance. So this time what can I do new to it using above mentioned techniques.
    Could you help me in this problem and i want to know how training, testing and validation can be done in WEKA tool.
    Thank you.

  6. Samarth Barthwal April 4, 2018 at 7:50 pm #

    How to choose a paper to solve a problem? And if usng transfer learning which existing architecture should I begin with?

    • Jason Brownlee April 5, 2018 at 5:56 am #

      Start with a strong description of your problem:
      http://machinelearningmastery.com/how-to-define-your-machine-learning-problem/

      Then find papers on and related to the definition of your problem. There will be 1000 ways to solve a given problem, find methods that look skilful, simple and that you understand. Too much complexity in learning models is often a smell (e.g. code smell).
      https://en.wikipedia.org/wiki/Code_smell

      For transfer learning, again use a strong definition of your problem and project goals to guide you. Maybe using the most skilful model matters most, maybe it doesn’t. Or maybe you must test a suite of methods to see what works best for your specific problem.

  7. Muhammad Younas April 4, 2018 at 8:22 pm #

    I have a task to predict student retention by modelling student behavior by observing observable states (such as interaction with log data that contain accessing lectures,discussion, problem and so on) using “hidden markov model”,I have data also some research papers related to my problem.But have not idea how to implement HMM on this type of task,Please can you refer any link related to this type of objective implemented with HMM,or anything through which I can get idea how and where to start.Thanx

    • Jason Brownlee April 5, 2018 at 5:58 am #

      Perhaps google search and find some existing open source code that you can play around with to see if the approach will be viable. It will save days/weeks.

      Also, I have worked on similar problems and found very simple stats and quantile models to be very effective. Basically, users that use the software more stay longer. Sometimes obvious works.

  8. Ruhin shaikh April 4, 2018 at 11:16 pm #

    Hello sir,
    Sir,my problem statement which I have choosen for my project is “NETWORK ANOMALY DETECTION OF DDOS ATTACKS “.
    We have planned of using DEEP LEARNING TECHNIQUES “to solve the problem.
    We have planned to use a two stage classifiers in our model.In first stage we classify using STACKED AUTOENCODERS and in the second stage using RNN(LSTM).
    The first stage classifier will mark anamolous that will be fed to second stage classifier.In the first stage most of the known attacks will be classified properly whereas in the second stage the novel attacks will be classified.This is done to reduce the false positives.
    The dataset which we will be using is NSL KDD dataset.
    The dataset contains 42 features.
    And we will be using python as platform.
    Sir,my questions are:
    1)The feasibility of this model .
    2)can you please help me with a sample code which can help to detect DDOS attack using STACKED AUTOENCODER and LSTM.
    Sir,I’m finding difficulty in implementing this model.I would be really grateful to you if you could help me.

    • Jason Brownlee April 5, 2018 at 6:02 am #

      Sounds fine. Opinions on feasibility don’t matter though. Only the skill of the model matters.

      Sorry, I cannot write code for you. I hope to cover autoencoders in the future. Until then, perhaps you can find some open source code to use as a starting point?

  9. Komal April 5, 2018 at 1:27 am #

    Hi Jason,

    I am a newbie in ML. I am in process of preparing an approach to an ML problem. I came up the following:

    Filling missing values -> Scatter Plot -> Transformations(Log/Pow etc..) -> Normalization -> Train Model->Evaluate Model with the metrics.

    I have couple of questions. Your inputs are highly appreciated.

    Around the process of choosing a transformation /normalization for any give data set. I looked for this in internet, most of the blogs suggest it’s specific to the data set.

    Would like understand, if there is a way at least to narrow down to few transformations/normalization algorithms for a given data set.

    The other important understanding i lack is the statistical importance of any metric and how to choose a right metric over the other for a given data set.

    Thanks

    • Jason Brownlee April 5, 2018 at 6:12 am #

      Most methods that use a linear/non-linear sum of the inputs benefit from scaling, think neural nets and logistic regression. Also methods that use distance calculations, think SVM and KNN.

      If unsure, try it and see, use model skill/hard results to guide you.

      Metric – for evaluating model skill? In that case think about what matters in a given model. How do you/stakeholders know it is good or not. Pick a metric or metrics that make answering this question crystal clear for everyone involved.

  10. Mahesh Nirmal April 5, 2018 at 4:02 pm #

    The article gave incredible insights on Machine learning and its importance in the present day. Loved the content. Thanks.

  11. Sangeeta Industries April 5, 2018 at 6:05 pm #

    Hey Jason!
    Thanks for sharing a marvelous article your content is amazing.

  12. Alberto April 5, 2018 at 8:06 pm #

    Hi Jason!

    I will try to describe our problem as easy as I can:

    We need to classify some economic registers. We have about 20 different categories. There are attributes whose type is easily interpretable: price is a continuous variable, product type is a discrete variable, iva type is an ordinal variable, …

    Our first attempt has consisted in find the best binary classifier for each category. I’m not sure if it is the best. But with it we can check what kind of algorithms can work better.

    Our main problem is to manage some attributes as the nif (The NIF number is the tax code allowing you to have fiscal presence in Spain). We believe that our dataset will grow and then that “discrete variable” will have a huge variety of values… And we think that this variable can be decisive to classify a new register… How we have to treat this variable?

    The problem we see is that we need to encode this variable values because some machine learning algorithms only works with numbers. Using the label and count encoders strategy we are generating a lot of columns (one per nif code) and this can underestimate the rest of the columns…

    What do you think? Does exist a machine learning algorithm which works better with this kind of variables?

    Thanks a lot for your job, your blog is very useful for us!

    • Jason Brownlee April 6, 2018 at 6:30 am #

      Thanks for sharing, good problem!

      I think you’re describing a situation where you have a categorical input (a factor) with a large number of categories (levels). E.g. a categorical variable with high and perhaps growing cardinality.

      If so, some ideas off the cuff to handle this might be:

      – Confirm that the variable adds value to the model, prove with experiments with/without it.
      – Scope all possible values and one hot encode them.
      – Integer encode labels, perhaps set an upper limit and normalize the integer (1/n)
      – Group labels into higher order categories then integer or one hot encode.
      – Analyse labels and perhaps pull out flags/binary indicators of properties of interest as new variables.

      Get creative and brainstorm domain specific feature engineering like the last suggestion.

      Does that help or did I misunderstand the question?

  13. Srihari Katti April 6, 2018 at 12:10 am #

    I stay in the Metropolitan city of Bengaluru and wanted to regulate the water supply to different parts of the city using Machine learning. ie use different places to classify of usage is heavy or less and distribute water accordingly using a central water supply. How do I enunciate this?

  14. John R April 6, 2018 at 5:47 am #

    Hi Jason,

    Thanks so much for your blog! It’s been very helpful with getting my feet wet in applying machine learning.

    I have data on many different parameters from health sensors (heart rate, skin temperature, breathing rate, air temperature, humidity, etc.) and want to try and predict the next reading of one of them (heart rate) based on the current readings of the others.

    Any thoughts on how this could be accomplished?

    I have this data for many different people, and eventually want to model how each individual responds to changes in their measurements. For example, fit people’s heart rates may not change as much with humidity.

    Cheers!

  15. Charles Brauer April 6, 2018 at 6:12 am #

    Hello Dr. Brownlee,

    I am doing research into patterns that occur in financial data. In particular, trade data from the major exchanges.

    To see what I am working on, please visit: http://blog.cypresspoint.com/.

    The main problem I am trying to solve is modeling a dataset that is out-of-balance. Packages like SKLearn and H2O address this problem with an API argument like class_weight = ‘balanced’. This helps, but I feel that this is not enough.

    It looks like Google is ignoring the unbalanced dataset problem. That’s understandable. Their business model is based on totally balanced datasets that contain text, images and audio data.

    Any comment or suggestions will be greatly appreciated.

    Charles Brauer

  16. khaldoon April 6, 2018 at 2:38 pm #

    hello Jason, I am working on text classification research, for that first we need to extraction features as you know, I am confused to select machine learning or deep learning for my research, how to select one and way ….. thanks

    • Jason Brownlee April 6, 2018 at 3:53 pm #

      Perhaps take a quick survey of the literature for similar problems and see what is common.

      Perhaps start with a technique that you are familiar with, then expand from there.

      Perhaps find a tutorial on a similar problem and adapt it for your needs.

      There will not be a single best approach, try a suite of methods to see what works and incrementally improve your code and understanding until you are happy with the skill of the system. It may take a while.

  17. sai sowmya grandhi April 6, 2018 at 4:37 pm #

    Lacking the consistency of keeping up the work of modeling using ANNs. The results are not good enough and I lose interest. I don’t have skills of coding by my own but take codes from here and there and customize for my work. If I have to change some parameter in it for improving results I seriously get overwhelmed by the amount of material I have to search through to finally get what I need. I am in short of time and getting frustrated about why I took up this challenge (project) in the first place.I use R studio for modeling and have large amounts of data to deal (daily climate data of 40-50 years) with. Please suggest me Jason what I can do to speed up my learning process.

  18. Maria April 6, 2018 at 6:25 pm #

    Hi Jason,
    thank you for all your posts about the different problems in ML and DL. They are always very detailed and therefore very helpful.

    I have to solve a multi-label classification problem with blogposts. For me as a student in Digital Humanities it is very difficult to understand all the different parameters and statistics. I am using doc2vec to get the vector representation of the text as input to the keras model. I tried to find out which model is the best for my problem and came to the solution, that a LSTM should fit.

    But I still have many questions:
    – How can I get something like accuracy for the multi-label multi-class prediction? How can I evaluate the multi-label model?
    – How many and which kind of layers would be good? Several LSTM cells in a row?
    – Does it make sense to use an autoencoder between doc2vec and the LSTM to improve my accuracy?
    – How big is the impact of the doc2vec parameters on the LSTM output?
    – How can I find the best combination of all these different parameters?

    I think it is even harder for a newbie to solve a multi-label classification problem with text instead of a multi-class classification problem with images because there are a lot more useful papers, tutorials and examples about that.

    Thanks.

    • Jason Brownlee April 7, 2018 at 6:18 am #

      Great comment!

      Yes, multi-label classification is under served. For example, I have nothing on it and I should. I hope to change that in the future.

      Every project will have lots of questions, lots of unknowns. You must get comfortable with the idea that there are no great truths out there, that all “results” and “findings” are provisional until you learn or discover something to overt turn them. That is the scientific mindset required in applied machine learning.

      I would recommend writing out each question as you have done. Tackle each in turn. Survey literature on google scholar, look at code and related projects, ask experts, get provisional answers to each question and move on. Circle back as needed. Iterate, but always continue to progress the project forward.

      Many questions we have, like how many layers or how many neurons or which algorithm is best for your problem have no answer. No expert in the world can tell you. They might have ideas of things to try, but the answer is unknown and must be discovered through experimentation on your specific dataset.

      I do see this a lot and I think the remedy is a change in mindset:

      – From: “I am working on a closed problem with a single solution that I just don’t know yet”
      – To: “I am working on an open problem with no best answer and many good enough answers and must use experiments and empirical results to discover what works”.

      Does that help?

      I write about this here:
      https://machinelearningmastery.com/applied-machine-learning-is-hard/

      And here:
      https://machinelearningmastery.com/applied-machine-learning-as-a-search-problem/

  19. Patrick April 6, 2018 at 7:35 pm #

    Hi Jason,

    Thanks for your blog! I have data of all previous shipping vessel positions in the world. I am looking at a specific market, and I am trying to forecast the freight price based on features extracted (variables created) from the data (AIS shipping data).

    To this point I have mainly focused on data exploration and feature extraction. Looking at counting number of vessels in specific areas over time, distances to loading port etc. So I have lots of variables that can be used. What do you recommend to do in finding the “right” and “right number” variables to use in my model? I have read that random forests might do that.

    I’ve done a lot of reading, and think that RNN (LSTM or MLP) might be the way to go and use diagnostics and grid search to find epochs, nr of neurons etc. I’ve also read that other types of neural networks might be used? And that this kind of problem also might be solved by a Support vector regressor, or use the SVR as a fitting technique as in the multivariate case, when the number of variables is high and data become complex, traditional fitting techniques, such as ordinary least squares (OLS), lose efficiency. Lastly, I’ve been told that multivariate adaptive regression splines (MARS) can give some results in forecasting.

    So to summarise, I want to do a multivariate forecast for the price data based on several variables found from global vessel positions, using LSTM, MLP, SVR, MARS or other ANN algorithms.

    What do you recommend/ what ate your thoughts?

    One last question, do you have any good resources on stream learning? To my understanding, it might be inefficient to re-train the model every time a new observation is made.

    Thank you!

  20. iulia April 6, 2018 at 7:47 pm #

    Currently I’m working on multiclass classification problem with RF.
    My biggest challenge for this particular problem is heavily imbalanced classes : I have one class that contains only one sample. I cannot ignore this class, I cannot collect more samples for this class and I don’t know how to generate more samples for this class.

    If smbd was facing the same problem please help 🙂

  21. Carlos Augusto April 7, 2018 at 5:35 am #

    Hi Jason, congratulations for your posts and availability for help people on ML problem.
    We are working no ML project that involves health features to predict a infant mortality in specific region.
    We are using regression modeling on this project and the difficulties we are dealing with now is that non of the regression models we
    tested (linear, general linear model and SVM) provided a good statiscal measure as result, such as low p-value and residual is not normal.
    The features on dataset have outliers that can not be removed, non of the predictors features presents a linear with target feature.
    The predictors present a low values on correlations matrix, whats is good. The features on dataset was selected by experient people on the domain.
    We also test some padronization on data such as normalization, zscore, log but dont solved the problem.
    Do you have any guess on where we are failing? May you suggest some ideia for us please.
    Thanks and sorry about some english error writing because i’m from Brasil and don’t have mastery of the english language.

    • Jason Brownlee April 7, 2018 at 6:40 am #

      It could be a million things, for example:

      – Perhaps you need more data.
      – Perhaps you need to relax your ideas about removing outliers (inputs won’t be gaussian invalidating most linear methods)
      – Perhaps you need to try a suite of nonlinear methods
      – Perhaps you need to use an alternate error metric to evaluate model skill.
      – …

      No one can give you the silver bullet answer, you are going to have to work to figure it out. Gather evidence to support your findings. Be prepared to change your mind on everything in order to get the most out of the problem.

      I list a ton of ideas to think about here:
      http://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/

  22. Gautam Karmakar April 8, 2018 at 4:12 pm #

    I have a problem that I need help or guidance. I have a prioblem like finding similar questions given a new questions. I have around 1500 un labeled questions to start with. What is best approach?

    • Jason Brownlee April 9, 2018 at 6:06 am #

      Good question.

      There are many excellent classical text similarity metrics. Perhaps experiment with them? I would recommend start by surveying the field to see what your options are.

  23. khaldoon April 10, 2018 at 2:50 am #

    How can use information gain in text classification? what output will be like? for example, if we have X_train.shape=(10,50) what output shape for information gain? and can we used it for classifiers like NB or SVM?

  24. Jordan April 10, 2018 at 3:03 am #

    Hello,
    I’m just starting to learn machine learning. So I don’t have much idea about this field. I don’t even know if the problem I’m going to take up is even a Machine learning problem, asking it here anyway.

    There is an exam that is being conducted once in every year. And the rank list (result) is also publicly available. If I have the rank list of last 5 years ( that is 5 ranklists, and a ranklist will have the rank of the aspirants and the score they gained). Is is possible for me to predict the rank of a new user based on the score of a mock test he attended.

    Is this even a machine learning problem?

  25. Anthony Weche April 13, 2018 at 9:34 am #

    Hello Jason, what is the best algorithm to use for preventive maintenance take for example you want to do predictive maintenance early enough before the machine breaks down. Or you want to see that the machine has worked sufficiently before changing parts or oil.
    please advise thank you

    • Jason Brownlee April 13, 2018 at 3:30 pm #

      Great question, perhaps look into survival analysis methods?

  26. Novoszáth András May 12, 2018 at 12:07 am #

    My actual issue: How do I recognize as quickly as possible that what type of model should I start to use and tinker with. After that the main problem becomes understanding the model because of the notations with which it is described.

    Nonetheless, these can learned, and I happy to do so in the long term. A big constrain for for that, however, is the lack of accessible, worked out and solid examples providing practice opportunities. There are some really good ones out there, but it always takes lots of time and trial and error to find them.

  27. Riad June 2, 2018 at 3:41 am #

    hello jason
    I want to do a sentiment analysis on a dataset of movie reviews but I don’t know where do I start so please can you help me

  28. Amandeep June 14, 2018 at 5:38 pm #

    Excellent article,
    Thanks for sharing this good information on machine learning…. 🙂

  29. Sevval June 29, 2018 at 5:02 pm #

    Thank you for your help and this helpful website. I sent an email to you.
    Thanks

  30. German July 5, 2018 at 1:55 am #

    Hi Jason,
    We are building a logger program to track user interface with a core business system. The data set will have the system name, object type, description, order step and a process tag.

    What we would like to predict for example is,
    – In a new logger instance, which is the process tag?
    – Which is the next step?

    Thanks,
    German.

Leave a Reply