Archive | Machine Learning Process

Choosing Machine Learning Algorithms: Lessons from Microsoft Azure

By Jason Brownlee on August 12, 2019 in Machine Learning Process 6

Microsoft recently launched support for machine learning in their Azure cloud computing platform. Buried in some of their technical documentation for the platform are some resources that you may find useful for thinking about what machine learning algorithm to use in different situations. In this post we take a look at the Microsoft recommendations for […]

What To Do During Machine Learning Model Runs

By Jason Brownlee on June 7, 2016 in Machine Learning Process 4

There was a recent question that asked “How to not waste-time/procrastinate while ml scripts are running?“. I think this is an important question. I think answers to this question show a level of organization or maturity in your approach to work. I left a small comment on this question, but in this post I elaborate […]

Common Pitfalls In Machine Learning Projects

By Jason Brownlee on June 7, 2016 in Machine Learning Process 0

In a recent presentation, Ben Hamner described the common pitfalls in machine learning projects he and his colleagues have observed during competitions on Kaggle. The talk was titled “Machine Learning Gremlins” and was presented in February 2014 at Strata. In this post we take a look at the pitfalls from Ben’s talk, what they look like and how […]

How To Work Through A Problem Like A Data Scientist

By Jason Brownlee on August 15, 2020 in Machine Learning Process 2

In a 2010 post Hilary Mason and Chris Wiggins described the OSEMN process as a taxonomy of tasks that a data scientist should feel comfortable working on. The title of the post was “A Taxonomy of Data Science” on the now defunct dataists blog. This process has also been used as the structure of a […]

Lessons Learned from Building Machine Learning Systems

By Jason Brownlee on September 5, 2016 in Machine Learning Process 0

In a recent presentation at MLConf, Xavier Amatriain described 10 lessons that he has learned about building machine learning systems as the Research/Engineering Manager at Netflix. In this you will discover these 10 lessons in a summary from his talk and slides. 10 Lessons Learned The 10 lessons that Xavier presents can be summarized as […]

Assessing and Comparing Classifier Performance with ROC Curves

By Jason Brownlee on March 5, 2020 in Machine Learning Process 17

The most commonly reported measure of classifier performance is accuracy: the percent of correct classifications obtained. This metric has the advantage of being easy to understand and makes comparison of the performance of different classifiers trivial, but it ignores many of the factors which should be taken into account when honestly assessing the performance of […]

Understand Your Problem and Get Better Results Using Exploratory Data Analysis

By Jason Brownlee on August 15, 2020 in Machine Learning Process 5

You often jump from problem-to-problem in applied machine learning and you need to get up to speed on a new dataset, fast. A classical and under-utilised approach that you can use to quickly build a relationship with a new data problem is Exploratory Data Analysis. In this post you will discover Exploratory Data Analysis (EDA), […]

Data Management Matters And Why You Need To Take It Seriously

By Jason Brownlee on March 5, 2020 in Machine Learning Process 0

We live in a world drowning in data. Internet tracking, stock market movement, genome sequencing technologies and their ilk all produce enormous amounts of data. Most of this data is someone else’s responsibility, generated by someone else, stored in someone else’s database, which is maintained and made available by… you guessed it… someone else. But. […]

Why Aren’t My Results As Good As I Thought? You’re Probably Overfitting

By Jason Brownlee on August 15, 2020 in Machine Learning Process 6

We all know the satisfaction of running an analysis and seeing the results come back the way we want them to: 80% accuracy; 85%; 90%? The temptation is strong just to turn to the Results section of the report we’re writing, and put the numbers in. But wait: as always, it’s not that straightforward. Succumbing […]

How To Get Baseline Results And Why They Matter

By Jason Brownlee on June 27, 2017 in Machine Learning Process 43

In my courses and guides, I teach the preparation of a baseline result before diving into spot checking algorithms. A student of mine recently asked: If a baseline is not calculated for a problem, will it make the results of other algorithms questionable? He went on to ask: If other algorithms do not give better accuracy […]

← Previous 1 2 3 4 … 6 Next →