Where does theory fit into a top-down approach to studying machine learning? In the traditional approach to teaching machine learning, theory comes first requiring an extensive background in mathematics to be able to understand it. In my approach to teaching machine learning, I start with teaching you how to work problems end-to-end and deliver results. […]
Search results for "regression"
Practice Machine Learning with Datasets from the UCI Machine Learning Repository
Where can you get good datasets to practice machine learning? Datasets that are real-world so that they are interesting and relevant, although small enough for you to review in Excel and work through on your desktop. In this post you will discover a database of high-quality, real-world, and well understood machine learning datasets that you […]
8 Tactics to Combat Imbalanced Classes in Your Machine Learning Dataset
Has this happened to you? You are working on your dataset. You create a classification model and get 90% accuracy immediately. “Fantastic” you think. You dive a little deeper and discover that 90% of the data belongs to one class. Damn! This is an example of an imbalanced dataset and the frustrating results it can […]
Linear Algebra for Machine Learning
You do not need to learn linear algebra before you get started in machine learning, but at some time you may wish to dive deeper. In fact, if there was one area of mathematics I would suggest improving before the others, it would be linear algebra. It will give you the tools to help you […]
Practical Machine Learning Books for the Holidays
O’Reilly books have a reputation for being practical, hands on and useful. Specifically the nutshell books and so-called animal books. O’Reilly have a few new books out in time for the holidays on the topic of machine learning. I don’t want to bore you with reviews, Amazon has plenty of those. In this post we take […]
Use Random Forest: Testing 179 Classifiers on 121 Datasets
If you don’t know what algorithm to use on your problem, try a few. Alternatively, you could just try Random Forest and maybe a Gaussian SVM. In a recent study these two algorithms were demonstrated to be the most effective when raced against nearly 200 other algorithms averaged over more than 100 data sets. In […]
Better Naive Bayes: 12 Tips To Get The Most From The Naive Bayes Algorithm
Naive Bayes is a simple and powerful technique that you should be testing and using on your classification problems. It is simple to understand, gives good results and is fast to build a model and make predictions. For these reasons alone you should take a closer look at the algorithm. In a recent blog post, you […]
Why Aren’t My Results As Good As I Thought? You’re Probably Overfitting
We all know the satisfaction of running an analysis and seeing the results come back the way we want them to: 80% accuracy; 85%; 90%? The temptation is strong just to turn to the Results section of the report we’re writing, and put the numbers in. But wait: as always, it’s not that straightforward. Succumbing […]
How To Get Baseline Results And Why They Matter
In my courses and guides, I teach the preparation of a baseline result before diving into spot checking algorithms. A student of mine recently asked: If a baseline is not calculated for a problem, will it make the results of other algorithms questionable? He went on to ask: If other algorithms do not give better accuracy […]
Take Control By Creating Targeted Lists of Machine Learning Algorithms
Any book on machine learning will list and describe dozens of machine learning algorithms. Once you start using tools and libraries you will discover dozens more. This can really wear you down, if you think you need to know about every possible algorithm out there. A simple trick to tackle this feeling and take some […]