The book Applied Predictive Modeling teaches practical machine learning theory with code examples in R. It is an excellent book and highly recommended to machine learning practitioners and users of R for machine learning. In this post you will discover the benefits of this book and how it can help you become a better machine […]
Search results for "Deep Learning"
What is R
R is perhaps one of the most powerful and most popular platforms for statistical programming and applied machine learning. When you get serious about machine learning, you will find your way into R. In this post, you will discover what R is, where it came from and some of its most important features. Let’s get […]
Going Beyond Predictions
The predictions you make with a predictive model do not matter, it is the use of those predictions that matters. Jeremy Howard was the President and Chief Scientist of Kaggle, the competitive machine learning platform. In 2012 he presented at the O’reilly Strata conference on what he called the Drivetrain Approach for building “data products” […]
Clever Application Of A Predictive Model
What if you could use a predictive model to find new combinations of attributes that do not exist in the data but could be valuable. In Chapter 10 of Applied Predictive Modeling, Kuhn and Johnson provide a case study that does just this. It’s a fascinating and creative example of how to use a predictive […]
Data Cleaning: Turn Messy Data into Tidy Data
Data preparation is difficult because the process is not objective, or at least it does not feel that way. Questions like “what is the best form of the data to describe the problem?” are not objective. You have to think from the perspective of the problem you want to solve and try a few different […]
Case Study: Predicting the Onset of Diabetes Within Five Years (part 2 of 3)
This is a guest post by Igor Shvartser, a clever young student I have been coaching. This post is part 2 in a 3 part series on modeling the famous Pima Indians Diabetes dataset (update: download from here). In Part 1 we defined the problem and looked at the dataset, describing observations from the patterns we […]
Project Spotlight: Face Recognition with Shashank Singh
This is a project spotlight with Shashank Singh a programmer and machine learning enthusiast. Could you please introduce yourself? I did Bachelors of Technology in Computer Science. I co-founded a startup at 23, spectacularly crashed it by 26th birthday. After that I was feeling particularly low and pretty dry of inspiration for quite some time. I […]
Feature Selection to Improve Accuracy and Decrease Training Time
Working on a problem, you are always looking to get the most out of the data that you have available. You want the best accuracy you can get. Typically, the biggest wins are in better understanding the problem you are solving. This is why I stress you spend so much time up front defining your […]
A Simple Intuition for Overfitting, or Why Testing on Training Data is a Bad Idea
When you first start out with machine learning you load a dataset and try models. You might think to yourself, why can’t I just build a model with all of the data and evaluate it on the same dataset? It seems reasonable. More data to train the model is better, right? Evaluating the model and […]
Make Better Predictions with Boosting, Bagging and Blending Ensembles in Weka
Weka is the perfect platform for studying machine learning. It provides a graphical user interface for exploring and experimenting with machine learning algorithms on datasets, without you having to worry about the mathematics or the programming. In a previous post we looked at how to design and run an experiment running 3 algorithms on a […]