Before you can build machine learning models, you need to load your data into memory. In this post you will discover how to load data for machine learning in Python using scikit-learn. Let’s get started. Update March/2018: Added alternate link to download the dataset as the original appears to have been taken down. Packaged Datasets […]
Search results for "Machine Learning"
Quick and Dirty Data Analysis with Pandas
Before you can select and prepare your data for modeling, you need to understand what you’ve got to start with. If you’re a using the Python stack for machine learning, a library that you can use to better understand your data is Pandas. In this post you will discover some quick and dirty recipes for […]
IPython from the shell to a book with a single tool with Fernando Perez
If you get serious with data analysis and machine learning in python then you will make good use of IPython notebooks. In this post we will review some takeaway points made by Fernando Perez, the creator of IPython in a keynote presentation at SciPy 2013. The title of the talk was IPython: from the shell to […]
Case Study: Predicting the Onset of Diabetes Within Five Years (part 3 of 3)
This is a guest post by Igor Shvartser, a clever young student I have been coaching. This post is part 3 in a 3 part series on modeling the famous Pima Indians Diabetes dataset that will investigate improvements to the classification accuracy and present final results (update: download from here). In Part 1 we defined the problem […]
BigML Tutorial: Develop Your First Decision Tree and Make Predictions
BigML is a fresh new and interesting machine learning as a service company based out of Corvallis, Oregon, USA. In a previous post, we reviewed the BigML service, the key features and the ways in which you could use this service in your business, on you side project or to present to clients. In this […]
Case Study: Predicting the Onset of Diabetes Within Five Years (part 2 of 3)
This is a guest post by Igor Shvartser, a clever young student I have been coaching. This post is part 2 in a 3 part series on modeling the famous Pima Indians Diabetes dataset (update: download from here). In Part 1 we defined the problem and looked at the dataset, describing observations from the patterns we […]
Project Spotlight: Face Recognition with Shashank Singh
This is a project spotlight with Shashank Singh a programmer and machine learning enthusiast. Could you please introduce yourself? I did Bachelors of Technology in Computer Science. I co-founded a startup at 23, spectacularly crashed it by 26th birthday. After that I was feeling particularly low and pretty dry of inspiration for quite some time. I […]
Case Study: Predicting the Onset of Diabetes Within Five Years (part 1 of 3)
This is a guest post by Igor Shvartser, a clever young student I have been coaching. This post is part 1 in a 3 part series on modeling the famous Pima Indians Diabetes dataset that will introduce the problem and the data. Part 2 will investigate feature selection and spot checking algorithms and Part 3 in […]
Privacy
Who we are We are Machine Learning Mastery Pty. Ltd. Our website address is: https://machinelearningmastery.com. What personal data we collect and why we collect it Comments When visitors leave comments on the site we collect the data shown in the comments form, and also the visitor’s IP address and browser user agent string to help […]
Classification Accuracy is Not Enough: More Performance Measures You Can Use
When you build a model for a classification problem you almost always want to look at the accuracy of that model as the number of correct predictions from all predictions made. This is the classification accuracy. In a previous post, we have looked at evaluating the robustness of a model for making predictions on unseen […]