In this post you will discover 3 recipes for penalized regression for the R platform. You can copy and paste the recipes in this post to make a jump-start on your own problem or to learn and practice with linear regression in R. Let’s get started. Each example in this post uses the longley dataset provided in the datasets […]
Linear Regression in R
In this post you will discover 4 recipes for linear regression for the R platform. You can copy and paste the recipes in this post to make a jump-start on your own problem or to learn and practice with linear regression in R. Let’s get started. Each example in this post uses the longley dataset […]
Applied Machine Learning Lessons from A Case Study of Passenger Survival Prediction
A valuable exercise when learning and practicing machine learning is to study how others apply methods and solve problems. It’s valuable because you can learn about new processes, software, graphs, and algorithms. But it is new ways of thinking about the process of solving problems with machine learning that is the most valuable part of […]
Java Machine Learning
Are you a Java programmer and looking to get started or practice machine learning? Writing programs that make use of machine learning is the best way to learn machine learning. You can write the algorithms yourself from scratch, but you can make a lot more progress if you leverage an existing open source library. In […]
How to Tune Algorithm Parameters with Scikit-Learn
Machine learning models are parameterized so that their behavior can be tuned for a given problem. Models can have many parameters and finding the best combination of parameters can be treated as a search problem. In this post, you will discover how to tune the parameters of machine learning algorithms in Python using the scikit-learn […]
Feature Selection in Python with Scikit-Learn
Not all data attributes are created equal. More is not always better when it comes to attributes or columns in your dataset. In this post you will discover how to select attributes in your data before creating a machine learning model using the scikit-learn library. Let’s get started. Update: For a more recent tutorial on feature selection in […]
Rescaling Data for Machine Learning in Python with Scikit-Learn
Your data must be prepared before you can build models. The data preparation process can involve three steps: data selection, data preprocessing and data transformation. In this post you will discover two simple data transformation methods you can apply to your data in Python using scikit-learn. Let’s get started. Update: See this post for a […]
How to Load Data in Python with Scikit-Learn
Before you can build machine learning models, you need to load your data into memory. In this post you will discover how to load data for machine learning in Python using scikit-learn. Let’s get started. Update March/2018: Added alternate link to download the dataset as the original appears to have been taken down. Packaged Datasets […]
Data Science Screencasts: A Data Origami Review
Data Origami is a new website by Cameron Davidson-Pilon that provides data science screencasts. It is a cool idea and a cool site. Cameron was kind enough to give me access to the site so that I could review it. I watched all of the videos I could and wrote up all my notes, and […]
Machine Learning is Kaggle Competitions
Julia Evans wrote a post recently titled “Machine learning isn’t Kaggle competitions“. It was an interesting post because it pointed out an important truth. If you want to solve business problems using machine learning, doing well at Kaggle competitions is not a good indicator of that skills. The rationale is that the work required to […]