A validation dataset is a sample of data held back from training your model that is used to give an estimate of model skill while tuning model’s hyperparameters. The validation dataset is different from the test dataset that is also held back from the training of the model, but is instead used to give an unbiased […]
Archive | Machine Learning Process
7 Ways to Handle Large Data Files for Machine Learning
Exploring and applying machine learning algorithms to datasets that are too large to fit into memory is pretty common. This leads to questions like: How do I load my multiple gigabyte data file? Algorithms crash when I try to run my dataset; what should I do? Can you help me with out-of-memory errors? In this […]
How to Train a Final Machine Learning Model
The machine learning model that we use to make predictions on new data is called the final model. There can be confusion in applied machine learning about how to train a final model. This error is seen with beginners to the field who ask questions such as: How do I predict with cross validation? Which […]
How to Get Started with Kaggle
4-Step Process for Getting Started and Getting Good at Competitive Machine Learning. Kaggle is a community and site for hosting machine learning competitions. Competitive machine learning can be a great way to develop and practice your skills, as well as demonstrate your capabilities. In this post, you will discover a simple 4-step process to get […]
10 Standard Datasets for Practicing Applied Machine Learning
The key to getting good at applied machine learning is practicing on lots of different datasets. This is because each problem is different, requiring subtly different data preparation and modeling methods. In this post, you will discover 10 top standard machine learning datasets that you can use for practice. Let’s dive in. Update Mar/2018: Added […]
Machine Learning Performance Improvement Cheat Sheet
32 Tips, Tricks and Hacks That You Can Use To Make Better Predictions. The most valuable part of machine learning is predictive modeling. This is the development of models that are trained on historical data and make predictions on new data. And the number one question when it comes to predictive modeling is: How can […]
Deploy Your Predictive Model To Production
5 Best Practices For Operationalizing Machine Learning. Not all predictive models are at Google-scale. Sometimes you develop a small predictive model that you want to put in your software. I recently received this reader question: Actually, there is a part that is missing in my knowledge about machine learning. All tutorials give you the steps up […]
Simple 3-Step Methodology To The Best Machine Learning Algorithm
How do you choose the best algorithm for your dataset? Machine learning is a problem of induction where general rules are learned from specific observed data from the domain. It infeasible (impossible?) to know what representation or what algorithm to use to best learn from the data on a specific problem before hand, without knowing the […]
How to Use a Machine Learning Checklist to Get Accurate Predictions, Reliably
How do you get accurate results using machine learning on problem after problem? The difficulty is that each problem is unique, requiring different data sources, features, algorithms, algorithm configurations and on and on. The solution is to use a checklist that guarantees a good result every time. In this post you will discover a checklist […]
Choosing Machine Learning Algorithms: Lessons from Microsoft Azure
Microsoft recently launched support for machine learning in their Azure cloud computing platform. Buried in some of their technical documentation for the platform are some resources that you may find useful for thinking about what machine learning algorithm to use in different situations. In this post we take a look at the Microsoft recommendations for […]