The question of how to learn a machine learning algorithm has come up a few times on the email list. In this post I’ll share with you the strategy I have been using for years to learn and build up a structured description of an algorithm in a step-by-step manner that I can add to, […]
Hands on Big Data by Peter Norvig
When I’m asked about resources for big data, I typically recommend people watch Peter Norvig’s Big Data tech talk to Facebook Engineering from 2009. It’s fantastic because he’s a great communicator and clearly and presents the deceptively simple thesis of big data in this video. In this blog post I summarize this video for you […]
Reproducible Machine Learning Results By Default
It is good practice to have reproducible outcomes in software projects. It might even be standard practice by now, I hope it is. You can take any developer off the street and they should be able to follow your process to check out the code base from revision control and make a build of the […]
What is Data Mining and KDD
I am very interested in processes. I want to know good ways to do things, even the best way to do things if possible. Even if you don’t have skill or deep understanding, process can get you a long way. It can lead the way and skill and deep understanding can follow. At least, I […]
4 Self-Study Machine Learning Projects
There are many paths into the field of machine learning and most start with theory. If you are a programmer then you already have the skills to decompose problems into their constituent parts and to prototype small projects in order to learn new technologies, libraries and methods. These are important skills for any professional programmer […]
How to Use Machine Learning Results
Once you have found and tuned a viable model of your problem it is time to make use of that model. You may need to revisit your why and remind yourself what form you need a solution for the problem you are solving. The problem is not addressed until you do something with the results. […]
How to Identify Outliers in your Data
Bojan Miletic asked a question about outlier detection in datasets when working with machine learning algorithms. This post is in answer to his question. If you have a question about machine learning, sign-up to the newsletter and reply to an email or use the contact form and ask, I will answer your question and may […]
How to Improve Machine Learning Results
Having one or two algorithms that perform reasonably well on a problem is a good start, but sometimes you may be incentivised to get the best result you can given the time and resources you have available. In this post, you will review methods you can use to squeeze out extra performance and improve the […]
How to Evaluate Machine Learning Algorithms
Once you have defined your problem and prepared your data you need to apply machine learning algorithms to the data in order to solve your problem. You can spend a lot of time choosing, running and tuning algorithms. You want to make sure you are using your time effectively to get closer to your goal. […]
How to Prepare Data For Machine Learning
Machine learning algorithms learn from data. It is critical that you feed them the right data for the problem you want to solve. Even if you have good data, you need to make sure that it is in a useful scale, format and even that meaningful features are included. In this post you will learn […]