Julia Evans wrote a post recently titled “Machine learning isn’t Kaggle competitions“. It was an interesting post because it pointed out an important truth. If you want to solve business problems using machine learning, doing well at Kaggle competitions is not a good indicator of that skills. The rationale is that the work required to […]
Machine Learning Communities
Online communities are invaluable in machine learning, regardless of your skill level. The reason is that, like programming, you never stop learning. You simply cannot know everything, there are always new algorithms, new data and new combinations to discover and practice. Communities help. You can get your questions answered, learn by answering other peoples questions […]
Books for Machine Learning with R
R is a powerful platform for data analysis and machine learning. It is my main workhorse for things like competitions and consulting work. The reason is the large amounts of powerful algorithms available, all on the one platform. In this post I want to point out some resources you can use to get started in […]
Practical Advice for Getting Started in Machine Learning
David Mimno is an assistant professor in the Information Sciences department at Cornell University. He has a background and interest in Natural Language Processing (NLP), specifically topic modeling. Notably, he is the chief maintainer of MALLET, the Java-based NLP library. I recently came across a blog post by David titled “Advice for students of machine […]
Quick and Dirty Data Analysis with Pandas
Before you can select and prepare your data for modeling, you need to understand what you’ve got to start with. If you’re a using the Python stack for machine learning, a library that you can use to better understand your data is Pandas. In this post you will discover some quick and dirty recipes for […]
Prepare Data for Machine Learning in Python with Pandas
If you are using the Python stack for studying and applying machine learning, then the library that you will want to use for data analysis and data manipulation is Pandas. This post gives you a quick introduction to the Pandas library and point you in the right direction for getting started. Let’s get started. Data […]
Machine Learning Algorithm Recipes in scikit-learn
You have to get your hands dirty. You can read all of the blog posts and watch all the videos in the world, but you’re not actually going to start really get machine learning until you start practicing. The scikit-learn Python library is very easy to get up and running. Nevertheless I see a lot […]
The Best Machine Learning Algorithm
What is the best machine learning algorithm? I get this question a lot. Maybe even daily. Sometimes it’s a general question. I figure people want to make sure they are learning the one true machine learning algorithm and not wasting their time on anything less. Most other times it is with regard to a specific […]
Computer Hardware for Machine Learning
A question that comes up from time to time is: What hardware do I need to practice machine learning? There was a time when I was a student when I was obsessed with more speed and more cores so I could run my algorithms faster and for longer. I have changed my perspective. Big hardware […]
How I Got Started In Machine Learning
I get a lot of emails asking about how I got interested in machine learning and about my background. I don’t think my story is special or interesting, but I’m happy to share it and honored I’m asked. This post feels a little self-indulgent. I figure it can be the definitive version of my story […]