In a recent presentation, Ben Hamner described the common pitfalls in machine learning projects he and his colleagues have observed during competitions on Kaggle. The talk was titled “Machine Learning Gremlins” and was presented in February 2014 at Strata. In this post we take a look at the pitfalls from Ben’s talk, what they look like and how […]
Linear Algebra for Machine Learning
You do not need to learn linear algebra before you get started in machine learning, but at some time you may wish to dive deeper. In fact, if there was one area of mathematics I would suggest improving before the others, it would be linear algebra. It will give you the tools to help you […]
Machine Learning With Statistical And Causal Methods
In November 2014, Bernhard Scholkopf was awarded the Milner Award by the Royal Society for his contributions to machine learning. In accepting the award, he gave a layman’s presentation of his work on statistical and causal machine learning methods titled “Statistical and causal approaches to machine learning“. It’s an excellent one hour talk and I highly recommend that you watch […]
How To Work Through A Problem Like A Data Scientist
In a 2010 post Hilary Mason and Chris Wiggins described the OSEMN process as a taxonomy of tasks that a data scientist should feel comfortable working on. The title of the post was “A Taxonomy of Data Science” on the now defunct dataists blog. This process has also been used as the structure of a […]
How To Get Started In Machine Learning: A Self-Study Blueprint
How do you get started in machine learning, specifically Deep Learning? This question was asked recently in the machine learning sub-reddit. Specifically, the original poster of the question had completed the Coursera Machine Learning course but felt like they did not have enough of a background to get started in Deep Learning. I wrote a […]
Practical Machine Learning Books for the Holidays
O’Reilly books have a reputation for being practical, hands on and useful. Specifically the nutshell books and so-called animal books. O’Reilly have a few new books out in time for the holidays on the topic of machine learning. I don’t want to bore you with reviews, Amazon has plenty of those. In this post we take […]
Use Random Forest: Testing 179 Classifiers on 121 Datasets
If you don’t know what algorithm to use on your problem, try a few. Alternatively, you could just try Random Forest and maybe a Gaussian SVM. In a recent study these two algorithms were demonstrated to be the most effective when raced against nearly 200 other algorithms averaged over more than 100 data sets. In […]
Better Naive Bayes: 12 Tips To Get The Most From The Naive Bayes Algorithm
Naive Bayes is a simple and powerful technique that you should be testing and using on your classification problems. It is simple to understand, gives good results and is fast to build a model and make predictions. For these reasons alone you should take a closer look at the algorithm. In a recent blog post, you […]
Lessons Learned from Building Machine Learning Systems
In a recent presentation at MLConf, Xavier Amatriain described 10 lessons that he has learned about building machine learning systems as the Research/Engineering Manager at Netflix. In this you will discover these 10 lessons in a summary from his talk and slides. 10 Lessons Learned The 10 lessons that Xavier presents can be summarized as […]
Get Your Dream Job in Machine Learning by Delivering Results
You can rise up and take on your desire to become an a machine learning practitioner and data scientist. You have to work hard, learn the skills and demonstrate that you can deliver results, but you don’t need a fancy degree or a fancy background. In this post I want to demonstrate that this is […]