How do you become a data scientist? I think that really depends on where you are now and what you really want to do as a data scientist. Nevertheless, DataCamp posted an infographic recently that described 8 easy steps to becoming a data scientist. In this post I want to highlight and review DataCamp’s infographic. […]
Search results for "Programming Machine Learning"
Inteview: Discover the Methodology and Mindset of a Kaggle Master
What does it take to do well in competitive machine learning? To really dig into this question, you need to dig into the people that do well. In 2010 I participated in a Kaggle competition to predict the outcome of chess games in the future. It was a fascinating problem because it required you to […]
Discover Feature Engineering, How to Engineer Features and How to Get Good at It
Feature engineering is an informal topic, but one that is absolutely known and agreed to be key to success in applied machine learning. In creating this guide I went wide and deep and synthesized all of the material I could. You will discover what feature engineering is, what problem it solves, why it matters, how […]
What is R
R is perhaps one of the most powerful and most popular platforms for statistical programming and applied machine learning. When you get serious about machine learning, you will find your way into R. In this post, you will discover what R is, where it came from and some of its most important features. Let’s get […]
Data Science Screencasts: A Data Origami Review
Data Origami is a new website by Cameron Davidson-Pilon that provides data science screencasts. It is a cool idea and a cool site. Cameron was kind enough to give me access to the site so that I could review it. I watched all of the videos I could and wrote up all my notes, and […]
Data Cleaning: Turn Messy Data into Tidy Data
Data preparation is difficult because the process is not objective, or at least it does not feel that way. Questions like “what is the best form of the data to describe the problem?” are not objective. You have to think from the perspective of the problem you want to solve and try a few different […]
The Data Analytics Handbook: CEOs and Managers
In a previous blog post we looked at the ebook of interviews with data analysts and data scientists put together by Liou, Tao and Lin. In this blog post we look at the second book in the series titled The Data Analytics Handbook CEOs and Managers. What are managers looking for in a Data Analyst and […]
IPython from the shell to a book with a single tool with Fernando Perez
If you get serious with data analysis and machine learning in python then you will make good use of IPython notebooks. In this post we will review some takeaway points made by Fernando Perez, the creator of IPython in a keynote presentation at SciPy 2013. The title of the talk was IPython: from the shell to […]
Project Spotlight: Event Recommendation in Python with Artem Yankov
This is a project spotlight with Artem Yankov. Could you please introduce yourself? My name is Artem Yankov, I have worked as a software engineer for Badgeville for the last 3 years. I’m using there Ruby and Scala although my prior background includes use of various languages such as: Assembly, C/C++, Python, Clojure and JS. I […]
Make Better Predictions with Boosting, Bagging and Blending Ensembles in Weka
Weka is the perfect platform for studying machine learning. It provides a graphical user interface for exploring and experimenting with machine learning algorithms on datasets, without you having to worry about the mathematics or the programming. In a previous post we looked at how to design and run an experiment running 3 algorithms on a […]