The Python conference PyCon2014 has held recently and the videos for the conference are online.
I have been working my way through the interesting machine learning ones and will share a few on this over the coming weeks.
A great talk if you are starting out in data science or machine learning in python was given by Melanie Warrick titled How to Get Started with Machine Learning. It’s about 25 minutes long. The abstract of the talk is:
Provide an introduction to machine learning to clarify what it is, what it’s not and how it fits into this picture of all the hot topics around data analytics and big data.
Computers…ability to learn without… explicit programming
She positions machine learning as the toolkit used in Artificial Intelligence and Data Science. Relatedly, she describes big data as data beyond the ability of common technology to capture and curate. This definition sits well with me. Although the talk is an introduction to machine learning, the focus is on the application of machine learning in data science.
Need help with Machine Learning in Python?
Take my free 2-week email course and discover data prep, algorithms and more (with code).
Click to sign-up now and also get a free PDF Ebook version of the course.
Melanie describes the four main data science roles as data lead, data creative, data developer and data researcher and uses a graph to indicate the amount of machine learning performed by each role. She also describes a data science project workflow.
She provides a cute example of linear regression on a 2d dataset (head size vs brain weight) using scikit-learn. Usefully, she summarizes Python tools in categories:
- Explore data: pandas, statsmodels, matplotlib, numpy, unix
- Build model: scikit-learn, numpy, pandas, scipy
- Test model: scikit-learn, matplotlib
- Data products: API, Flask, Django
- Visualize: D3, Matplotplib, vincent and vega, ggplot
There is also a question at the end about contracting Python and R and she makes the apt comment of sticking with one language (i.e. Python) so you don’t need to change languages between research and production.
The talk is on youtube and on the pyvideo archive. You can review the slides from the talk and download the sample code from github. Melanie maintains a blog at nyghtowl.io and you can review the post on her talk here.
Frustrated With Python Machine Learning?
Develop Your Own Models in Minutes
…with just a few lines of scikit-learn code
Discover how in my new Ebook:
Machine Learning Mastery With Python
Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, modeling, tuning, and much more…
Finally Bring Machine Learning To
Your Own Projects
Skip the Academics. Just Results.