If you get serious with data analysis and machine learning in python then you will make good use of IPython notebooks.
The title of the talk was IPython: from the shell to a book with a single tool; the method behind the madness.
the purpose of computing is insight, not numbers
Need help with Machine Learning in Python?
Take my free 2-week email course and discover data prep, algorithms and more (with code).
Click to sign-up now and also get a free PDF Ebook version of the course.
Fernando presents what he calls a schematic for the lifecycle of a scientific idea, as follows:
- Individual: exploratory work
- Collaborative: development
- Parallel: production runs
- Publication: with reproducible results
- Education: sharing what was learned
- GoTo Step 1
He stresses the requirement to be able to move backward and forward through this process, that it is not linear. He comments that IPython was designed in October or November 2001 to address this requirement.
IPython started as a better python shell. It developed to include live interactive plotting, then live interactive parallel computing and embedding in applications. Interactivity is important, it is the ‘I’ in IPython. The platform has been through 6 iterations and has arrived at the IPython Notebook.
IPython Notebooks allow you to have cells of executable python code and markdown descriptions. This allows a single document to include the description, computation (such as Python scriptlets and programs) and artefacts (such as results and plots) from running the computation. This is a simple but very powerful communication tool.
Fernando describes this as Literate Computing, a step beyond Knuth’s Literate Programming.
An important contribution is the IPython Notebook Viewer that will render any notebook for you and presented it on the web. This service used in contribution with open source Notebook files on the web (such as GitHub) is a powerful resource.
Fernando then provides some cornerstone notebook examples to highlight the benefits of the technology.
Reproducible Research Paper
This paper was developed and written as an IPython notebook. It includes the descriptions, computations, results and even the configuration to spin up the Cluster to execute the computations in parallel on a cluster. Completely reproducible research.
Notebook-based Technical Blogging
The blog Pythonic Perambulations, Musings and ramblings through the world of Python and beyond by Jake VanderPlas.
Jake blogs using IPython notebooks allowing the combination of descriptions, computation and the outputs of executed computations in the form of graphs.
Bayesian Methods for Hackers
The book Bayesian Methods for Hackers was developed by Cameron Davidson-Pilon as a series of IPython notebooks (one per chapter) that you can work through.
This is a high-quality book and an excellent use case and demonstration for the technology.
Fernando spends some times describing the impressive architecture of the IPython kernel and shell and it is well worth the time to understand this material better,
Frustrated With Python Machine Learning?
Develop Your Own Models in Minutes
…with just a few lines of scikit-learn code
Discover how in my new Ebook:
Machine Learning Mastery With Python
Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, modeling, tuning, and much more…
Finally Bring Machine Learning To
Your Own Projects
Skip the Academics. Just Results.