IPython from the shell to a book with a single tool with Fernando Perez

If you get serious with data analysis and machine learning in python then you will make good use of IPython notebooks.

In this post we will review some takeaway points made by Fernando Perez, the creator of IPython in a keynote presentation at SciPy 2013.

The title of the talk was IPython: from the shell to a book with a single tool; the method behind the madness.

Kick-start your project with my new book Machine Learning Mastery With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Fernando opens the talk with an excellent quote by Richard Hamming (1962) from the preface of Numerical Methods for Scientists and Engineers that bears repeating:

the purpose of computing is insight, not numbers

Need help with Machine Learning in Python?

Take my free 2-week email course and discover data prep, algorithms and more (with code).

Click to sign-up now and also get a free PDF Ebook version of the course.

Fernando presents what he calls a schematic for the lifecycle of a scientific idea, as follows:

  1. Individual: exploratory work
  2. Collaborative: development
  3. Parallel: production runs
  4. Publication: with reproducible results
  5. Education: sharing what was learned
  6. GoTo Step 1
Lifecycle of a scientific idea

Lifecycle of a scientific idea

He stresses the requirement to be able to move backward and forward through this process, that it is not linear. He comments that IPython was designed in October or November 2001 to address this requirement.

IPython started as a better python shell. It developed to include live interactive plotting, then live interactive parallel computing and embedding in applications. Interactivity is important, it is the ‘I’ in IPython. The platform has been through 6 iterations and has arrived at the IPython Notebook.

IPython Notebooks allow you to have cells of executable python code and markdown descriptions. This allows a single document to include the description, computation (such as Python scriptlets and programs) and artefacts (such as results and plots) from running the computation. This is a simple but very powerful communication tool.

Fernando describes this as Literate Computing, a step beyond Knuth’s Literate Programming.

An important contribution is the IPython Notebook Viewer that will render any notebook for you and presented it on the web. This service used in contribution with open source Notebook files on the web (such as GitHub) is a powerful resource.

Fernando then provides some cornerstone notebook examples to highlight the benefits of the technology.

Reproducible Research Paper

The paper Collaborative cloud-enabled tools allow rapid, reproducible biological insights, and the associated materials.

This paper was developed and written as an IPython notebook. It includes the descriptions, computations, results and even the configuration to spin up the Cluster to execute the computations in parallel on a cluster. Completely reproducible research.

Notebook-based Technical Blogging

The blog Pythonic Perambulations, Musings and ramblings through the world of Python and beyond by Jake VanderPlas.

Pythonic Perambulations

Screenshot from Pythonic Perambulations by Jake VanderPlas

Jake blogs using IPython notebooks allowing the combination of descriptions, computation and the outputs of executed computations in the form of graphs.

Bayesian Methods for Hackers

The book Bayesian Methods for Hackers was developed by Cameron Davidson-Pilon as a series of IPython notebooks (one per chapter) that you can work through.

Bayesian Methods for Hackers

Bayesian Methods for Hackers

This is a high-quality book and an excellent use case and demonstration for the technology.

Fernando spends some times describing the impressive architecture of the IPython kernel and shell and it is well worth the time to understand this material better,

For more information you can checkout the IPython home page and this curated gallery of notable IPython Notebooks.

Discover Fast Machine Learning in Python!

Master Machine Learning With Python

Develop Your Own Models in Minutes

...with just a few lines of scikit-learn code

Learn how in my new Ebook:
Machine Learning Mastery With Python

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, modeling, tuning, and much more...

Finally Bring Machine Learning To
Your Own Projects

Skip the Academics. Just Results.

See What's Inside

2 Responses to IPython from the shell to a book with a single tool with Fernando Perez

  1. Avatar
    Jesús Martínez March 31, 2018 at 1:56 am #

    IPython (now Jupyter Notebooks) is an excellent tool and I’d dare to say it changed the way data scientist, data analysts and machine learning enthusiasts share, improve, and complement their knowledge. Many companies like Kaggle, DataBricks, Zeppelin, and Skymind has adopted this way of presenting our work in an interactive way.

    Have you used Jupyter Notebooks with another language besides Python? How was your experience? Thank you very much for your time and attention.

    Keep up the good work!

    • Avatar
      Jason Brownlee March 31, 2018 at 6:37 am #

      I avoid them and generally recommend students to work from the command line instead as the notebooks can introduce env issues and hide true errors. I got a lot of emails from confused beginners when I recommended using them.

Leave a Reply