How to Create a Linux Virtual Machine For Machine Learning Development With Python 3

Linux is an excellent environment for machine learning development with Python.

The tools can be installed quickly and easily and you can develop and run large models directly.

In this tutorial, you will discover how to create and setup a Linux virtual machine for machine learning with Python.

After completing this tutorial, you will know:

  • How to download and install VirtualBox for managing virtual machines.
  • How to download and setup Fedora Linux.
  • How to install a SciPy environment for machine learning in Python 3.

This tutorial is suitable if your base operating system is Windows, Mac OS X, and Linux.

Let’s get started.

Benefits of a Linux Virtual Machine

There are a number of reasons that you may want to use a Linux virtual machine for Python machine learning development.

For example, below is a list of 5 top benefits for using a virtual machine:

  • To use tools not available on your system (if you’re on Windows).
  • To install and use machine learning tools without impacting your local environment (e.g. use Python 3 tools).
  • To have highly customized environments for different projects (Python2 and Python3).
  • To save the state of the machine and pick up exactly where you left off (jump from machine to machine).
  • To share development environment with other developers (set-up once and reuse many times).

Perhaps the most beneficial point is the first, being able to easily use machine learning tools not supported on your environment.

I’m an OS X user, and even though machine learning tools can be installed using brew and macports, I still find it easier to setup and use Linux virtual machines for machine learning development.

Overview

This tutorial is broken down into 3 parts:

  1. Download and Install VirtualBox.
  2. Download and Install Fedora Linux in a Virtual Machine.
  3. Install Python Machine Learning Environment

1. Download and Install VirtualBox

VirtualBox is a free open source platform for creating and managing virtual machines.

Once installed, you can create all the virtual machines you like, as long as you have the ISO images or CDs to install from.

Download VirtualBox

Download VirtualBox

  • 3. Choose binaries for your workstation.
  • 4. Install the software for your system and follow the installation instructions.
Install VirtualBox

Install VirtualBox

  • 5. Open the VirtualBox software and confirm it works.
Start VirtualBox

Start VirtualBox

2. Download and Install Fedora Linux

I chose Fedora Linux because I think it is a kinder and gentler Linux than some.

It is a leading edge for RedHat Linux intended for workstations and developers.

2.1 Download the Fedora ISO Image

Let’s start off by downloading the ISO for Fedora Linux. In this case, the 64-bit version of Fedora 25.

Download Fedora

Download Fedora

  • 5. You should now have an ISO file with the name:
    • Fedora-Workstation-Live-x86_64-25-1.3.iso“.

We are now ready to create the VM in VirtualBox.

2.2 Create the Fedora Virtual Machine

Now, let’s create the Fedora virtual machine in VirtualBox.

  • 1. Open the VirtualBox software.
  • 2. Click “New” button.
  • 3. Select the Name and operating system.
    • name: Fedora25
    • type: Linux
    • version: Fedora (64-bit)
    • Click “Continue
Create Fedora VM Name and Operating System

Create Fedora VM Name and Operating System

  • 4. Configure the Memory Size
    • 2048
  • 5. Configure the Hard Disk
    • Create a virtual hard disk now
    • Hard disk file type
    • VDI (VirtualBox Disk Image)
    • Storage on physical hard disk
    • Dynamically allocated
    • File location and size: 10GB

We are now ready to install Fedora from the ISO image.

2.3 Install Fedora Linux

Now, let’s install Fedora Linux on the new virtual machine.

  • 1. Select the new virtual machine and click the “Start” button.
  • 2. Click Folder Icon and choose the Fedora ISO file:
    • Fedora-Workstation-Live-x86_64-25-1.3.iso“.
Install Fedora

Install Fedora

  • 3. Click the “Start” button.
  • 4. Select the first option “Start Fedora-Live-Workstation-Live 25” and press the Enter key.
  • 5. Hit the “Esc” key to skip the check.
  • 6. Select “Live System User“.
  • 7. Select “Install to Hard Drive“.
Install Fedora to Hard Drive

Install Fedora to Hard Drive

  • 8. Complete “Language Selection” (English)
  • 9. Complete “Installation Destination” (“ATA VBOX HARDDISK“).
    • You may need to wait one minute for the VM to create the hard disk.
Install on Virtual Hard Disk

Install on Virtual Hard Disk

  • 10. Click “Begin Installation“.
  • 11. Set root password.
  • 12. Create a user for yourself.
    • Note down the username and password (so that you can use it later).
    • Tick the “Make this user administrator” (so you can install software).
Create a New User

Create a New User

  • 13. Wait for the installation to complete… (5 minutes?)
  • 14. Click “Quit”, click power icon in top right; select power off.

2.4 Finalize Fedora Linux Installation

Fedora Linux has been installed; let’s finalize the installation and make it ready for use.

  • 1. In VirtualBox with the Fedora25 VM selected, under “Storage“, click on “Optical Drive“.
    • Select “Remove disk from virtual drive” to eject the ISO image.
  • 2. Click the “Start” button to start the Fedora Linux installation.
  • 3. Login as the user you created.
Fedora Login as New User

Fedora Login as New User

  • 4. Finalize installation
    • Choose language “English
    • Click “Next
    • Choose Keyboard “US
    • Click “Next
    • Configure Privacy
    • Click “Next
    • Connect Your Online Accounts
    • Click “Skip
    • Click “Start using Fedora
  • 5. Close the help system that starts automatically.

We now have a Fedora Linux virtual machine ready to install new software.

3. Install Python Machine Learning Environment

Fedora uses Gnome 3 as the window manager.

Gnome 3 is quite different to prior versions of Gnome; you can learn how to get around by using the built-in help system.

3.1 Install Python Environment

Let’s start off by installing the required Python libraries for machine learning development.

  • 1. Open the terminal.
    • Click “Activities
    • Type “terminal
    • Click icon or press enter
Start Terminal

Start Terminal

  • 2. Confirm Python3 was installed.

Type:

Python3 Version

Python3 Version

  • 3. Install the Python machine learning environment. Specifically:
    • NumPy
    • SciPy
    • Pandas
    • Matplotlib
    • Statsmodels
    • Scikit-Learn

DNF is the software installation system, formally yum. The first time you run dnf, it will update the database of packages, this might take a minute.

Type:

Enter your password when prompted.

Confirm the installation when prompted by pressing “y” and “enter“.

3.2 Confirm Python Environment

Now that the environment is installed, we can confirm it by printing the versions of each required library.

  • 1. Open Gedit.
    • Click “Activities
    • Type “gedit
    • Click icon or press enter
  • 2. Type the following script and save it as versions.py in the home directory.

There is no copy-paste support; you may want to open Firefox within the VM and navigate to this page and copy paste the script into your Gedit window.

Write Versions Script

Write Versions Script

  • 3. Run the script in the terminal.

Type:

Python3 Check Library Versions

Python3 Check Library Versions

Tips For Using the VM

This section lists some tips using the VM for machine learning development.

  • Copy-paste and Folder Sharing. These features require the installation of “Guest Additions” in the Linux VM. I have not been able to get this to install correctly and therefore do not use these features. You can try if you like; let me know how you do in the comments.
  • Use GitHub. I recommend storing all of your code in GitHub and checking the code in and out from the VM. It makes life a lot easier for getting code and assets in and out of the VM.
  • Use Sublime. I think sublime is a great text editor on Linux for development, better than Gedit at least.
  • Use AWS for large jobs. You can use the same procedure to setup Fedora Linux on Amazon Web Services for running large models in the cloud.
  • VM Tools. You can save the VM at any point by closing the window. You can also take a snapshot of the VM at any point and return to the snapshot. This can be helpful if you are making large changes to the file system.
  • Python2. You can easily install Python2 alongside Python 3 in Linux and use the python (rather than python3) binary or use alternatives to switch between the two.
  • Notebooks. Consider running a notebook server inside the VM and opening up the firewall so that you can connect and run from your main workstation outside of the VM.

Do you have any tips to share? Let me know in the comments.

Further Reading

Below are some resources for further reading if you are new to the tools used in this tutorial.

Summary

In this tutorial, you discovered how to setup a Linux virtual machine for Python machine learning development.

Specifically, you learned:

  • How to download and install VirtualBox, free, open-source software for managing virtual machines.
  • How to download and setup Fedora Linux, a friendly Linux distribution for developers.
  • How to install and test a Python3 environment for machine learning development.

Did you complete the tutorial?
Let me know how it went in the comments below.

12 Responses to How to Create a Linux Virtual Machine For Machine Learning Development With Python 3

  1. jm February 27, 2017 at 1:22 pm #

    Maybe the same but with Docker?

    Thanks

  2. Francis Ibok February 28, 2017 at 5:18 pm #

    Hi Jason, I am using Macbook pro with two operating systems installed already OS X and Microsoft window 7, what am I to do ?

    • Jason Brownlee March 1, 2017 at 8:32 am #

      I would suggest using Mac OS X.

      You could install and use Python Anaconda, or if you are more advanced, explore using a package manager like macports (my personal preference).

      • Francis Ibok March 12, 2017 at 4:08 pm #

        That means, I can uninstall the window 7 and use only the Mac OSX.

  3. Jimmy Olano March 1, 2017 at 5:48 am #

    Great article!

    The key for VBox Guest Adds is compiling the “kernel modules” into the virtual machine:

    -Just ALT+F2 and type “gnome-terminal”
    -Type “sudo yum install kernel-devel-4.8.9-300.fc25.x86_64”
    -Set iso file with VBox guest add at VirtualBox.
    -Just open the “cd” from Fedora and execute “runasroot.sh” (or use “sudo” in “/run/media/VIRTUALBOX…”).
    -Wait for compiling “kernel modules” (take a while determining your “hardware” and compiling)
    -Share file and folders from real machine to virtual machine by set in VitualBox.
    -Done!

    • Jason Brownlee March 1, 2017 at 8:45 am #

      Fantastic Jimmy, thanks for the note.

      I’ll give it a try to confirm and maybe even update the tutorial.

  4. Marcos Keyser April 24, 2017 at 8:04 pm #

    Really interesting article!

    I would like to know more about how to run a notebook server inside the VM so that you can connect and run fro your main workstation outside of the VM. Where I can start?

    Thanks

    • Jason Brownlee April 25, 2017 at 7:49 am #

      Great idea. Sorry I don’t have an example at hand.

  5. Marcos Keyser April 24, 2017 at 8:35 pm #

    This presentation about how to use Docker in a data science context is interesting. It would be great to see a blog post about this.

  6. Marcos Keyser April 24, 2017 at 8:35 pm #

    The presentation is here: https://www.youtube.com/watch?v=GOW6yQpxOIg

Leave a Reply