How to Create a Linux Virtual Machine For Machine Learning Development With Python 3

Linux is an excellent environment for machine learning development with Python.

The tools can be installed quickly and easily and you can develop and run large models directly.

In this tutorial, you will discover how to create and setup a Linux virtual machine for machine learning with Python.

After completing this tutorial, you will know:

  • How to download and install VirtualBox for managing virtual machines.
  • How to download and setup Fedora Linux.
  • How to install a SciPy environment for machine learning in Python 3.

This tutorial is suitable if your base operating system is Windows, Mac OS X, and Linux.

Kick-start your project with my new book Machine Learning Mastery With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Benefits of a Linux Virtual Machine

There are a number of reasons that you may want to use a Linux virtual machine for Python machine learning development.

For example, below is a list of 5 top benefits for using a virtual machine:

  • To use tools not available on your system (if you’re on Windows).
  • To install and use machine learning tools without impacting your local environment (e.g. use Python 3 tools).
  • To have highly customized environments for different projects (Python2 and Python3).
  • To save the state of the machine and pick up exactly where you left off (jump from machine to machine).
  • To share development environment with other developers (set-up once and reuse many times).

Perhaps the most beneficial point is the first, being able to easily use machine learning tools not supported on your environment.

I’m an OS X user, and even though machine learning tools can be installed using brew and macports, I still find it easier to setup and use Linux virtual machines for machine learning development.

Overview

This tutorial is broken down into 3 parts:

  1. Download and Install VirtualBox.
  2. Download and Install Fedora Linux in a Virtual Machine.
  3. Install Python Machine Learning Environment

1. Download and Install VirtualBox

VirtualBox is a free open source platform for creating and managing virtual machines.

Once installed, you can create all the virtual machines you like, as long as you have the ISO images or CDs to install from.

Download VirtualBox

Download VirtualBox

  • 3. Choose binaries for your workstation.
  • 4. Install the software for your system and follow the installation instructions.
Install VirtualBox

Install VirtualBox

  • 5. Open the VirtualBox software and confirm it works.
Start VirtualBox

Start VirtualBox

2. Download and Install Fedora Linux

I chose Fedora Linux because I think it is a kinder and gentler Linux than some.

It is a leading edge for RedHat Linux intended for workstations and developers.

2.1 Download the Fedora ISO Image

Let’s start off by downloading the ISO for Fedora Linux. In this case, the 64-bit version of Fedora 25.

Download Fedora

Download Fedora

  • 5. You should now have an ISO file with the name:
    • Fedora-Workstation-Live-x86_64-25-1.3.iso“.

We are now ready to create the VM in VirtualBox.

2.2 Create the Fedora Virtual Machine

Now, let’s create the Fedora virtual machine in VirtualBox.

  • 1. Open the VirtualBox software.
  • 2. Click “New” button.
  • 3. Select the Name and operating system.
    • name: Fedora25
    • type: Linux
    • version: Fedora (64-bit)
    • Click “Continue
Create Fedora VM Name and Operating System

Create Fedora VM Name and Operating System

  • 4. Configure the Memory Size
    • 2048
  • 5. Configure the Hard Disk
    • Create a virtual hard disk now
    • Hard disk file type
    • VDI (VirtualBox Disk Image)
    • Storage on physical hard disk
    • Dynamically allocated
    • File location and size: 10GB

We are now ready to install Fedora from the ISO image.

2.3 Install Fedora Linux

Now, let’s install Fedora Linux on the new virtual machine.

  • 1. Select the new virtual machine and click the “Start” button.
  • 2. Click Folder Icon and choose the Fedora ISO file:
    • Fedora-Workstation-Live-x86_64-25-1.3.iso“.
Install Fedora

Install Fedora

  • 3. Click the “Start” button.
  • 4. Select the first option “Start Fedora-Live-Workstation-Live 25” and press the Enter key.
  • 5. Hit the “Esc” key to skip the check.
  • 6. Select “Live System User“.
  • 7. Select “Install to Hard Drive“.
Install Fedora to Hard Drive

Install Fedora to Hard Drive

  • 8. Complete “Language Selection” (English)
  • 9. Complete “Installation Destination” (“ATA VBOX HARDDISK“).
    • You may need to wait one minute for the VM to create the hard disk.
Install on Virtual Hard Disk

Install on Virtual Hard Disk

  • 10. Click “Begin Installation“.
  • 11. Set root password.
  • 12. Create a user for yourself.
    • Note down the username and password (so that you can use it later).
    • Tick the “Make this user administrator” (so you can install software).
Create a New User

Create a New User

  • 13. Wait for the installation to complete… (5 minutes?)
  • 14. Click “Quit”, click power icon in top right; select power off.

2.4 Finalize Fedora Linux Installation

Fedora Linux has been installed; let’s finalize the installation and make it ready for use.

  • 1. In VirtualBox with the Fedora25 VM selected, under “Storage“, click on “Optical Drive“.
    • Select “Remove disk from virtual drive” to eject the ISO image.
  • 2. Click the “Start” button to start the Fedora Linux installation.
  • 3. Login as the user you created.
Fedora Login as New User

Fedora Login as New User

  • 4. Finalize installation
    • Choose language “English
    • Click “Next
    • Choose Keyboard “US
    • Click “Next
    • Configure Privacy
    • Click “Next
    • Connect Your Online Accounts
    • Click “Skip
    • Click “Start using Fedora
  • 5. Close the help system that starts automatically.

We now have a Fedora Linux virtual machine ready to install new software.

3. Install Python Machine Learning Environment

Fedora uses Gnome 3 as the window manager.

Gnome 3 is quite different to prior versions of Gnome; you can learn how to get around by using the built-in help system.

3.1 Install Python Environment

Let’s start off by installing the required Python libraries for machine learning development.

  • 1. Open the terminal.
    • Click “Activities
    • Type “terminal
    • Click icon or press enter
Start Terminal

Start Terminal

  • 2. Confirm Python3 was installed.

Type:

Python3 Version

Python3 Version

  • 3. Install the Python machine learning environment. Specifically:
    • NumPy
    • SciPy
    • Pandas
    • Matplotlib
    • Statsmodels
    • Scikit-Learn

DNF is the software installation system, formally yum. The first time you run dnf, it will update the database of packages, this might take a minute.

Type:

Enter your password when prompted.

Confirm the installation when prompted by pressing “y” and “enter“.

3.2 Confirm Python Environment

Now that the environment is installed, we can confirm it by printing the versions of each required library.

  • 1. Open Gedit.
    • Click “Activities
    • Type “gedit
    • Click icon or press enter
  • 2. Type the following script and save it as versions.py in the home directory.

There is no copy-paste support; you may want to open Firefox within the VM and navigate to this page and copy paste the script into your Gedit window.

Write Versions Script

Write Versions Script

  • 3. Run the script in the terminal.

Type:

Python3 Check Library Versions

Python3 Check Library Versions

Tips For Using the VM

This section lists some tips using the VM for machine learning development.

  • Copy-paste and Folder Sharing. These features require the installation of “Guest Additions” in the Linux VM. I have not been able to get this to install correctly and therefore do not use these features. You can try if you like; let me know how you do in the comments.
  • Use GitHub. I recommend storing all of your code in GitHub and checking the code in and out from the VM. It makes life a lot easier for getting code and assets in and out of the VM.
  • Use Sublime. I think sublime is a great text editor on Linux for development, better than Gedit at least.
  • Use AWS for large jobs. You can use the same procedure to setup Fedora Linux on Amazon Web Services for running large models in the cloud.
  • VM Tools. You can save the VM at any point by closing the window. You can also take a snapshot of the VM at any point and return to the snapshot. This can be helpful if you are making large changes to the file system.
  • Python2. You can easily install Python2 alongside Python 3 in Linux and use the python (rather than python3) binary or use alternatives to switch between the two.
  • Notebooks. Consider running a notebook server inside the VM and opening up the firewall so that you can connect and run from your main workstation outside of the VM.

Do you have any tips to share? Let me know in the comments.

Further Reading

Below are some resources for further reading if you are new to the tools used in this tutorial.

Summary

In this tutorial, you discovered how to setup a Linux virtual machine for Python machine learning development.

Specifically, you learned:

  • How to download and install VirtualBox, free, open-source software for managing virtual machines.
  • How to download and setup Fedora Linux, a friendly Linux distribution for developers.
  • How to install and test a Python3 environment for machine learning development.

Did you complete the tutorial?
Let me know how it went in the comments below.

Discover Fast Machine Learning in Python!

Master Machine Learning With Python

Develop Your Own Models in Minutes

...with just a few lines of scikit-learn code

Learn how in my new Ebook:
Machine Learning Mastery With Python

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, modeling, tuning, and much more...

Finally Bring Machine Learning To
Your Own Projects

Skip the Academics. Just Results.

See What's Inside

34 Responses to How to Create a Linux Virtual Machine For Machine Learning Development With Python 3

  1. Avatar
    jm February 27, 2017 at 1:22 pm #

    Maybe the same but with Docker?

    Thanks

    • Avatar
      Jason Brownlee February 28, 2017 at 8:09 am #

      Great suggestion, thanks!

    • Avatar
      Nunya May 2, 2018 at 9:57 pm #

      You forgot to actually install Python 3; you go from Step 1: open a terminal to Step 2: confirm Python installation…?

      • Avatar
        Naseef February 3, 2022 at 6:16 pm #

        It’s was already installed for me. I think it would be same for you

  2. Avatar
    Francis Ibok February 28, 2017 at 5:18 pm #

    Hi Jason, I am using Macbook pro with two operating systems installed already OS X and Microsoft window 7, what am I to do ?

    • Avatar
      Jason Brownlee March 1, 2017 at 8:32 am #

      I would suggest using Mac OS X.

      You could install and use Python Anaconda, or if you are more advanced, explore using a package manager like macports (my personal preference).

      • Avatar
        Francis Ibok March 12, 2017 at 4:08 pm #

        That means, I can uninstall the window 7 and use only the Mac OSX.

  3. Avatar
    Jimmy Olano March 1, 2017 at 5:48 am #

    Great article!

    The key for VBox Guest Adds is compiling the “kernel modules” into the virtual machine:

    -Just ALT+F2 and type “gnome-terminal”
    -Type “sudo yum install kernel-devel-4.8.9-300.fc25.x86_64”
    -Set iso file with VBox guest add at VirtualBox.
    -Just open the “cd” from Fedora and execute “runasroot.sh” (or use “sudo” in “/run/media/VIRTUALBOX…”).
    -Wait for compiling “kernel modules” (take a while determining your “hardware” and compiling)
    -Share file and folders from real machine to virtual machine by set in VitualBox.
    -Done!

    • Avatar
      Jason Brownlee March 1, 2017 at 8:45 am #

      Fantastic Jimmy, thanks for the note.

      I’ll give it a try to confirm and maybe even update the tutorial.

  4. Avatar
    Marcos Keyser April 24, 2017 at 8:04 pm #

    Really interesting article!

    I would like to know more about how to run a notebook server inside the VM so that you can connect and run fro your main workstation outside of the VM. Where I can start?

    Thanks

    • Avatar
      Jason Brownlee April 25, 2017 at 7:49 am #

      Great idea. Sorry I don’t have an example at hand.

  5. Avatar
    Marcos Keyser April 24, 2017 at 8:35 pm #

    This presentation about how to use Docker in a data science context is interesting. It would be great to see a blog post about this.

  6. Avatar
    Marcos Keyser April 24, 2017 at 8:35 pm #

    The presentation is here: https://www.youtube.com/watch?v=GOW6yQpxOIg

  7. Avatar
    CarlosFra May 17, 2017 at 1:12 am #

    I love your post, thanks for your help in my Data Science Career

  8. Avatar
    Nagaraj October 15, 2017 at 11:39 pm #

    Tried the above steps in Windows 10 machine and they worked like a charm. Great post. Thanks.

  9. Avatar
    domenico February 19, 2018 at 6:26 pm #

    why not make the virtual machine available for download? basically it’s all open source 🙂

    • Avatar
      Jason Brownlee February 21, 2018 at 6:25 am #

      Good suggestion. The main reason is because it is massive, e.g. Gigabytes.

  10. Avatar
    Venkita Krishnan June 27, 2018 at 12:55 am #

    I had done the same using your instructions using VMWare Workstation and it works perfect. Thanks

  11. Avatar
    Alistair August 17, 2018 at 9:47 am #

    Just a note: Fedora 28 no longer gets you to create a user or set a root password during the install. I found some information on Reddit: https://www.reddit.com/r/Fedora/comments/8g0ggh/question_about_fedora_28s_new_install_no_root/

  12. Avatar
    sweta January 18, 2019 at 11:31 pm #

    Good information. How to do this on windows 7? It would be great if you share the information.

  13. Avatar
    Titan January 22, 2019 at 3:23 pm #

    This was a great help.

    I am really curious and would love to get your perspective on the following:

    1) What are your thoughts on VMware with Ubuntu on it? As Ubuntu claims to be ML centric in its build structure and VMware is often pitted against VirtualBox (are there any distinct advantages?)

    2) Also do you have any recommendation on how much ram i should set aside for
    the VirtualBox or VMware?

    Thanks,

  14. Avatar
    David February 16, 2019 at 11:44 pm #

    Nice tutorial.

    I built a Virtual Machine with Anaconda and shared as OVA file on GitHub (click my name).
    The page has a description of the machine, photos and a video.

    Hope it helps.

  15. Avatar
    guach July 19, 2019 at 9:36 am #

    Does VirtualBox support the use of GPU (Cuda)?

  16. Avatar
    Facebook August 31, 2019 at 7:46 pm #

    Great to see such a good presentation which was more than guidable to create a Linux Virtual Machine for Machine Learning Development with Python 3.

Leave a Reply