Computer Hardware for Machine Learning

A question that comes up from time to time is:

What hardware do I need to practice machine learning?

There was a time when I was a student when I was obsessed with more speed and more cores so I could run my algorithms faster and for longer. I have changed my perspective. Big hardware still matters, but only after you have considered a bunch of other factors.

machine learning hardware

TRS 80!
Photo by blakespot, some rights reserved.

Hardware Lessons

The lesson is, if you are just starting out, you’re hardware doesn’t matter. Focus on learning with small datasets that fit in memory, such as those from the UCI Machine Learning Repository.

Learn good experimental design and make sure you ask the right questions and challenge your intuitions by testing diverse algorithms and interpreting your results through the lens of statistical hypothesis testing.

Once hardware does start to matter and you really need lots of cores and a whole lot of RAM, rent it just-in-time for your carefully designed project or experiment.

More CPU! More RAM!

I was naive when I first stated in artificial intelligence and machine learning. I would use all the data that was available and run it through my algorithms. I would re-run models with minor tweets to parameters in an effort to improve the final score. I would run my models for days or weeks on end. I was obsessed.

This mainly stemmed from the fact that competitions got be interested in pushing my machine learning skills. Obsession can be good, you can learn a lot very quickly. But when misapplied, you can waste a lot of time.

I built my own machines in those days. I would update my CPU and RAM often. It was the early 2000s, before multicore was the clear path (to me) and even before GPUs where talked about much for non-graphics use (at least in my circles). I needed bigger and faster CPUs and I needed lots and lots of RAM. I even commandeered the PCs of housemates so that I could do more runs.

A little later whilst in grad school, I had access to a small cluster in the lab and proceeded to make good use of it. But things started to change and it started to matter less how much raw compute power I had available.

gpu machine learning

Getting serious with GPU hardware for machine learning.
Photo by wstryder, some rights reserved.

Results Are Wrong

The first step in my change was the discovery of good (any) experimental design. I discovered the tools of statistical hypothesis testing which allowed me to get an idea of whether one result really was significantly different (such as better) when compared to another result.

Suddenly, the fractional improvements I thought I was achieving were nothing more than statistical blips. This was an important change. I started to spend a lot more time thinking about the experimental design.

Questions Are Wrong

I shifted my obsessions to making sure I was asking good questions.

I now spend a lot of time up front loading in as many questions and variations on the questions as I can think of for a given problem. I want to make sure that when I run long compute jobs, that the results I get really matter. That they are going to impact the problem.

You can see this when I strongly advocate spending a lot of time defining your problem.

Intuitions Are Wrong

Good hypothesis testing exposes how little you think you know. We’ll it did for me and still does. I “knew” that this configuration of that algorithm was stable, reliable and good. Results when interpreted through the lens of statistical tests quickly taught me otherwise.

This shifted my thinking to be less reliable on my old intuitions and to rebuild my institution through the lens of statistically significant results.

Now, I don’t assume I know which algorithm or even which class of algorithm will do well on a given problem. I spot check a diverse set and let the data guide me in.

I also strongly advice careful consideration of test options and use of tools like the Weka experimenter that bake in hypothesis testing when interpreting results.

Best is Not Best

For some problems, the very best results are fragile.

I used to be big into non-linear function optimization (and associated competitions) and you could expend a huge amount of compute time on exploring (in retrospect, essentially enumerating!) search spaces and come up with structures or configurations that were marginally better than easily found solutions.

The thing is, the hard to find configurations were commonly very strange or exploited bugs or quirks in the domain or simulator. These solutions were good for competitions or for experiments because the numbers were better, but not necessarily viable for use in the domain or operations.

I see the same pattern in machine learning competitions. A quick and easily found solution is lower in a given performance measure, but is robust. Often, once you pour days, weeks, and months into tuning your models, you are building a fragile model of glass that is very much overfit to the training data and/or the leaderboard. Good for learning and for doing well in competitions, not necessarily usable in operations (for example, the Netflix Prize-Winning System was not Deployed).

machine learning data center

Machine Learning in a Data Center.
Photo by bandarji, some rights reserved.

Machine Learning Hardware

There are big data that require big hardware. Learning about big machine learning requires big data and big hardware.

On this site, I focus on beginners starting out in machine learning, who are much better off with small data on small hardware. Once you get enough of the machine learning, you can graduate to the bigger problems.

Today, I have an iMac i7 with a bunch of cores and 8 GB of RAM. It’s a run-of-the-mill workstation and does the job. I think that your workstation or laptop is good enough to get started in machine learning.

I do need bigger hardware on occasion, such as a competition or for my own personal satisfaction. On these occasions I rent cloud infrastructure, spin up some instances and run my models, then download the CSV predictions or whatever. It’s very cheap in time and dollars.

When it comes time for you to start practicing on big hardware with big data, rent it. Invest a little bit of money in your own education, design some careful experiments and rent a cluster to execute them.

What hardware do you practice machine learning on? Leave a comment and share your experiences.

15 Responses to Computer Hardware for Machine Learning

  1. Mark February 28, 2015 at 9:58 am #

    I need to upgrade my desktop and want to do some AI/machine learning algorithms on it. I see people using Nvidia graphics cards to speed things. I am wondering which cards to get and what type of general computer specs i need. This is just for home use and i would like something of reasonable price if not dirt cheap. Lol. Thanks.

  2. OMG August 31, 2016 at 11:46 am #

    what if It also has both ml and gaming? It is hard and critical problem.

  3. Ganesh November 3, 2016 at 2:18 am #

    Hey Jason,

    Thanks for your blog and this writeup. I’ve found it to be very useful.

    What do you think is a good heuristic limit for rowXcolumns type data that one can analyze on a decent laptop of the type you mention in your writeup versus, say, EC2.

    • Jason Brownlee November 3, 2016 at 8:02 am #

      Hi Ganesh,

      I need fast turn around times. I want results in minutes. This means I often scale data down to a size where I can model it in minutes. I then use big computers to help understand how the results on small data map to the full dataset.

      I find the real the bottleneck is ideas and testing them. You want an environment that helps you test things fast.

  4. Jatin November 25, 2016 at 7:11 pm #

    I am also facing kind of same problem. Can be specifically recommend some “cloud infrastructure” ?

  5. Jonathan December 21, 2016 at 8:23 pm #

    I am new to machine learning and I think I’m not ready yet to rent a cluster, how about a laptop with decent GPU, right now I don’t have access to large data to play from. I have a laptop with gtx950m.

    Awesome books, I bought 3 of them.

  6. sandy May 25, 2017 at 7:24 pm #

    What is the minimum configuration needed to train deep learning model ? Do i need NVIDIA GPU ?or is it possible on Intel HD graphics?

    • Jason Brownlee June 2, 2017 at 11:43 am #

      No, you can use the CPU until you need to train large models, then you can use AWS.

  7. Rohan June 6, 2017 at 1:59 am #

    I am confused between AMD vs INTEL cpu? What should i buy for machine learning ? Is there any compatibility issues with AMD cpu and NVEDIA Graphic card ?

  8. LukeJohnnywalker October 11, 2017 at 3:56 pm #

    Honestly I am only looking for an excuse to buy a high end gaming laptop. I am not getting it from here…but very educational information. Cheers.

  9. Jon Snow November 9, 2017 at 11:21 pm #

    Very good advice… I have also concluded I need to brush up on my statistical knowledge stack. It is not enough to be able to use different models without having a beyond-shallow statistical understanding of results and model behaviour.

    If you can also recommend good resources where one can improve their statistical knowledge. it’d be super nice.

    Thanks!

Leave a Reply