Use R For Machine Learning

You should use R for machine learning.

R is one of the most powerful machine learning platforms and is used by the top data scientists in the world.

In this post you will learn why you should use R for machine learning.

Kick-start your project with my new book Machine Learning Mastery With R, including step-by-step tutorials and the R source code files for all examples.

Let’s get started.

R Platform For Machine Learning

R Platform For Machine Learning
Photo by Christopher Woo, some rights reserved.

Why You Should Care About R

R is used by the best data scientists in the world. In surveys on Kaggle (the competitive machine learning platform), R is by far the most used machine learning tool. When professional machine learning practitioners were surveyed in 2015, again the most popular machine learning tool was R.

R is powerful because of the breadth of techniques it offers. Any techniques that you can think of for data analysis, visualization, sampling, supervised learning and model evaluation are provided in R. The platform has more techniques than any other that you will come across.

R is state-of-the-art because it is used by academics. One of the reasons why R has so many techniques is because academics that develop new algorithms are developing them in R and releasing them as R packages. This means that you can get access to state-of-the-art algorithms in R before other platforms. It also means that you can only access some algorithms in R until someone ports them to other platforms.

R is free because it is open source software. You can download it right now for free and it runs on any workstation platform you are likely to use.

Convinced?

Need more Help with R for Machine Learning?

Take my free 14-day email course and discover how to use R on your project (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

So What Is R?

R is a language, an interpreter and a platform.

R is a computer language. It can be difficult to learn but is familiar and you will figure it out quickly if you have used other scripting languages like Python, Ruby or BASH.

R is an interpreter. You can write scripts and save them as files. Like other scripting languages, you can then use the interpret to run those scripts any time. R also provides a REPL environment where you can type in commands and see the output immediately.

R is also a platform. You can use it to create and display graphics, to save and load state and to interface with other systems. You can do all of your exploration and development in the REPL environment if you so wish.

Want more, check out my previous post What is R?

Power Is In The Packages

The power of R is in the packages.

R itself is very simple. It provides built in commands for basic statistics and data handing. The machine learning features of R that you will use come from third party packages. Packages are plug-ins to the R platform. You can search for, download and install them within the R environment.

Because packages are created by third parties, their quality can vary. It is a good idea to search for the best-of-breed packages that provide a specific technique you want to use. Packages provide documentation in the form of help for each package function and often vignettes that demonstrate how to use the package.

Before you write a line of code, always search to see if there is a package that can do what you need.

You can search for packages on the Comprehensive R Archive Network or CRAN.

How Do You Use R For Machine Learning?

The R platform is not suitable for all types of machine learning projects. The sweet spot is to use R for exploration and for building one-off models.

Interactive Environment for Exploration

The R interactive environment is very useful for exploring and learning how to use packages and functions. You should spend a lot of time in the interactive environment when you are just starting out.

The environment is also very good if you are exploring a new problem. Not a systematic working of the problem, but more of trying what-if scenarios.

It is also great if you want to use a systematic process and come up with a prototype model very quickly without the full rigmarole.

R Interactive Environment

R Interactive Environment

You can start the interactive environment on the command line by typing:

You can get help on any function by typing:

You can close the interactive environment by calling the quit function:

Use Scripts for One-Off Models

I recommend that if you have a machine learning project that you develop scripts.

Each task in your project could be described in a new script which can be documented, updated and tracked in revision control.

R scripts can be run from the command line, called from shell scripts and (my personal favorite) called from targets in a Makefile.

For example, here is how you can call the R executable from the command line, shell script or make file to run your script file:

This runs the script your_script.R using R in a batch mode (non-iteratively) and saves any results of the script in the file your_script.log.

Not For Production

R is probably not the best solution for building a production model.

The techniques may be state-of-the-art but they may not use the best software engineering principles, have tests or be scalable to the size of datasets that you may need to work with.

That being said, R may be the best solution to discover what model to actually use in production.

The landscape is changing and people are writing R scripts to run operationally and services are emerging to support larger dataset.

General Tips When Using R

Below are tips for making the most of R for machine learning.

  • Stick with basic R. Don’t write functions and serious code until you are comfortable with the environment. Stick to calling functions in packages.
  • Learn from help and vignettes. Packages come with help in the form of documentation for each function and vignettes that give you usage information. If in doubt, search for the package in your favorite search engine to find the home page of the package on CRAN. Running examples from vignettes can teach you a lot about the expected usage of a function.
  • Tabular data. Because R was built by statisticians for statisticians, it is suited for tabular data, e.g. a matrix of data as you would see in a spreadsheet.
  • Small data. R is more suited to smaller datasets, e.g. tens- or hundreds of thousands, but not millions of rows.
  • Don’t program. Focus on packages and functions and how to use them well. I do not recommend learning “how to program in R” unless you want to create your own packages.

You Can Use R For Machine Learning

You do not need to be an R programmer. If you know how to program with another programming language like Java, C#, JavaScript or Python then you can use R. You will pick-up the syntax very quickly.

You do not need to be a good programmer. Getting good at using R is not about being a good programmer, it is about knowing which packages to use and how to use them well. Read up on the packages and practice using them. Don’t study how to program well in R, it is a waste of time.

You do not need to be a machine learning expert. There are hundreds of machine learning packages and thousands of techniques that you can use. Take your time, read the documentation and practice.

Summary

In this post you discovered that you should use R for machine learning.

It is one of the most widely used platforms for machine learning by professionals and the best data scientists in the world.

You discovered the sweet spot for R:

  • Using R for exploration and prototyping in the interactive environment.
  • Using R to develop one-off models by writing scripts.

Your Next Step

Do you want to use R for machine learning?

Get started Right Now!

Do you have a question? Send me an email or post a comment below.

Discover Faster Machine Learning in R!

Master Machine Learning With R

Develop Your Own Models in Minutes

...with just a few lines of R code

Discover how in my new Ebook:
Machine Learning Mastery With R

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, build models, tuning, and much more...

Finally Bring Machine Learning To Your Own Projects

Skip the Academics. Just Results.

See What's Inside

23 Responses to Use R For Machine Learning

  1. Avatar
    Ivan Oboth September 7, 2016 at 8:16 pm #

    Thanks Jason! As helpful as always.

  2. Avatar
    Xandbal September 23, 2016 at 8:19 pm #

    really useful, was al ready starting in R and realized that i could learn R as a programming language but i got a lot of conflicting info on R as a programming language since its more domain specific and thus optimized for that function (is a ‘for’ command optimized in R or an add on?). So i agree with you, its the packages that makes it great, i am going to focus on that. Thanks you for cutting through that!

  3. Avatar
    Venu Dave December 6, 2016 at 12:23 am #

    It’s really nice for beginners. Thank you so much jason brownlee

  4. Avatar
    Manuel December 17, 2016 at 7:03 pm #

    You said that you don’t recommend R for large datasets, which language would you recommend instead?

    • Avatar
      Jason Brownlee December 18, 2016 at 5:29 am #

      You can make your dataset small by taking a sample.

      Platforms like Hadoop and Spark are designed for large data sets. This not a recommendation, just a pointer to a different class of tool.

  5. Avatar
    Raghuraj singh January 27, 2017 at 3:02 pm #

    It a great …… Thanks jason

  6. Avatar
    Sam May 1, 2017 at 4:51 am #

    what is machine language why do we need to use R in that… why we cant do the same using R packages instead machine language packages???????

  7. Avatar
    Suman September 10, 2019 at 4:23 pm #

    The best way to explain R …
    Thnx Jason ..

  8. Avatar
    x cheng January 1, 2020 at 12:10 pm #

    Hello, Jason. Why is my Weka without CPython scripting machine learning library?

    • Avatar
      Jason Brownlee January 2, 2020 at 6:36 am #

      Sorry, I don’t understand. Perhaps you can elaborate?

  9. Avatar
    Best data science software course training institute in hyderabad February 3, 2022 at 4:03 pm #

    Thanks for the useful Message. Interesting to read this blog.
    This is good site and nice point of view.I learnt lots of useful information.

    • Avatar
      James Carmichael February 4, 2022 at 10:21 am #

      You are very welcome!

  10. Avatar
    Stephane April 17, 2023 at 4:51 pm #

    Hellow Jason,
    Can i use R as a machine language for predicting loan eligibility client by using Machine Learning?

    • Avatar
      James Carmichael April 18, 2023 at 10:31 am #

      Hi Stephane…R is indeed powerful, however, we recommend Python and most of content is based upon Python for machine learning.

  11. Avatar
    Felix February 2, 2024 at 11:07 pm #

    Hello Jason,

    Thanks for this piece. I want to know please, why is most ML explorative project sample scripts outside there are done in Python and not R.

    For instance, one can easily get a python implementation scripts for malicious url detection online but not the R version.

    Second, I would like to carry out an explorative research project work on anomalous user or entity behaviour on a corporate network. I am more familiar with R than python. Which language is better to implement this, and that one can find helpful sample codes online?

    Thank you

    • Avatar
      James Carmichael February 3, 2024 at 9:41 am #

      Hi Felix…You are very welcome!

      I have focused my deep learning tutorials on the Keras library in Python.

      The main reason for this, is that skills in machine learning and deep learning in Python are in huge demand. You can learn more in this post:

      Python is the Growing Platform for Applied Machine Learning
      I believe Keras is now supported in R, and perhaps much of the library has the same API function calls and arguments.

      You may be able to port my Python-based tutorials to R with little effort.

  12. Avatar
    Princess Leja February 23, 2024 at 1:46 am #

    Thank you Jason. I know some basics in R and plan to further R in future.

Leave a Reply