Penalized Regression in R

Last Updated on

In this post you will discover 3 recipes for penalized regression for the R platform.

You can copy and paste the recipes in this post to make a jump-start on your own problem or to learn and practice with linear regression in R.

Discover how to prepare data, fit machine learning models and evaluate their predictions in R with my new book, including 14 step-by-step tutorials, 3 projects, and full source code.

Let’s get started.

Penalized Regression

Penalized Regression
Photo by Bay Area Bias, some rights reserved

Each example in this post uses the longley dataset provided in the datasets package that comes with R. The longley dataset describes 7 economic variables observed from 1947 to 1962 used to predict the number of people employed yearly.

Ridge Regression

Ridge Regression creates a linear regression model that is penalized with the L2-norm which is the sum of the squared coefficients. This has the effect of shrinking the coefficient values (and the complexity of the model) allowing some coefficients with minor contribution to the response to get close to zero.

Learn about the glmnet function in the glmnet package.

Need more Help with R for Machine Learning?

Take my free 14-day email course and discover how to use R on your project (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

Start Your FREE Mini-Course Now!

Least Absolute Shrinkage and Selection Operator

Least Absolute Shrinkage and Selection Operator (LASSO) creates a regression model that is penalized with the L1-norm which is the sum of the absolute coefficients. This has the effect of shrinking coefficient values (and the complexity of the model), allowing some with a minor effect to the response to become zero.

Learn about the lars function in the lars package.

Elastic Net

Elastic Net creates a regression model that is penalized with both the L1-norm and L2-norm. This has the effect of effectively shrinking coefficients (as in ridge regression) and setting some coefficients to zero (as in LASSO).

Learn about the glmnet function in the glmnet package.

Summary

In this post you discovered 3 recipes for penalized regression in R.

Penalization is a powerful method for attribute selection and improving the accuracy of predictive models. For more information see Chapter 6 of Applied Predictive Modeling by Kuhn and Johnson that provides an excellent introduction to linear regression with R for beginners.

Discover Faster Machine Learning in R!

Master Machine Learning With R

Develop Your Own Models in Minutes

...with just a few lines of R code

Discover how in my new Ebook:
Machine Learning Mastery With R

Covers self-study tutorials and end-to-end projects like:
Loading data, visualization, build models, tuning, and much more...

Finally Bring Machine Learning To Your Own Projects

Skip the Academics. Just Results.

See What's Inside

29 Responses to Penalized Regression in R

  1. Hrvoje July 25, 2014 at 10:53 pm #

    Nice article, but first and third code are the same 🙂 What’s the difference?

    • jasonb July 26, 2014 at 7:40 am #

      Almost. Note the value of alpha (the elastic net mixing parameter).
      A great thing about the glmnet function is that it can do ridge, lasso and a hybrid of both. In the first example, we have used glmnet with an alpha of 0 which results in ridge regression (only L2). If alpha was set to 1 it would be lasso (only L1). Note in the third example that alpha is set to 0.5, this is the elastic net mixture of L1 and L2 at a 50% mixing.
      I hope that is clearer.

      • Hrvoje July 26, 2014 at 11:47 pm #

        It’ s much clearer now. Tnx a lot 🙂

  2. TropoSco August 2, 2014 at 7:50 pm #

    Thanks for the post,

    I was wondering if you knew the differences (computational and statistical performances) between using the lars package and the glmnet one with alpha=1 for performing a LASSO regression ?

    Thank you for your time and keep up the good work!

  3. JOY May 12, 2015 at 10:04 pm #

    nice article.l want to know how to use R for regression analysis.you are following the step by step method to do it to my e-mail.l have R package already on my Laptop

  4. Christine March 16, 2016 at 9:08 pm #

    Hi Jason, thank you so much for the very clear tutorials,
    It may be a silly question to ask, but how do I interpret the goodness-of-fit from the elastic net? I’ve obtained a value after:
    # summarize the fit
    summary(fit)
    # make predictions
    predictions <- predict(fit, x, type="link")
    # summarize accuracy
    rmse <- mean((y – predictions)^2)
    print(rmse)

    but not sure how I should interpret it.
    Thanks so much!

    • keval March 24, 2017 at 2:02 pm #

      I was wondering the same, how do I interpret mse?

      • Jason Brownlee March 25, 2017 at 7:32 am #

        Take the square root and the results are in the same units as the original data.

  5. Hans June 2, 2017 at 12:25 am #

    What does ‘longley[,1:6]’ respectively ‘longley[,7]’ mean?

    How can we replace data(longley) with own csv-data accurately?

    • Jason Brownlee June 2, 2017 at 1:01 pm #

      It specifies columns in the data.

      • Hans June 2, 2017 at 8:14 pm #

        Which columns are specified with ‘longley[,1:6]’ for example?

        When I click ‘x’ in the inspector of R Studio it gives me a table with headers:
        GNP.deflator
        GNP
        Unemployed
        Armed.Forces
        Population
        Year

        When I click y it gives me a table with header “V1”

        Could it described in words like ‘longley[datastart,dataend:datarows]’?
        Is it a kind of subsetting?

        • Hans June 2, 2017 at 8:25 pm #

          Alternatively if it is meant to be

          longley[,1:6] = longley[,col 1 to col 6]

          and

          longley[,7] = longley[,col 7]

          what is the part before the ‘,’ ? Any wildcard?

          • Hans June 6, 2017 at 9:16 am #

            got it…
            dataset[10:12,1:3] = dataset[startRow:endRow,startColumn:endColumn]
            dataset[,1:3] = dataset[allRows,startColumn:endColumn]

  6. Hans June 2, 2017 at 8:27 pm #

    Is there a way in R Studio where I can easily see how the original table of ‘data(longley)’ is structured (all headers)?

    • Jason Brownlee June 3, 2017 at 7:24 am #

      I do not use RStudio and cannot give you advice about it sorry.

    • srepho July 5, 2017 at 2:49 pm #

      In RStudio you can use View(df) and it shows the dataframe/tibble in the Viewer

      View(longley)

      Alternativly within the dplyr package you can do glimpse(df) to get a list of the column names and the data type.

      dplyr::glimpse(longley)

  7. Hans June 6, 2017 at 9:05 am #

    How to predict one step of unseen data in the above code?

    • Hand June 6, 2017 at 11:11 am #

      Do we have to use training and testdata to predict unseen data with glmnet?
      And if so, should we use last obervations of prediction with ‘test_data’ as newx for a prediction of new unseen data?

      • Jason Brownlee June 7, 2017 at 7:08 am #

        You can use any test harness you like to estimate the skill of the model on unseen data.

  8. hiya July 11, 2017 at 5:31 pm #

    Internally what method has been used – Least square
    or Maximum Likelihood

    • Jason Brownlee July 12, 2017 at 9:41 am #

      I would recommend reading the package documentation for the specific methods.

  9. Fakhra December 25, 2017 at 1:34 am #

    what if i have to use a proposed penalty than the built in penalty of LASSO or Elastic net? How it can be used?

    • Jason Brownlee December 25, 2017 at 5:25 am #

      Good question, you might need to implement it yourself.

  10. Munira April 6, 2018 at 2:58 am #

    Hi Jason,
    I have started your MAchine Learning Course-which seems to be very useful. I have a question regarding an analysis I am planning to conduct. I have briefly described my research setting and questions about R packages. If you can help me with this, that would be wonderful.

    I have been trying to analyze a high dimensional data (p exceeds n) with limited observation (n=50). I want to use Lasso method for variable selection and parameter estimation to develop a prediction model. As my data is Count observation, it has to be Poisson or Negative Binomial. I have explored several R packages and researches and finally decided to use Glmnet. Now I have some questions

    I know “Glmnet” and “Penalized” package are using different algorithm. As I saw “Mpath” use coordinate descent like Glmnet, do “Mapth” and “Glmnet” provide comparable results? The only reason I am interested in “Mpath” because they allow NB regression that “Glmnet” doesn’t.

    A few of the package allow post Inference (p-value and confidence interval) for example ‘selectiveInference’, “hdi”. however, I couldn’t find anything for Poisson or NB models. Is there any package that can help me?

    Thanks in advance.

    • Jason Brownlee April 6, 2018 at 6:34 am #

      There might be, I’m not sure off the cuff sorry.

      Perhaps try posting to the R user list?

  11. niebieska_biedronka April 20, 2018 at 6:24 pm #

    As I understand from glmnet package description (https://cran.r-project.org/web/packages/glmnet/glmnet.pdf), it does not fit the ridge regression, only lasso or elastic net – I guess it is because of penalty definition (it never reduces to ridge penalty definition)

    • Jason Brownlee April 21, 2018 at 6:44 am #

      With elastic net you can do ridge, lasso and both (e.g. elastic net).

  12. william July 28, 2018 at 8:20 pm #

    is Lasso a good method to use for feature selection for high dimensional dataset for a regression problem ML algorithm?

    i.e. i’m trying to determine the best algorithm i can use to select the best features for my output variable, which is continuous, so i’ve been using Lasso, i’m just not sure how effective it is compared to others…any suggestions for feature selection methods for regression-focused ML problems?

    thanks,

Leave a Reply