What is the Difference Between a Parameter and a Hyperparameter?

By Jason Brownlee on June 17, 2019 in Machine Learning Process 153

It can be confusing when you get started in applied machine learning.

There are so many terms to use and many of the terms may not be used consistently. This is especially true if you have come from another field of study that may use some of the same terms as machine learning, but they are used differently.

For example: the terms “model parameter” and “model hyperparameter.”

Not having a clear definition for these terms is a common struggle for beginners, especially those that have come from the fields of statistics or economics.

In this post, we will take a closer look at these terms.

How to Develop an Information Maximizing Generative Adversarial Network (InfoGAN) in Keras

What is the Difference Between a Parameter and a Hyperparameter?
Photo by Irol Trasmonte, some rights reserved.

What is a Model Parameter?

A model parameter is a configuration variable that is internal to the model and whose value can be estimated from data.

They are required by the model when making predictions.
They values define the skill of the model on your problem.
They are estimated or learned from data.
They are often not set manually by the practitioner.
They are often saved as part of the learned model.

Parameters are key to machine learning algorithms. They are the part of the model that is learned from historical training data.

In classical machine learning literature, we may think of the model as the hypothesis and the parameters as the tailoring of the hypothesis to a specific set of data.

Often model parameters are estimated using an optimization algorithm, which is a type of efficient search through possible parameter values.

Statistics: In statistics, you may assume a distribution for a variable, such as a Gaussian distribution. Two parameters of the Gaussian distribution are the mean (mu) and the standard deviation (sigma). This holds in machine learning, where these parameters may be estimated from data and used as part of a predictive model.
Programming: In programming, you may pass a parameter to a function. In this case, a parameter is a function argument that could have one of a range of values. In machine learning, the specific model you are using is the function and requires parameters in order to make a prediction on new data.

Whether a model has a fixed or variable number of parameters determines whether it may be referred to as “parametric” or “nonparametric“.

Some examples of model parameters include:

The weights in an artificial neural network.
The support vectors in a support vector machine.
The coefficients in a linear regression or logistic regression.

What is a Model Hyperparameter?

A model hyperparameter is a configuration that is external to the model and whose value cannot be estimated from data.

They are often used in processes to help estimate model parameters.
They are often specified by the practitioner.
They can often be set using heuristics.
They are often tuned for a given predictive modeling problem.

We cannot know the best value for a model hyperparameter on a given problem. We may use rules of thumb, copy values used on other problems, or search for the best value by trial and error.

When a machine learning algorithm is tuned for a specific problem, such as when you are using a grid search or a random search, then you are tuning the hyperparameters of the model or order to discover the parameters of the model that result in the most skillful predictions.

Many models have important parameters which cannot be directly estimated from the data. For example, in the K-nearest neighbor classification model … This type of model parameter is referred to as a tuning parameter because there is no analytical formula available to calculate an appropriate value.

— Page 64-65, Applied Predictive Modeling, 2013

Model hyperparameters are often referred to as model parameters which can make things confusing. A good rule of thumb to overcome this confusion is as follows:

If you have to specify a model parameter manually then
it is probably a model hyperparameter.

Some examples of model hyperparameters include:

The learning rate for training a neural network.
The C and sigma hyperparameters for support vector machines.
The k in k-nearest neighbors.

Summary

In this post, you discovered the clear definitions and the difference between model parameters and model hyperparameters.

In summary, model parameters are estimated from data automatically and model hyperparameters are set manually and are used in processes to help estimate model parameters.

Model hyperparameters are often referred to as parameters because they are the parts of the machine learning that must be set manually and tuned.

Did this post help you clear up the confusion?
Let me know in the comments below.

Are there model parameters or hyperparameters that you are still unsure about?
Post them in the comments and I’ll do my best to help clear things up further.

153 Responses to What is the Difference Between a Parameter and a Hyperparameter?

Kiki July 26, 2017 at 7:32 am #

Awesome article! This was a big point of confusion, as I wasn’t sure what “knobs” I had at my disposal to tune my model — there are a lot of them, but they weren’t all in one place like the dash of a car. 🙂 Thank you for making this clear!

Reply
- Jason Brownlee July 26, 2017 at 8:04 am #
  
  Thanks. I’m glad it helped!
  
  Reply
Dr Alan Beckles July 26, 2017 at 7:57 am #

Excellent post, Jason. Thanks!

Reply
- Jason Brownlee July 26, 2017 at 8:04 am #
  
  You’re welcome Alan.
  
  Reply
ujjawal sinha July 26, 2017 at 8:02 am #

Thanks Jason , Excellent

Reply
- Jason Brownlee July 26, 2017 at 8:04 am #
  
  I’m glad it helped.
  
  Reply
Wesley July 26, 2017 at 8:04 am #

Great explanation…

Reply
- Jason Brownlee July 26, 2017 at 8:05 am #
  
  Thanks Wesley.
  
  Reply
Deepak Sharma July 27, 2017 at 3:57 am #

Superb explanation Jason….love reading your articles!!!

Reply
- Jason Brownlee July 27, 2017 at 8:11 am #
  
  Thanks Deepak.
  
  Reply
Jie July 27, 2017 at 6:06 pm #

In part model para, you give this example “The support vectors in a support vector machine.” I am a little confusing, why not the coefficients in SVM?

Reply
- Jason Brownlee July 28, 2017 at 8:29 am #
  
  We call the instances found by SVM “support vectors” they are technically not “weights” or “coefficients”.
  
  Reply
Luis July 28, 2017 at 6:15 am #

Great post, Jason. Thanks!

One question: k-nearest neighbourhood is considered a non parametric model (vs parametric models). Shouldn’t k be considered as a hyperparameter then?

Reply
- Jason Brownlee July 28, 2017 at 8:36 am #
  
  The “k” in kNN is a hyperparameter. I say exactly this Luis.
  
  Reply
Luis July 28, 2017 at 6:22 am #

The confounding part was the use of “parameter” in:

“Many models have important parameters which cannot be directly estimated from the data. For example, in the K-nearest neighbor classification model … This type of model parameter is referred to as a tuning parameter because there is no analytical formula available to calculate an appropriate value.”

Reply
- Jason Brownlee July 28, 2017 at 8:37 am #
  
  Why is this confounding Luis?
  
  Reply
  - Tommy July 31, 2017 at 7:39 pm #
    
    The book Applied Predictive Modeling does not contain the word hyperparameter. The article above states that many experts mix up the terms parameter and hyperparameter.
    
    So what’s the point of including the quote? Here are some potential answers:
    1. The authors used the term “tuning parameter” incorrectly, and should have used the term hyperparameter. This understanding is supported by including the quote in the section on hyperparameters, Furthermore my understanding is that using a threshold for statistical significance as a tuning parameter may be called a hyperparameter because it
    
    However, I believe that “tuning parameter” is not an incorrect description.
    
    Also, you linked to the Wikipedia page for Baysian hyperparameters rather than the page for hyperparameters in Machine learning https://en.wikipedia.org/wiki/Hyperparameter_optimization
    
    The Wikipedia page gives the straightforward definition: “In the context of machine learning, hyperparameters are parameters whose values are set prior to the commencement of the learning process. By contrast, the value of other parameters is derived via training.”
    
    Reply
    - Tommy July 31, 2017 at 7:56 pm #
      
      Correct me if I’m wrong, but according to many definitions, hyperparameters are a type of parameter.
      
      Synonyms for hyperparameters: tuning parameters, meta parameters, free parameters
      
      Since hyperparameters are a type of parameter, the two terms are interchangeable when discussing hyperparameters. However, not all parameters are hyperparameters.
      
      Reply
      - Jason Brownlee August 1, 2017 at 7:58 am #
        
        Nice perspective, thanks Tommy.
        
        I cannot disagree generally, but the distinction is important, especially if you are a beginner trying to figure out what to “configure” or “tune”.
    - Jason Brownlee August 1, 2017 at 7:56 am #
      
      Hi Tommy, I provided the quote to help clarify the definitions, not as an example of misuse. Sorry for the confusion.
      
      Nice, your definition matches with the “estimated from data vs not” approach used in the post.
      
      Reply
Sasikanth July 28, 2017 at 11:58 am #

Crystal clear. Thanks Jason

Reply
- Jason Brownlee July 29, 2017 at 8:01 am #
  
  I’m glad it helped.
  
  Reply
Bharath Bhushan July 28, 2017 at 4:12 pm #

thanks. I was thinking both of them refer to the same thing. Thanks for clarification.

Reply
- Jason Brownlee July 29, 2017 at 8:06 am #
  
  I’m glad it help helped.
  
  Reply
Ravindra July 28, 2017 at 4:34 pm #

Awesome! It was really confusing(parameters vs hyperparameter) and I was ignoring it, but this post made it very clear.
Thank You!!

Reply
- Jason Brownlee July 29, 2017 at 8:06 am #
  
  Happy it helped!
  
  Reply
Abkul July 28, 2017 at 4:44 pm #

superbly explained.Thanks for the always handy post.

Reply
- Jason Brownlee July 29, 2017 at 8:06 am #
  
  Thanks!
  
  Reply
Tim July 29, 2017 at 5:31 pm #

clf = svm.SVC(C =0.01, kernel =’rbf’, random_state=33)

——
random_state is parameter or hyperparameter?

Reply
- Jason Brownlee July 30, 2017 at 7:45 am #
  
  Deep Tim… great question!
  
  A gut check says “hyperparameter”, but we do not optimize it, we control for it. This feels wrong though. Perhaps it is neither.
  
  What I mean is, it impacts the skill of the model, or most models that are stochastic, but we do not “tune” the value for a specific model/dataset. The idea of the “best” random seed does not make sense. Instead, we would re-run the experiment n times in order to develop a robust estimate of skill. We would create an ensemble of n final models to produce a more robust set of predictions.
  
  Does that help? Am I making sense?
  
  Reply
  - Nate November 3, 2019 at 12:29 pm #
    
    This makes sense. I agree “random_seed” seems like *neither* a hyperparameter or a parameter…
    This Stack Exchange question (link below) implies “random_seed” is a *parameter*. Whereas if I had to choose, I would choose “hyperparameter” (i.e. you “tune” /”configure” it; it is not “learned by the model”)
    
    https://datascience.stackexchange.com/a/14194/77875
    
    Thanks for the article!
    
    Reply
    - Jason Brownlee November 4, 2019 at 6:36 am #
      
      Thanks.
      
      Reply
Vinícius July 30, 2017 at 1:28 am #

Excellent post! I am currently studying an application of Stacked Autoencoders on passive sonar classification and your posts have been very helpful for me. I have learned a lot with you. Taking advantage, do you have any material on this topic? Or novelty detection? Thank you!

Reply
- Jason Brownlee July 30, 2017 at 7:48 am #
  
  THanks.
  
  Sorry, I don’t have posts on these topics, I hope to get to them sometime.
  
  Reply
Siva August 2, 2017 at 5:40 pm #

Good clarification and explanation. Thanks!

Reply
- Jason Brownlee August 3, 2017 at 6:47 am #
  
  Thanks Siva.
  
  Reply
Gunjan August 31, 2017 at 9:09 pm #

Hi Jason, good explanation. I have one doubt that, if we have some hyperparameter for a given data sequence. Can we predict new set of hyperparameter if a new data sequence is given?

Reply
- Jason Brownlee September 1, 2017 at 6:45 am #
  
  Parameters and hyperparameters refer to the model, not the data.
  
  Reply
Antonio September 13, 2017 at 8:29 am #

To me, a model is fully specified by its family (linear, NN etc) and its parameters. The hyper parameters are used prior to the prediction phase and have an impact on the parameters, but are no longer needed. So coefficients in a linear model are clearly parameters. The learning rate in any gradient descent procedure is a hyperparameter. Structural parameters such as the degree of a polynomial or the number of hidden units are somewhere in between, because they are decided prior to model fitting but are implicit in the parameters themselves. Whether all these numbers are chosen by an algorithm or by hand, I don’t see that as a very helpful distinction. Linear models were fitted by hand only a generation or two ago. Tukey cites drawing something like a super smoother by eye. Nobody would do that now.

Reply
- Jason Brownlee September 13, 2017 at 12:38 pm #
  
  Great note Antonio, thanks.
  
  Reply
Nicolas Marx October 19, 2017 at 10:50 pm #

Hi! great post, i was looking for this clarification. I wonder why it is not possible to tune the hyper-parameters from data, using another data partition as “hyperparameter test set”

Reply
- Jason Brownlee October 20, 2017 at 5:35 am #
  
  You can and this is what people do in a grid search or random search across hyperparameters.
  
  Reply
john October 30, 2017 at 11:30 am #

a model is more strongly influenced whether by model parameters or hyperparameters?

Reply
- Jason Brownlee October 30, 2017 at 3:50 pm #
  
  A model is defined by its parameters.
  
  The hyperparameters influence the training process used to arrive at the model parameters.
  
  Does that help John?
  
  Reply
Mark December 14, 2017 at 5:56 am #

A great article.

I have a question as to the mathematical meaning of a hyperparameter.
If one had to view a machine learning process as a function f that maps some input (from a domain) to some other type of output (codomain).
Is setting a value to a hyper-parameter the same as what is mathematically known as the ‘restriction of a function’?

Thanks.

Reply
- Jason Brownlee December 14, 2017 at 4:39 pm #
  
  Perhaps. I’d recommend casting this question to a mathematician.
  
  Reply
ahmed December 21, 2017 at 3:04 pm #

wow!!
that is awesome , it’s finally clear

Reply
- Jason Brownlee December 21, 2017 at 3:35 pm #
  
  I’m glad to hear that!
  
  Reply
Sean December 23, 2017 at 12:44 am #

Are the hyperparameters the implementation of the learning bias or inductive bias??

Reply
- Jason Brownlee December 23, 2017 at 5:20 am #
  
  The hyperparameters can influence how biased the model is. Depends on the model as to which hyperparameters have an influence.
  
  Reply
Ram February 27, 2018 at 1:06 am #

Let us consider in the case of XGBoost, the option: n_estimators, Max_depth etc., are called as hyperparameter ? Where as the number associated with the n_estimators =100 or Max_depth =6, let’s 100 is called as parameter? Please clarify my confusion

Reply
- Jason Brownlee February 27, 2018 at 6:34 am #
  
  Yes, they are hyperparameters to the model and you specify values for those hyperparameters.
  
  Parameters in the model may be specific split point values within a given tree within the xgboost model (e.g. not exposed).
  
  Reply
- Aditi Joshi June 22, 2020 at 5:38 am #
  
  Great information. Thank you for sharing such an informative post. I have a doubt, Which hyper parameters I should consider while working with price prediction ? Any suggestions for hyperparameters.
  
  Reply
  - Jason Brownlee June 22, 2020 at 6:17 am #
    
    Focus on the hyperparameters on your chosen model only.
    
    Reply
Archit Rao March 4, 2018 at 1:38 pm #

Really great article. Infact all your articles are so good and easy to understand.really helped clear many concepts.

Reply
- Jason Brownlee March 5, 2018 at 6:21 am #
  
  Thanks!
  
  Reply
Manu March 20, 2018 at 8:17 pm #

Clear explanation. Thank you so much!!

Reply
- Jason Brownlee March 21, 2018 at 6:31 am #
  
  Thanks.
  
  Reply
Nisa March 22, 2018 at 5:26 pm #

Hi Jason, do we need hyperparameter tuning while using clustering algorithm such as K-Means / Gaussian Mixture Model?

Thanks in advance 🙂

Reply
- Jason Brownlee March 23, 2018 at 6:03 am #
  
  Sorry, I don’t have material on clustering. I don’t want to give you ad hoc misleading advice.
  
  Reply
Anisah April 2, 2018 at 2:55 am #

Hi Jason, nice article. I am beginner in machine learning. Your article helps me so much. But, I got confuse about how to choose the best parameter? And we should regularly update the parameter right?

Reply
- Jason Brownlee April 2, 2018 at 5:25 am #
  
  The algorithm chooses the parameters via training. The hyperparameters are chosen by you. I recommend running experiments to see what hyperparameters values are best for your chosen model on your specific dataset.
  
  Model parameters may need to be updated if there is a change or drift in your data over time. I cover this here:
  https://machinelearningmastery.com/gentle-introduction-concept-drift-machine-learning/
  
  Reply
NeuroMorphing May 1, 2018 at 9:06 am #

Dear Jason,

first of all thank you very much for the great article and the clarification regarding both types of parameters.

I have a question: At our organization we have the computational power to tune our hyperparameters via grid search. However, almost always when i pick the configuration of the hyperparameters that led to the maximum AUC on the training set, I’m facing pretty bad results on the test set, which implies overfitting… Which strategy would you recommend us to avoid this overfitting? Of course we could use random search (which would take much less runtime) but very likly would not lead to the best results. So what else can we do in such a case?

Thanks in advance…

Reply
- Jason Brownlee May 2, 2018 at 5:36 am #
  
  Perhaps grid search across configs that have shown good results in the literature (e.g. use literature to define the bounding box of the search).
  
  Reply
Parminder Kaur May 23, 2018 at 11:46 pm #

Hi…
I am using a deep convolutional neural network for a remote sensed image classification.
1. The variables used in training are filter weights, biases, learning rate, momentum etc…are they parameters or hyper-parameters?
2. In pooling and drop-out layer, i have defined stride factor and drop-out ratio….are they parameters or hyper-parameters?

Reply
- Jason Brownlee May 24, 2018 at 8:14 am #
  
  Weights are parameters, learning rate is a hyperparameter.
  
  Network architecture is different again. It is more model design.
  
  Reply
  - Aakash Behl August 9, 2018 at 10:31 pm #
    
    is the bias vector b, a parameter or a hyperparameter? Logically I feel it is a parameter but during coursera’s deeplearning.ai assignment it says that it is a hyperparameter.
    
    Reply
    - Jason Brownlee August 10, 2018 at 6:15 am #
      
      Bias is an input (1.0), the weight for the bias is a parameter of the model. The choice to use a bias input in the model may be a hyperparameter.
      
      I don’t know about the thought process behind other courses. Perhaps ask them?
      
      Reply
Indranil May 28, 2018 at 4:52 pm #

So since I am a machine learning practitioner I am a hyper(-active) person.
So whatever I can tune (or directly change) is hyper parameter.
Whatever refuses to be in my direct control but is derived by training the model is just a parameter.

😀

Reply
- Jason Brownlee May 29, 2018 at 6:23 am #
  
  Nice.
  
  Reply
Roald Severtson June 10, 2018 at 12:47 pm #

It might be better to say that a range of hyperparameter values are manually specified in the case where they are able to be tuned. This tuning process has itself been automated now, for example with Amazon SageMaker, so model performance relative to a specified metric is now automated both with respect to the tunable hyperparameters and the model parameters: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html

Reply
- Jason Brownlee June 11, 2018 at 6:06 am #
  
  It can be automated.
  
  Reply
Vaseem July 20, 2018 at 6:11 am #

Excellent Article and very clear explanation. Thanks Dr. Brownless. You website is now in my bookmark bar

Reply
- Jason Brownlee July 20, 2018 at 6:19 am #
  
  Thanks.
  
  Reply
Aakash Behl August 9, 2018 at 7:39 pm #

thank you so much Jason!!

Reply
- Jason Brownlee August 10, 2018 at 6:12 am #
  
  I’m happy it helped.
  
  Reply
Zenon Uchida August 14, 2018 at 11:07 pm #

Thanks for this!
I have a question though, Are the number of layers (CNN: convolutional layer, pooling layer, dropout layer) considered to be hyperparameters? according to this (page 24) http://www.iro.umontreal.ca/~bengioy/talks/ICML-AutoML-26Jun2014.pdf, number of layers are part of DL hyperparams.

Reply
- Jason Brownlee August 15, 2018 at 6:03 am #
  
  Yes, the number of layers is a hyperparameter.
  
  Reply
Ameya October 15, 2018 at 12:51 am #

Thank you very much for the clear and concise explanation ! Really appreciate your help !

Reply
- Jason Brownlee October 15, 2018 at 7:31 am #
  
  I’m happy to hear that you found the post useful.
  
  Reply
Himanshu October 20, 2018 at 6:34 pm #

Your articles are awesome. Is there any way I can subscribe for email subscription.

Reply
- Jason Brownlee October 21, 2018 at 6:10 am #
  
  Thanks!
  
  Yes, right here:
  https://machinelearningmastery.com/faq/single-faq/how-can-i-get-emails-about-new-tutorials
  
  Reply
Leothorn January 4, 2019 at 8:25 pm #

I somehow felt that that Hyperparameters are dealing with those specific parameters which have a very large influence on the performance of the algorithm for small changes in value –

Something one deals with when solving Initial Value Problems and Boundry Value Problems .

I would think that in Clustering – K is a parameter given by user and the cluster radius is the hyper-parameter

Somehow this explanation that all those that are deduced with the model are parameters and those that are inputted are hyperparameters does not sound right ??

Reply
- Jason Brownlee January 5, 2019 at 6:55 am #
  
  Perhaps there are examples that don’t fit neatly into this distinction.
  
  I believe k in k-means is a hyperparameter, it is specified, not learned.
  
  Reply
Abdul January 6, 2019 at 8:23 pm #

Thank you!

Reply
- Jason Brownlee January 7, 2019 at 6:28 am #
  
  I’m glad the post was helpful.
  
  Reply
Shuaib January 22, 2019 at 4:07 am #

Oh man, It is a really helpful & its cleared so many questions that were in my mind.
Loveable.
Huge respect 🙂

Reply
- Jason Brownlee January 22, 2019 at 6:27 am #
  
  I’m glad it helped!
  
  Reply
Rahul dwivedi February 24, 2019 at 11:23 pm #

Nicely explained!

Reply
- Jason Brownlee February 25, 2019 at 6:43 am #
  
  Thanks.
  
  Reply
Rishabh July 9, 2019 at 6:10 am #

Hi,

Although it was quite helpful, but how do we find the best hyperparameter which is helpful to estimate the model parameter. How do we decide on mtry value?

Reply
- Jason Brownlee July 9, 2019 at 8:14 am #
  
  Test a suite of values and use the one that results in a model with the best estimated performance.
  
  Reply
Shilpa Kulkarni July 9, 2019 at 5:01 pm #

Great explanation! Very helpful. Tks.

Reply
- Jason Brownlee July 10, 2019 at 8:04 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Prashant September 18, 2019 at 7:36 pm #

great post !

Reply
- Jason Brownlee September 19, 2019 at 5:55 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Niharika December 4, 2019 at 10:25 pm #

A very useful article.

Reply
- Jason Brownlee December 5, 2019 at 6:40 am #
  
  Thanks!
  
  Reply
Suman December 19, 2019 at 9:41 pm #

Great explanation… Thanks for the awesome article.

Can you write an article on object detection, where we can able to take our model to the mobile device.

Reply
- Jason Brownlee December 20, 2019 at 6:45 am #
  
  Thanks!
  
  Perhaps start here:
  https://machinelearningmastery.com/how-to-perform-object-detection-with-yolov3-in-keras/
  
  Reply
Jermaine January 23, 2020 at 4:59 am #

You are the GOAT of Machine Learning Tutorials!!!!! For those unaware it means ‘Greatest Of All Time”!!!!

As always, thanks Jason!

Reply
- Jason Brownlee January 23, 2020 at 6:41 am #
  
  Thanks.
  
  Reply
Fevzi January 27, 2020 at 6:02 am #

Hello Jason Brownlee,

Thank you very much for your article.

How can I reference this article?

Reply
- Jason Brownlee January 27, 2020 at 7:10 am #
  
  You’re welcome.
  
  See this:
  https://machinelearningmastery.com/faq/single-faq/how-do-i-reference-or-cite-a-book-or-blog-post
  
  Reply
ashima March 5, 2020 at 12:08 am #

Hi Jason,

Can we set our own parameters to the model (fixing some parameters) .Basically, before the model training for e.g. the water always flow or at a certain temperature it will decrease always or something like that? or the model learn this on its own while training?

Thank you

Reply
- Jason Brownlee March 5, 2020 at 6:37 am #
  
  If something is constant and does not need to be learned, you can probably exclude it from the model.
  
  Reply
Ashish March 30, 2020 at 4:29 pm #

Great article. Helped me clear my doubts.

Reply
- Jason Brownlee March 31, 2020 at 7:55 am #
  
  Thanks, I’m happy to hear that.
  
  Reply
Kiki April 11, 2020 at 2:51 am #

Hi Jason, do you have an idea on how we are going to decide what is the parameter that has a biggest influence for certain case.?

Reply
- Jason Brownlee April 11, 2020 at 6:24 am #
  
  This is called feature importance:
  https://machinelearningmastery.com/calculate-feature-importance-with-python/
  
  Reply
Dina April 12, 2020 at 9:38 am #

Hi Jason. Actually Im a bit confused about the step of prediction. Lets say i had predict the value of something and fit the model and save and load the model. Then i want to continue predict with the model that I had save.

if i do all the step of prediction include fit and continue to use the model after that only then the prediction will be ok.

My question is, if i want to predict using the model. Does I still need to fit the model every first time I want to used the save model?Because when I do the prediction using the model n skip the fit part it says that the model has not been fit. 🙁

Reply
- Jason Brownlee April 12, 2020 at 1:14 pm #
  
  Great question!
  
  Yes, this refers to the possible maintenance and updating of any model required over time.
  
  You must test whether a model requires updating and if so different strategies for updating the model that allow that model to be most useful over time.
  
  Reply
  - dina April 13, 2020 at 11:31 am #
    
    Thank you Jason.
    
    Reply
    - Jason Brownlee April 13, 2020 at 1:50 pm #
      
      You’re welcome.
      
      Reply
Paul Fullilove April 14, 2020 at 2:30 am #

Thank you, Jason, that was most enlightening.

Reply
- Jason Brownlee April 14, 2020 at 6:26 am #
  
  You’re welcome.
  
  Reply
Markus June 24, 2020 at 7:47 pm #

Thanks Jason for your informative posts! I have a question though to two formulations of you considering hyperparameters:

“A model hyperparameter is a configuration that is external to the model and whose value cannot be estimated from data.”

“We cannot know the best value for a model hyperparameter on a given problem.”

Considering advances in AutoML and HPO can we still say that a hyperparamter cannot be estimated from data or that we can not know its best value? If yes, could you please elaborate why?

Reply
- Jason Brownlee June 25, 2020 at 6:14 am #
  
  Yes, we can estimate hyperparameters using search methods, but we will never know if a modeling pipeline is optimal. The search space is enormous.
  
  Reply
Magnus August 12, 2020 at 7:05 am #

Clear, clean, concise and straight to the point. Thank you.

Reply
- Jason Brownlee August 12, 2020 at 7:44 am #
  
  Thanks!
  
  Reply
Anvita October 4, 2020 at 11:18 pm #

amazing article !

Reply
- Jason Brownlee October 5, 2020 at 6:52 am #
  
  Thanks!
  
  Reply
adam smith October 10, 2020 at 5:00 am #

Hyper-parameters are external configuration variables, whereas model parameters are internal to the system.

Since hyper-parameter values are not saved, the trained or final models are not used for prediction. Model parameters, however, are used while making predictions.
https://www.hitechnectar.com/blogs/hyperparameter-vs-parameter/

Reply
- Jason Brownlee October 10, 2020 at 7:11 am #
  
  Agreed.
  
  Reply
  - Jay May 6, 2021 at 6:29 am #
    
    Jason:
    
    Adm brought very good question that “Since hyper-parameter values are not saved, the trained or final models are not used for prediction.”.
    
    Then what is the purpose of hyperparameter tuning if at the end it is not going get saved?
    
    Reply
    - Jason Brownlee May 7, 2021 at 6:22 am #
      
      At the end of hyperparameter tuning, you can use the discovered “good” or “best” configuration to train a new final model and use that to make predictions.
      
      Reply
Gaurav November 30, 2020 at 10:49 pm #

Thanks Jason. It was quite confusing figuring out the difference between Parameters and Hyperparameters. Now its pretty much clear about the difference between the two.

Reply
- Jason Brownlee December 1, 2020 at 6:20 am #
  
  I’m happy to hear that.
  
  Reply
phillip December 11, 2020 at 10:32 pm #

Thank you for the explanation, it is very clear.
I notice, that you mentioned in charactors in hyperparameter, that hyperparameters can be set heuristic.

I have understood the statement so. Using a grid search, that I can test almost all combination of parameters, that might be fit for the model. And see which combination is the best.

Am I right about that?

Another question about the hyperparameter tuning.
If we have a problem, the dataset is not static, but dynamic. With a fixed parameter combination might not be fit for the coming data. Are there ways to find the parameter combination automaticlly?

Thank you!

Reply
- Jason Brownlee December 12, 2020 at 6:28 am #
  
  If you have a small dataset and a fast model, you can test many combinations of hyperparameters, not all.
  
  With dynamic data you may need to monitor the performance of your model over time and perhaps re-train the model or even re-tune the model periodically if performance drops below a required level.
  
  Reply
Osagie Iyayi December 24, 2020 at 1:40 am #

Superb article

Thanks for Clarifying Jason !

Reply
- Jason Brownlee December 24, 2020 at 5:29 am #
  
  You’re welcome.
  
  Reply
Slawa January 27, 2021 at 12:58 am #

I’m still a bit confused, are the parameters in a model fixed or trainable during testing?

Reply
- Jason Brownlee January 27, 2021 at 6:10 am #
  
  Parameters are trainable.
  
  Reply
B Sidaoui February 25, 2021 at 4:22 am #

many thanks

Reply
- Jason Brownlee February 25, 2021 at 5:37 am #
  
  You’re welcome.
  
  Reply
David March 26, 2021 at 1:27 am #

Would the number of hidden layers be considered a hyperparameter?

Reply
- Jason Brownlee March 26, 2021 at 6:27 am #
  
  Yes!
  
  Reply
Callie Garrott April 28, 2021 at 12:35 pm #

Can someone tell me why the Bayesian Generalized General Regression model say it doesn’t need any tuning pararmeters

Reply
gabriel June 8, 2021 at 11:34 pm #

excellent potst!

Reply
- Jason Brownlee June 9, 2021 at 5:43 am #
  
  Thanks.
  
  Reply
Ajay July 4, 2021 at 9:13 pm #

Thank you for your clear explanation!

Example of hyper parameter is learning rate in Gradient descent ?

Reply
- Jason Brownlee July 5, 2021 at 5:08 am #
  
  Correct.
  
  Reply
Tibo July 25, 2021 at 9:05 am #

That is really very helpful. Thank you so much for the clear explanation.

Reply
- Jason Brownlee July 26, 2021 at 5:29 am #
  
  You’re welcome!
  
  Reply
Nikhil December 5, 2021 at 3:58 am #

I liked this explanation from Manju Savanth.

Model Parameters are something that a model learns on its own. For example, 1) Weights or Coefficients of independent variables in Linear regression model. 2) Weights or Coefficients of independent variables SVM. 3) Split points in Decision Tree.

Model hyper-parameters are used to optimize the model performance. For example, 1)Kernel and slack in SVM. 2)Value of K in KNN. 3)Depth of tree in Decision trees.

Reply
- Adrian Tam December 8, 2021 at 7:31 am #
  
  Thanks for sharing. That’s correct.
  
  Reply
Abdul Rehman July 24, 2022 at 6:19 am #

We can say that parameters is something that is learned during ML/DL process.
examples Weights and Bias
and Hyper-Parameters you specify manually in order to obtain a model with optimal performance.

Reply
- James Carmichael July 24, 2022 at 9:30 am #
  
  Hi Abdul…Your understanding is correct! Hyperparameters can also be optimized through many other processes that are considered better options than a manual or ad-hoc approach:
  
  https://www.kdnuggets.com/2020/05/hyperparameter-optimization-machine-learning-models.html
  
  Reply
Andrew Wilkinson September 13, 2022 at 2:38 pm #

As others have said, really nice clear summary. Thanks for producing these mini articles.

Reply
- James Carmichael September 14, 2022 at 5:51 am #
  
  Thank you for the feedback and support Andrew! We greatly appreciate it!
  
  Reply
Juan September 13, 2023 at 1:50 am #

Hello.

When we fit a regression tree we use…
– Training data: to optimize the parameters (the order of the variables and their thresholds).
– Validation data: to validate the error of the model while trying different hyperparameter values. (*)

What about the pruning? Is it performed automatically by the algorithm using (splitting the training data furhter into training and validation)? Or is it performed on the same validation data (*)

Reply
Juan September 13, 2023 at 2:11 am #

Is “pruning” performed in the same k-fold process, with the same validation data, than the other hyperparameters?
Or is it performed with the same data used to train the tree (var selection and thresholds), though it’s futher split it into a new training-validation?

Reply

Navigation

What is the Difference Between a Parameter and a Hyperparameter?

What is a Model Parameter?

What is a Model Hyperparameter?

Further Reading

Summary

More On This Topic

153 Responses to What is the Difference Between a Parameter and a Hyperparameter?

Leave a Reply Click here to cancel reply.