Not all data attributes are created equal. More is not always better when it comes to attributes or columns in your dataset.
In this post you will discover how to select attributes in your data before creating a machine learning model using the scikit-learn library.
Kick-start your project with my new book Machine Learning Mastery With Python, including step-by-step tutorials and the Python source code files for all examples.
Let’s get started.
Update: For a more recent tutorial on feature selection in Python see the post:

Cut Down on Your Options with Feature Selection
Photo by Josh Friedman, some rights reserved
Select Features
Feature selection is a process where you automatically select those features in your data that contribute most to the prediction variable or output in which you are interested.
Having too many irrelevant features in your data can decrease the accuracy of the models. Three benefits of performing feature selection before modeling your data are:
- Reduces Overfitting: Less redundant data means less opportunity to make decisions based on noise.
- Improves Accuracy: Less misleading data means modeling accuracy improves.
- Reduces Training Time: Less data means that algorithms train faster.
Two different feature selection methods provided by the scikit-learn Python library are Recursive Feature Elimination and feature importance ranking.
Recursive Feature Elimination
The Recursive Feature Elimination (RFE) method is a feature selection approach. It works by recursively removing attributes and building a model on those attributes that remain. It uses the model accuracy to identify which attributes (and combination of attributes) contribute the most to predicting the target attribute.
This recipe shows the use of RFE on the Iris floweres dataset to select 3 attributes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
# Recursive Feature Elimination from sklearn import datasets from sklearn.feature_selection import RFE from sklearn.linear_model import LogisticRegression # load the iris datasets dataset = datasets.load_iris() # create a base classifier used to evaluate a subset of attributes model = LogisticRegression() # create the RFE model and select 3 attributes rfe = RFE(model, 3) rfe = rfe.fit(dataset.data, dataset.target) # summarize the selection of the attributes print(rfe.support_) print(rfe.ranking_) |
For a more extensive tutorial on RFE for classification and regression, see the tutorial:
Feature Importance
Methods that use ensembles of decision trees (like Random Forest or Extra Trees) can also compute the relative importance of each attribute. These importance values can be used to inform a feature selection process.
This recipe shows the construction of an Extra Trees ensemble of the iris flowers dataset and the display of the relative feature importance.
1 2 3 4 5 6 7 8 9 10 11 |
# Feature Importance from sklearn import datasets from sklearn import metrics from sklearn.ensemble import ExtraTreesClassifier # load the iris datasets dataset = datasets.load_iris() # fit an Extra Trees model to the data model = ExtraTreesClassifier() model.fit(dataset.data, dataset.target) # display the relative importance of each attribute print(model.feature_importances_) |
For a more extensive tutorial on feature importance with a range of algorithms, see the tutorial:
Summary
Feature selection methods can give you useful information on the relative importance or relevance of features for a given problem. You can use this information to create filtered versions of your dataset and increase the accuracy of your models.
In this post you discovered two feature selection methods you can apply in Python using the scikit-learn library.
Nice post, how does RFE and Feature selection like chi2 are different. I mean, finally they are achieving the same goal, right?
Both seek to reduce the number of features, but they do so using different methods. chi squared is a univariate statistical measure that can be used to rank features, whereas RFE tests different subsets of features.
Is there any benchmarks, for example, P value, F score, or R square, to be used to score the importance of features?
No, the scores are relative and specific to a given problem.
Hello,
I read and view a lot about machine learning but you are amazing,
You are able to explain everything in a simple way and write code that everyone can understand and ‘play’ with it. and you give good resource for anyone who wants to deep in the topic
you are good teacher
Thank you for your work
Thanks mitillo.
Hello,
Can you tell me which feature selection methods you suggest for time-series data?
Please see tsfresh – it’s a new approach for feature selection designed for TS
Great site Jason!
Thanks for that good post. Just wondering whether RFE is also usable for linear regression? How it the model accuracy measured?
Jason, quick question that may help someone else stumbling across this post.
The example above does RFE using an untuned model. When would/would not make sense to find some optimised hyperparameters of the model using grid search *first*, and THEN doing RFE. In your experience, is this a good idea/helpful thing to do? If not, then why?
Hi Carmen, nice catch.
Short answer: we are interested in relative difference of feature subsets, not absolute best performance.
Generally, it a good idea to use a robust method for feature selection – that is a method that performs well on most problems with little or no tuning. This provides a baseline and a wrapper method like RFE can focus on the relative difference in the feature subsets rather than on the optimized best performance of each subset.
There are those cases where your general method (say a random forest) falls down. In those cases, you may want to try RFE with a suite of 3-5 different wrapped methods and see what falls out. I expect that is this is overkill on most problems.
Does that help?
Thanks that helps. The only reason I’d mentioned tuning a model first (light tuning) is that as you mentioned in your “spot checking” post, you want to give algorithms a chance to put their best step forward. If that applies there, I don’t see why it shouldn’t apply to RFE.
So I figured light tuning (only on the most common hyperparameter with the most common grid values) may help here. But I see your point. Once I’ve got my code all sorted out I may try both and report back 🙂
You’re absolutely right Carmen.
There is a cost/benefit here and ultimately it will come down to experience and the “taste” of the practitioner.
In fact, much of industrial machine learning comes down to taste 🙂
Most top methods perform just as well say at the 90-95% effort-result level. The really hard work is trying to get above that, kaggle comps are good case in point.
thanks so much for your post Jason
i’am a beginner in scikit-learn and i’ve a little problem when using feature selection module VarianceThreshold, the problem is when i set the variance Var[X]=.8*(1-.8)
it is supposed to remove all features (that have the same value in all samples) which have the probability p>0.8.
in my case the fifth column should be removed, p=8/10>(threshold=0,7).
#####################################
from sklearn.feature_selection import VarianceThreshold
X=[[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.01,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,255,1.00,0.00,0.01,0.00,0.00,0.00,0.00,0.00],
[0,1,2,1,29,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0.00,0.00,0.00,0.00,0.50,1.00,0.00,10,3,0.30,0.30,0.30,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,253,0.99,0.01,0.00,0.00,0.00,0.00,0.00,0.00],
[0,1,1,1,105,146,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0.00,0.00,0.00,0.00,1.00,0.00,0.00,255,254,1.00,0.01,0.00,0.00,0.00,0.00,0.00,0.00],
[0,2,3,1,223,185,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,4,4,0.00,0.00,0.00,0.00,1.00,0.00,0.00,71,255,1.00,0.00,0.01,0.01,0.00,0.00,0.00,0.00]]
sel=VarianceThreshold(threshold=(.7*(1-.7)))
and this is what i get when running the script
>>> sel.fit_transform(X)
array([[ 1., 105., 146., 1., 1., 255., 254.],
[ 1., 105., 146., 1., 1., 255., 254.],
[ 1., 105., 146., 1., 1., 255., 254.],
[ 1., 105., 146., 2., 2., 255., 254.],
[ 1., 105., 146., 2., 2., 255., 254.],
[ 1., 105., 146., 2., 2., 255., 255.],
[ 2., 29., 0., 2., 1., 10., 3.],
[ 1., 105., 146., 1., 1., 255., 253.],
[ 1., 105., 146., 2., 2., 255., 254.],
[ 3., 223., 185., 4., 4., 71., 255.]])
the second column here should not apear.
thanks;)
It is not clear to me what the fault could be. Consider posting to stackoverflow or similar?
Hi Jason,
I am performing feature selection ( on a dataset with 1,00,000 rows and 32 features) using multinomial Logistic Regression using python.Now, what would be the most efficient way to select features in order to build model for multiclass target variable(1,2,3,4,5,6,7,8,9,10)? I have used RFE for feature selection but it gives Rank=1 to all features. Do I consider all features for building model? Is there any other method for this?
Thanks in advance.
Try a suite of methods, build models based on the features and compare the performance of those models.
can you tell me how to select features for clinical datasets from a csv file??
Try a suite of feature selection methods, build models based on selected features, use the set of features + model that results in the best model skill.
Hi Jason, How can I print the feature name and the importance side by side?
Thanks,
Sufian
es, if you have an array of feature or column names you can use the same index into both arrays.
what are the feature selection methods?? and how to build models based on the selected features??
can you help me in this? because I am new to machine learning and python
Sure, read this post on feature selection:
https://machinelearningmastery.com/an-introduction-to-feature-selection/
i want to remove columns which are highly correlated like caret package pre processing method does in R. how can i remove them using sklearn?
You might need to implement it yourself – e.g. calculate the correlation matrix and remove selected columns.
Deas Keras have similar functionality like FRE that we can use?
I am using Keras for my models. I created a model. Then, I wanted to use RFE for it. The first line (rfe=FRE(model, 3)) is fine, but as soon as I want to fit the data, I get following error:
TypeError: Cannot clone object ” (type ): it does not seem to be a scikit-learn estimator as it does not implement a ‘get_params’ methods.
You may be able to use the sklearn wrappers in Keras and then put the wrapped model within RFE.
I have posts on using the wrappers on the blog, for example:
https://machinelearningmastery.com/use-keras-deep-learning-models-scikit-learn-python/
That is awesome! I’ll read it. Thanks a lot for your reply and sharing the link.
No problem.
After using your suggestion keras model does not support or ranking attribute
No it does not.
Then how can we RFE test on keras model ?
Perhaps you can use the Keras wrapper for the model, then use it as part of RFE?
I did that, but no suceess, I am pasting the code for reference
def create_model():
# create model
model = Sequential()
model.add(Dense(1000, input_dim=v.shape[1], activation=’relu’))
model.add(Dropout(0.2))
model.add(Dense(3, activation=’softmax’))
model.compile(loss=’sparse_categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
return model
by_name=True)
seed = 7
np.random.seed(seed)
keras_model = KerasClassifier(build_fn=create_model, epochs=10, batch_size=10, verbose=1)
rfe = RFE(keras_model, 3)
rfe = rfe.fit(v, all_label_encoded)
print(rfe.support_)
print(rfe)
model does not support support and ranking. Can you tell me exactly how to get the ranking and the support?
I’m eager to help, but I don’t have the capacity to debug code.
I have some suggestions here:
https://machinelearningmastery.com/faq/single-faq/can-you-read-review-or-debug-my-code
Your answer justifies the stuff, thanks for the reply.
@Shubham Just to clarify Keras classifier will not work with RFE. Answer mentioned by Jason Brownlee will not work.
Perhaps you can try running a manual search over subsets of features with the model?
Perhaps you can run RFE with a sklearn model and use the results to motivate a Keras model?
OK
Hi Jason,
Can Random Forest’s feature importance be considered as a wrapper based approach?
No.
Is it an embedded method?
No, it is not an embedding method.
Hi Jason,
Do you know how is feature importance calculated?
It depends on the algorithm.
I cover it in detail for stochastic gradient boosting here:
https://machinelearningmastery.com/feature-importance-and-feature-selection-with-xgboost-in-python/
I feel in recursive feature selection it is more prudent to use cv and let the algo decide how many features to retain
Yes. I often keep all features and use subspaces or ensembles of feature selection methods.
i need to select the best features from my own data set…using feature selection wrapper approach the learning algorithm is ant colony optimization and the classifier is svm …any one have any idea…
Nice post!
But I still have a question.
I entered the kaggle competition recently, and I evaluate my dataset by using the methods you have posted(the model is
RandomForest).
Then I deleted the worst feature. And my score decreased from 0.79904 to 0.78947. Then I was confused. Should I build more
features? And What should I do to get a higher score(change model? expand features or more?) or where I can learn those ?
Thanks a lot.
Great question. You must try lots of things, this is why ml is hard:
https://machinelearningmastery.com/applied-machine-learning-is-hard/
It’s a big search problem:
https://machinelearningmastery.com/applied-machine-learning-as-a-search-problem/
Here is a list of things to try:
https://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/
Hi Jason,
I wanted to know if there are any existing python library/libraries that can be used to rank all the features in a specific dataset based on a specific attribute for various methods like Gain Ratio, Infomation Gain, Chi2,rank correlation, linear correlation, symmetric uncertainty . If not, can you please provide some steps to proceed with the same?
Thanks
Perhaps?
Each method will have a different “view” on what is important in the data. You can test each view to see what is real/useful to developing a skilful model.
What about the feature importance attribute from the decision tree classifier? Could it be used for feature selection?
http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
Sure.
Could this method be used to perform feature subset selection on groups of subsets that have to be considered together? For instance, after performing a FeatureHasher transformation you have a fixed length hash which takes up say 256 columns which have to be considered as a group. Do you have any resources for this case?
Perhaps. Try it. Sorry,I don’t have material on this topic. Try a search on scholar.google.com
Regarding ensemble learning model, I used it to reduce the features. But, how i can get to know that how many features I need to select?
Great question, I answer it here:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
How large can your feature set before the efficacy of this algorithm breaks down?
Or, because it uses subsets, it returns a reasonable feature ranking even if you fit over a large number of features?
Thanks!
It depends on the dataset.
I am using the tree classifier on my dataset and it gives different values each time I run the script. Is this a problem? or it differentiates because different ways the features are linked by the tree?
This is to be expected, you can learn more about this here:
https://machinelearningmastery.com/randomness-in-machine-learning/
classification and regression analysis feature selection python code???if any one have
Perhaps start here:
https://machinelearningmastery.com/an-introduction-to-feature-selection/
Is there a way to find the best number of features for each data set?
Yes, try a suite of feature selection methods, and a suite of models and use the combination of features and model that give the best performance.
For example, which algorithm can find the optimal number of features?
There are many solutions and each with different performance. Machine learning is empirical, there’s no idea of ‘best’, just good enough given time and resources.
I recommend reading this:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
For example, there are 500 features. Is there any way to know the number of features that show the highest classification accuracy when performing a feature selection algorithm?
Test different subsets of features by building a model from them and evaluate the performance of the model. The features that lead to a model with the best performance are the features that you should use.
Hey Jason,
Again a great post, I have followed several of your posts.
I want your opinion on the type of Machine learning algorithm that I can use my project on Supervised Learning.
This is a common question that I answer here:
https://machinelearningmastery.com/faq/single-faq/what-algorithm-config-should-i-use
Hello Jason,
Thank you for all your content. Big fan of all your posts.
I am now stuck in deciding when to use which feature selection method ( Filter, Wrapper & Embedded ) for my problem.
Can you please help or provide any reference links where I can get the required info.
Thanks in advance. !
Vaibhav
No problem, this is a common question that I answer here:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
Hi Jason,
I have a requirement about model predictions for text classification using keras.
suppose if i entered any unrelated texts for model prediction,the entered texts which is not trained in model, instantly to give your entered query is invalid .
Please suggest me any methods are available .
thanks in advance 🙂
Sorry, I don’t follow, perhaps you can elaborate?
Hi,
There are many different methods for feature selection. It depends on the algorithm i use. For example, if i use logistic regression for prediction then i can not use random forest for feature selection (the subset of features from random forest can be non significant in logistic regression model).
Is the method you suggest suitable for logistic regression?
Perhaps start with RFE?
After using logistic regression for feature selection can we apply different models such as knn, decision tree, random forest etc to get the accuracy?
Perhaps your problem is too easy or too hard and all models find the same solution?
hi, Jason,
Thanks for your post, it’s clear and useful.
But I still have some questions.
1. Should I eliminate collinearity of variables before feature selection? Some posts says collinearity is not a problem for nonlinear model. but I am afraid that it will affect the result of feature selection.
2. There are several feature selection method in scikit-learn, different method may select different subset, how do I know which subset or method is more suitable?
3. When I build a machine learning model, the performance of the model seems more related to the number of features. No matter what features I use, the accuracy will increase when a certain threshold is reached. How do I explain this?
Again, thanks a lot for your patient answer.
Perhaps, try it and see for your model + data.
Good question, try them all and see what works best, see this:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
If the features are relevant to the outcome, the model will figure out how to use them. Or most models will.
Thanks for the great posts. I have a problem for feature selection and parameter tuning.
Thanks in advance for the help,
I would like to do feature selection with recursive feature elimination and cross-validated selection of the best number of features. So I use RFECV:
But I am passing an untuned model, svm.SVC(kernel=’linear’), to RFECV(), to find a subset of best features. So I have not addressed the tuning of hyperparameters within the model.
Does this make sense to find some optimised hyperparameters of the model using grid search first, and THEN doing RFE? (However, parameter tuning has performed on un-optimized feature set.)
How about doing vise versa,i.e. first feature selection and then parameter tuning? (However, selected features has chosen based on the untuned model)
Although, either gridsearchCV and RFECV perform feature selection independently in each fold of the cross-validation, and I can use different splitting criteria for RFECV and gridsearchCV,
I still suspect that as I have to use the same dataset for parameter tuning as well as for RFECV selection, Dose it cause overfiting?
Do I have to take out a portion of the training set to do feature selection on. Next start model selection on the remaining data in the training set?
It might make sense to use standalone rfe within a pipeline with a given algorithm.
Hi,
Will Recursive Feature Elimination works good for categorical input datasets also ?
Sure.
Hi Jason, thanks for your hard work !
How do you explain the following behavior ? Feature importance doesn’t tell you to keep the same features as RFE… which one should we trust ?
The code :
# Feature Importance
from sklearn import datasets
from sklearn import metrics
from sklearn.ensemble import RandomForestClassifier
# load the iris datasets
dataset = datasets.load_iris()
# fit an Extra Trees model to the data
model = RandomForestClassifier()
model.fit(dataset.data, dataset.target)
# display the relative importance of each attribute
print(model.feature_importances_)
rfe = RFE(model, 1)
rfe = rfe.fit(dataset.data, dataset.target)
# summarize the selection of the attributes
print(rfe.support_)
print(rfe.ranking_)
Output:
[0.02029219 0.01598919 0.57190818 0.39181044]
[False False False True]
[3 4 2 1]
This is a common question that I answer here:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
Great explanation but i want to extract feature from videos for human activity recognition (walk,sleep,jump). But i dont know how to load the datasets. Any help will be appreciated.
Sorry, i don’t have a tutorial on loading video.
Hello Jason,
I am trying to select the best features among 80 features in my dataset. My dataset contains integer as well as string values. I got an issue while trying to select the features using SelectKBest method. Why such issue happened. Could you help me in understanding this?
Good question, I answer it here:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
Thanks Jason. Having another doubt. Will all the feature selection techniques such as SelectKBest, Feature Importance prioritize the features in the same order? If so, How could we get to know particular method is best for feature selection?
Good question, I answer it here:
https://machinelearningmastery.com/faq/single-faq/what-feature-selection-method-should-i-use
Each time when I execute a feature importance method, it is giving different features as best features. Will this be possible?
Yes, each method has a different “idea” of what features to use.
Test a number of different approaches and choose one that results in the best performing model.
Thank you Jason.
You’re welcome.
Hi Jason
Can you provide me python code for correlation based features selection?
Yes, here:
https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/
What is the role of p-value in machine learning algorithm?Why to use that?
It is used to interpret the result of a statistical hypothesis test:
https://machinelearningmastery.com/faq/single-faq/how-do-i-interpret-a-p-value
Hello Jason,
Thank you for the descriptive article.
I am working with microbiome data analysis and would like to use machine learning to pick a set of genera which can classify samples between two categories (for examples, healthy and disease).
i used the following code:
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2
from sklearn.feature_selection import SelectFpr
from sklearn.feature_selection import GenericUnivariateSelect
X = df_n #dataset with 131 columns and 51 rows
y = list(map(lambda x : x[:2], df_n.index))
bestfeatures = GenericUnivariateSelect(chi2, ‘k_best’)
fit = bestfeatures.fit(X,y)
pvalues = -np.log10(bestfeatures.pvalues_) #convert pvalues into log format
dfscores = pd.DataFrame(fit.scores_)
dfcolumns = pd.DataFrame(X.columns)
dfpvalues = pd.DataFrame(pvalues)
#concat two dataframes for better visualization
featureScores = pd.concat([dfcolumns,dfscores,dfpvalues],axis=1)
featureScores.columns = [‘Specs’,’Score’,’pvalues’] #naming the dataframe columns
FS = featureScores.loc[featureScores[‘pvalues’] < 0.05, :]
print(FS.nlargest(10, 'pvalues')) #top 10 features
Specs Score pvalues
41 a1 0.206076 0.044749
22 a2 0.193496 0.042017
11 a3 0.153464 0.033324
117 a4 0.143448 0.031149
20 a5 0.143214 0.031099
45 a6 0.136450 0.029630
67 a7 0.132488 0.028769
0 a8 0.122946 0.026697
80 a9 0.120120 0.026084
123 a10 0.118977 0.025836
Now I would like to use these list of features to make a PCoA plot with Bray-curtis because I want to visualize how these features can distinguish the 40 samples into two different categories (already known).
Can you help me by guiding in this regard?
What is a PCoA plot and what is Bray-curtis?
Hi,
After rfe.fit and getting the rakings of the features how do we get the feature names according to rankings. Also, which rankings would we choose to go ahead and train the model
The ranking has the indexes of each feature, you can use these indexes to access the column names from an array or from your dataframe.
Hi Jason,
RFE selects the feature set based on train data.
Although in general, lesser features tend to prevent overfitting. So how does it ensure that the best performing features were not due to overfitted training data, since there is no validation set in place?
Also, how does RFE differ from the importance_plot from XGboost or random forest or Gradient Boosting which shows the list of features based on gain importance?
RFE cannot help you prevent overfitting.
The are very different. RFE is calculated using any model you like and selects features based on how it impacts model performance. Feature importance from ensembles of trees is calculated based on how much the features are used in the trees.
Hi,
thank you for the tutorial.
Something that is not clear for me is if the RFE is only used for classification or if it can be used for regression problems as well.
When adapting the tutorial above to another dataset, it keeps alerting that the data is continuous. This is normally associated with classifiers, isn’t it?
Thank you once more.
It can be used for classification or regression, see examples here:
https://machinelearningmastery.com/rfe-feature-selection-in-python/
Hey there,
Can we extract features name from model only?
Like you just have a fitted model and now you have to calculate its score, but the problem is you dont have list of features used in it. You just have the model and train dataset.
If yes, them please help me because i am stuck at this!
Thanks
It will suggest feature/column indexes, you can then relate these to the names of the features in the original dataset directly.
hi Jason,
its a good article.
I have one doubt, if i dont know the no of features to select. How should i go about on selecting the optimum number of feaures required for rfe ?
Thanks and regards
Good question.
You can use a grid search and test each number of features from 1 to the total number of features, here is an example:
https://machinelearningmastery.com/rfe-feature-selection-in-python/