SALE! Use code blackfriday for 40% off everything!
Hurry, sale ends soon! Click to see the full catalog.

Archive | Ensemble Learning

Box Plots of XGBoost Random Forest Feature Set Size vs. Classification Accuracy

How to Develop Random Forest Ensembles With XGBoost

The XGBoost library provides an efficient implementation of gradient boosting that can be configured to train random forest ensembles. Random forest is a simpler algorithm than gradient boosting. The XGBoost library allows the models to be trained in a way that repurposes and harnesses the computational efficiencies implemented in the library for training random forest […]

Continue Reading 0
Box Plots of LightGBM Ensemble Tree Depth vs. Classification Accuracy

How to Develop a Light Gradient Boosted Machine (LightGBM) Ensemble

Light Gradient Boosted Machine, or LightGBM for short, is an open-source library that provides an efficient and effective implementation of the gradient boosting algorithm. LightGBM extends the gradient boosting algorithm by adding a type of automatic feature selection as well as focusing on boosting examples with larger gradients. This can result in a dramatic speedup […]

Continue Reading 4
Box Plots of XGBoost Ensemble Column Ratio vs. Classification Accuracy

Extreme Gradient Boosting (XGBoost) Ensemble in Python

Extreme Gradient Boosting (XGBoost) is an open-source library that provides an efficient and effective implementation of the gradient boosting algorithm. Although other open-source implementations of the approach existed before XGBoost, the release of XGBoost appeared to unleash the power of the technique and made the applied machine learning community take notice of gradient boosting more […]

Continue Reading 0
Box and Whisker Plots of Accuracy of Singles Model Fit On Selected Features vs. Ensemble

How to Develop a Feature Selection Subspace Ensemble in Python

Random subspace ensembles consist of the same model fit on different randomly selected groups of input features (columns) in the training dataset. There are many ways to choose groups of features in the training dataset, and feature selection is a popular class of data preparation techniques designed specifically for this purpose. The features selected by […]

Continue Reading 4
Multivariate Adaptive Regression Splines (MARS) in Python

Multivariate Adaptive Regression Splines (MARS) in Python

Multivariate Adaptive Regression Splines, or MARS, is an algorithm for complex non-linear regression problems. The algorithm involves finding a set of simple linear functions that in aggregate result in the best predictive performance. In this way, MARS is a type of ensemble of simple linear functions and can achieve good performance on challenging regression problems […]

Continue Reading 22
Example of Combining Hyperplanes Using an Ensemble

Develop an Intuition for How Ensemble Learning Works

Ensembles are a machine learning method that combine the predictions from multiple models in an effort to achieve better predictive performance. There are many different types of ensembles, although all approaches have two key properties: they require that the contributing models are different so that they make different errors and they combine the predictions in […]

Continue Reading 7
Box Plot of Random Subspace Ensemble Features vs. Classification Accuracy

How to Develop a Random Subspace Ensemble With Python

Random Subspace Ensemble is a machine learning algorithm that combines the predictions from multiple decision trees trained on different subsets of columns in the training dataset. Randomly varying the columns used to train each contributing member of the ensemble has the effect of introducing diversity into the ensemble and, in turn, can lift performance over […]

Continue Reading 28
Box and Whisker Plots of Bits Per Class vs. Distribution of Classification Accuracy for ECOC

Error-Correcting Output Codes (ECOC) for Machine Learning

Machine learning algorithms, like logistic regression and support vector machines, are designed for two-class (binary) classification problems. As such, these algorithms must either be modified for multi-class (more than two) classification problems or not used at all. The Error-Correcting Output Codes method is a technique that allows a multi-class classification problem to be reframed as […]

Continue Reading 0
Why Use Ensemble Learning

Why Use Ensemble Learning?

What are the Benefits of Ensemble Methods for Machine Learning? Ensembles are predictive models that combine predictions from two or more other models. Ensemble learning methods are popular and the go-to technique when the best performance on a predictive modeling project is the most important outcome. Nevertheless, they are not always the most appropriate technique […]

Continue Reading 4