Blog - Page 48 of 136

One-vs-Rest and One-vs-One for Multi-Class Classification

By Jason Brownlee on April 27, 2021 in Ensemble Learning 74

Not all classification predictive models support multi-class classification. Algorithms such as the Perceptron, Logistic Regression, and Support Vector Machines were designed for binary classification and do not natively support classification tasks with more than two classes. One approach for using binary classification algorithms for multi-classification problems is to split the multi-class classification dataset into multiple […]

Box Plot of Standalone and Stacking Model Accuracies for Binary Classification

Stacking Ensemble Machine Learning With Python

By Jason Brownlee on April 27, 2021 in Ensemble Learning 135

Stacking or Stacked Generalization is an ensemble machine learning algorithm. It uses a meta-learning algorithm to learn how to best combine the predictions from two or more base machine learning algorithms. The benefit of stacking is that it can harness the capabilities of a range of well-performing models on a classification or regression task and […]

Scatter Plot of Multi-Class Classification Dataset

4 Types of Classification Tasks in Machine Learning

By Jason Brownlee on August 19, 2020 in Python Machine Learning 115

Machine learning is a field of study and is concerned with algorithms that learn from examples. Classification is a task that requires the use of machine learning algorithms that learn how to assign a class label to examples from the problem domain. An easy to understand example is classifying emails as “spam” or “not spam.” […]

Scatter Plot of Synthetic Clustering Dataset With Points Colored by Known Cluster

10 Clustering Algorithms With Python

By Jason Brownlee on August 20, 2020 in Python Machine Learning 147

Clustering or cluster analysis is an unsupervised learning problem. It is often used as a data analysis technique for discovering interesting patterns in data, such as groups of customers based on their behavior. There are many clustering algorithms to choose from and no single best clustering algorithm for all cases. Instead, it is a good […]

What Is Argmax in Machine Learning?

By Jason Brownlee on August 19, 2020 in Linear Algebra 25

Argmax is a mathematical function that you may encounter in applied machine learning. For example, you may see “argmax” or “arg max” used in a research paper used to describe an algorithm. You may also be instructed to use the argmax function in your algorithm implementation. This may be the first time that you encounter […]

Gradient Boosting with Scikit-Learn, XGBoost, LightGBM, and CatBoost

By Jason Brownlee on April 27, 2021 in Ensemble Learning 59

Gradient boosting is a powerful ensemble machine learning algorithm. It’s popular for structured predictive modeling problems, such as classification and regression on tabular data, and is often the main algorithm or one of the main algorithms used in winning solutions to machine learning competitions, like those on Kaggle. There are many implementations of gradient boosting […]

Bar Chart of XGBClassifier Feature Importance Scores

How to Calculate Feature Importance With Python

By Jason Brownlee on August 20, 2020 in Data Preparation 237

Feature importance refers to techniques that assign a score to input features based on how useful they are at predicting a target variable. There are many types and sources of feature importance scores, although popular examples include statistical correlation scores, coefficients calculated as part of linear models, decision trees, and permutation importance scores. Feature importance […]

How to Develop Multioutput Regression Models in Python

How to Develop Multi-Output Regression Models with Python

By Jason Brownlee on April 27, 2021 in Ensemble Learning 219

Multioutput regression are regression problems that involve predicting two or more numerical values given an input example. An example might be to predict a coordinate given an input, e.g. predicting x and y values. Another example would be multi-step time series forecasting that involves predicting multiple future time series of a given variable. Many machine […]

4 Distance Measures for Machine Learning

By Jason Brownlee on August 19, 2020 in Python Machine Learning 40

Distance measures play an important role in machine learning. They provide the foundation for many popular and effective machine learning algorithms like k-nearest neighbors for supervised learning and k-means clustering for unsupervised learning. Different distance measures must be chosen and used depending on the types of the data. As such, it is important to know […]

Line Plot of Variance Threshold (X) Versus Number of Selected Features (Y)

How to Perform Data Cleaning for Machine Learning with Python

By Jason Brownlee on June 30, 2020 in Data Preparation 68

Data cleaning is a critically important step in any machine learning project. In tabular data, there are many different statistical analysis and data visualization techniques you can use to explore your data in order to identify data cleaning operations you may want to perform. Before jumping to the sophisticated methods, there are some very basic […]

← Previous 1 … 47 48 49 … 136 Next →

Navigation

MachineLearningMastery.com