Archive | Imbalanced Classification

Histogram of Examples in Each Class in the Glass Multi-Class Classification Dataset

Multi-Class Imbalanced Classification

By Jason Brownlee on January 5, 2021 in Imbalanced Classification 63

Imbalanced classification are those prediction tasks where the distribution of examples across class labels is not equal. Most imbalanced classification examples focus on binary classification tasks, yet many of the tools and techniques for imbalanced classification also directly support multi-class classification problems. In this tutorial, you will discover how to use the tools of imbalanced […]

Histogram of Variables in the E.coli Dataset

Imbalanced Multiclass Classification with the E.coli Dataset

By Jason Brownlee on January 5, 2021 in Imbalanced Classification 14

Multiclass classification problems are those where a label must be predicted, but there are more than two labels that may be predicted. These are challenging predictive modeling problems because a sufficiently representative number of examples of each class is required for a model to learn the problem. It is made challenging when the number of […]

Box and Whisker Plot of Machine Learning Models on the Imbalanced Glass Identification Dataset

Imbalanced Multiclass Classification with the Glass Identification Dataset

By Jason Brownlee on August 21, 2020 in Imbalanced Classification 30

How to Predict the Probability of Fraudulent Credit Card Transactions

Imbalanced Classification with the Fraudulent Credit Card Transactions Dataset

By Jason Brownlee on August 21, 2020 in Imbalanced Classification 38

Fraud is a major problem for credit card companies, both because of the large volume of transactions that are completed each day and because many fraudulent transactions look a lot like normal transactions. Identifying fraudulent credit card transactions is a common type of imbalanced binary classification where the focus is on the positive class (is […]

How to Spot-Check Imbalanced Machine Learning Algorithms

Step-By-Step Framework for Imbalanced Classification Projects

By Jason Brownlee on March 19, 2020 in Imbalanced Classification 67

Classification predictive modeling problems involve predicting a class label for a given set of inputs. It is a challenging problem in general, especially if little is known about the dataset, as there are tens, if not hundreds, of machine learning algorithms to choose from. The problem is made significantly more difficult if the distribution of […]

Develop an Imbalanced Classification Model to Predict Income

Imbalanced Classification with the Adult Income Dataset

By Jason Brownlee on October 27, 2020 in Imbalanced Classification 34

Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced. A popular example is the adult income dataset that involves predicting personal income levels as above or below $50,000 per year based on personal details such as relationship and education level. There […]

Histogram Plots of the Variables for the Phoneme Dataset

Predictive Model for the Phoneme Imbalanced Classification Dataset

By Jason Brownlee on January 5, 2021 in Imbalanced Classification 22

Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced. Nevertheless, accuracy is equally important in both classes. An example is the classification of vowel sounds from European languages as either nasal or oral on speech recognition where there are many more […]

Develop an Imbalanced Classification Model to Detect Microcalcifications

Imbalanced Classification Model to Detect Mammography Microcalcifications

By Jason Brownlee on August 21, 2020 in Imbalanced Classification 17

Cancer detection is a popular example of an imbalanced classification problem because there are often significantly more cases of non-cancer than actual cancer. A standard imbalanced classification dataset is the mammography dataset that involves detecting breast cancer from radiological scans, specifically the presence of clusters of microcalcifications that appear bright on a mammogram. This dataset […]

Box and Whisker Plot of Machine Learning Models on the Imbalanced German Credit Dataset

Develop a Model for the Imbalanced Classification of Good and Bad Credit

By Jason Brownlee on January 5, 2021 in Imbalanced Classification 31

Misclassification errors on the minority class are more important than other types of prediction errors for some imbalanced classification tasks. One example is the problem of classifying bank customers as to whether they should receive a loan or not. Giving a loan to a bad customer marked as a good customer results in a greater […]

How to Calibrate Probabilities for Imbalanced Classification

By Jason Brownlee on August 21, 2020 in Imbalanced Classification 37

Many machine learning models are capable of predicting a probability or probability-like scores for class membership. Probabilities provide a required level of granularity for evaluating and comparing models, especially on imbalanced classification problems where tools like ROC Curves are used to interpret predictions and the ROC AUC metric is used to compare model performance, both […]

1 2 … 5 Next →