[New Book] Click to get The Beginner's Guide to Data Science!
Use the offer code 20offearlybird to get 20% off. Hurry, sale ends soon!

Archive | Imbalanced Classification

Histogram of Examples in Each Class in the Glass Multi-Class Classification Dataset

Multi-Class Imbalanced Classification

Imbalanced classification are those prediction tasks where the distribution of examples across class labels is not equal. Most imbalanced classification examples focus on binary classification tasks, yet many of the tools and techniques for imbalanced classification also directly support multi-class classification problems. In this tutorial, you will discover how to use the tools of imbalanced […]

Continue Reading
Box and Whisker Plot of Machine Learning Models on the Imbalanced Glass Identification Dataset

Imbalanced Multiclass Classification with the Glass Identification Dataset

Multiclass classification problems are those where a label must be predicted, but there are more than two labels that may be predicted. These are challenging predictive modeling problems because a sufficiently representative number of examples of each class is required for a model to learn the problem. It is made challenging when the number of […]

Continue Reading
How to Predict the Probability of Fraudulent Credit Card Transactions

Imbalanced Classification with the Fraudulent Credit Card Transactions Dataset

Fraud is a major problem for credit card companies, both because of the large volume of transactions that are completed each day and because many fraudulent transactions look a lot like normal transactions. Identifying fraudulent credit card transactions is a common type of imbalanced binary classification where the focus is on the positive class (is […]

Continue Reading
How to Spot-Check Imbalanced Machine Learning Algorithms

Step-By-Step Framework for Imbalanced Classification Projects

Classification predictive modeling problems involve predicting a class label for a given set of inputs. It is a challenging problem in general, especially if little is known about the dataset, as there are tens, if not hundreds, of machine learning algorithms to choose from. The problem is made significantly more difficult if the distribution of […]

Continue Reading
Histogram Plots of the Variables for the Phoneme Dataset

Predictive Model for the Phoneme Imbalanced Classification Dataset

Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced. Nevertheless, accuracy is equally important in both classes. An example is the classification of vowel sounds from European languages as either nasal or oral on speech recognition where there are many more […]

Continue Reading
Develop an Imbalanced Classification Model to Detect Microcalcifications

Imbalanced Classification Model to Detect Mammography Microcalcifications

Cancer detection is a popular example of an imbalanced classification problem because there are often significantly more cases of non-cancer than actual cancer. A standard imbalanced classification dataset is the mammography dataset that involves detecting breast cancer from radiological scans, specifically the presence of clusters of microcalcifications that appear bright on a mammogram. This dataset […]

Continue Reading
How to Calibrate Probabilities for Imbalanced Classification

How to Calibrate Probabilities for Imbalanced Classification

Many machine learning models are capable of predicting a probability or probability-like scores for class membership. Probabilities provide a required level of granularity for evaluating and comparing models, especially on imbalanced classification problems where tools like ROC Curves are used to interpret predictions and the ROC AUC metric is used to compare model performance, both […]

Continue Reading