Plot a Decision Surface for Machine Learning Algorithms in Python

By Jason Brownlee on August 26, 2020 in Python Machine Learning 19

Classification algorithms learn how to assign class labels to examples, although their decisions can appear opaque.

A popular diagnostic for understanding the decisions made by a classification algorithm is the decision surface. This is a plot that shows how a fit machine learning algorithm predicts a coarse grid across the input feature space.

A decision surface plot is a powerful tool for understanding how a given model “sees” the prediction task and how it has decided to divide the input feature space by class label.

In this tutorial, you will discover how to plot a decision surface for a classification machine learning algorithm.

After completing this tutorial, you will know:

Decision surface is a diagnostic tool for understanding how a classification algorithm divides up the feature space.
How to plot a decision surface for using crisp class labels for a machine learning algorithm.
How to plot and interpret a decision surface using predicted probabilities.

Kick-start your project with my new book Machine Learning Mastery With Python, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Plot a Decision Surface for Machine Learning Algorithms in Python
Photo by Tony Webster, some rights reserved.

Tutorial Overview

This tutorial is divided into three parts; they are:

Decision Surface
Dataset and Model
Plot a Decision Surface

Decision Surface

Classification machine learning algorithms learn to assign labels to input examples.

Consider numeric input features for the classification task defining a continuous input feature space.

We can think of each input feature defining an axis or dimension on a feature space. Two input features would define a feature space that is a plane, with dots representing input coordinates in the input space. If there were three input variables, the feature space would be a three-dimensional volume.

Each point in the space can be assigned a class label. In terms of a two-dimensional feature space, we can think of each point on the planing having a different color, according to their assigned class.

The goal of a classification algorithm is to learn how to divide up the feature space such that labels are assigned correctly to points in the feature space, or at least, as correctly as is possible.

This is a useful geometric understanding of classification predictive modeling. We can take it one step further.

Once a classification machine learning algorithm divides a feature space, we can then classify each point in the feature space, on some arbitrary grid, to get an idea of how exactly the algorithm chose to divide up the feature space.

This is called a decision surface or decision boundary, and it provides a diagnostic tool for understanding a model on a classification predictive modeling task.

Although the notion of a “surface” suggests a two-dimensional feature space, the method can be used with feature spaces with more than two dimensions, where a surface is created for each pair of input features.

Now that we are familiar with what a decision surface is, next, let’s define a dataset and model for which we later explore the decision surface.

Dataset and Model

In this section, we will define a classification task and predictive model to learn the task.

Synthetic Classification Dataset

We can use the make_blobs() scikit-learn function to define a classification task with a two-dimensional class numerical feature space and each point assigned one of two class labels, e.g. a binary classification task.

...
# generate dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)

...

# generate dataset

X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)

Once defined, we can then create a scatter plot of the feature space with the first feature defining the x-axis, the second feature defining the y axis, and each sample represented as a point in the feature space.

We can then color points in the scatter plot according to their class label as either 0 or 1.

...
# create scatter plot for samples from each class
for class_value in range(2):
	# get row indexes for samples with this class
	row_ix = where(y == class_value)
	# create scatter of these samples
	pyplot.scatter(X[row_ix, 0], X[row_ix, 1])
# show the plot
pyplot.show()

...

# create scatter plot for samples from each class

for class_value in range(2):

# get row indexes for samples with this class

row_ix = where(y == class_value)

# create scatter of these samples

pyplot.scatter(X[row_ix, 0], X[row_ix, 1])

# show the plot

pyplot.show()

Tying this together, the complete example of defining and plotting a synthetic classification dataset is listed below.

# generate binary classification dataset and plot
from numpy import where
from matplotlib import pyplot
from sklearn.datasets import make_blobs
# generate dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)
# create scatter plot for samples from each class
for class_value in range(2):
	# get row indexes for samples with this class
	row_ix = where(y == class_value)
	# create scatter of these samples
	pyplot.scatter(X[row_ix, 0], X[row_ix, 1])
# show the plot
pyplot.show()

# generate binary classification dataset and plot

from numpy import where

from matplotlib import pyplot

from sklearn.datasets import make_blobs

# generate dataset

X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)

# create scatter plot for samples from each class

for class_value in range(2):

# get row indexes for samples with this class

row_ix = where(y == class_value)

# create scatter of these samples

pyplot.scatter(X[row_ix, 0], X[row_ix, 1])

# show the plot

pyplot.show()

Running the example creates the dataset, then plots the dataset as a scatter plot with points colored by class label.

We can see a clear separation between examples from the two classes and we can imagine how a machine learning model might draw a line to separate the two classes, e.g. perhaps a diagonal line right through the middle of the two groups.

Scatter Plot of Binary Classification Dataset With 2D Feature Space

Fit Classification Predictive Model

We can now fit a model on our dataset.

In this case, we will fit a logistic regression algorithm because we can predict both crisp class labels and probabilities, both of which we can use in our decision surface.

We can define the model, then fit it on the training dataset.

...
# define the model
model = LogisticRegression()
# fit the model
model.fit(X, y)

...

# define the model

model = LogisticRegression()

# fit the model

model.fit(X, y)

Once defined, we can use the model to make a prediction for the training dataset to get an idea of how well it learned to divide the feature space of the training dataset and assign labels.

...
# make predictions
yhat = model.predict(X)

...

# make predictions

yhat = model.predict(X)

The predictions can be evaluated using classification accuracy.

...
# evaluate the predictions
acc = accuracy_score(y, yhat)
print('Accuracy: %.3f' % acc)

...

# evaluate the predictions

acc = accuracy_score(y, yhat)

print('Accuracy: %.3f' % acc)

Tying this together, the complete example of fitting and evaluating a model on the synthetic binary classification dataset is listed below.

# example of fitting and evaluating a model on the classification dataset
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# generate dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)
# define the model
model = LogisticRegression()
# fit the model
model.fit(X, y)
# make predictions
yhat = model.predict(X)
# evaluate the predictions
acc = accuracy_score(y, yhat)
print('Accuracy: %.3f' % acc)

# example of fitting and evaluating a model on the classification dataset

from sklearn.datasets import make_blobs

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

# generate dataset

X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)

# define the model

model = LogisticRegression()

# fit the model

model.fit(X, y)

# make predictions

yhat = model.predict(X)

# evaluate the predictions

acc = accuracy_score(y, yhat)

print('Accuracy: %.3f' % acc)

Running the example fits the model and makes a prediction for each example.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average outcome.

In this case, we can see that the model achieved a performance of about 97.2 percent.

Accuracy: 0.972

1	Accuracy: 0.972

Now that we have a dataset and model, let’s explore how we can develop a decision surface.

Plot a Decision Surface

We can create a decision surface by fitting a model on the training dataset, then using the model to make predictions for a grid of values across the input domain.

Once we have the grid of predictions, we can plot the values and their class label.

A scatter plot could be used if a fine enough grid was taken. A better approach is to use a contour plot that can interpolate the colors between the points.

The contourf() Matplotlib function can be used.

This requires a few steps.

First, we need to define a grid of points across the feature space.

To do this, we can find the minimum and maximum values for each feature and expand the grid one step beyond that to ensure the whole feature space is covered.

...
# define bounds of the domain
min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1
min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1

...

# define bounds of the domain

min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1

min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1

We can then create a uniform sample across each dimension using the arange() function at a chosen resolution. We will use a resolution of 0.1 in this case.

...
# define the x and y scale
x1grid = arange(min1, max1, 0.1)
x2grid = arange(min2, max2, 0.1)

...

# define the x and y scale

x1grid = arange(min1, max1, 0.1)

x2grid = arange(min2, max2, 0.1)

Now we need to turn this into a grid.

We can use the meshgrid() NumPy function to create a grid from these two vectors.

If the first feature x1 is our x-axis of the feature space, then we need one row of x1 values of the grid for each point on the y-axis.

Similarly, if we take x2 as our y-axis of the feature space, then we need one column of x2 values of the grid for each point on the x-axis.

The meshgrid() function will do this for us, duplicating the rows and columns for us as needed. It returns two grids for the two input vectors. The first grid of x-values and the second of y-values, organized in an appropriately sized grid of rows and columns across the feature space.

...
# create all of the lines and rows of the grid
xx, yy = meshgrid(x1grid, x2grid)

...

# create all of the lines and rows of the grid

xx, yy = meshgrid(x1grid, x2grid)

We then need to flatten out the grid to create samples that we can feed into the model and make a prediction.

To do this, first, we flatten each grid into a vector.

...
# flatten each grid to a vector
r1, r2 = xx.flatten(), yy.flatten()
r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))

...

# flatten each grid to a vector

r1, r2 = xx.flatten(), yy.flatten()

r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))

Then we stack the vectors side by side as columns in an input dataset, e.g. like our original training dataset, but at a much higher resolution.

...
# horizontal stack vectors to create x1,x2 input for the model
grid = hstack((r1,r2))

...

# horizontal stack vectors to create x1,x2 input for the model

grid = hstack((r1,r2))

We can then feed this into our model and get a prediction for each point in the grid.

...
# make predictions for the grid
yhat = model.predict(grid)
# reshape the predictions back into a grid

...

# make predictions for the grid

yhat = model.predict(grid)

# reshape the predictions back into a grid

So far, so good.

We have a grid of values across the feature space and the class labels as predicted by our model.

Next, we need to plot the grid of values as a contour plot.

The contourf() function takes separate grids for each axis, just like what was returned from our prior call to meshgrid(). Great!

So we can use xx and yy that we prepared earlier and simply reshape the predictions (yhat) from the model to have the same shape.

...
# reshape the predictions back into a grid
zz = yhat.reshape(xx.shape)

...

# reshape the predictions back into a grid

zz = yhat.reshape(xx.shape)

We then plot the decision surface with a two-color colormap.

...
# plot the grid of x, y and z values as a surface
pyplot.contourf(xx, yy, zz, cmap='Paired')

...

# plot the grid of x, y and z values as a surface

pyplot.contourf(xx, yy, zz, cmap='Paired')

We can then plot the actual points of the dataset over the top to see how well they were separated by the logistic regression decision surface.

The complete example of plotting a decision surface for a logistic regression model on our synthetic binary classification dataset is listed below.

# decision surface for logistic regression on a binary classification dataset
from numpy import where
from numpy import meshgrid
from numpy import arange
from numpy import hstack
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
from matplotlib import pyplot
# generate dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)
# define bounds of the domain
min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1
min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1
# define the x and y scale
x1grid = arange(min1, max1, 0.1)
x2grid = arange(min2, max2, 0.1)
# create all of the lines and rows of the grid
xx, yy = meshgrid(x1grid, x2grid)
# flatten each grid to a vector
r1, r2 = xx.flatten(), yy.flatten()
r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))
# horizontal stack vectors to create x1,x2 input for the model
grid = hstack((r1,r2))
# define the model
model = LogisticRegression()
# fit the model
model.fit(X, y)
# make predictions for the grid
yhat = model.predict(grid)
# reshape the predictions back into a grid
zz = yhat.reshape(xx.shape)
# plot the grid of x, y and z values as a surface
pyplot.contourf(xx, yy, zz, cmap='Paired')
# create scatter plot for samples from each class
for class_value in range(2):
	# get row indexes for samples with this class
	row_ix = where(y == class_value)
	# create scatter of these samples
	pyplot.scatter(X[row_ix, 0], X[row_ix, 1], cmap='Paired')
# show the plot
pyplot.show()

# decision surface for logistic regression on a binary classification dataset

from numpy import where

from numpy import meshgrid

from numpy import arange

from numpy import hstack

from sklearn.datasets import make_blobs

from sklearn.linear_model import LogisticRegression

from matplotlib import pyplot

# generate dataset

X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)

# define bounds of the domain

min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1

min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1

# define the x and y scale

x1grid = arange(min1, max1, 0.1)

x2grid = arange(min2, max2, 0.1)

# create all of the lines and rows of the grid

xx, yy = meshgrid(x1grid, x2grid)

# flatten each grid to a vector

r1, r2 = xx.flatten(), yy.flatten()

r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))

# horizontal stack vectors to create x1,x2 input for the model

grid = hstack((r1,r2))

# define the model

model = LogisticRegression()

# fit the model

model.fit(X, y)

# make predictions for the grid

yhat = model.predict(grid)

# reshape the predictions back into a grid

zz = yhat.reshape(xx.shape)

# plot the grid of x, y and z values as a surface

pyplot.contourf(xx, yy, zz, cmap='Paired')

# create scatter plot for samples from each class

for class_value in range(2):

# get row indexes for samples with this class

row_ix = where(y == class_value)

# create scatter of these samples

pyplot.scatter(X[row_ix, 0], X[row_ix, 1], cmap='Paired')

# show the plot

pyplot.show()

Running the example fits the model and uses it to predict outcomes for the grid of values across the feature space and plots the result as a contour plot.

We can see, as we might have suspected, logistic regression divides the feature space using a straight line. It is a linear model, after all; this is all it can do.

Creating a decision surface is almost like magic. It gives immediate and meaningful insight into how the model has learned the task.

Try it with different algorithms, like an SVM or decision tree.
Post your resulting maps as links in the comments below!

Decision Surface for Logistic Regression on a Binary Classification Task

We can add more depth to the decision surface by using the model to predict probabilities instead of class labels.

...
# make predictions for the grid
yhat = model.predict_proba(grid)
# keep just the probabilities for class 0
yhat = yhat[:, 0]

...

# make predictions for the grid

yhat = model.predict_proba(grid)

# keep just the probabilities for class 0

yhat = yhat[:, 0]

When plotted, we can see how confident or likely it is that each point in the feature space belongs to each of the class labels, as seen by the model.

We can use a different color map that has gradations, and show a legend so we can interpret the colors.

...
# plot the grid of x, y and z values as a surface
c = pyplot.contourf(xx, yy, zz, cmap='RdBu')
# add a legend, called a color bar
pyplot.colorbar(c)

...

# plot the grid of x, y and z values as a surface

c = pyplot.contourf(xx, yy, zz, cmap='RdBu')

# add a legend, called a color bar

pyplot.colorbar(c)

The complete example of creating a decision surface using probabilities is listed below.

# probability decision surface for logistic regression on a binary classification dataset
from numpy import where
from numpy import meshgrid
from numpy import arange
from numpy import hstack
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
from matplotlib import pyplot
# generate dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)
# define bounds of the domain
min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1
min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1
# define the x and y scale
x1grid = arange(min1, max1, 0.1)
x2grid = arange(min2, max2, 0.1)
# create all of the lines and rows of the grid
xx, yy = meshgrid(x1grid, x2grid)
# flatten each grid to a vector
r1, r2 = xx.flatten(), yy.flatten()
r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))
# horizontal stack vectors to create x1,x2 input for the model
grid = hstack((r1,r2))
# define the model
model = LogisticRegression()
# fit the model
model.fit(X, y)
# make predictions for the grid
yhat = model.predict_proba(grid)
# keep just the probabilities for class 0
yhat = yhat[:, 0]
# reshape the predictions back into a grid
zz = yhat.reshape(xx.shape)
# plot the grid of x, y and z values as a surface
c = pyplot.contourf(xx, yy, zz, cmap='RdBu')
# add a legend, called a color bar
pyplot.colorbar(c)
# create scatter plot for samples from each class
for class_value in range(2):
	# get row indexes for samples with this class
	row_ix = where(y == class_value)
	# create scatter of these samples
	pyplot.scatter(X[row_ix, 0], X[row_ix, 1], cmap='Paired')
# show the plot
pyplot.show()

# probability decision surface for logistic regression on a binary classification dataset

from numpy import where

from numpy import meshgrid

from numpy import arange

from numpy import hstack

from sklearn.datasets import make_blobs

from sklearn.linear_model import LogisticRegression

from matplotlib import pyplot

# generate dataset

X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)

# define bounds of the domain

min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1

min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1

# define the x and y scale

x1grid = arange(min1, max1, 0.1)

x2grid = arange(min2, max2, 0.1)

# create all of the lines and rows of the grid

xx, yy = meshgrid(x1grid, x2grid)

# flatten each grid to a vector

r1, r2 = xx.flatten(), yy.flatten()

r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))

# horizontal stack vectors to create x1,x2 input for the model

grid = hstack((r1,r2))

# define the model

model = LogisticRegression()

# fit the model

model.fit(X, y)

# make predictions for the grid

yhat = model.predict_proba(grid)

# keep just the probabilities for class 0

yhat = yhat[:, 0]

# reshape the predictions back into a grid

zz = yhat.reshape(xx.shape)

# plot the grid of x, y and z values as a surface

c = pyplot.contourf(xx, yy, zz, cmap='RdBu')

# add a legend, called a color bar

pyplot.colorbar(c)

# create scatter plot for samples from each class

for class_value in range(2):

# get row indexes for samples with this class

row_ix = where(y == class_value)

# create scatter of these samples

pyplot.scatter(X[row_ix, 0], X[row_ix, 1], cmap='Paired')

# show the plot

pyplot.show()

Running the example predicts the probability of class membership for each point on the grid across the feature space and plots the result.

Here, we can see that the model is unsure (lighter colors) around the middle of the domain, given the sampling noise in that area of the feature space. We can also see that the model is very confident (full colors) in the bottom-left and top-right halves of the domain.

Together, the crisp class and probability decision surfaces are powerful diagnostic tools for understanding your model and how it divides the feature space for your predictive modeling task.

Probability Decision Surface for Logistic Regression on a Binary Classification Task

Summary

In this tutorial, you discovered how to plot a decision surface for a classification machine learning algorithm.

Specifically, you learned:

Decision surface is a diagnostic tool for understanding how a classification algorithm divides up the feature space.
How to plot a decision surface for using crisp class labels for a machine learning algorithm.
How to plot and interpret a decision surface using predicted probabilities.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

19 Responses to Plot a Decision Surface for Machine Learning Algorithms in Python

Lloyd August 14, 2020 at 4:59 pm #

Great tutorial!
Ive been performing similiar tasks over the last year with a quuite computationally expensive series of for-loop functions, the code here seems to speed up my old code quite a lot!
Thanks Jason

Reply
- Jason Brownlee August 15, 2020 at 6:17 am #
  
  Thanks! I’m happy to hear that it’s useful.
  
  Reply
- Pratyush December 10, 2022 at 1:38 am #
  
  Hi how can we do the same for decision trees and spiral dataset.
  
  Reply
mmo August 15, 2020 at 7:55 pm #

Thanks for the nice tutorial, very helpful!

Are there any plans to do a tutorial with three or more features? That would be very interesting. Or do you have some reading recommendations?

Reply
- Jason Brownlee August 16, 2020 at 5:50 am #
  
  You’re welcome.
  
  With more than 2 features, you would create one surface plot for each pair of input variables.
  
  Reply
Nihad August 16, 2020 at 2:51 pm #

Great stuff as usual Jason.
Can u also provide as simple and clear reading on data discovery and cataloging.
One more thing pls,
Data source APIs
Regards

Reply
- Jason Brownlee August 17, 2020 at 5:44 am #
  
  Thanks.
  
  What is “data discovery and cataloging”?
  
  Reply
John Lee August 16, 2020 at 4:39 pm #

Thanks for the great lesson!

Reply
- Jason Brownlee August 17, 2020 at 5:45 am #
  
  You’re welcome.
  
  Reply
Johan Widén August 18, 2020 at 6:01 pm #

Thanks for the great tutorial!
I googled for a library module that creates a decision surface, and found this:
https://towardsdatascience.com/easily-visualize-scikit-learn-models-decision-boundaries-dd0fb3747508
I renamed the module to plot_decision_boundaries.py, because python did not accept the original name. With that, and having put the file in my current directory, I could then execute the following python code:

from numpy import where
from matplotlib import pyplot
from sklearn.datasets import make_blobs
from sklearn.linear_model import LogisticRegression
from plot_decision_boundaries import plot_decision_boundaries
# generate dataset
X, y = make_blobs(n_samples=1000, centers=2, n_features=2, random_state=1, cluster_std=3)
fig = pyplot.figure()
plot_decision_boundaries(X, y, LogisticRegression)
pyplot.savefig(‘plot_decision_boundaries_1.png’)
pyplot.close(fig=’all’)

Reply
- Jason Brownlee August 19, 2020 at 5:56 am #
  
  Nice work.
  
  Reply

Anthony The Koala November 19, 2020 at 3:07 am #

Dear Dr Jason,
The major steps in making a grid and predicting using the grid values to me are the most ‘complex’ operations. While you can find help files on arange and meshgrid as in the following:

# define bounds of the domain
min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1
min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1
# define the x and y scale
x1grid = arange(min1, max1, 0.1)
x2grid = arange(min2, max2, 0.1)
# create all of the lines and rows of the grid
xx, yy = meshgrid(x1grid, x2grid)
# flatten each grid to a vector
r1, r2 = xx.flatten(), yy.flatten()
r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))
# horizontal stack vectors to create x1,x2 input for the model
grid = hstack((r1,r2))

# define bounds of the domain

min1, max1 = X[:, 0].min()-1, X[:, 0].max()+1

min2, max2 = X[:, 1].min()-1, X[:, 1].max()+1

# define the x and y scale

x1grid = arange(min1, max1, 0.1)

x2grid = arange(min2, max2, 0.1)

# create all of the lines and rows of the grid

xx, yy = meshgrid(x1grid, x2grid)

# flatten each grid to a vector

r1, r2 = xx.flatten(), yy.flatten()

r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))

# horizontal stack vectors to create x1,x2 input for the model

grid = hstack((r1,r2))

I will explain with fake data in order to understand the the above operations. Pseudocode may be involved.

#We have two classes, but we want to plot the grid for all the classes
min1, max1 = ;X[:, 0].min()-1, X[:, 0].max()+1
min2, max2 = ;X[:, 1].min()-1, X[:, 1].max()+1
#have more min and maxes according to the number of classes
#min3, max3 = ;X[:, 2].min()-1, X[:, 2].max()+1#this is if we had three classes of for X.

#Make a number of lines according to the number of classes
x1grid = arange(min1, max1, incfactor1); #incfactor1 is the increment factor between min1 & max1
x2grid = arange(min2, max2, incfactor2);# #incfactor2 is the increment factor between min2 & max2
# what does x1grid and x2grid look like
# the data is extremely contrived, but to illustrate what the data structures look like. 
# Normally the lengths of x1grid and x2grid should be the same. 
# This is to give you the feel of the data structures from this point on.
x1grid = arange(0,10,3) ; #length = 4
x1grid
array([0, 3, 6, 9])
x2grid = arange(0,10,2) ; #length = 5
x2grid
array([0, 2, 4, 6, 8])
#The above is not the only way to generate a line.

#We are going to generate two kinds of meshgrids and observe the structure
xx,yy = meshgrid(x1grid, x2grid);
# xx is num rows x num columns = 5 x 4, where 5 = len (x2grid), 4 = len(x1grid).
# xx contains x1grid repeated 5 times row-wise
#x1grid=[0, 3, 6, 9], x2grid=[0, 2, 4, 6, 8] #  we use x1grid, repeated len(x2grid) row times
xx
array([[0, 3, 6, 9],
       [0, 3, 6, 9],
       [0, 3, 6, 9],
       [0, 3, 6, 9],
       [0, 3, 6, 9]])
# yy is num rows x num columns = 4 x 5, where 4 = len (x1grid), 5 = len(x2grid).
# yy contains x2grid repeated 5 times vertically
#x1grid=[0, 3, 6, 9], x2grid=[0, 2, 4, 6, 8] #  we use x2grid, repeated len(x1grid) col times
yy
array([[0, 0, 0, 0],
       [2, 2, 2, 2],
       [4, 4, 4, 4],
       [6, 6, 6, 6],
       [8, 8, 8, 8]])

#What happens if we put x2grid first then x1grid?
xxx, yyy = meshgrid(x2grid, x1grid)
#This is a 4 rows by 5 columns
#,x2grid=[0, 2, 4, 6, 8], x1grid=[0, 3, 6, 9] #  we use x2grid, repeated len(x1grid) row times
xxx
array([[0, 2, 4, 6, 8],
       [0, 2, 4, 6, 8],
       [0, 2, 4, 6, 8],
       [0, 2, 4, 6, 8]])
# This is 4 rows by 5 columns
#,x2grid=[0, 2, 4, 6, 8], x1grid=[0, 3, 6, 9] #  we use x2grid, repeated len(x1grid) col times
yyy
array([[0, 0, 0, 0, 0],
       [3, 3, 3, 3, 3],
       [6, 6, 6, 6, 6],
       [9, 9, 9, 9, 9]])
shape(xxx), shape(yyy), shape(xx), shape(yy)
((4, 5), (4, 5), (5, 4), (5, 4))
#
# Meshgrid in general
#  a, b = meshgrid(x,y)
#
#  a is a len(y) by len(x) array, len(y) is rows, len(x) are cols. Each row is x
#  b is a len(y) by len(x) array, len(y) is rows, len(x) are cols, Each col is y
#
# Going back to xx and yy
xx
array([[0, 3, 6, 9],
       [0, 3, 6, 9],
       [0, 3, 6, 9],
       [0, 3, 6, 9],
       [0, 3, 6, 9]])
yy
array([[0, 0, 0, 0],
       [2, 2, 2, 2],
       [4, 4, 4, 4],
       [6, 6, 6, 6],
       [8, 8, 8, 8]])
# Flatten xx and yy - observe with the matrix, flattening is done row-wise.
#To flatten col-wise xx.flatten('F') produces array([0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 6, 6, 6, 6, 6, 9, 9, 9, 9, 9])
r1, r2 = xx.flatten(), yy.flatten(); # r1 and r2 have shapes (20,)
r1
array([0, 3, 6, 9, 0, 3, 6, 9, 0, 3, 6, 9, 0, 3, 6, 9, 0, 3, 6, 9])
 r2
array([0, 0, 0, 0, 2, 2, 2, 2, 4, 4, 4, 4, 6, 6, 6, 6, 8, 8, 8, 8])
# Pre-step to horizontal (col-wise) stacking
r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1)); #r1 and r21 have shapes (20,1)
#display r1 only. It is a (20,1) shape.
array([[0],
       [3],
       [6],
       [9],
       [0],
       [3],
       [6],
       [9],
       [0],
       [3],
       [6],
       [9],
       [0],
       [3],
       [6],
       [9],
       [0],
       [3],
       [6],
       [9]])
grid = hstack((r1,r2))
grid
array([[0, 0],
       [3, 0],
       [6, 0],
       [9, 0],
       [0, 2],
       [3, 2],
       [6, 2],
       [9, 2],
       [0, 4],
       [3, 4],
       [6, 4],
       [9, 4],
       [0, 6],
       [3, 6],
       [6, 6],
       [9, 6],
       [0, 8],
       [3, 8],
       [6, 8],
       [9, 8]])
#### BUT the grid can be made in a few lines instead!!!!
#Start at the point where xx and yy is flattened.
from numpy import array, reshape,shape
r1, r2 = xx.flatten(), yy.flatten()
grid = array((r1.T, r2.T)).T
grid
array([[0, 0],
       [3, 0],
       [6, 0],
       [9, 0],
       [0, 2],
       [3, 2],
       [6, 2],
       [9, 2],
       [0, 4],
       [3, 4],
       [6, 4],
       [9, 4],
       [0, 6],
       [3, 6],
       [6, 6],
       [9, 6],
       [0, 8],
       [3, 8],
       [6, 8],
       [9, 8]])
#Note the data is contrived to show how data can be shaped. 
#The grid consists of two features.
# We will want to use the grid to predict the values of the grid and the predictions
# made in the shape of xx.
model = whatever model you like()
model.fit(X,y); # from the original data source
yhat = model.predict_proba(grid)
# keep just the probabilities for class 0
yhat = yhat[:, 0]
# reshape the predictions back into a grid
zz = yhat.reshape(xx.shape)
c = pyplot.contourf(xx, yy, zz, cmap='RdBu')
# add a legend, called a color bar
from matplotlib import pyplot
pyplot.colorbar(c)
# create scatter plot for samples from each class
no_of_features = 2
for class_value in range(no_of_features): #no_of_feature=2
	# get row indexes for samples with this class
	row_ix = where(y == class_value)
	# create scatter of these samples
	pyplot.scatter(X[row_ix, 0], X[row_ix, 1], cmap='Paired')
# show the plot
pyplot.show()

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

#We have two classes, but we want to plot the grid for all the classes

min1, max1 = ;X[:, 0].min()-1, X[:, 0].max()+1

min2, max2 = ;X[:, 1].min()-1, X[:, 1].max()+1

#have more min and maxes according to the number of classes

#min3, max3 = ;X[:, 2].min()-1, X[:, 2].max()+1#this is if we had three classes of for X.

#Make a number of lines according to the number of classes

x1grid = arange(min1, max1, incfactor1); #incfactor1 is the increment factor between min1 & max1

x2grid = arange(min2, max2, incfactor2);# #incfactor2 is the increment factor between min2 & max2

# what does x1grid and x2grid look like

# the data is extremely contrived, but to illustrate what the data structures look like.

# Normally the lengths of x1grid and x2grid should be the same.

# This is to give you the feel of the data structures from this point on.

x1grid = arange(0,10,3) ; #length = 4

x1grid

array([0, 3, 6, 9])

x2grid = arange(0,10,2) ; #length = 5

x2grid

array([0, 2, 4, 6, 8])

#The above is not the only way to generate a line.

#We are going to generate two kinds of meshgrids and observe the structure

xx,yy = meshgrid(x1grid, x2grid);

# xx is num rows x num columns = 5 x 4, where 5 = len (x2grid), 4 = len(x1grid).

# xx contains x1grid repeated 5 times row-wise

#x1grid=[0, 3, 6, 9], x2grid=[0, 2, 4, 6, 8] # we use x1grid, repeated len(x2grid) row times

array([[0, 3, 6, 9],

[0, 3, 6, 9],

[0, 3, 6, 9]])

# yy is num rows x num columns = 4 x 5, where 4 = len (x1grid), 5 = len(x2grid).

# yy contains x2grid repeated 5 times vertically

#x1grid=[0, 3, 6, 9], x2grid=[0, 2, 4, 6, 8] # we use x2grid, repeated len(x1grid) col times

array([[0, 0, 0, 0],

[2, 2, 2, 2],

[4, 4, 4, 4],

[6, 6, 6, 6],

[8, 8, 8, 8]])

#What happens if we put x2grid first then x1grid?

xxx, yyy = meshgrid(x2grid, x1grid)

#This is a 4 rows by 5 columns

#,x2grid=[0, 2, 4, 6, 8], x1grid=[0, 3, 6, 9] # we use x2grid, repeated len(x1grid) row times

xxx

array([[0, 2, 4, 6, 8],

[0, 2, 4, 6, 8],

[0, 2, 4, 6, 8]])

# This is 4 rows by 5 columns

#,x2grid=[0, 2, 4, 6, 8], x1grid=[0, 3, 6, 9] # we use x2grid, repeated len(x1grid) col times

yyy

array([[0, 0, 0, 0, 0],

[3, 3, 3, 3, 3],

[6, 6, 6, 6, 6],

[9, 9, 9, 9, 9]])

shape(xxx), shape(yyy), shape(xx), shape(yy)

((4, 5), (4, 5), (5, 4), (5, 4))

# Meshgrid in general

# a, b = meshgrid(x,y)

# a is a len(y) by len(x) array, len(y) is rows, len(x) are cols. Each row is x

# b is a len(y) by len(x) array, len(y) is rows, len(x) are cols, Each col is y

# Going back to xx and yy

array([[0, 3, 6, 9],

[0, 3, 6, 9],

[0, 3, 6, 9]])

array([[0, 0, 0, 0],

[2, 2, 2, 2],

[4, 4, 4, 4],

[6, 6, 6, 6],

[8, 8, 8, 8]])

# Flatten xx and yy - observe with the matrix, flattening is done row-wise.

#To flatten col-wise xx.flatten('F') produces array([0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 6, 6, 6, 6, 6, 9, 9, 9, 9, 9])

r1, r2 = xx.flatten(), yy.flatten(); # r1 and r2 have shapes (20,)

array([0, 3, 6, 9, 0, 3, 6, 9, 0, 3, 6, 9, 0, 3, 6, 9, 0, 3, 6, 9])

array([0, 0, 0, 0, 2, 2, 2, 2, 4, 4, 4, 4, 6, 6, 6, 6, 8, 8, 8, 8])

# Pre-step to horizontal (col-wise) stacking

r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1)); #r1 and r21 have shapes (20,1)

#display r1 only. It is a (20,1) shape.

array([[0],

[3],

[6],

[9],

[0],

[3],

[6],

[9],

[0],

[3],

[6],

[9],

[0],

[3],

[6],

[9],

[0],

[3],

[6],

[9]])

grid = hstack((r1,r2))

grid

array([[0, 0],

[3, 0],

[6, 0],

[9, 0],

[0, 2],

[3, 2],

[6, 2],

[9, 2],

[0, 4],

[3, 4],

[6, 4],

[9, 4],

[0, 6],

[3, 6],

[6, 6],

[9, 6],

[0, 8],

[3, 8],

[6, 8],

[9, 8]])

#### BUT the grid can be made in a few lines instead!!!!

#Start at the point where xx and yy is flattened.

from numpy import array, reshape,shape

r1, r2 = xx.flatten(), yy.flatten()

grid = array((r1.T, r2.T)).T

grid

array([[0, 0],

[3, 0],

[6, 0],

[9, 0],

[0, 2],

[3, 2],

[6, 2],

[9, 2],

[0, 4],

[3, 4],

[6, 4],

[9, 4],

[0, 6],

[3, 6],

[6, 6],

[9, 6],

[0, 8],

[3, 8],

[6, 8],

[9, 8]])

#Note the data is contrived to show how data can be shaped.

#The grid consists of two features.

# We will want to use the grid to predict the values of the grid and the predictions

# made in the shape of xx.

model = whatever model you like()

model.fit(X,y); # from the original data source

yhat = model.predict_proba(grid)

# keep just the probabilities for class 0

yhat = yhat[:, 0]

# reshape the predictions back into a grid

zz = yhat.reshape(xx.shape)

c = pyplot.contourf(xx, yy, zz, cmap='RdBu')

# add a legend, called a color bar

from matplotlib import pyplot

pyplot.colorbar(c)

# create scatter plot for samples from each class

no_of_features = 2

for class_value in range(no_of_features): #no_of_feature=2

# get row indexes for samples with this class

row_ix = where(y == class_value)

# create scatter of these samples

pyplot.scatter(X[row_ix, 0], X[row_ix, 1], cmap='Paired')

# show the plot

pyplot.show()

In summary knowing how arrays of features (X) can be turned into a meshgrid and then into a grid help us understand how the grid is fed into a model to predict the coordinates of the decision surface. The colours of the decision surface are determined by xx and fed into the contourf function.

I’ll be experimenting with the iris data. This project will have to be split into two plots based on petal length vs petal width, and sepal length vs sepal width.

Thank you,
Anthony of Sydney

Jason Brownlee November 19, 2020 at 7:48 am #

Nice work.

Reply

Yuda Mnyawami April 28, 2021 at 11:06 pm #

Hello,

I have tried the above python lines of code, It worked well.

Question
How can I link these lines of code CSV file?

Reply
- Jason Brownlee April 29, 2021 at 6:27 am #
  
  Good question, this will help:
  https://machinelearningmastery.com/how-to-connect-model-input-data-with-predictions-for-machine-learning/
  
  Reply
Murilo January 4, 2022 at 8:14 am #

Very interesting!

How could i change the code so each class is plotted with a different marker? Say the oranges with marker ‘+’ and the blue with marker ‘o’?

Reply
- James Carmichael January 4, 2022 at 10:36 am #
  
  Hi Murilo,
  
  Please refer to the following:
  
  https://matplotlib.org/stable/api/markers_api.html
  
  Regards,
  
  Reply
San February 8, 2023 at 3:11 pm #

What changes should I make for softmax regression with three classes?

Reply
- James Carmichael February 9, 2023 at 9:49 am #
  
  Hi San…You may find the following resources of interest:
  
  https://machinelearningmastery.com/multinomial-logistic-regression-with-python/
  
  https://towardsdatascience.com/multiclass-classification-with-softmax-regression-explained-ea320518ea5d
  
  Reply

Navigation

Plot a Decision Surface for Machine Learning Algorithms in Python

Tutorial Overview

Decision Surface

Dataset and Model

Synthetic Classification Dataset

Fit Classification Predictive Model

Plot a Decision Surface

Further Reading

Summary

Discover Fast Machine Learning in Python!

Develop Your Own Models in Minutes

Finally Bring Machine Learning To
Your Own Projects

More On This Topic

19 Responses to Plot a Decision Surface for Machine Learning Algorithms in Python

Leave a Reply Click here to cancel reply.

Navigation

Tutorial Overview

Decision Surface

Dataset and Model

Synthetic Classification Dataset

Fit Classification Predictive Model

Plot a Decision Surface

Further Reading

Summary

Discover Fast Machine Learning in Python!

Develop Your Own Models in Minutes

Finally Bring Machine Learning To Your Own Projects

More On This Topic

19 Responses to Plot a Decision Surface for Machine Learning Algorithms in Python

Leave a Reply Click here to cancel reply.

Finally Bring Machine Learning To
Your Own Projects