How to Index, Slice and Reshape NumPy Arrays for Machine Learning

By Jason Brownlee on June 13, 2020 in Linear Algebra 102

Machine learning data is represented as arrays.

In Python, data is almost universally represented as NumPy arrays.

If you are new to Python, you may be confused by some of the pythonic ways of accessing data, such as negative indexing and array slicing.

In this tutorial, you will discover how to manipulate and access your data correctly in NumPy arrays.

After completing this tutorial, you will know:

How to convert your list data to NumPy arrays.
How to access data using Pythonic indexing and slicing.
How to resize your data to meet the expectations of some machine learning APIs.

Kick-start your project with my new book Linear Algebra for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.

Let’s get started.

Update Jul/2019: Fixed small typo related to reshaping 1D data (thanks Rodrigue).

How to Index, Slice and Reshape NumPy Arrays for Machine Learning in Python
Photo by Björn Söderqvist, some rights reserved.

Tutorial Overview

This tutorial is divided into 4 parts; they are:

From List to Arrays
Array Indexing
Array Slicing
Array Reshaping

Need help with Linear Algebra for Machine Learning?

Take my free 7-day email crash course now (with sample code).

Click to sign-up and also get a free PDF Ebook version of the course.

1. From List to Arrays

In general, I recommend loading your data from file using Pandas or even NumPy functions.

For examples, see the post:

How To Load Machine Learning Data in Python

This section assumes you have loaded or generated your data by other means and it is now represented using Python lists.

Let’s look at converting your data in lists to NumPy arrays.

One-Dimensional List to Array

You may load your data or generate your data and have access to it as a list.

You can convert a one-dimensional list of data to an array by calling the array() NumPy function.

# one dimensional example
from numpy import array
# list of data
data = [11, 22, 33, 44, 55]
# array of data
data = array(data)
print(data)
print(type(data))

# one dimensional example

from numpy import array

# list of data

data = [11, 22, 33, 44, 55]

# array of data

data = array(data)

print(data)

print(type(data))

Running the example converts the one-dimensional list to a NumPy array.

[11 22 33 44 55]
<class 'numpy.ndarray'>

1 2	[11 22 33 44 55] <class 'numpy.ndarray'>

Two-Dimensional List of Lists to Array

It is more likely in machine learning that you will have two-dimensional data.

That is a table of data where each row represents a new observation and each column a new feature.

Perhaps you generated the data or loaded it using custom code and now you have a list of lists. Each list represents a new observation.

You can convert your list of lists to a NumPy array the same way as above, by calling the array() function.

# two dimensional example
from numpy import array
# list of data
data = [[11, 22],
		[33, 44],
		[55, 66]]
# array of data
data = array(data)
print(data)
print(type(data))

# two dimensional example

from numpy import array

# list of data

data = [[11, 22],

[33, 44],

[55, 66]]

# array of data

data = array(data)

print(data)

print(type(data))

Running the example shows the data successfully converted.

[[11 22]
 [33 44]
 [55 66]]
<class 'numpy.ndarray'>

[[11 22]

[33 44]

[55 66]]

2. Array Indexing

Once your data is represented using a NumPy array, you can access it using indexing.

Let’s look at some examples of accessing data via indexing.

One-Dimensional Indexing

Generally, indexing works just like you would expect from your experience with other programming languages, like Java, C#, and C++.

For example, you can access elements using the bracket operator [] specifying the zero-offset index for the value to retrieve.

# simple indexing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
# index data
print(data[0])
print(data[4])

# simple indexing

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

# index data

print(data[0])

print(data[4])

Running the example prints the first and last values in the array.

11
55

Specifying integers too large for the bound of the array will cause an error.

# simple indexing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
# index data
print(data[5])

# simple indexing

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

# index data

print(data[5])

Running the example prints the following error:

IndexError: index 5 is out of bounds for axis 0 with size 5

1	IndexError: index 5 is out of bounds for axis 0 with size 5

One key difference is that you can use negative indexes to retrieve values offset from the end of the array.

For example, the index -1 refers to the last item in the array. The index -2 returns the second last item all the way back to -5 for the first item in the current example.

# simple indexing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
# index data
print(data[-1])
print(data[-5])

# simple indexing

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

# index data

print(data[-1])

print(data[-5])

Running the example prints the last and first items in the array.

55
11

Two-Dimensional Indexing

Indexing two-dimensional data is similar to indexing one-dimensional data, except that a comma is used to separate the index for each dimension.

data[0,0]

data[0,0]

This is different from C-based languages where a separate bracket operator is used for each dimension.

data[0][0]

1	data[0][0]

For example, we can access the first row and the first column as follows:

# 2d indexing
from numpy import array
# define array
data = array([[11, 22], [33, 44], [55, 66]])
# index data
print(data[0,0])

# 2d indexing

from numpy import array

# define array

data = array([[11, 22], [33, 44], [55, 66]])

# index data

print(data[0,0])

Running the example prints the first item in the dataset.

11

If we are interested in all items in the first row, we could leave the second dimension index empty, for example:

# 2d indexing
from numpy import array
# define array
data = array([[11, 22], [33, 44], [55, 66]])
# index data
print(data[0,])

# 2d indexing

from numpy import array

# define array

data = array([[11, 22], [33, 44], [55, 66]])

# index data

print(data[0,])

This prints the first row of data.

[11 22]

[11 22]

3. Array Slicing

So far, so good; creating and indexing arrays looks familiar.

Now we come to array slicing, and this is one feature that causes problems for beginners to Python and NumPy arrays.

Structures like lists and NumPy arrays can be sliced. This means that a subsequence of the structure can be indexed and retrieved.

This is most useful in machine learning when specifying input variables and output variables, or splitting training rows from testing rows.

Slicing is specified using the colon operator ‘:’ with a ‘from‘ and ‘to‘ index before and after the column respectively. The slice extends from the ‘from’ index and ends one item before the ‘to’ index.

data[from:to]

1	data[from:to]

Let’s work through some examples.

One-Dimensional Slicing

You can access all data in an array dimension by specifying the slice ‘:’ with no indexes.

# simple slicing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
print(data[:])

# simple slicing

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

print(data[:])

Running the example prints all elements in the array.

[11 22 33 44 55]

1	[11 22 33 44 55]

The first item of the array can be sliced by specifying a slice that starts at index 0 and ends at index 1 (one item before the ‘to’ index).

# simple slicing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
print(data[0:1])

# simple slicing

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

print(data[0:1])

Running the example returns a subarray with the first element.

[11]

[11]

We can also use negative indexes in slices. For example, we can slice the last two items in the list by starting the slice at -2 (the second last item) and not specifying a ‘to’ index; that takes the slice to the end of the dimension.

# simple slicing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
print(data[-2:])

# simple slicing

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

print(data[-2:])

Running the example returns a subarray with the last two items only.

[44 55]

[44 55]

Two-Dimensional Slicing

Let’s look at the two examples of two-dimensional slicing you are most likely to use in machine learning.

Split Input and Output Features

It is common to split your loaded data into input variables (X) and the output variable (y).

We can do this by slicing all rows and all columns up to, but before the last column, then separately indexing the last column.

For the input features, we can select all rows and all columns except the last one by specifying ‘:’ for in the rows index, and :-1 in the columns index.

X = [:, :-1]

1	X = [:, :-1]

For the output column, we can select all rows again using ‘:’ and index just the last column by specifying the -1 index.

y = [:, -1]

1	y = [:, -1]

Putting all of this together, we can separate a 3-column 2D dataset into input and output data as follows:

# split input and output
from numpy import array
# define array
data = array([[11, 22, 33],
		[44, 55, 66],
		[77, 88, 99]])
# separate data
X, y = data[:, :-1], data[:, -1]
print(X)
print(y)

# split input and output

from numpy import array

# define array

data = array([[11, 22, 33],

[44, 55, 66],

[77, 88, 99]])

# separate data

X, y = data[:, :-1], data[:, -1]

print(X)

print(y)

Running the example prints the separated X and y elements. Note that X is a 2D array and y is a 1D array.

[[11 22]
 [44 55]
 [77 88]]
[33 66 99]

[[11 22]

[44 55]

[77 88]]

[33 66 99]

Split Train and Test Rows

It is common to split a loaded dataset into separate train and test sets.

This is a splitting of rows where some portion will be used to train the model and the remaining portion will be used to estimate the skill of the trained model.

This would involve slicing all columns by specifying ‘:’ in the second dimension index. The training dataset would be all rows from the beginning to the split point.

dataset
train = data[:split, :]

1 2	dataset train = data[:split, :]

The test dataset would be all rows starting from the split point to the end of the dimension.

test = data[split:, :]

1	test = data[split:, :]

Putting all of this together, we can split the dataset at the contrived split point of 2.

# split train and test
from numpy import array
# define array
data = array([[11, 22, 33],
		[44, 55, 66],
		[77, 88, 99]])
# separate data
split = 2
train,test = data[:split,:],data[split:,:]
print(train)
print(test)

# split train and test

from numpy import array

# define array

data = array([[11, 22, 33],

[44, 55, 66],

[77, 88, 99]])

# separate data

split = 2

train,test = data[:split,:],data[split:,:]

print(train)

print(test)

Running the example selects the first two rows for training and the last row for the test set.

[[11 22 33]
[44 55 66]]
[[77 88 99]]

[[11 22 33]

[44 55 66]]

[[77 88 99]]

4. Array Reshaping

After slicing your data, you may need to reshape it.

For example, some libraries, such as scikit-learn, may require that a one-dimensional array of output variables (y) be shaped as a two-dimensional array with one column and outcomes for each row.

Some algorithms, like the Long Short-Term Memory recurrent neural network in Keras, require input to be specified as a three-dimensional array comprised of samples, timesteps, and features.

It is important to know how to reshape your NumPy arrays so that your data meets the expectation of specific Python libraries. We will look at these two examples.

Data Shape

NumPy arrays have a shape attribute that returns a tuple of the length of each dimension of the array.

For example:

# array shape
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
print(data.shape)

# array shape

from numpy import array

# define array

data = array([11, 22, 33, 44, 55])

print(data.shape)

Running the example prints a tuple for the one dimension.

(5,)

(5,)

A tuple with two lengths is returned for a two-dimensional array.

# array shape
from numpy import array
# list of data
data = [[11, 22],
		[33, 44],
		[55, 66]]
# array of data
data = array(data)
print(data.shape)

# array shape

from numpy import array

# list of data

data = [[11, 22],

[33, 44],

[55, 66]]

# array of data

data = array(data)

print(data.shape)

Running the example returns a tuple with the number of rows and columns.

(3, 2)

(3, 2)

You can use the size of your array dimensions in the shape dimension, such as specifying parameters.

The elements of the tuple can be accessed just like an array, with the 0th index for the number of rows and the 1st index for the number of columns. For example:

# array shape
from numpy import array
# list of data
data = [[11, 22],
		[33, 44],
		[55, 66]]
# array of data
data = array(data)
print('Rows: %d' % data.shape[0])
print('Cols: %d' % data.shape[1])

# array shape

from numpy import array

# list of data

data = [[11, 22],

[33, 44],

[55, 66]]

# array of data

data = array(data)

print('Rows: %d' % data.shape[0])

print('Cols: %d' % data.shape[1])

Running the example accesses the specific size of each dimension.

Rows: 3
Cols: 2

1 2	Rows: 3 Cols: 2

Reshape 1D to 2D Array

It is common to need to reshape a one-dimensional array into a two-dimensional array with one column and multiple rows.

NumPy provides the reshape() function on the NumPy array object that can be used to reshape the data.

The reshape() function takes a single argument that specifies the new shape of the array. In the case of reshaping a one-dimensional array into a two-dimensional array with one column, the tuple would be the shape of the array as the first dimension (data.shape[0]) and 1 for the second dimension.

data = data.reshape((data.shape[0], 1))

1	data = data.reshape((data.shape[0], 1))

Putting this all together, we get the following worked example.

# reshape 1D array
from numpy import array
from numpy import reshape
# define array
data = array([11, 22, 33, 44, 55])
print(data.shape)
# reshape
data = data.reshape((data.shape[0], 1))
print(data.shape)

# reshape 1D array

from numpy import array

from numpy import reshape

# define array

data = array([11, 22, 33, 44, 55])

print(data.shape)

# reshape

data = data.reshape((data.shape[0], 1))

print(data.shape)

Running the example prints the shape of the one-dimensional array, reshapes the array to have 5 rows with 1 column, then prints this new shape.

(5,)
(5, 1)

1 2	(5,) (5, 1)

Reshape 2D to 3D Array

It is common to need to reshape two-dimensional data where each row represents a sequence into a three-dimensional array for algorithms that expect multiple samples of one or more time steps and one or more features.

A good example is the LSTM recurrent neural network model in the Keras deep learning library.

The reshape function can be used directly, specifying the new dimensionality. This is clear with an example where each sequence has multiple time steps with one observation (feature) at each time step.

We can use the sizes in the shape attribute on the array to specify the number of samples (rows) and columns (time steps) and fix the number of features at 1.

data.reshape((data.shape[0], data.shape[1], 1))

1	data.reshape((data.shape[0], data.shape[1], 1))

Putting this all together, we get the following worked example.

# reshape 2D array
from numpy import array
# list of data
data = [[11, 22],
		[33, 44],
		[55, 66]]
# array of data
data = array(data)
print(data.shape)
# reshape
data = data.reshape((data.shape[0], data.shape[1], 1))
print(data.shape)

# reshape 2D array

from numpy import array

# list of data

data = [[11, 22],

[33, 44],

[55, 66]]

# array of data

data = array(data)

print(data.shape)

# reshape

data = data.reshape((data.shape[0], data.shape[1], 1))

print(data.shape)

Running the example first prints the size of each dimension in the 2D array, reshapes the array, then summarizes the shape of the new 3D array.

(3, 2)
(3, 2, 1)

1 2	(3, 2) (3, 2, 1)

Summary

In this tutorial, you discovered how to access and reshape data in NumPy arrays with Python.

Specifically, you learned:

How to convert your list data to NumPy arrays.
How to access data using Pythonic indexing and slicing.
How to resize your data to meet the expectations of some machine learning APIs.

Do you have any questions?
Ask your questions in the comments below and I will do my best to answer.

102 Responses to How to Index, Slice and Reshape NumPy Arrays for Machine Learning

ltshan October 27, 2017 at 12:58 pm #

Great articles. Still get much base knowledge though used python for a while.

Reply
- Jason Brownlee October 27, 2017 at 2:57 pm #
  
  I’m glad it helped!
  
  Reply
Ben January 8, 2018 at 7:53 pm #

I have a clarification regarding converting single dimensional array to matrix form(2D). Why we have to convert 1D to 2D in machine learning?

Reply
- Jason Brownlee January 9, 2018 at 5:27 am #
  
  Most machine learning algorithms expect a matrix as input, each row is one observation.
  
  Reply
Anthony The Koala March 24, 2018 at 8:53 am #

Dear Dr Jason, :,
Thank you for providing this tutorial on slicing. It will certainly help me understand material on LSTMs on your page, and your e-books.

My question is on 2-D slicing and the method of slicing. I would like a “generalised” concept of slicing.
In 1-D slicing an array can be split as:
myarray[from:to] – that is understood

In 2-D slicing, “Split Input and Output Features”, you gave two examples of splitting
x = [ : , :-1]
y = [ : , -1]

I would like clarification please in the context of myarray[from:to] especially for 2D splicing especially where there are two colons ‘:’

myarray[from: to] and x [ : , : -1] For x, to = : , and from = : – 1,

Thank you,
Anthony of NSW

Reply
- Jason Brownlee March 25, 2018 at 6:24 am #
  
  Good question Anthony!
  
  When you just provide an index instead of a slice, like -1, it will select only that column/row, in this case the last column.
  
  Doers that help?
  
  Reply
Ashish K S Arya April 27, 2018 at 5:53 pm #

Hello;
thanks for such nice tutorial, i am new to numpy. While I am working on deep learning algorithms at a certain juncture in my program I need to create an array of (1000,256,256,3) dims where 1000 images data (of size 256*256*3) can be loaded.

Can you kindly help me in doing this in Python.

code looks like the following:

x_train=np.empty([256,256,3])

for i in range(0,1000):

pic= Image.open(f'{path}%s.jpg’ %i)

pic=pic.resize([256,256])

x_train2= np.asarray(pic)

x_train=np.stack((x_train,x_train2))

print(x_train.shape)

Thanks

Reply
- Jason Brownlee April 28, 2018 at 5:25 am #
  
  A good trick is to load the data into a list, convert the list to an array, and then reshape the array to the required dimensionality.
  
  Reply
Fatemeh June 8, 2018 at 6:01 am #

Great. the best description I had seen.

Reply
- Jason Brownlee June 8, 2018 at 6:18 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Visalini August 25, 2018 at 4:37 pm #

Hi Jason,
How to convert a 3D array into 1D array..

Help me please..

Reply
- Jason Brownlee August 26, 2018 at 6:23 am #
  
  X = X.reshape((X.shape[0], X.shappe[1], 1))
  
  Reply
- hashim July 30, 2020 at 12:40 am #
  
  X=X.reshape((X.shape[0]*X.shape[1]*X.shape[2],1))
  
  Reply
SkrewEverything August 25, 2018 at 5:34 pm #

We can access 2D array just like C: data[0][0].

Using data[0, 0] is not the only way like you said.

Which version of python are you using?

Reply
- Jason Brownlee August 26, 2018 at 6:24 am #
  
  Using [i, j] is valid for 2d numpy array access in Python 2 and 3.
  
  Reply
  - jerico April 23, 2019 at 4:53 pm #
    
    can you help me with this sir..your help will be much apreciated. thanks in advance!
    
    https://stackoverflow.com/questions/55645616/how-to-output-masks-using-vis-util-visualize-boxes-and-labels-on-image-array
    
    Reply
    - Jason Brownlee April 24, 2019 at 7:52 am #
      
      Perhaps you can summarize the issue for me in a sentence or two?
      
      Reply
ai October 1, 2018 at 3:58 am #

I’ve been searching for this information for so long. Thank you so so so much sir

Reply
- Jason Brownlee October 1, 2018 at 6:30 am #
  
  You’re welcome!
  
  Reply
Haotian Wang October 8, 2018 at 5:52 am #

Jason, you help me a lot

Reply
- Jason Brownlee October 8, 2018 at 9:29 am #
  
  I’m happy to hear that it helped.
  
  Reply
Maijama'a October 27, 2018 at 9:37 pm #

Thank you so much for this clear tutorial.

Reply
- Jason Brownlee October 28, 2018 at 6:10 am #
  
  I’m glad it helped.
  
  Reply
RasmusT January 29, 2019 at 6:42 am #

Hi, I am dealing with normalization of 3d timeseries data.

I have 3d time series data (samples, features, timestemps).

How am I supposed to reshape my 3d data to do normalization?
I am quessing it goes like this:
From [samples, features, timesteps] to ([timesteps,features] or [features,timesteps]) and then use like MinMaxScaler to fit_transform on the training data and then transform on test data?

Reply
- Jason Brownlee January 29, 2019 at 11:39 am #
  
  It is a good idea to normalize per time series across all samples.
  
  I give examples here:
  https://machinelearningmastery.com/machine-learning-data-transforms-for-time-series-forecasting/
  
  Reply
Mansoor February 4, 2019 at 7:13 pm #

Thank you very much for this very useful, very easy to follow and understand introduction to NumPy arrays. It was very helpful.

Reply
- Jason Brownlee February 5, 2019 at 8:14 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Kuruvilla Abraham February 8, 2019 at 2:24 pm #

Hi jason,

my Dataframe has non – image data of dimensions (1446736, 11).

How can i reshape this array to be fed into CNN model

Reply
- Jason Brownlee February 9, 2019 at 5:52 am #
  
  This may help:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-timesteps-and-features-for-lstm-input
  
  Reply
  - Kuruvilla Abraham February 9, 2019 at 4:41 pm #
    
    thanks Jason will this apply to CNN model as well as this for LSTM ryt ?
    
    Reply
    - Jason Brownlee February 10, 2019 at 9:40 am #
      
      Yes.
      
      Reply
Eva April 9, 2019 at 8:00 pm #

Hi Jason, thanks for the tutorial, really helps strengthening the basics.
I have a somewhat related question – the numpy reshape function has a default reshape order C. When working in Keras, is there any difference in using the numpy reshape or the keras native reshape? Uses that one the same order? And if not, is it possible to use numpy native functions directly in Keras code?

Thanks! 🙂

Reply
- Jason Brownlee April 10, 2019 at 6:11 am #
  
  They solve different things.
  
  You can use numpy to reshape your arrays.
  
  Keras offers a reshape layer, to reshape tensors as part of a neural net.
  
  Reply
Rahul April 13, 2019 at 5:37 pm #

You’re awesome dude! So well explained. Thank you!

Reply
- Jason Brownlee April 14, 2019 at 5:44 am #
  
  Thanks, I’m glad it helped.
  
  Reply
Nastaran April 26, 2019 at 11:26 am #

Hi,

Just wondering how I can import a dataset of 2D arrays ?

Thanks

Reply
- Jason Brownlee April 26, 2019 at 1:57 pm #
  
  Do you mean load from file?
  
  This might help:
  https://machinelearningmastery.com/load-machine-learning-data-python/
  
  Reply
dong zhan May 27, 2019 at 12:32 pm #

very thoughtful of you to give this tutorial, otherwise, it would be much harder for me to follow your machine learning tutorial, thank you so much

Reply
- Jason Brownlee May 27, 2019 at 2:38 pm #
  
  Thanks, I’m glad it helped.
  
  Reply
Sandeep Dawre June 5, 2019 at 7:51 pm #

One of the best article!!

Reply
- Jason Brownlee June 6, 2019 at 6:22 am #
  
  Thanks!
  
  Reply
Hemanth June 18, 2019 at 4:24 am #

sir thanks a lot for this article.

Reply
- Jason Brownlee June 18, 2019 at 6:42 am #
  
  You’re welcome.
  
  Reply
Rodrigue KANKEU July 2, 2019 at 5:21 pm #

Hi Jason thanks a lot for this nice tutorial again. I have a problem with the following sentence of the section 4 about reshaping.
For example, some libraries, such as scikit-learn, may require that a one-dimensional array of output variables (y) be shaped as a two-dimensional array with one column and outcomes for each column.

I’m not getting well Isn’t?
For example, some libraries, such as scikit-learn, may require that a one-dimensional array of output variables (y) be shaped as a two-dimensional array with one column and outcomes for each “row”.
In fact I’m confused.
Thanks a lot for the work done.

Reply
- Jason Brownlee July 3, 2019 at 8:24 am #
  
  Yes, I mean row. Updated.
  
  Yes, so [1, 2, 3] is a 1D with the shape (3,) becomes [[1, 2, 3]] or one column with 3 rows shape (1,3).
  
  Reply
Rakesh July 24, 2019 at 2:57 pm #

Dear sir, I want to input my tfidf vector of shape 156060×15103 into LSTM layer with 150 timeseries step and 15103 features .. my lstm input should look something like
(None, 150, 15103). How can I achieve this ? Please help.

If array reshaping does not help here, please suggest any alternative way of how to create lstm layer with (None, 150, 15103) using tfidf input.
My aim is to use tfidf output as input to LSTM layer with 150 timesteps.

Please help.

Reply
- Jason Brownlee July 25, 2019 at 7:39 am #
  
  Set the input_shape argument on the first LSTM layer to the desired shape.
  
  Reply
tina August 15, 2019 at 7:08 pm #

how to ask the user to input a 3d and to solve that input as an inverse

Reply
- Jason Brownlee August 16, 2019 at 7:50 am #
  
  Sorry, I don’t follow. Can you elaborate?
  
  Reply
Nick August 28, 2019 at 9:46 pm #

Hi Jason,

I am experiencing some weird indexing problem where I seemingly have the correct coordinates to call both cv2.rectangle() and plt.Rectangle() but then using the same coordintes for slicing does not work, ie. y1 and y2 need to be reversed. I have an issue up on OpenCV Github but can you take a look if you have a moment? https://github.com/opencv/opencv/issues/15406

Reply
- Jason Brownlee August 29, 2019 at 6:07 am #
  
  Sorry to hear that, this sounds like an opencv issue, not an python array issue.
  
  Good Luck.
  
  Reply
Ipsita October 10, 2019 at 6:18 pm #

I have got a .dat file that contains the matrix images is of size 784-by-1990, i.e., there are totally 1990 images, and each column of the
matrix corresponds to one image of size 28-by-28 pixels.how to visualize it.Pls help.

Reply
- Jason Brownlee October 11, 2019 at 6:15 am #
  
  Perhaps this will help:
  https://machinelearningmastery.com/how-to-load-and-manipulate-images-for-deep-learning-in-python-with-pil-pillow/
  
  Reply
Abdoo October 13, 2019 at 6:55 pm #

Great Article!

I have a data to be fed into stacked LSTM. The data of the shape (81,25,12). When I do predictions using model.predict, the output will of course be 3D again (81,25,12). But I want to plot each feature by itself. I want to convert the output back into 2D and slice each column/feature for error calculations and plotting.

How can I convert 3D back into to 2D array?
thank you so much Jason!

Reply
- Jason Brownlee October 14, 2019 at 8:06 am #
  
  The output does not have to be 3d, the output can be any shape you design into your model.
  
  Nevertheless, you can reshape an array using the reshape() function:
  
  data = data.reshape((??))
  
  Reply
Ryan October 16, 2019 at 3:58 am #

Hi Jason,

Can you explain an array slicing like [:,:,1] for a 2D array please?

Reply
- Jason Brownlee October 16, 2019 at 8:11 am #
  
  Yes, it selects all rows and column 0.
  
  Reply
shadrack kodondi October 17, 2019 at 1:37 am #

the best training in ML i have ever come across…THANK YOU

Reply
- Jason Brownlee October 17, 2019 at 6:39 am #
  
  Thanks.
  
  Reply

Anthony The Koala November 15, 2019 at 9:44 am #

Dear Dr Jason,
Under the heading Two-Dimensional Slicing”

Where it says:
“For the input features, we can select all rows and columns except the last one by…”

To clarify:
“For the input features, we can select all rows and all columns except the last column by….”

X = [:,:-1]

1	X = [:,:-1]

Where it says:
“For the output column, we can select all rows again using “:
and index just the last column by….”

To clarify:
“For the output column, we can select all rows again using “:
and index just the last row by….”

y = [:,-1]

1	y = [:,-1]

Reason, in the last example, all columns are displayed of the last row.

Alternatively, only the last row is displayed.

Thank you,
Anthony of Sydney

Anthony The Koala November 15, 2019 at 10:04 am #

Dear Dr Jason,
My further elaboration on section 2

import numpy as np

doo = np.array([[1,2,3],[4,5,6],[7,8,9]])
doo
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

#Select only the all rows and all columns except the last column
doo[:,:-1]
array([[1, 2],
       [4, 5],
       [7, 8]])

#Select only all the rows and all columns except the last two columns
doo[:,:-2]
array([[1],
       [4],
       [7]])

#Select all rows and last column
doo[:,-1]
array([3, 6, 9])

#Select all rows and second column
doo[:,-2]
array([2, 5, 8])

#Select all rows and first column
doo[:,0]
array([1, 4, 7])

import numpy as np

doo = np.array([[1,2,3],[4,5,6],[7,8,9]])

doo

array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

#Select only the all rows and all columns except the last column

doo[:,:-1]

array([[1, 2],

[4, 5],

[7, 8]])

#Select only all the rows and all columns except the last two columns

doo[:,:-2]

array([[1],

[4],

[7]])

#Select all rows and last column

doo[:,-1]

array([3, 6, 9])

#Select all rows and second column

doo[:,-2]

array([2, 5, 8])

#Select all rows and first column

doo[:,0]

array([1, 4, 7])

#Select all rows and first two columns
doo[:,0:2]
array([[1, 2],
[4, 5],
[7, 8]])

The list is not exhaustible, but you get a better feel by experimenting.

Thank you,
Anthony of Sydney

Anthony The Koala November 15, 2019 at 11:18 am #

Dear Dr Jason,
While the above methods look at selecting either ‘successive’ rows or columns, the finishing touch is to select particular columns or particular rows.

#The whole matrix
doo	       
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

#Select the first and last (3rd row)
doo[[0,2],:]
array([[1, 2, 3],
       [7, 8, 9]])

#Select the first and last columns
doo[:,[0,2]]	       
array([[1, 3],
       [4, 6],
       [7, 9]])

#If you want the diagonal of the matrix, use numpy's diag = no clever indexing
np.diag(doo)
array([1, 5, 9])

#The whole matrix

doo

array([[1, 2, 3],

[4, 5, 6],

[7, 8, 9]])

#Select the first and last (3rd row)

doo[[0,2],:]

array([[1, 2, 3],

[7, 8, 9]])

#Select the first and last columns

doo[:,[0,2]]

array([[1, 3],

[4, 6],

[7, 9]])

#If you want the diagonal of the matrix, use numpy's diag = no clever indexing

np.diag(doo)

array([1, 5, 9])

Still exploring the fundamentals of matrix selection,

One question please:
How if you have a 3D matrix, how to slice a matrix.

Thank you
Anthony of Sydney

Jason Brownlee November 16, 2019 at 7:17 am #

It is important to get good at slicing in Python.

Reply

Jason Brownlee November 16, 2019 at 7:16 am #

Well done!

Reply

Jason Brownlee November 16, 2019 at 7:16 am #

Thanks.

Anthony The Koala November 20, 2019 at 1:33 am #

Dear Dr Jason,
I had a go at slicing a 3D array. I was able to select particular columns and particular rows. I don’t know if 3D arrays are sliced or if there is an application for slicing 3D arrays. So here it is.

I note that the slicing techniques are not exhaustible.

doo = [1 + i for i in range(36)]
doo = np.reshape(doo,[3,3,4])
# We relate all examples from the 3x3x4 array
doo; # We could also say doo[:]
array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]],

       [[13, 14, 15, 16],
        [17, 18, 19, 20],
        [21, 22, 23, 24]],

       [[25, 26, 27, 28],
        [29, 30, 31, 32],
        [33, 34, 35, 36]]])

doo[0,]    # Oth submatrix (the first)
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

doo[1,]    # 1st submatrix (the second)
array([[13, 14, 15, 16],
       [17, 18, 19, 20],
       [21, 22, 23, 24]])

doo[2,]    # 2nd submatrix (the third)
array([[25, 26, 27, 28],
       [29, 30, 31, 32],
       [33, 34, 35, 36]])

doo[0:,0] ;# The first row (0th) of each submatrix
array([[ 1,  2,  3,  4],
       [13, 14, 15, 16],
       [25, 26, 27, 28]])

doo[0:,1] ;# The second row (1st) of each submatrix
array([[ 5,  6,  7,  8],
       [17, 18, 19, 20],
       [29, 30, 31, 32]])

doo[0:,2] ;# The third row (2nd) of each submatrix
array([[ 9, 10, 11, 12],
       [21, 22, 23, 24],
       [33, 34, 35, 36]])

#Getting specific columns for a particular submatrix
#doo[submatrix,[array of rows], column]\

doo[0,[0,1,2],0] #1st submatrix, rows 0-2, 0th column, that is first column of first submatrix
array([1, 5, 9])

doo[0,[0,1,2],1] #1st submatrix, rows 0-2, 0th column, that is the 2nd column of first submatrix
array([ 2,  6, 10])

doo[1,[0,1,2],0] #2nd submatrix, rows 0-2, 0th column, that is the 1st column of 2nd submatrix
array([13, 17, 21])

doo[1,[0,1,2],1]  #2nd submatrix rows 0-2, 1st column, that is the 2nd column of 2nd submatrix
array([14, 18, 22])

#Specific columns of all submatrices 3x3 -please relate this to the 3x4x3 matrix
doo[:,:,0] #First column of all submatrices
array([[ 1,  5,  9],
       [13, 17, 21],
       [25, 29, 33]])

doo[:,:,1] #Second column of all submatrices
array([[ 2,  6, 10],
       [14, 18, 22],
       [26, 30, 34]])

#doo[:,:,2] #Third column of all submatrices
#doo[:,:,3] #Fourth column of all submatrices

#Selecting multiple columns
#Example 1st and 2nd columns of each submatrix
doo[:,:,[0,1]]
array([[[ 1,  2],
        [ 5,  6],
        [ 9, 10]],

       [[13, 14],
        [17, 18],
        [21, 22]],

       [[25, 26],
        [29, 30],
        [33, 34]]])

#Specific columns of all submatrices 3x3x3
doo[:,:,[0,1,3]]; #First, second and fourth cols of all submatrices
array([[[ 1,  2,  4],
        [ 5,  6,  8],
        [ 9, 10, 12]],

       [[13, 14, 16],
        [17, 18, 20],
        [21, 22, 24]],

       [[25, 26, 28],
        [29, 30, 32],
        [33, 34, 36]]])

#Selecting particular rows - the list is not exhaustible
#How to select particular rows of all submatrix
doo[[0,1,2],0]  #Select 1st (0th) row from each submatrix
array([[ 1,  2,  3,  4],
       [13, 14, 15, 16],
       [25, 26, 27, 28]])

doo[[0,1,2],1] #Select 2nd (1st) row from each submatrix
array([[ 5,  6,  7,  8],
       [17, 18, 19, 20],
       [29, 30, 31, 32]])

doo[[0,1,2],2] #Select 3rd (2nd) row from each submatrix
array([[ 9, 10, 11, 12],
       [21, 22, 23, 24],
       [33, 34, 35, 36]])

doo[[0,2],0]   #Select 1st (0th) row from first and last submatrix
array([[ 1,  2,  3,  4],
       [25, 26, 27, 28]])

doo[[0,2],1]   #Select 2nd (1st) row from first and last submatrix
array([[ 5,  6,  7,  8],
       [29, 30, 31, 32]])

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

doo = [1 + i for i in range(36)]

doo = np.reshape(doo,[3,3,4])

# We relate all examples from the 3x3x4 array

doo; # We could also say doo[:]

array([[[ 1, 2, 3, 4],

[ 5, 6, 7, 8],

[ 9, 10, 11, 12]],

[[13, 14, 15, 16],

[17, 18, 19, 20],

[21, 22, 23, 24]],

[[25, 26, 27, 28],

[29, 30, 31, 32],

[33, 34, 35, 36]]])

doo[0,] # Oth submatrix (the first)

array([[ 1, 2, 3, 4],

[ 5, 6, 7, 8],

[ 9, 10, 11, 12]])

doo[1,] # 1st submatrix (the second)

array([[13, 14, 15, 16],

[17, 18, 19, 20],

[21, 22, 23, 24]])

doo[2,] # 2nd submatrix (the third)

array([[25, 26, 27, 28],

[29, 30, 31, 32],

[33, 34, 35, 36]])

doo[0:,0] ;# The first row (0th) of each submatrix

array([[ 1, 2, 3, 4],

[13, 14, 15, 16],

[25, 26, 27, 28]])

doo[0:,1] ;# The second row (1st) of each submatrix

array([[ 5, 6, 7, 8],

[17, 18, 19, 20],

[29, 30, 31, 32]])

doo[0:,2] ;# The third row (2nd) of each submatrix

array([[ 9, 10, 11, 12],

[21, 22, 23, 24],

[33, 34, 35, 36]])

#Getting specific columns for a particular submatrix

#doo[submatrix,[array of rows], column]\

doo[0,[0,1,2],0] #1st submatrix, rows 0-2, 0th column, that is first column of first submatrix

array([1, 5, 9])

doo[0,[0,1,2],1] #1st submatrix, rows 0-2, 0th column, that is the 2nd column of first submatrix

array([ 2, 6, 10])

doo[1,[0,1,2],0] #2nd submatrix, rows 0-2, 0th column, that is the 1st column of 2nd submatrix

array([13, 17, 21])

doo[1,[0,1,2],1] #2nd submatrix rows 0-2, 1st column, that is the 2nd column of 2nd submatrix

array([14, 18, 22])

#Specific columns of all submatrices 3x3 -please relate this to the 3x4x3 matrix

doo[:,:,0] #First column of all submatrices

array([[ 1, 5, 9],

[13, 17, 21],

[25, 29, 33]])

doo[:,:,1] #Second column of all submatrices

array([[ 2, 6, 10],

[14, 18, 22],

[26, 30, 34]])

#doo[:,:,2] #Third column of all submatrices

#doo[:,:,3] #Fourth column of all submatrices

#Selecting multiple columns

#Example 1st and 2nd columns of each submatrix

doo[:,:,[0,1]]

array([[[ 1, 2],

[ 5, 6],

[ 9, 10]],

[[13, 14],

[17, 18],

[21, 22]],

[[25, 26],

[29, 30],

[33, 34]]])

#Specific columns of all submatrices 3x3x3

doo[:,:,[0,1,3]]; #First, second and fourth cols of all submatrices

array([[[ 1, 2, 4],

[ 5, 6, 8],

[ 9, 10, 12]],

[[13, 14, 16],

[17, 18, 20],

[21, 22, 24]],

[[25, 26, 28],

[29, 30, 32],

[33, 34, 36]]])

#Selecting particular rows - the list is not exhaustible

#How to select particular rows of all submatrix

doo[[0,1,2],0] #Select 1st (0th) row from each submatrix

array([[ 1, 2, 3, 4],

[13, 14, 15, 16],

[25, 26, 27, 28]])

doo[[0,1,2],1] #Select 2nd (1st) row from each submatrix

array([[ 5, 6, 7, 8],

[17, 18, 19, 20],

[29, 30, 31, 32]])

doo[[0,1,2],2] #Select 3rd (2nd) row from each submatrix

array([[ 9, 10, 11, 12],

[21, 22, 23, 24],

[33, 34, 35, 36]])

doo[[0,2],0] #Select 1st (0th) row from first and last submatrix

array([[ 1, 2, 3, 4],

[25, 26, 27, 28]])

doo[[0,2],1] #Select 2nd (1st) row from first and last submatrix

array([[ 5, 6, 7, 8],

[29, 30, 31, 32]])

Whether you call this selection or slicing depends on whether you use indices or the slicing operator “:”.

Whatever procedure, the end result is that you want to get a subset of the original data structure.

What is the application of 3D slicing and/or selection?

Thank you
Anthony of Sydney

Jason Brownlee November 20, 2019 at 6:19 am #

Nice work!

Anthony The Koala June 17, 2020 at 3:25 pm #

Dear Dr Adrian,
I came across an array splitting with four parameters:
I understand that it substitutes a 0 at particular location

#Don't worry about this, GOTO simplified fftShift[cY - size:cY + size, cX - size:cX + size] = 0 ;

1
2

#Don't worry about this, GOTO simplified
fftShift[cY - size:cY + size, cX - size:cX + size] = 0 ;

To simplify: by subsituting for a,b,c,d:

fftShift[a: b, c: d] = 0 ;#substitute a 0, but was is a,b,c,d

1

fftShift[a: b, c: d] = 0 ;#substitute a 0, but was is a,b,c,d

What is a, b, c, d? What is the effect of a,b,c,d for rows and columns?

Thank you,
Anthony of Sydneyhank you,
Anthony of Sydney
Jason Brownlee June 18, 2020 at 6:20 am #

Yes, that is “to” and “from” for rows then columns.

Anthony The Koala June 18, 2020 at 3:33 am #

Dear Dr Jason,
From experimentation, a and b means to select ath row to bth-1 row and at the same time select the remaining from cth column to cth-1 column.
To illustrate;

doo = np.reshape([i for i in range(100)],(10,10))
doo
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

doo = np.reshape([i for i in range(100)],(10,10))

doo

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],

[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],

[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],

[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],

[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],

[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],

[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],

[70, 71, 72, 73, 74, 75, 76, 77, 78, 79],

[80, 81, 82, 83, 84, 85, 86, 87, 88, 89],

[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

Suppose we did an operation slicing on doo

doo[0:3,1:2]
array([[ 1],
       [11],
       [21]])

doo[0:3,1:4]
array([[ 1,  2,  3],
       [11, 12, 13],
       [21, 22, 23]])

doo[1:3,1:4]
array([[11, 12, 13],
       [21, 22, 23]])

doo[0:3,1:2]

array([[ 1],

[11],

[21]])

doo[0:3,1:4]

array([[ 1, 2, 3],

[11, 12, 13],

[21, 22, 23]])

doo[1:3,1:4]

array([[11, 12, 13],

[21, 22, 23]])

So if we assigned to those elements

doo[1:3,1:4] = 0
doo[1:3,1:4] = 0
>>> doo
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10,  0,  0,  0, 14, 15, 16, 17, 18, 19],
       [20,  0,  0,  0, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

doo[1:3,1:4] = 0

>>> doo

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],

[10, 0, 0, 0, 14, 15, 16, 17, 18, 19],

[20, 0, 0, 0, 24, 25, 26, 27, 28, 29],

[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],

[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],

[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],

[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],

[70, 71, 72, 73, 74, 75, 76, 77, 78, 79],

[80, 81, 82, 83, 84, 85, 86, 87, 88, 89],

[90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

Going back:

#Are two successive operations, one for columns and one for rows on the #variable doo the same? The answer is NO.
doo = np.reshape([i for i in range(100)],(10,10))
doo[1:3] = 0   ;#It does not matter the order of doing doo[1:3] = 0 or doo[:,1:4] =0
doo[:,1:4] = 0
doo
array([[ 0,  0,  0,  0,  4,  5,  6,  7,  8,  9],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [30,  0,  0,  0, 34, 35, 36, 37, 38, 39],
       [40,  0,  0,  0, 44, 45, 46, 47, 48, 49],
       [50,  0,  0,  0, 54, 55, 56, 57, 58, 59],
       [60,  0,  0,  0, 64, 65, 66, 67, 68, 69],
       [70,  0,  0,  0, 74, 75, 76, 77, 78, 79],
       [80,  0,  0,  0, 84, 85, 86, 87, 88, 89],
       [90,  0,  0,  0, 94, 95, 96, 97, 98, 99]])

#Are two successive operations, one for columns and one for rows on the #variable doo the same? The answer is NO.

doo = np.reshape([i for i in range(100)],(10,10))

doo[1:3] = 0 ;#It does not matter the order of doing doo[1:3] = 0 or doo[:,1:4] =0

doo[:,1:4] = 0

doo

array([[ 0, 0, 0, 0, 4, 5, 6, 7, 8, 9],

[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],

[30, 0, 0, 0, 34, 35, 36, 37, 38, 39],

[40, 0, 0, 0, 44, 45, 46, 47, 48, 49],

[50, 0, 0, 0, 54, 55, 56, 57, 58, 59],

[60, 0, 0, 0, 64, 65, 66, 67, 68, 69],

[70, 0, 0, 0, 74, 75, 76, 77, 78, 79],

[80, 0, 0, 0, 84, 85, 86, 87, 88, 89],

[90, 0, 0, 0, 94, 95, 96, 97, 98, 99]])

In sum doing slicing operations

#We are doing the substitution at the intersection of the rows and the columns
doo[1:3,1:4] = 0
#We are substituting on the particular rows and the particular columns
doo[1:3] = 0   ;# Select the rows 
doo[:,1:4] = 0 ;# Select the columns

#We are doing the substitution at the intersection of the rows and the columns

doo[1:3,1:4] = 0

#We are substituting on the particular rows and the particular columns

doo[1:3] = 0 ;# Select the rows

doo[:,1:4] = 0 ;# Select the columns

I hope to exhaust all the possible methods of index slicing

Thank you,
Anthony of Sydney

spankwire November 30, 2019 at 4:37 am #

In more advanced use case, you may find yourself needing to switch the dimensions of a certain matrix. This is often the case in machine learning applications where a certain model expects a certain shape for the inputs that is different from your dataset. NumPy’s

Reply
- Jason Brownlee November 30, 2019 at 6:32 am #
  
  Yes, I think I give examples using moveaxis() with images here:
  https://machinelearningmastery.com/a-gentle-introduction-to-channels-first-and-channels-last-image-formats-for-deep-learning/
  
  Reply
Iyanda Taofeek December 13, 2019 at 10:06 am #

Please how can I convert a 1D array to 7D array?

Reply
- Jason Brownlee December 13, 2019 at 1:42 pm #
  
  You can use the reshape() function to specify the dimensions of existing numpy array.
  
  Reply
Pratik Chavhan January 10, 2020 at 3:24 am #

Nice workflow for Numpy slicing and reshaping………got much knowledge from ur article…and it helped me a lot in my workspace..Thanks

Reply
- Jason Brownlee January 10, 2020 at 7:29 am #
  
  I’m happy to hear that!
  
  Reply
Alexiy February 28, 2020 at 8:22 am #

Explained better than on Stack Overflow. Thank you, man)

Reply
- Jason Brownlee February 28, 2020 at 1:24 pm #
  
  Thanks!
  
  Reply
Seham March 13, 2020 at 1:46 am #

Such a great tutorial, thanks very much for your great work. keep the good work up

Reply
- Jason Brownlee March 13, 2020 at 8:19 am #
  
  Thanks, I’m happy it helps!
  
  Reply
Rajendran May 21, 2020 at 9:37 pm #

This tutorial really helps a lot. Thank you so much. How can I give my Input to RNN as my dataset is 3d array. for example, it is (no. of samples, no. of rows, no.of columns)? because the RNN accepts input as (samples, time steps and features) rite?

Reply
- Jason Brownlee May 22, 2020 at 6:07 am #
  
  You’re welcome.
  
  Good question, see this:
  https://machinelearningmastery.com/faq/single-faq/what-is-the-difference-between-samples-timesteps-and-features-for-lstm-input
  
  Reply
Winner June 11, 2020 at 4:44 am #

‘How to Index, Slice and Reshape NumPy Arrays for Machine Learning’

Ok, I have been at this for weeks…., going through your post and then trying to figure out how you did it, especially when the code does readily run on my system.
But this post most eloquently explained the root of the problem I have been facing over the last few weeks.
And that has to do with my lack of understanding/knowledge that different python ‘libraries’ and network models require different data formatting for input/output.
Today I have found this post which explains the fundamentals of data preparedness for beginners like myself.
Thank you very much.
I have three questions for you:
1.
Is this all in your book?
2.
Is there some way you can redesign this website so that a ‘table of content’ for this site could show on the left sidebar similar to (https://pandas.pydata.org/docs/getting_started/intro_tutorials/index.html ) starting at the beginner level? This will surely help with ease of navigation and troubleshooting. It takes me HOURS to find some of the answers that I am looking for and it’s all right under my nose.
3.
Do you have a hard copy version of your book?

Winner

Reply
- Jason Brownlee June 11, 2020 at 6:04 am #
  
  I do cover the basics of array indexing/manipulation in this book:
  https://machinelearningmastery.com/linear_algebra_for_machine_learning/
  
  Great suggestion, this may help:
  https://machinelearningmastery.com/start-here/
  
  On hard copies:
  https://machinelearningmastery.com/faq/single-faq/can-i-get-a-hard-copy-of-your-book
  
  Reply
Han June 12, 2020 at 7:19 pm #

Thanks a lot Jason! This is really clear and helpful for me as a beginner. Just wanted to check if there’s a typo in this sentence: “It is common to need to reshape a one-dimensional array into a two-dimensional array with one column and multiple arrays (sic).” Do you mean *rows?

Reply
- Jason Brownlee June 13, 2020 at 5:56 am #
  
  Thanks, fixed!
  
  Reply
mm October 16, 2020 at 10:35 am #

How to enter matrix values from the text box tkinter

Reply
- Jason Brownlee October 16, 2020 at 1:49 pm #
  
  What is tkinter?
  
  Reply
  - mm October 29, 2020 at 10:47 am #
    
    What is meant is how to create numpy array and array values user intervention via graphical interface
    
    Reply
    - Jason Brownlee October 29, 2020 at 1:42 pm #
      
      Sorry, I don’t have any tutorials on using a graphical interface for creating numpy arrays.
      
      Reply
Ariel January 5, 2021 at 5:51 pm #

A very nice tutorial. One point to improve: I’d add a few words on the train-test-split() method, or a link to some reference, like the one below:
https://realpython.com/train-test-split-python-data/

Reply
- Jason Brownlee January 6, 2021 at 6:24 am #
  
  You can learn more here:
  https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms/
  
  Reply
engimp March 13, 2021 at 4:04 am #

first class contribution, thank you

Reply
- Jason Brownlee March 13, 2021 at 5:34 am #
  
  Thanks!
  
  Reply
Guillaume April 9, 2021 at 3:21 pm #

Hi Jason,

What would be the best way to retrieve a 1d index from a shape. Let’s say I have a state made of 2 variables which are binned on (20, 30) and I want to construct a Q value table of shape (600, 3).

I would need frequently to transform from 2d indices to 1d index. For example, going from [1, 2] to 1*20 + 2=22. Do we have a way to do that easily in numpy ?

The only thing I can think of is:
if dims = [20, 30]
dims[-1] = 1
and idx = np.dot(dims, state_idx) where state_idx = [1, 2] for example

Is there something more elegant

Reply
- Jason Brownlee April 10, 2021 at 6:02 am #
  
  Not sure I follow your question sorry.
  
  Perhaps try posting to stackoverflow.
  
  Reply
K PRASANTH May 17, 2021 at 10:49 pm #

Hi Jason,

1.How could I resolve the layer incompatobility and Reshaping of the array data?
2.What are the parameters that can Improves the Accuracy of the model In LSTM, the parameters like the Optimizer, Metrix etc.

Reply
- Jason Brownlee May 18, 2021 at 6:15 am #
  
  Perhaps you can use a reshape layer?
  
  This part of the site is about improving the performance of models:
  https://machinelearningmastery.com/start-here/#better
  
  Reply
CraigM November 20, 2021 at 12:45 am #

Hi Jason,

I’m trying to multiply 2, 2d arrays of unknown sizes while keeping the last column in the largest array and copied in to the new resultant array. For example:

Array A = [1,2,3,4,5]
[6,7,8,9,10]
[11,12,13,14,15]

Array B = [1,2,3,4]
[6,7,8,9,]
[11,12,13,14]

C = AxB and the shape to remain 3,5 with the 5 column to be 5,10,15

I know indexing uses specific positions, however is there a way to use nth position to split this in to sub arrays? I would expect Array A to be split in to a sub array (in this case 4 columns) to perform the multiplication.

Alternatively, is there a way to insert 1s in to the final column of the smaller array to make them the same size that would give the same result.

Any assistance is greatly appreciated.

Best regards,
Craig

Reply
- Adrian Tam November 20, 2021 at 2:42 am #
  
  What you mean is surely not matrix multiplication. If you want to do multiplication elementwise, you can do this:
  
  C = A.copy() C[:, :4] = C[:, :4] * B
  
  1
  2
  
  C = A.copy()
  C[:, :4] = C[:, :4] * B
  
  You must use slicing to make your C the exact same shape as B
  
  Reply
Tesfaye August 20, 2022 at 10:17 pm #

Thank you for what you did on machine learning and deep learning that is helpfull.

Reply
- James Carmichael August 21, 2022 at 7:50 am #
  
  You are very welcome Tesfaye! We wish you the best on your machine learning journey!
  
  Reply

Navigation

How to Index, Slice and Reshape NumPy Arrays for Machine Learning

Tutorial Overview

Need help with Linear Algebra for Machine Learning?

1. From List to Arrays

One-Dimensional List to Array

Two-Dimensional List of Lists to Array

2. Array Indexing

One-Dimensional Indexing

Two-Dimensional Indexing

3. Array Slicing

One-Dimensional Slicing

Two-Dimensional Slicing

Split Input and Output Features

Split Train and Test Rows

4. Array Reshaping

Data Shape

Reshape 1D to 2D Array

Reshape 2D to 3D Array

Further Reading

Summary

Get a Handle on Linear Algebra for Machine Learning!

Develop a working understand of linear algebra

Finally Understand the Mathematics of Data

More On This Topic

102 Responses to How to Index, Slice and Reshape NumPy Arrays for Machine Learning

Leave a Reply Click here to cancel reply.