A Gentle Introduction to Vectors in R

By Adrian Tam on August 20, 2023 in R for Data Science 0

R is a language for programming with data. Unlike many other languages, the primitive data types in R are not scalars but vectors. Therefore, understanding how to deal with vectors is crucial to programming or reading the R code. In this post, you will learn about various vector operations in R. Specifically, you will know:

What are the fundamental data objects in R
How to work with vectors in R

Let’s get started.

A Gentle Introduction to Vectors in R
Photo by Frame Harirak. Some rights reserved.

Overview

This post is divided into three parts; they are:

Fundamental Data Objects
Operations on Vectors
From Vector to Matrix

Fundamental Data Objects

In other programming languages like Python, you have fundamental data elements such as integers, floats, and strings. In R, however, the fundamentals are vectors, lists, factors, data frames, and environments. There are data types in R, such as character, numeric, integer, logical, and complex. But R natively deals with vectors of integer, for example, rather than a single integer.

Let’s start with the simplest data object. To create a vector of integers of 5 to 10, you can type 5:10:

> 5:10
[1] 5 6 7 8 9 10

1 2	> 5:10 [1] 5 6 7 8 9 10

The syntax uses a colon to separate the two end values, and R will fill in the rest as consecutive integers. You can assign this vector to a variable and retrieve one of its values:

> x <- 5:10
> x[2]
[1] 6

> x <- 5:10

> x[2]

[1] 6

In R, vectors are indexed from 1, not 0. This follows the convention in the mathematics literature. Indeed you can use multiple indices in a vector to produce a sub-vector, e.g.,

x[c(1,3,5)]

1	x[c(1,3,5)]

will produce a vector (5,7,9).

The colon syntax works for integer vectors. But a more complicated pattern requires the use of the seq() function, for example:

> seq(from=-2, to=2, by=0.5)
[1] -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

1 2	> seq(from=-2, to=2, by=0.5) [1] -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

This creates a vector of numeric with an even step size. To check the data type of the vectors, we can run the following:

> x <- -5:5
> y <- seq(-2, 2, 0.5)
> x
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5
> y
[1] -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
> is.numeric(x)
[1] TRUE
> is.numeric(y)
[1] TRUE
> is.integer(x)
[1] TRUE
> is.integer(y)
[1] FALSE

> x <- -5:5

> y <- seq(-2, 2, 0.5)

> x

[1] -5 -4 -3 -2 -1 0 1 2 3 4 5

> y

[1] -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0

> is.numeric(x)

[1] TRUE

> is.numeric(y)

[1] TRUE

> is.integer(x)

[1] TRUE

> is.integer(y)

[1] FALSE

Indeed, vectors built by the seq() function are numeric but not integer types because the step size can be arbitrary. To check if a vector is an integer vector, we use is.integer() function. The name “is.integer” has a dot in it. Identifiers with a dot as a legitimate character are a feature of the R syntax.

In the above, we created two vectors. We can concatenate them using:

> c(x,y)
[1] -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0 -2.0 -1.5 -1.0 -0.5
[16] 0.0 0.5 1.0 1.5 2.0

> c(x,y)

[1] -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 5.0 -2.0 -1.5 -1.0 -0.5

[16] 0.0 0.5 1.0 1.5 2.0

Note that the integers in vector x have been converted into floating point values to make a consistent vector type. You can convert data explicitly, but there may be side effects. For example, converting the above into an integer vector means rounding off the floating points:

> z <- c(x,y)
> as.integer(z)
[1] -5 -4 -3 -2 -1 0 1 2 3 4 5 -2 -1 -1 0 0 0 1 1 2

> z <- c(x,y)

> as.integer(z)

[1] -5 -4 -3 -2 -1 0 1 2 3 4 5 -2 -1 -1 0 0 0 1 1 2

Operations on Vectors

In R, most operations on vectors are applied elementwise. For example,

> c(10, 9, 8, 7) %/% 3
[1] 3 3 2 2
> c(10, 9, 8, 7) %% 3
[1] 1 0 2 1

> c(10, 9, 8, 7) %/% 3

[1] 3 3 2 2

> c(10, 9, 8, 7) %% 3

[1] 1 0 2 1

In the above, c(10, 9, 8, 7) is to concatenate four 1-element vectors. The operator “%/%” is to do integer division (with the remainder discarded), while the operator “%%” is to return the remainder. Other operators in R are similar to other languages, such as +, -, *, /, ^, for addition, subtraction, multiplication, division, and exponentiation respectively.

The other mathematical operations are as you would expect. For example, this is for exponential function and log:

> exp(x)
[1] 6.737947e-03 1.831564e-02 4.978707e-02 1.353353e-01 3.678794e-01
[6] 1.000000e+00 2.718282e+00 7.389056e+00 2.008554e+01 5.459815e+01
[11] 1.484132e+02
> log(x)
[1] NaN NaN NaN NaN NaN -Inf 0.0000000
[8] 0.6931472 1.0986123 1.3862944 1.6094379
Warning message:
In log(x) : NaNs produced

> exp(x)

[1] 6.737947e-03 1.831564e-02 4.978707e-02 1.353353e-01 3.678794e-01

[6] 1.000000e+00 2.718282e+00 7.389056e+00 2.008554e+01 5.459815e+01

[11] 1.484132e+02

> log(x)

[1] NaN NaN NaN NaN NaN -Inf 0.0000000

[8] 0.6931472 1.0986123 1.3862944 1.6094379

Warning message:

In log(x) : NaNs produced

You may refer to the R documentation for the list of built-in functions. Or you can check out the help using the R command: library(help = "base")

From Vector to Matrix

A matrix in R is built from a vector. For example, the matrix

$$
A = \begin{bmatrix}
9 & 2 & 1 \\
5 & -1 & 6 \\
4 & 0 & -2
\end{bmatrix}
$$

can be built by filling a vector into three columns:

> A <- matrix(c(9, 5, 4, 2, -1, 0, 1, 6, -2), ncol=3)
> print(A)
     [,1] [,2] [,3]
[1,]    9    2    1
[2,]    5   -1    6
[3,]    4    0   -2

> A <- matrix(c(9, 5, 4, 2, -1, 0, 1, 6, -2), ncol=3)

> print(A)

[,1] [,2] [,3]

[1,] 9 2 1

[2,] 5 -1 6

[3,] 4 0 -2

Note that a vector is filled into a matrix by columns, but you can provide an additional argument, “byrow=TRUE”, to change this behavior.

You can tell the matrix’s dimension with

> dim(A)
[1] 3 3

1 2	> dim(A) [1] 3 3

This output is indeed a vector. Hence you can find the number of rows with:

> dim(A)[1]
[1] 3

1 2	> dim(A)[1] [1] 3

In this example, you have a square matrix. Hence you can find its determinant and inverse with the following:

> det(A)
[1] 90
> solve(A)
           [,1]        [,2]       [,3]
[1,] 0.02222222  0.04444444  0.1444444
[2,] 0.37777778 -0.24444444 -0.5444444
[3,] 0.04444444  0.08888889 -0.2111111

> det(A)

[1] 90

> solve(A)

[,1] [,2] [,3]

[1,] 0.02222222 0.04444444 0.1444444

[2,] 0.37777778 -0.24444444 -0.5444444

[3,] 0.04444444 0.08888889 -0.2111111

As you may have guessed, there are many more matrix operations built-in, including chol(), qr(), and svd() for various matrix decomposition. You will know the inverse above is right by multiplying with the original matrix:

> A.inv <- solve(A)
> A %*% A.inv
             [,1]          [,2]          [,3]
[1,] 1.000000e+00 -6.938894e-17 -1.110223e-16
[2,] 6.938894e-17  1.000000e+00  0.000000e+00
[3,] 2.775558e-17 -8.326673e-17  1.000000e+00

> A.inv <- solve(A)

> A %*% A.inv

[,1] [,2] [,3]

[1,] 1.000000e+00 -6.938894e-17 -1.110223e-16

[2,] 6.938894e-17 1.000000e+00 0.000000e+00

[3,] 2.775558e-17 -8.326673e-17 1.000000e+00

Matrix multiplication uses the operator “%*%” since “*” is for elementwise multiplication. Except for the rounding error in the floating point, the product, as shown above, is an identity matrix.

With a matrix, you can extract a row, a column, or a particular element with the following:

> A
     [,1] [,2] [,3]
[1,]    9    2    1
[2,]    5   -1    6
[3,]    4    0   -2
> A[,1]
[1] 9 5 4
> A[2,]
[1]  5 -1  6
> A[3,2]
[1] 0

> A

[,1] [,2] [,3]

[1,] 9 2 1

[2,] 5 -1 6

[3,] 4 0 -2

> A[,1]

[1] 9 5 4

> A[2,]

[1] 5 -1 6

> A[3,2]

[1] 0

On the contrary, you can build a larger matrix by “binding” two matrices along the rows or along the columns:

> A
     [,1] [,2] [,3]
[1,]    9    2    1
[2,]    5   -1    6
[3,]    4    0   -2
> A.inv
           [,1]        [,2]       [,3]
[1,] 0.02222222  0.04444444  0.1444444
[2,] 0.37777778 -0.24444444 -0.5444444
[3,] 0.04444444  0.08888889 -0.2111111
> cbind(A, A.inv)
     [,1] [,2] [,3]       [,4]        [,5]       [,6]
[1,]    9    2    1 0.02222222  0.04444444  0.1444444
[2,]    5   -1    6 0.37777778 -0.24444444 -0.5444444
[3,]    4    0   -2 0.04444444  0.08888889 -0.2111111
> rbind(A, A.inv)
           [,1]        [,2]       [,3]
[1,] 9.00000000  2.00000000  1.0000000
[2,] 5.00000000 -1.00000000  6.0000000
[3,] 4.00000000  0.00000000 -2.0000000
[4,] 0.02222222  0.04444444  0.1444444
[5,] 0.37777778 -0.24444444 -0.5444444
[6,] 0.04444444  0.08888889 -0.2111111

> A

[,1] [,2] [,3]

[1,] 9 2 1

[2,] 5 -1 6

[3,] 4 0 -2

> A.inv

[,1] [,2] [,3]

[1,] 0.02222222 0.04444444 0.1444444

[2,] 0.37777778 -0.24444444 -0.5444444

[3,] 0.04444444 0.08888889 -0.2111111

> cbind(A, A.inv)

[,1] [,2] [,3] [,4] [,5] [,6]

[1,] 9 2 1 0.02222222 0.04444444 0.1444444

[2,] 5 -1 6 0.37777778 -0.24444444 -0.5444444

[3,] 4 0 -2 0.04444444 0.08888889 -0.2111111

> rbind(A, A.inv)

[,1] [,2] [,3]

[1,] 9.00000000 2.00000000 1.0000000

[2,] 5.00000000 -1.00000000 6.0000000

[3,] 4.00000000 0.00000000 -2.0000000

[4,] 0.02222222 0.04444444 0.1444444

[5,] 0.37777778 -0.24444444 -0.5444444

[6,] 0.04444444 0.08888889 -0.2111111

Summary

In this post, you learned how to manipulate vectors, the fundamental data object in R. Specifically, you learned how to:

Create a vector in R
Perform vector operations in R
Converting a vector into a matrix and performing some matrix operations

Navigation

A Gentle Introduction to Vectors in R

Overview

Fundamental Data Objects

Operations on Vectors

From Vector to Matrix

Further Readings

Web site

Books

Summary

More On This Topic

No comments yet.

Leave a Reply Click here to cancel reply.