One of the most popular plotting libraries in R is not the plotting function in R base, but the ggplot2 library. People use that because it is flexible. This library also works using the philosophy of “grammar of graphics”, which is not to generate a visualization upon a function call, but to define what should be in the plot, and you can refine it further before setting it into a picture. In this post, you will learn about ggplot2 and see some examples. In particular, you will learn:
- How to make use of ggplot2 to create a plot from a dataset
- How to create various charts and graphics with multiple facades using ggplot2
Let’s get started.

Using ggplot2 for Visualization in R.
Photo by Alice Dietrich. Some rights reserved.
Overview
This post is divided into two parts; they are:
- Getting Started with ggplot2
- Examples of Plots with ggplot2
Getting Started with ggplot2
You need to install ggplot2 in your R environment with the following:
1 |
install.packages("ggplot2") |
Once you have it installed, you need to load it to use its features:
1 |
library(ggplot2) |
An example of ggplot2 would be to load a simple dataset such as the iris classification dataset and make a plot:
1 |
iris |> ggplot() + geom_point(aes(x=Petal.Width, y=Petal.Length, color=Species)) |
This is to first create a plot object with the dataset iris
. But this would be a clean slate. Then you want to add a scatter plot on the canvas, namely, the points as separate dots. This is done by adding geom_point()
onto the ggplot object and using aes() to specify the coordinate and color of each point.
The output of this plot would be as follows:
This is indeed a plot object that you can assign to a variable. To show the idea of the grammar of graphics, you should notice the two axes are labeled after the column name, and you can add a modifier to the axes label as well as the theme:
1 2 3 4 5 |
picture <- iris |> ggplot() + geom_point(aes(x=Petal.Width, y=Petal.Length, color=Species)) picture <- picture + labs(title="Scatter plot of iris dataset", x="Petal width/cm", y="Petal lenght/cm") + theme_classic() picture |
This will give you a slightly different picture:
If you want to overlay different plot or change some style in the picture, all you need to do is to add the modifier function to the graph object.
Examples of Plots with ggplot2
Let’s see some more examples with ggplot2.
Let’s consider the mtcars dataset in the following. The dataset is like the following:
1 |
print(mtcars) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
mpg cyl disp hp drat wt qsec vs am gear carb Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 |
This dataset has only 32 rows and each row has 11 attributes. Let’s consider only the column mpg
and below is how we create a histogram and a density plot:
1 |
ggplot(mtcars, aes(x=mpg, y=..count..)) + geom_histogram(bins=10) + geom_density() |
Here you see how you can make two different plots overlap on the same chart. Another example that you may find it useful is to overlap a scatter plot with a linear regression:
1 |
ggplot(mtcars, aes(x=wt, y=mpg)) + geom_point() + geom_smooth(method=lm) |
You defined a ggplot object with the x- and y-axes specified. Then draw the points and draw the smoothed line. Beware that you need to use method=lm
in geom_smooth()
for a straight line. By default, it will be equivalent to method=loess
which will be a curve generated using locally estimated scatterplot smoothing algorithm.
Sometimes, you would like to plot three different attributes. Instead of having a 3D plot, you may try a 2D plot with multiple facets if one of them is a categorical variable. Below is an example of plotting the attribute mpg
against wt
and separated by different values of cyl
:
1 |
ggplot(mtcars, aes(wt, mpg)) + geom_point() + facet_grid(rows=vars(cyl)) |
Note that you may choose to have column facets if you pass in a cols=
parameter instead of rows=
above. One downside of facets is that the plots must be of similar nature. If you want to put two different plots side by side with highest flexibility, you may want to look at the package cowplot
:
1 2 3 4 |
library(cowplot) left <- ggplot(mtcars, aes(group=cyl, x=cyl, y=mpg)) + geom_boxplot() + theme_bw() right <- ggplot(mtcars, aes(wt, mpg)) + geom_point() + geom_smooth(method=lm) plot_grid(left, right) |
This will produce a plot as follows:
Further Readings
This section provides you some links to study further on the materials above:
Books
- Beginning data science in R 4, second edition by Thomas Mailund
- R graphics cookbook, second edition, by Winston Chang
- The Grammar of Graphics, second edition, by Leland Wilkinson
Online materials
- ggplot2 on tidyverse
- ggplot2 graph gallery
- Local regression on Wikipedia
Summary
In this post, you learned about the library ggplot2 in R. In particular, you learned:
- How to create plots using the grammar of graphics
- How to create scatter plot, line plot, and histograms using ggplot2
- How to create multiple plots in the same graph
No comments yet.