Error Bars are used to visualize the variability of the plotted data. Error Bars can be applied to graphs such as, Dot Plots, Barplots or Line Graphs, to provide an additional layer of detail on the presented data.
Generally, Error bars are used to show either the standard deviation, standard error, confidence intervals or interquartile range.
The length of an Error Bar helps reveal the uncertainty of a data point: a short Error Bar shows that values are concentrated, signalling that the plotted average value is more likely, while a long Error Bar would indicate that the values are more spread out and less reliable.
This article describes how to add error bars into a plot using the ggplot2 R package. You will learn how to create bar plots and line plots with error bars
Contents:
- Loading required R package
- Data preparation
- Key R functions and error plot types
- Basic error bars
- Grouped error bars
- Conclusion
Related Book
GGPlot2 Essentials for Great Data Visualization in R
Loading required R package
Load the ggplot2 package and set the default theme to theme_classic()
with the legend at the top of the plot:
library(ggplot2)theme_set( theme_classic() + theme(legend.position = "top") )
Data preparation
- Prepare the data:
ToothGrowth
data set.
df <- ToothGrowthdf$dose <- as.factor(df$dose)head(df, 3)
## len supp dose## 1 4.2 VC 0.5## 2 11.5 VC 0.5## 3 7.3 VC 0.5
- Compute summary statistics for the variable
len
organized into groups by the variabledose
:
library(dplyr)df.summary <- df %>% group_by(dose) %>% summarise( sd = sd(len, na.rm = TRUE), len = mean(len) )df.summary
## # A tibble: 3 x 3## dose sd len## <fct> <dbl> <dbl>## 1 0.5 4.50 10.6## 2 1 4.42 19.7## 3 2 3.77 26.1
Key R functions and error plot types
Key functions to create error plots using the summary statistics data:
geom_crossbar()
for hollow bar with middle indicated by horizontal linegeom_errorbar()
for error barsgeom_errorbarh()
for horizontal error barsgeom_linerange()
for drawing an interval represented by a vertical linegeom_pointrange()
for creating an interval represented by a vertical line, with a point in the middle.
Start by initializing ggplot with the summary statistics data:
- Specify x and y as usually
- Specify
ymin = len-sd
andymax = len+sd
to add lower and upper error bars. If you want only to add upper error bars but not the lower ones, useymin = len
(instead oflen-sd
) andymax = len+sd
.
# Initialize ggplot with dataf <- ggplot( df.summary, aes(x = dose, y = len, ymin = len-sd, ymax = len+sd) )
Possible error plots:
Basic error bars
Create simple error plots:
# Vertical line with point in the middlef + geom_pointrange()# Standard error barsf + geom_errorbar(width = 0.2) + geom_point(size = 1.5)
Create horizontal error bars. Put dose
on y axis and len
on x-axis. Specify xmin
and xmax
.
# Horizontal error bars with mean points# Change the color by groupsggplot(df.summary, aes(x = len, y = dose, xmin = len-sd, xmax = len+sd)) + geom_point() + geom_errorbarh(height=.2)
- Add jitter points (representing individual points), dot plots and violin plots. For this, you should initialize ggplot with original data (
df
) and specify thedf.summary
data in the error plot function, heregeom_pointrange()
.
# Combine with jitter pointsggplot(df, aes(dose, len)) + geom_jitter(position = position_jitter(0.2), color = "darkgray") + geom_pointrange(aes(ymin = len-sd, ymax = len+sd),data = df.summary)# Combine with violin plotsggplot(df, aes(dose, len)) + geom_violin(color = "darkgray", trim = FALSE) + geom_pointrange(aes(ymin = len-sd, ymax = len+sd), data = df.summary)
- Create basic bar/line plots of mean +/- error. So we need only the
df.summary
data. :- Add lower and upper error bars for the line plot:
ymin = len-sd
andymax = len+sd
. - Add only upper error bars for the bar plot:
ymin = len
(instead oflen-sd
) andymax = len+sd
.
- Add lower and upper error bars for the line plot:
Note that, for line plot, you should always specify group = 1
in the aes()
, when you have one group of line.
# (1) Line plotggplot(df.summary, aes(dose, len)) + geom_line(aes(group = 1)) + geom_errorbar( aes(ymin = len-sd, ymax = len+sd),width = 0.2) + geom_point(size = 2)# (2) Bar plotggplot(df.summary, aes(dose, len)) + geom_col(fill = "lightgray", color = "black") + geom_errorbar(aes(ymin = len, ymax = len+sd), width = 0.2)
For line plot, you might want to treat x-axis as numeric:
df.sum2 <- df.summarydf.sum2$dose <- as.numeric(df.sum2$dose)ggplot(df.sum2, aes(dose, len)) + geom_line() + geom_errorbar( aes(ymin = len-sd, ymax = len+sd),width = 0.2) + geom_point(size = 2)
- Bar plots and line plots + jitter points. We need the original
df
data for the jitter points and thedf.summary
data for the othergeom
layers.- For the line plot: First, add jitter points, then add lines + error bars + mean points on top of the jitter points.
- For the bar plot: First, add the bar plot, then add jitter points + error bars on top of the bars.
# (1) Create a line plot of means + # individual jitter points + error bars ggplot(df, aes(dose, len)) + geom_jitter( position = position_jitter(0.2), color = "darkgray") + geom_line(aes(group = 1), data = df.summary) + geom_errorbar( aes(ymin = len-sd, ymax = len+sd), data = df.summary, width = 0.2) + geom_point(data = df.summary, size = 2)# (2) Bar plots of means + individual jitter points + errorsggplot(df, aes(dose, len)) + geom_col(data = df.summary, fill = NA, color = "black") + geom_jitter( position = position_jitter(0.2), color = "black") + geom_errorbar( aes(ymin = len-sd, ymax = len+sd), data = df.summary, width = 0.2)
Grouped error bars
Case of one continuous variable (len
) and two grouping variables (dose
, supp
).
- Compute the summary statistics of
len
grouped bydose
andsupp
:
library(dplyr)df.summary2 <- df %>% group_by(dose, supp) %>% summarise( sd = sd(len), len = mean(len) )df.summary2
## # A tibble: 6 x 4## # Groups: dose [?]## dose supp sd len## <fct> <fct> <dbl> <dbl>## 1 0.5 OJ 4.46 13.2 ## 2 0.5 VC 2.75 7.98## 3 1 OJ 3.91 22.7 ## 4 1 VC 2.52 16.8 ## 5 2 OJ 2.66 26.1 ## 6 2 VC 4.80 26.1
- Create error plots for multiple groups:
- pointrange colored by groups (supp)
- standard error bars + mean points colored by groups (supp)
# (1) Pointrange: Vertical line with point in the middleggplot(df.summary2, aes(dose, len)) + geom_pointrange( aes(ymin = len-sd, ymax = len+sd, color = supp), position = position_dodge(0.3) )+ scale_color_manual(values = c("#00AFBB", "#E7B800"))# (2) Standard error barsggplot(df.summary2, aes(dose, len)) + geom_errorbar( aes(ymin = len-sd, ymax = len+sd, color = supp), position = position_dodge(0.3), width = 0.2 )+ geom_point(aes(color = supp), position = position_dodge(0.3)) + scale_color_manual(values = c("#00AFBB", "#E7B800"))
- Create simple line/bar plots for multiple groups.
- Line plots: change linetype by groups (
supp
) - Bar plots: change fill color by groups (
supp
)
- Line plots: change linetype by groups (
# (1) Line plot + error barsggplot(df.summary2, aes(dose, len)) + geom_line(aes(linetype = supp, group = supp))+ geom_point()+ geom_errorbar( aes(ymin = len-sd, ymax = len+sd, group = supp), width = 0.2 )# (2) Bar plots + upper error bars.ggplot(df.summary2, aes(dose, len)) + geom_col(aes(fill = supp), position = position_dodge(0.8), width = 0.7)+ geom_errorbar( aes(ymin = len, ymax = len+sd, group = supp), width = 0.2, position = position_dodge(0.8) )+ scale_fill_manual(values = c("grey80", "grey30"))
- Add jitter points:
# Line plots with jittered pointsggplot(df, aes(dose, len, color = supp)) + geom_jitter(position = position_jitter(0.2)) + geom_line(aes(group = supp),data = df.summary2) + geom_errorbar(aes(ymin = len-sd, ymax = len+sd), data = df.summary2, width = 0.2)+ scale_color_manual(values = c("#00AFBB", "#E7B800")) + theme(legend.position = "top")# Bar plots + jittered points + error barsggplot(df, aes(dose, len, color = supp)) + geom_col(data = df.summary2, position = position_dodge(0.8), width = 0.7, fill = "white") + geom_jitter( position = position_jitterdodge(jitter.width = 0.2, dodge.width = 0.8) ) + geom_errorbar( aes(ymin = len-sd, ymax = len+sd), data = df.summary2, width = 0.2, position = position_dodge(0.8) )+ scale_color_manual(values = c("#00AFBB", "#E7B800")) + theme(legend.position = "top")
Conclusion
This article describes how to add error bars to plots created using the ggplot2 R package.