Let’s use the HomeRun_2014 data to learn how to draw side-by-side boxplots. Recall, this data represents all home runs hit during the 2014 Major League baseball season. We will draw side-by-side boxplots of the quantitative variable “TrueDist” (distance, in feet, of a home run) by “Type” (PL - plenty, ND - no doubt, JE = just enough, ITP - inside the park).

HomeRun <- read.csv("https://sullystats.github.io/Statistics6e/Data/HomeRun_2014.csv")
head(HomeRun,n=4)
##        Date           Hitter HitterTeam           Pitcher PitcherTeam INN
## 1 9/28/2014   Rizzo, Anthony        CHC       Fiers, Mike         MIL   1
## 2 9/28/2014 Bernadina, Roger        LAD      Scahill, Rob         COL   6
## 3 9/28/2014     Duvall, Adam         SF     Stauffer, Tim          SD   4
## 4 9/28/2014      Duda, Lucas        NYM Foltynewicz, Mike         HOU   8
##          Ballpark TrueDist SpeedOffBat Elev.Angle Horiz.Angle Apex Type
## 1     Miller Park      441       109.1       22.7        86.7   81   PL
## 2 Dodger Stadi...      424       113.2       27.7        62.3   98   ND
## 3       AT&T Park      423       103.6       31.9       112.9   98   ND
## 4      Citi Field      417       106.3       26.5        73.0   83   PL

Using Base R

Now, use the boxplot( ) command. The syntax is

boxplot(quant_var ~ qualitative_var, horizontal=TRUE, main = “…”,xlab=“…”,col=“#6897bb”)

boxplot(HomeRun$TrueDist ~ HomeRun$Type, horizontal = TRUE, main = "Home Run Distance by Type", xlab = "Distance (in feet)", ylab ="Type", col = "#6897bb")

Note: In the example above, the quantitative variable was in one column, while the qualitative variable was in a separate column. If, however, the quantitative variable for each level of the categorical variable are in separate columns (such as distance for JE home runs in one column, distance for ND home runs in a second column, and so on), use the following syntax.

boxplot(table$column1, table$column2, names = c(“column1_name”, “column2_name”))

Using Mosaic

To have Mosaic draw a horizontal boxplot by a qualitative variable use the syntax:

function(qual_variable~ quant_variable, data = df_name)

The command for a boxplot in Mosaic is bwplot( ).

library(mosaic)
bwplot(Type ~ TrueDist,data=HomeRun,col="black",fill="#6897bb",main="Home Run Distance by Type",xlab="Distance (in feet)",ylab="Type")