Base R

R will automatically use a variable within a regression as an indicator variable if it is a binary variable. To illustrate this, follow Example 2 within Section 14.4. Firstly, enter the data with the variable *sex* indicated as 0 for males and 1 for females.

Lic_Drivers <- c(12, 6424, 6941, 18068, 20406, 19898, 14430, 8194, 4803, 12, 6139, 6816, 17664, 20063, 19984, 14441, 8400, 5375)
Crashes <- c(227, 5180, 5016, 8595, 7990, 7118, 4527, 2274, 2022, 77, 2113, 1531, 2780, 2742, 2285, 1514, 938, 980)
Sex <- rep(c(0, 1), each = 9)

To perform a multiple linear regression, enter

lm(Crashes ~ Lic_Drivers + Sex)
## 
## Call:
## lm(formula = Crashes ~ Lic_Drivers + Sex)
## 
## Coefficients:
## (Intercept)  Lic_Drivers          Sex  
##   2288.6706       0.2254   -3102.8274

The least-squares regression model is y-hat = 2289 + 0.2254Lic_Drivers – 3103Sex.