14.3 Confidence and Prediction Intervals

Michael Sullivan

2023-08-28

Base R

To find the least-squares regression model, use the lm() command.

We will use the cholesterol data from Section 14.1, Table 4.

Age <- c(25, 25, 28, 32, 32, 32, 38, 42, 48, 51, 51, 58, 62, 65)
Fat <- c(19,28,19,16,24,20,31,20,26,24,32,21,21,30)
Cholesterol <- c(180, 195, 186, 180, 210, 197, 239, 183, 204, 221, 243, 208, 228, 269)
Table4 <- data.frame('Age'=Age, 'Fat'=Fat,'Cholesterol'=Cholesterol)
head(Table4)

##   Age Fat Cholesterol
## 1  25  19         180
## 2  25  28         195
## 3  28  19         186
## 4  32  16         180
## 5  32  24         210
## 6  32  20         197

Find the least-squares regression model and save it as an object.

lm_object <- lm(Cholesterol ~ Age + Fat, data=Table4)
summary(lm_object)

##
## Call:
## lm(formula = Cholesterol ~ Age + Fat, data = Table4)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -19.874  -8.192   3.479   8.151  14.907
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  90.8415    15.9887   5.682 0.000142 ***
## Age           1.0142     0.2427   4.179 0.001540 **
## Fat           3.2443     0.6632   4.892 0.000478 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 11.42 on 11 degrees of freedom
## Multiple R-squared:  0.8473, Adjusted R-squared:  0.8196
## F-statistic: 30.53 on 2 and 11 DF,  p-value: 3.239e-05

To create a confidence interval for the mean value of the response variable, use the predict() command with interval = ‘confidence’.

predict(lm_object,newdata=data.frame('Age'=32,'Fat'=23),interval='confidence',level=0.95)

##        fit      lwr      upr
## 1 197.9142 189.4487 206.3797

# NOTE: If a confidence interval for a different age is desired, change the value of '32' in the code to a different age.  If a different level of confidence is desired, change the level of confidence.

We are 95% confident that the mean total cholesterol of all 32-year-old females who consume 23 grams of saturate fat daily is between 189.4 mg/dL and 206.4 mg/dL.

To create a prediction interval for an individual response, change the interval to ‘prediction’.

predict(lm_object,newdata=data.frame('Age'=32,'Fat'=23),interval='prediction',level=0.95)

##        fit     lwr      upr
## 1 197.9142 171.396 224.4324

We are 95% confident that the mean total cholesterol of a particular 32-year-old female who consumes 23 grams of saturated fat daily is between 171.4 mg/dL and 224.4 mg/dL.