Test Hypotheses about a Mean Using Simulation

install.packages("mosaic")

We are going to follow Example 1 from Section 10.3A.

Coors Field is home to the Colorado Rockies baseball team and is located in Denver, Colorado. Denver is approximately one mile above sea level, and the air there is “thinner.” Therefore, baseballs are thought to travel farther in this stadium. Does the evidence support this belief? In a random sample of 15 home runs hit in Coors Field, the mean distance the ball traveled was 411.75 feet. Does this represent evidence to suggest that the ball travels farther in Coors Field than it does in the other Major League ballparks?

Load the HomeRun_2018 data set (this removes any non-traditional home runs).

HomeRun <- read.csv("https://sullystats.github.io/Statistics6e/Data/HomeRuns_2018.csv")
head(HomeRun,n=3)

##   Distance
## 1      505
## 2      489
## 3      481

Find the population mean of the data HomeRun.

library(mosaic)
mean(~Distance,data=HomeRun)

## [1] 397.6047

The population mean is 397.6 feet. Therefore, to determine if home runs in Coor’s Field travel farther than all home runs, we are testing the following hypotheses:

\(H_0:\mu = 397.6\) feet
\(H_1:\mu > 397.6\) feet

Obtain 2000 simple random samples of size n = 15 from the population data, HomeRun. For each sample, compute the mean.

set.seed(10)    #Use a seed so results are fixed. 
SampleMean <- bind_rows(do(2000) * c(mean = mean(~Distance, data = sample(HomeRun,15))))
head(SampleMean,n=4)

##       mean
## 1 398.2000
## 2 398.2667
## 3 398.5333
## 4 390.6667

histogram(~mean,data=SampleMean,v=411.75,main="Distribution of Sample Mean Home Run Distance", xlab="Sample Mean Distance (in feet)")

Finally, determine the proportion of sample means that are as extreme or more extreme than 411.75 feet.

prop(~(mean >= 411.75),data=SampleMean)

## prop_TRUE 
##    0.0195

The estimate of the P-value is 0.0195.

Simulating Results Based on the Normal Model

IQ Scores are approximately normal with mean \(\mu\) = 100 and standard deviation \(\sigma\) = 15. What is the likelihood of obtaining a sample mean IQ of 110 or higher based on a random sample of n = 10 individuals who have an advanced or professional degree?

To answer this question, we obtain 2000 simple random samples of size n = 10 from a population that is approximately normal with\(\mu\) = 100 and \(\sigma\) = 15.

set.seed(50)
SampleIQ <- bind_rows(do(2000)*c(meanIQ = mean(rnorm(10,mean=100,sd=15))))
head(SampleIQ,n=4)

##      meanIQ
## 1 101.18910
## 2  91.75132
## 3  96.53523
## 4 110.04012

Now, draw a histogram of the sample means and estimate the P-value from computing the proportion of sample means that are 110 or higher.

histogram(~meanIQ,data=SampleIQ,v=110,main="Mean IQ with n = 10",xlab="Sample Mean")

prop(~(meanIQ>=110),data=SampleIQ)

## prop_TRUE 
##     0.018

The proportion of sample means that are 110 or higher is 0.018.