Hypothesis Tests for a Population Proportion

All raw data from Sullivan Statistics: Informed Decisions Using Data 6/e may be found at

First, be sure the package mosaic is installed.

install.packages('mosaic')

Confidence Interval for a Population Proportion (Summarized Data)

Example 1 Humira is a medication used to treat rheumatoid arthritis (RA). In clinical trials of Humira, 705 subjects diagnosed with RA were administered 40 mg of Humira every other week. Of the 705 subjects, 66 reported nausea as a side effect. It is known that the proportion of RA subjects in similar studies receiving a placebo who report nausea as a side effect is 0.08. Does the sample evidence represent significant evidence that a higher proportion of subjects receiving Humira experience nausea as a side effect than those taking a placebo? Use the a = 0.05 level of significance. Source: rxabbvie.com

Use the binom.test function in the Mosaic package. The syntax is

binom.test(x, n,p = proportion in null,alternative = “greater” or “less” or “two.sided”)

library(mosaic)
binom.test(66,705,p=0.08,alternative="greater")

## 
## 
## 
## data:  66 out of 705
## number of successes = 66, number of trials = 705, p-value = 0.1051
## alternative hypothesis: true probability of success is greater than 0.08
## 95 percent confidence interval:
##  0.07615269 1.00000000
## sample estimates:
## probability of success 
##             0.09361702

The P-value is 0.1051.

Note If you want the test statistic, do the following.

phat <- 66/705                            #66 successes out of 705 trials
z <- (phat - 0.08)/sqrt(0.08*0.92/705)    #Proportion in null is 0.08; n = 705
z

## [1] 1.332716

Hypothesis Test for a Population Proportion (Raw Data)

Let’s consider the fare charged by ALL Chicago taxi rides on a single day.

Taxi <- read.csv("https://sullystats.github.io/Statistics6e/Data/ChicagoTaxi.csv")
head(Taxi,n=4)

##   Trip  Fare Payment
## 1  300  6.50    Cash
## 2 1281 42.25  Credit
## 3  780 10.75    Cash
## 4  900 17.00  Credit

We are going to work with the variable “Payment”, which is how the fare is paid - cash or credit.

Now, let’s take a random sample of n = 50 rides from this data set and determine whether the sample evidence suggests more than a majority of fares are paid with cash. So, we are testing

\(H_0: p = 0.5\)
\(H_1: p > 0.5\)

Taxi_Sample <- sample(Taxi,50) 
head(Taxi_Sample,n=4)

##       Trip  Fare Payment orig.id
## 17644  475  7.25    Cash   17644
## 5102   249  5.75  Credit    5102
## 27656  900 26.25    Cash   27656
## 9345    60  3.50    Cash    9345

Now, use the binom.test function within Mosaic.

library(mosaic)
binom.test(~(Payment=="Cash"),data=Taxi_Sample,p=0.5,alternative = "greater")  #Define a success as payment is equivalent to Cash.

## 
## 
## 
## data:  Taxi_Sample$(Payment == "Cash")  [with success = TRUE]
## number of successes = 29, number of trials = 50, p-value = 0.1611
## alternative hypothesis: true probability of success is greater than 0.5
## 95 percent confidence interval:
##  0.4539892 1.0000000
## sample estimates:
## probability of success 
##                   0.58