Load Data

Option 1 Use Excel or Google Sheets to create a csv file. The first column should be the class midpoint, \(x_i\), and the second column should be the frequency, \(f_i\). Be sure to title each column.

We will use the data in Table 14 of Section 3.3.

Table14 <- read.csv("https://sullystats.github.io/Statistics6e/Data/Chapter3/Table14.csv")
head(Table14,n=4)
##   Midpoint Frequency
## 1     62.5         1
## 2     87.5         0
## 3    112.5         7
## 4    137.5        10

Option 2 Enter the data directly into R and create a data frame.

Midpoint <- c(62.5,87.5,112.5,137.5,162.5,187.5,212.5,237.5,262.5,287.5,312.5)
Freq <- c(1,0,7,10,5,4,13,4,5,0,1)
Table14a <- data.frame(Midpoint,Freq)
head(Table14a,n=4)
##   Midpoint Freq
## 1     62.5    1
## 2     87.5    0
## 3    112.5    7
## 4    137.5   10

To find the mean and standard deviation from grouped data, we need to install a package called Weighted.Desc.Stat.

install.packages("Weighted.Desc.Stat")

Now, we will call the Weighted.Desc.Stat library and find the mean and standard deviation of the data in Table 14.

library(Weighted.Desc.Stat)
w.mean(Table14$Midpoint,Table14$Frequency)   # Find the mean
## [1] 182.5
w.sd(Table14$Midpoint,Table14$Frequency)    # Find the standard deviation
## [1] 53.61903

The weighted mean \(\overline{x}_w\) = $182.50.

The weighted standard deviation supplied is a population standard deviation. Because the data in Table 14 is sample data, we need to multiply the weighted standard deviation by \(\sqrt{\frac{n}{n - 1}}\). In this problem, n = 50, so multiply the weighted standard deviation by \(\sqrt{\frac{50}{49}}\).

w.sd(Table14$Midpoint,Table14$Frequency)*sqrt(50/49)  # Find the sample standard deviation. 
## [1] 54.1634

So, the weighted sample standard deviation is $54.16.