Constructing Histograms from Continuous Data

Again, we will use the data from Table 12 in Section 2.2.

Table12 <- read.csv("https://sullystats.github.io/Statistics6e/Data/Chapter2/Table12.csv")

Now, let’s draw a histogram of the data.

Using Mosaic

Use the gf_histogram command. Don’t forget to install Mosaic (if necessary) and call the Mosaic library. Some arguments needed within gf_histogram are binwidth and origin. The argurment binwidth=25 makes the class width 25. The argument boundary=50 makes the lower class limit of the first class 50.

install.packages("Mosaic")
library(mosaic)
gf_histogram(~Fine,data=Table12,binwidth=25,boundary=50,col="black",fill="blue",ylab="Frequency",title="Fines for Parking and Camera Violations in New York City")

To create a relative frequency histogram, use the gf_refine function. Again, we are going to use two functions in one command, so use the plus (+) sign.

gf_histogram(~Fine,data=Table12,binwidth=25,boundary=50,col="black",fill="blue",title="Fines for Parking and Camera Violations in New York City") + gf_refine(scale_y_continuous(sec.axis=sec_axis(trans=~./nrow(Table12),name="Relative Frequency")))

The left side of the histogram is still count, but the right side is relative frequency. A little strange, but it gets the point across.

Using Base R

To create a histogram with frequencies as the labels, use the following function:

hist(table$column_name, breaks=seq(0,max,by=#))

Notice that # represents the class width of the histogram.

hist(Table12$Fine, breaks = seq(12.5,337.5,by=25),xlab = "Fine Amount ($)", main = "Fines for Parking and Camera Violations in New York City", col = '#6897bb', labels = TRUE)

axis(side=1, at=seq(0,350,25), labels=seq(0,350,25))

To create a relative frequency histogram, simply add probability = T to the command above. Change the name of ylab.

hist(Table12$Fine, breaks = seq(12.5,337.5,by=25), probability = T, xlab = "Fine Amount ($)", ylab = "Relative Frequency", main = "Frequency Histogram from Continuous Data", col = '#6897bb', labels = TRUE)

axis(side=1, at=seq(0,350,25), labels=seq(0,350,25))