Randomization

Implementation Guide: Simulation and Randomization

The Guidelines for Assessment and Instruction in Statistics Education (GAISE) College Report 2016, endorsed by the American Statistical Association, acknowledges that there are new, “innovative ways to teach the logic of statistical inference”.   The most popular techniques for introducing inference are simulations and resampling methods (randomization tests and bootstrapping).   I would encourage you to read George Cobb’s article The Introductory Statistics Course: A Ptolemaic Curriculum available at

http://escholarship.org/uc/item/6hb3k0nz

To assist in reaching the goal of expanding students’ understanding of the logic of statistical inference, we offer simulation and randomization material to accompany any of the Sullivan Statistics titles. Below, we lay out a guide to assist you in utilizing these materials in your classes.

Chapter 10 Hypothesis Tests Regarding a Parameter

Sections 10.2A and 10.2B: Simulation-Based Inference to Hypothesis Testing

  • Section 10.2A presents hypothesis tests for a proportion using simulation. There are two approaches that may be used with the simulation method.  The first approach utilizes coin flipping. This model is useful when describing a random process (such as deciding whether stocks will go up or down for a series of stocks).  Be sure to emphasize that each flip of a coin represents a choice. Also emphasize what a head represents for each flip of the coin.  Clearly define the concept of a null model.  Going through a tactile simulation with actual coins is a good idea. A second method for building the null model is through the use of the urn applet in StatCrunch.  The advantage of this method is that you are actually building a population based on the proportion stated in the null hypothesis and randomly selecting outcomes from this population.  This helps students develop an intuitive feel for the null model and the meaning of the P-value.  One final item to consider is whether you use counts or proportions for the test statistic.  The advantage of counts is that it is easier and is one less layer of complication.  The advantage of proportions is this is the basis of the test statistic when we segue to using the normal model to estimate P-values.  Once the null model is built, be sure to point out the shape, center, and spread of the outcomes of the simulation.
  • Once you complete Section 10.2A, jump into Section 10.2B. This section utilizes the normal model to obtain P-values for hypothesis tests on a proportion. Continue to emphasize the interpretation of P-values. You might consider presenting simulation side-by-side with the normal model approach so students can see the similarity between the two approaches.

Section 10.3A: Hypothesis Tests on a Population Mean Using Simulation and the Bootstrap

Section 10.3A is optional and may be covered prior to the discussion of Section 10.3.  This material utilizes both simulation and the bootstrap to perform hypothesis tests on a population mean.  Section 10.3A begins with using simulation to estimate P-values.  If you did not cover bootstrapping in Chapter 9, then be sure to only cover Objective 1.

Chapter 11 Inference on Two Population Parameters

Section 11.1A Inference about Two Population Proportions (Randomization Method)

Begin with Section 11.1A by using a small sample example of random assignment. This will allow you to use a tactile activity (using something like index cards).   For example, using the Math Redesign Program (MRP) data from Table 1, let 33 green index cards represent a passing student and 15 red index cards represent a failing student.   Shuffle the cards and then deal 24 cards to the MRP course. Ask students to explain why this is just like randomly assigning 24 students to the MRP course under the assumption the pass rates in the two courses are equal. In fact, allow each student in class to do this random assignment using the cards. Be sure to aggregate the results of the class to determine the proportion of random assignments that led to a result as extreme or more extreme than those observed. If cards are not available, you could still do a tactile activity by utilizing StatCrunch (as was done in Figure 1 of Section 11.1A.   Point out the test statistic is as simple as the count of the number of MRP students who pass! Finally, use the StatCrunch randomization applet to obtain a P-value.

Of course, with the small sample example, the conditions for using the normal model do not apply, so now cover Example 1 from Section 11.1A to obtain an estimate of the P-value using random assignment. Once this example is complete, cover Section 11.1 and introduce the methods using the normal model to obtain a P-value.

Finally, assign the problems from Section 11.1A and the corresponding problems from Section 11.1 so that students may compare the two approaches.

Section 11.2A Inference about Two Means: Dependent Samples

If you did not cover Section 9.5 on Bootstrapping from Statistics 6/e, do not cover the material in Section 11.2A.

Introduce the bootstrapping approach to hypothesis tests for two dependent means by covering Example 1 from Section 11.2A. Then, present hypothesis tests for two dependent means using Student’s t-distribution from Section 11.2. Compare the P-values using the bootstrapping approach to the P-value using Student’s t-distribution.

Assign the problems from Section 11.2A and the corresponding problems from Section 11.2 so that students may compare the two approaches.

Section 11.3A Inference about Two Means: Independent Samples

As with Section 11.1A, it is recommended that you begin the material in Section 11.3A with a tactile activity. Using blue index cards, record each male’s time spent on the homework; using red index cards, record each female’s time spent on the homework.   Shuffle the cards and deal 12 cards to represent the random assignment of the male homework time; the remaining cards represent the random assignment of the female homework time.   Compute the sample mean difference in time spent. As in Section 11.1A, it is best if each student completes this tactile activity.   Ask students to explain why this is just like randomly assigning 12 students to the “male” group and 12 students to the “female” group under the assumption the statement in the null hypothesis is true (that the mean study time for the two groups is equal).   Next, use the StatCrunch randomization applet to obtain a P-value.

Now, introduce how Student’s t-distribution may be used to estimate the P-value (provided certain conditions are satisfied). Use the t-distribution to estimate the P-value for the student study time example.   Compare the results to those obtained using random assignment.

Assign the problems from Section 11.3A and the corresponding problems from Section 11.3 so that students may compare the two approaches.

Chapter 14 Testing the Significance of the Least-Squares Regression Model

Section 14.1A Inference on the Slope of the Least-Squares Regression Line (Randomization Method)

The material in Section 14.1A corresponds to Section 14.1 if you are using Statistics 6/e of Interactive Statistics 2/e. The material in Section 14.1A corresponds to Section 12.3 if you are using Fundamentals of Statistics 5/e.

Begin this section by presenting the Zillow material. A tactile activity could be obtained by creating index cards with the various selling prices written on one set of cards, and the Zestimate on another set. Shuffle the selling price cards and deal one selling price card to each Zestimate card.   Compute the slope for the randomized data. Aggregate the results of the class so that students can determine the proportion of randomly assigned paired data are as extreme or more extreme than the observed slope. Ask students to explain how the random assignment reflects the statement in the null hypothesis.

Next, use the randomization applet in StatCrunch to obtain an estimate of the P-value.

Now, introduce the methods of inference on the slope of the least-squares regression line using Student’s t-distribution.   Go over Example 1 from Section 14.1A and compare the results to those obtained using Student’s t-distribution.

Assign the problems from Section 14.1A and the corresponding problems from Section 14.1 (if using Statistics 6/e or Interactive Statistics) or Section 12.3 (if using Fundamentals of Statistics 5/e).