Inference on Two Population Parameters

Chapter 9 introduced estimation of a single population parameter – the population proportion and the population mean. Chapter 10 introduced hypothesis testing on a single population parameter. This chapter continues inference. Here, we consider comparing two population parameters. We compare two independent population proportions, two dependent means, and two independent means. The discussion of comparing two dependent proportions is in Section 12.3 because the approach uses the chi-square distribution. Just as we did in Chapter 10, we focus on the P-value approach to hypothesis testing.

This chapter also includes optional presentations on randomization tests for two independent proportions (Section 11.1A) and two independent means (Section 11.3A). There is also an optional presentation on using the bootstrap to test hypotheses regarding two dependent means (Section 11.2A).

What to Emphasize

There are two very important concepts to emphasize as you proceed through this chapter:

(1) How is the data collected?
(2) How many levels are there for the explanatory variable and what type of variable is the response variable?

Be sure students understand the method in which the data is collected. In other words, are the data collected via an observational study (such as a prospective cohort study) or are the data collected via a designed experiment (such as a matched-pairs design)? It is worthwhile to review the various data collection techniques. Review the various observational studies (cross-sectional, case-control, and cohort) and the two experimental designs we considered (completely randomized design and matched-pairs design). In this review, focus on situations where there are two levels of a qualitative explanatory variable (because we are comparing two populations throughout the chapter). For example, discuss a study in which a group of 1000 volunteers are randomly divided into two groups – a treatment group and a control group. The treatment gets a new experimental drug, the control group gets a placebo. The response variable might be proportion of patients who experience headache as a side effect (a qualitative response variable with two possible outcomes). Or, the response variable might be time until the symptoms go away (a quantitative response variable). The type of analysis performed to test hypotheses or construct confidence intervals depends on the response variable (and, of course, the number of levels of the explanatory variable and type of explanatory variable). For all examples, always review the data collection method, the type and levels of the explanatory variable, and the type of response variable.

Continue to emphasize the interpretation of the P-value and confidence intervals. In addition, continue to discuss the difference between statistical and practical significance.

Inference about Two Population Proportions: Independent Samples – The chapter begins with hypothesis tests on two independent population proportions. Just as in Chapters 9 and 10, the reason for this is that tests regarding two independent population proportions (Section 11.1) utilize the normal model, which is familiar to students. There is an optional Section 11.1A that includes a discussion on using randomization tests to compare two population proportions. This approach offers an intuitive introduction to comparing two independent proportions. If you choose to use this method, we recommend that you start with Section 11.1A and then move to Section 11.1. We recommend this approach because through the randomization method, students will see how a normal model may be appropriate in testing hypotheses about two independent proportions.

Begin by discussing the difference between independent and dependent sampling. Include the difference between a matched-pairs design (dependent) and a completely randomized design (independent) in the discussion.
This section deals with inference on a qualitative variable with two possible outcomes. In addition, the explanatory variable has two levels (such as male/female or with a treatment/without a treatment). It is important for students to explain why inference on a proportion with independent samples is the appropriate analysis to utilize for each problem.
Should you decide to present randomization tests as an introduction to hypothesis testing, you will need the Randomization test for two proportions applet found in StatCrunch under Applets > Resampling. There are a few different examples you could use in class that are found in the Student Activity Workbook. In addition, there are many new problems available in MyStatLab and in the additional exercises.
Consider requiring students to find a P-value of a right- or left-tailed hypothesis test directly. That is, have students find the test statistic and use the normal model to find the P-value. After finding a P-value for one problem directly, feel free to rely on technology of obtain P-values. Compare the P-values obtained from the normal model to those obtained using randomization tests.
Continue to emphasize the interpretation of the P-value for hypothesis testing problems. Emphasize the interpretation of confidence intervals for estimation problems.

Inference about Two Population Means: Dependent Samples – Section 11.2 presents inference on two population means from dependent samples using Student’s t-distribution. There is an optional Section 11.2A that utilizes the bootstrap method to perform inference on two dependent means. You must have covered the bootstrap method for obtaining confidence intervals (Section 9.4) if you are going to cover the material in Section 11.2A. There is no advantage to covering Section 11.2A prior to covering Section 11.2, however. So, it is entirely up to you whether you cover Section 11.2 before or after Section 11.2A.

Begin by reviewing the sampling distribution of the sample mean and the properties of Student’s t-distribution. Be sure students understand that the procedures for testing this type of hypothesis, or constructing this type of confidence interval, are the same as those for inference on a single population mean.
In both Section 11.2 and 11.2A, the data is collected via matched samples. For this reason, there are two levels of the explanatory variable, with each treatment applied to a matched sample (twins, husband/wife, before/after). The explanatory variable is qualitative, but the response variable is quantitative. Because the data are collected through matched-pairs, the variable of interest is the difference in the value of the response variable. Because we use a difference, it is extremely important that students explain how the difference was determined. For example, if the matched-pairs are husband/wife, state whether the difference is “husband – wife” or “wife – husband”. This is important because it plays a role in the direction of alternative hypothesis for one-tailed tests.
Rely on technology to obtain P-values. However, feel free to go through an example that illustrates how a P-value might be estimated by hand using Student’s t-distribution.
Continue to emphasize the interpretation of the P-value. For each hypothesis test, require students to explain what the P-value represents.

Inference about Two Population Means: Independent Samples – The last inferential technique of the chapter is inference on two population means from independent samples. Section 11.3 discusses inference on two independent means using Student’s t-distribution. There is an optional Section 11.3A that utilizes randomization tests to test hypotheses regarding two independent means. The logic behind the randomization tests for two independent means follows the same logic as randomization tests for two independent proportions. If you choose to use this method, we recommend that you start with Section 11.3A and then move to Section 11.3. We recommend this approach because through the randomization method, students will see how Student’s t-distribution may be appropriate in testing hypotheses about two independent means.

Begin with a review of what it means for a sample to be independent. Review the completely randomized design with two levels of treatment (such as experimental drug versus placebo).
Emphasize that the response variable is quantitative. Because the data are independent, we do not compute differences of matched samples. Rather, we look at the difference in the sample means from the two independent samples.
Should you decide to present randomization tests as part of the presentation of comparing two independent means, you will need the Randomization test for two means applet found in StatCrunch under Applets > Resampling. Problems that allow you to present this approach to hypothesis testing may be found in the Student Activity Workbook. In addition, there are many new problems available in MyLabStatistics and in the additional exercises.
There are two techniques based on probability models that could be used to compare two independent means. The first method follows Student’s t-distribution and requires that the two independent samples come from populations with the same standard deviation. This approach to inference is referred to as the “Pooled ” The problem with this approach is that it can be difficult to verify the requirement of equal standard deviations. Why? It requires using an F-test of equality of variances, which is very sensitive to departures from its normality requirement. Many statisticians have recommended against using the F-test to test for equality of variances, and, therefore, recommend against the “Pooled t-test.” See Moser and Stevens, “Homogeneity of Variance in the Two-Sample Means Test.” American Statistician 46(1). For this reason, we only present Welch’s t, which does not assume equality of population standard deviations. This test statistic approximately follows Student’s t distribution. The nice thing is that Welch’s t and the “pooled t” give the same results when the sample sizes from each independent sample are equal – so this allows for a discussion of the importance of quality data collection prior to inference.

Inference about Two Population Standard Deviations The material in this section is optional and may be skipped without loss of continuity.

Begin the section by introducing students to the F-distribution. If you are doing any by-hand calculations, be sure to review how to read the F-table (Table IX).
Emphasize to students that the F-test for comparing two population standard deviations (or variances) is not robust. It is highly sensitive to departures from the normality requirement.

Putting It Together: Which Procedure Do I Use? Students will have difficulty reading problems and ascertaining which statistical technique to utilize. For this reason, we wrote the Putting It Together Section 11.4 to provide a mix of hypothesis tests and confidence intervals. If you are using MyStatLab to build an assignment, be sure to include some problems that require hypothesis tests or confidence intervals on a single parameter. Being able to determine the type of inference required for a problem is a very important skill to develop in students. Emphasize that proportions are based on qualitative data with two outcomes; means are based on quantitative data where we are interested in a measure of center (or “typical” value); standard deviations are based on quantitative data where we are interested in measures of consistency or spread.

Ideas for Traditional/Online/Blended/Flipped

This chapter allows you to develop the skill of selecting the appropriate inferential approach to use in analyzing sample data. Besides discussing the mechanics of inference within each section, spend time developing the skill of identifying the type of data collection utilized, the explanatory variable and the number of levels it has, and the response variable (along with whether it is qualitative with two possible outcomes or quantitative).
To help students understand what the P-value is measuring for the various hypothesis tests, we recommend using the randomization applets in StatCrunch. For online classes, you might consider making a video where you illustrate how the randomization technique works. For in class, use small groups along with the suggested problems in the Student Activity Workbook to explore this approach to hypothesis testing.
Use the discussion board to ask questions about the appropriate inferential technique to use. Do not require students to perform the inference, just ask which method should be used. For example, give an example where data is collected through a completely randomized design with two levels of treatment. Ask students to identify the type of data collection utilized, the explanatory variable, the number of levels of the explanatory variable, and the response variable. Is the response variable qualitative or quantitative. Finally, ask students to explain which inferential technique to use and why. Another possibility is to post abstracts from journal articles and ask the same questions. Plus, you could give the P-value from the study and ask for a conclusion.