Drawing conclusions based on data from a random sample
• Understand sampling variability in the context of estimating a population or a population mean.
• Use data from a random sample to estimate a population proportion.
• Use data from a random sample to estimate a population mean.
• Calculate and interpret margin of error in context.
• Understand the relationship between sample size and margin of error.
A good description of how this section might play out in the classroom can be found on pp. 8–10 of Progressions for the Common Core State Standards in Mathematics: High School Statistics and Probability.
The focus of this section is on standards S-IC.B.3 and S-IC.B.4. This section will require a substantial time commitment. Students build on what they have learned about distributions and sampling variability as they use data from a random sample to learn about the value of a population proportion or a population mean.
It is probably easiest to begin by focusing on estimating a population proportion. Simulation can be used to approximate the sampling distribution of a sample proportion. This can be done using physical simulations (for example from a bag of beads that contains 40% blue beads) or using one of the many technology applets (for example, rossmanchance.com/applets/Reeses3/ReesesPieces.html which builds up a sampling distribution of the proportion of orange candies in a random sample after the user specifies the population proportion and the sample size). It is recommended that at least one physical simulation be carried out before turning to technology so that students understand the process that the technology is implementing.
Simulated sampling distributions become the basis for the important discussion of how the sampling distribution provides information about the anticipated accuracy of estimates based on random samples.
Margin of error associated with an estimate of a population proportion can be motivated based on the simulated sampling distribution in one of two ways:
1. In the context of the sampling distribution of a sample proportion, general properties can be described. These include:
-
The sampling distribution is approximately normal if the sample size is large enough and the population proportion is not too close to 0 or 1.
-
The sampling distribution is centered at the value of the population proportion, meaning that sample proportions tend to cluster around the actual value of the population proportion.
-
The standard deviation of the sampling distribution (the standard deviation of the sample proportion) is approximately equal to $\sqrt{p(1-p)/n}$.
Based on these properties, the margin of error for estimating a population proportion is approximately $2\sqrt{p(1-p)/n}$.
2. A less formal approach introduces $1/\sqrt{n}$ as a conservative estimate of the margin of error (this is the maximum value of $2\sqrt{p(1-p)/n}$, which occurs when $p = ½$. The simulated sampling distributions can then be used to convince students that this value is reasonable (and at times a bit large—this is the “conservative” part) as a way to describe error.
Once margin of error has been developed, the formula and/or simulation results can be used to explore the relationship between sample size and margin of error.
Students should practice interpreting margin of error in context. This is a good place to bring in statements from the media where a margin of error is reported and to have students explain what is meant by the margin of error.
The sampling distribution also provides information that allows testing of claims about a population proportion. By comparing an observed sample proportion to what would be expected under a specified model (for example, a model that specifies that the population proportion is 0.6), a decision can be reached about whether the observed data are consistent with the model or whether it provides evidence that the model is not a believable description of the population.
The final part of this section should consider using sample data to estimate a population mean. Simulating to obtain a sampling distribution is not as easy here as in the case of proportions, but there are a number of good applets that can be used to carry out such simulations, e.g., rossmanchance.com/applets/SampleMeans/SampleMeans.html).
Tasks
WHAT: Students use simulation to determine if Sarah, a chimpanzee, is able to solve problems. The task is based on a study which is referenced in the task commentary. This study provides students with an interesting context in which to carry out a simulation and to use simulation results to assess whether observed behavior is consistent with a “guessing” model.
WHY: This task involves students in two important aspects of statistical reasoning: providing a probabilistic model for the situation at hand, and defining a way to collect data to determine whether or not the observed data are reasonably likely to occur under the chosen model.
WHAT: Students use simulated sampling distributions to draw conclusions about a population proportion in the context of a survey of student opinion regarding a plan to implement block scheduling at their school.
WHY: This task provides an additional opportunity for students to use simulated sampling distributions to draw conclusions.
External Resources
Description
WHAT: In this lesson, students use simulation to evaluate a claim about a population proportion. Using a population of M&Ms, they take samples, record the proportion of green M&Ms in the sample, and create a graphical representation of the sampling distribution. They then use the simulated distribution to informally evaluate a claim about the population proportion.
WHY: This lesson provides hands-on experience in carrying out a simulation to construct an approximate sampling distribution (S-IC.A.2). The unit One Variable Statistics also lists an M&M’s activity. If that was used, Population Parameters with M&M’s provides an opportunity for students to contrast this analysis with the earlier one, noting the increase in sophistication.
Description
WHAT: In this investigation, students use simulation to develop an estimate of the margin of error associated with a published estimate of a population proportion. The context is a bit dated now, but it would be easy to modify the investigation to use a current media statement.
WHY: This introduces the conservative margin of error for estimating a population proportion and uses simulation to justify the formula for the conservative margin of error (S-IC.B.4).
Note that a one-time fee is required to access this resource, which is #13 out of 15 in the American Statistical Association publication Making Sense of Statistical Studies.