Task
Students in a high school mathematics class decided that their term project would be a study of the strictness of the parents or guardians of students in the school. Their goal was to estimate the proportion of students in the school who thought of their parents or guardians as “strict”. They do not have time to interview all 1000 students in the school, so they plan to obtain data from a sample of students.
- Describe the parameter of interest and a statistic the students could use to estimate the parameter.
- Is the best design for this study a sample survey, an experiment, or an observational study? Explain your reasoning.
- The students quickly realized that, as there is no definition of “strict”, they could not simply ask a student, “Are your parents or guardians strict?” Write three questions that could provide objective data related to strictness.
- Describe an appropriate method for obtaining a sample of 100 students, based on your answer in part (a) above.
IM Commentary
(a) Student responses should recognize that parameter is a numerical summary of a population and a statistic a numerical summary of a sample. The sample proportion is a natural statistic to use, but others are possible, also. Some textbooks, for example, encourage the use of a "plus 2" estimator for population proportions in which the numerator is the number of "yes" responses in the sample plus 2, and the denominator is the sample size plus 4.
(b) This part requires previous introduction to the terminology of study design.
(c) This is a good question to use for class discussion, as many issues arise. The question assumes that students will know why you can't simply ask "Are your parents or guardians strict?" But not all students will understand why this is a problem. Students should understand that the lack of an agreed-upon definition of "strict" means that answers to the questions may vary more than if there were a precise definition, and this will cause measurement error in the survey. Another possibility is that some students will not answer the question because the lack of a definition means they do not know how to answer. If so, there will be many non-responses in the sample, which could lead to a biased estimate. Finally, the instructor should be aware that students may suggest numerical questions (e.g., "How old were you when your parents allow..."), which raise the difficulty of analysis. The instructor may have to work to steer students toward yes/no questions (as in the solution) where the analysis concerns only the proportion of respondents who answer "yes."
(d) It is important that the student's answer indicate that a random sample be taken. Also, students should specify as precisely as possible a mechanism for obtaining a sample. One "test" as to whether the answer is specific enough is whether another student could follow the directions unambiguously. Other sampling schemes are possible, too. For example, students might specify stratified sampling, in which random samples are chosen from each class (1st year, sophomores, juniors, seniors.) Again, students should specify the mechanism for taking the random sample.
One reason for requiring that students specify the mechanism used to collect the sample is that the term "random" is often used, colloquially, to mean "arbitrary" or "haphazard." But taking an arbitrary sample can lead to bias in the sample. For this reason, students need to make it very clear that they understand what is meant by "random sample".
Solution
(a) The parameter of interest is the proportion of all 1000 students at the school who have strict parents or guardians. A possible statistic to estimate this parameter is the proportion of students in the collected sample who have strict parents or guardians.
(b) The best design would be a sample survey, because we are interested in estimating a population parameter, namely, the proportion of all parents at the school who are "strict". It is less time consuming and costly to take a random sample of students than to interview all students at the school.
(c) Answers will vary. "Do your parents require you to do your homework before you can meet with your friends?" "Do your parents require that you be home before 11pm on a weekend night?" "Do your parents limit your mobile phone time?"
(d) Answers will vary. A list of all students should be obtained from the principal's office and a subset of student names should be taken from the list by randomly sampling without replacement. For example, the students could read triplets of digits from a random number table so that 000 represents the first student on the principal's list and 999 the last. The students would begin at an arbitrary point in the table and then write down consecutive triplets until they had obtained the desired sample size. If a three-digit number is repeated, then they should skip that triplet and write down the next. Alternatively, a computer could be asked to take a random sample without replacement from the digits 1 through 1000.