Describing Data Sets with Outliers
For certain data sets, such as home prices and household or individual income, is often described using the median instead of the mean. The questions below explore the mean and median in some different situations to help you understand the information that they communicate.
- Give an example of a set of five positive numbers whose median is 10 and whose mean is larger than 10.
- Find the mean and median of the following set of numbers: $\{10, 15, 25, 30, 30, 50, 55, 55, 60, 80\}$. What happens to the mean and median of these numbers if 80 is replaced by 800?
- The brightness of celestial bodies depends on many factors, two of the most important being the distance from Earth and size. The eight brightest objects in the night sky are listed below with their approximate distance from Earth (in light years).
$\text{Object}$ $\text{Distance in light years}$ Moon 0.000000038 Venus 0.0000048 Jupiter 0.000067 Mars 0.0000076 Mercury 0.0000095 Syrius 8.6 Canopus 310 Saturn 0.00014 Calculate the mean and median for these distances. Would the typical distance of these celestial bodies best be communicated using the mean or the median? Why?
- What impact do the very large values in the data set have on the mean?
- Suppose that a sample of 100 homes in the metropolitan Phoenix area had a median sales price of \$300,000. The mean value of these homes was \$1,000,000. Explain how this could happen. Why might the median price be more informative than the mean price in describing a typical house price?
- Suppose the mean annual income for a sample of one hundred Minneapolis residents was \$50,000. Do you think the median income for this sample would have been greater than, equal to, or less than \$50,000? Explain.