# Choosing an appropriate growth model

Alignments to Content Standards: F-LE.A.1 F-LE.A.2

Below are population estimates for the larger metropolitan areas of Paris (France), Shenzhen (China), and Lagos (Nigeria) for each decade between 1950 and 2010:

City 1950 1960 1970 1980 1990 2000 2010
Paris 6,300,000 7,400,000 8,200,000 8,700,000 9,300,000 9,700,000 10,500,000
Shenzhen 3100 8000 22,000 58,000 875,000 6,600,000 10,000,000
Lagos 330,000 760,000 1,400,000 2,600,000 4,800,000 7,300,000 11,000,000
1. For each city, decide if the population data can be accurately modeled by a linear, quadratic, and/or exponential function. Explain.
2. If you found one or more good models for a city population, what predictions would those models make for future decades? Are these reasonable?

## IM Commentary

The goal of this task is to examine some population data from a modeling perspective. Because large urban centers and their growth are governed by many complex factors, we cannot expect a simple model (linear, quadratic, or exponential) to give accurate values or predictions over large stretches of time. Deciding on an appropriate model is a delicate process requiring careful analysis. Some key aspects which the teacher may wish to bring out include

1. Examining the ''fit'' of the model to the data. Detailed analysis of this type is belongs to the domain of statistics. This includes having the appropriate shape (increasing/decreasing and concavity) as well as, ideally, having overestimates and underestimates ''balance out.''
2. Predictive value: will the model help make accurate predictions, either in the short term or the long term?
3. Simplicity: it is possible that a given set of data can be accurately modeled in multiple ways. In fact, any set of data which is roughly linear can be modeled by a quadratic or exponential function by choosing an appropriate domain and scaling the variable appropriately: but in these situations the model will have unnecessary complexity with little explanatory value.

The teacher should encourage examining all possible models for each population and may wish to emphasize that for a given city there may be no good model, one good model, or more than one. Moreover, if students are familiar with the statistical analysis of the standard S-ID.6 ''Represent data on two quantitative variables on a scatter plot, and describe how the variables are related'' then this is an ideal opportunity to have students investigate this approach along with the successive differences, second differences, and quotients presented in the solution. This should lead to a lively discussion of how the two methods assess the fit of the data to a given model: a robust, accurate model should fit the data well from either perspective. This is an ideal task for group work and, ideally, the groups will come up with different, valid models and then they can compare with one another and argue in favor of their choices.

The data for this task has been taken from http://www.geohive.com/earth/cy_aggmillion2.aspx although only two significant figures have been listed for each entry (except for the Paris 2010 population where we used three). The table in the website gives predicted populations for 2020 and 2025 which students might use in analyzing their answers to part (b).

Here is a sketch showing a quadratic fit for the population of Lagos:

Here are sketches for linear and quadratic approximations for the Paris population (clicking on the little squiggle next to the function turns on and off that graph so that you can work with one at a time or see how the two fits compare):

For Shenzhen, none of the models give reasonable results. The exponential function can capture the first several data points of Shenzhen well but then the later dates are grossly underestimated. If we go for accuracy on the later dates (as in the sketch below) then the model predicts that the population for the earlier dates is effectively zero. Similarly, for the quadratic function which we put in (designed to pass very close to three of the data points) , it has negative values for some of the dates.

This task has been written with a view toward MP4, Model with Mathematics. One of the biggest challenges of modeling problems is that there is no clear cut technique or strategy to apply. In this case, the universe of possible models has been cut to three (linear, quadratic, or exponential) so it is not open ended. Nonetheless, much interesting discussion can and should take place over which of the three best interprets each set of data. It is also written to exemplify MP5, Use Appropriate Tools Strategically. Technology such as desmos allows students to visualize the different models and obtain confirmation about which models are best. Graphing technology is also essential to see the meaning of all of the numbers in the tables: otherwise students can only observe trends in the data but this is not enough to determine which models are the most accurate.

This task was designed for an NSF supported summer program for teachers and undergraduate students held at the University of New Mexico from July 29 through August 2, 2013 (http://www.math.unm.edu/mctp/). The participants greatly helped to enrich the solution by proposing alternate models for the different data sets.

## Solution

1. To get an idea of which model might be appropriate, we have plotted the population data together on the graph below: in this graph, the data points for each city have been joined by line segments. Visual examination reveals that the Paris population seems to be approximately linear, the Lagos population data appears to be modeled by a quadratic function, while the Shenzhen population is likely exponential though the rate of growth is slowing down after the turn of the 21st century.

To analyze the data more closely, we make tables, one for each city population. In the tables, we have listed the populations in thousands for simplicity, to avoid writing extra zeroes. To test for linear growth, we use the fact that a linear function is characterized by having equal differences over equal intervals. For quadratic growth, we use the fact that second differences (that is differences of differences) over equal intervals are equal. Finally, exponential growth is characterized by having equal quotients over equal intervals. With this in mind, we make tables for each of the three cities, giving successive differences, second differences, and quotients over the 60 year period for which we have data.

Paris Population 1950 - 2010
Year Population (in thousands) Change in Population (in thousands) Change in Population Changes (in thousands) Successive Population Quotients
1950 6300 ---- ---- ----
1960 7400 1100 ---- 1.17
1970 8200 800 -300 1.11
1980 8700 500 -300 1.06
1990 9300 600 100 1.07
2000 9700 400 -200 1.04
2010 10500 800 400 1.08

The population growth in Paris for this period is always positive, ranging from a low of 400 thousound to a high of 1100 thousand per decade. The mean rate of growth is 700 thousand per decade. A linear model with a growth rate of about 700 thousand per decade would be appropriate although it would miss the general trend of slower growth (save for the last decade).

The second differences for this table are relatively uniform and are mostly negative, indicating that the rate of growth is slowing down. Except for the last data point, where the growth began to accelerate, a quadratic model would be able both to capture the growth and indicate that the general trend is toward slower growth.

The successive quotients are all relatively close to 1 and so an exponential model with an exponent slightly larger than 1 might model the data relatively well. But, except for the last decade, the successive quotients are decreasing , meaning that the rate of growth is slowing down. This makes an exponential model inappropriate since successive quotients of an exponential model over equal intervals would be constant.

Shenzhen Population 1950 - 2010
Year Population (in thousands) Change in Population (in thousands) Change in Population Changes (in thousands) Successive Population Quotients
1950 3.1 ---- ---- ----
1960 8 4.9 ---- 2.6
1970 22 14 9.1 2.8
1980 58 36 22 2.6
1990 875 817 781 15
2000 6600 5725 4908 7.5
2010 10000 4400 -1325 1.5

The successive differences for the Shenzhen population grow extremely rapidly and then start to decrease at the very end. Linear growth would mean constant differences so the table shows that a linear model would not be appropriate for the Shenzhen population.

The second differences increase very rapidly and then become negative at the very end. A quadratic function is therefore not likely to approximate the data very well.

The successive quotients are steady at the beginning, increase rapidly, and then decrease at the end. So an exponential model is also not likely to give an accurate fit but of the three choices it is probably the best.

Lagos Population 1950 - 2010
Year Population
(in thousands)
Change in Population
(in thousands)
Change in Population Changes ( in thousands ) Successive Population Quotients
1950 330 ---- ---- ----
1960 760 430 ---- 2.3
1970 1400 640 210 1.8
1980 2600 1200 560 1.9
1990 4800 2200 1000 1.8
2000 7300 2500 300 1.5
2010 11000 3700 1200 1.5

The Lagos data shows consistently increasing differences in the population so this rules out a linear model. The second differences for the Lagos model are all positive showing that the rate of growth is increasing over this period of time. These second differences are not constant but they are also not varying wildly like for the Shenzhen population so a quadratic model might be appropriate. The successive quotients are steadily decreasing: since the successive quotients for an exponential model are all equal, an exponential model will either underestimate the population at the beginning of the period or overestimate the population toward the end (or both).

2. For Paris, a linear model would predict a constant growth of somewhere near 700,000 people per decade. A quadratic model, on the other hand, would predict that the population will continue to grow at a slower rate until reaching a maximum and then beginning to decrease. Of these two, the quadratic is more likely to be realistic.

For Shenzhen, none of these models will make accurate predictions. The growth rate of Shenzhen over this 60 year period was phenomenally high and cannot continue due to space constraints and resources. A linear, quadratic, or exponential model will predict that this growth rate will continue. We can already see in the table that this rate is slowing down dramatically toward 2010.

Like for Shenzhen, none of the models can make accurate predictions for Lagos over the long term. The growth of Lagos is less dramatic than Shenszhen (allowing for the possibility of a good quadratic approximation) but this will prove inadequate very quickly as space considerations and resources slow down the population growth. The quadratic model could remain accurate for a few more years (perhaps for a decade or two) but not for the long term. For example, the desmos sketch in the commentary which models the Lagos population very well predicts a population of of about 15,000,000 by 2020 and close 20,000,000 by 2030. Both of these estimates are perhaps realistic but as time continues the quadratic model will predict more and more rapid growth which is eventually not possible.