| A statistical simulation and analysis, aimed at A
level statisticians. Developed from an idea by Jo Tomalin
(email) and Dave Cassell.
200
hundred dandelion seeds are scattered at random on a 10 by 8 patch of soil.
They all grow. The patch is then subdivided into 80 plots each measuring 1 by 1
and the number of dandelions in each plot is counted. How many plots would you
expect to have no dandelions in? How many with just one dandelion? What kind of
distribution is involved? How could you model it?
The answer is... find out by growing them on your
Texas Instruments Graphical Calculator TI-83! Through programming, the
calculator can simulate the planting of the seeds, the splitting up into 80
plots and the counting of seeds in each plot. Then the data can be processed
into a frequency distribution and finally modelled by either the binomial or
poisson probability distributions. From the student's point of view, there are
many aspects of this process to appreciate. The programming itself is worth
studying, and the various stages of the simulation since they are all key
elements in any statisics course. A good way of presenting this simulation to a
class is to use the OHP screen that is available for the TI-83.
First a program is required to simulate the
planting of the seeds.
Input T
ClrDraw
PlotsOff
ClrAllLists
0->Xmin
10->Xmax
0->Ymin
8->Ymax
For(N,1,80)
N->L1(N)
0->L2(N)
End
T->L1(81)
For(M,1,T)
10*rand->X 8*rand->Y
Pt-On(X,Y)
10*int(Y)+int(X+1)->N
L2(N)+1->L2(N)
End
Pause
For(U,1,8)
Line(0,U,10,U)
End
For(T,1,10)
Line(T,0,T,8)
End
Pause
|
T represents the number of dandelions to be planted, for
example 200 (and is stored at the end of List L1 for use later in the program).
The program clears all lists and displays and sets up the axes. It sets up list
L1 to contain the plot numbers 1 to 80. A loop is then run that creates and
plots random points (the dandelions) in the 10 by 8 grid and keeps a tally of
how many points there are in each of the 80 plots.

When all the points have been diaplayed, the grid showing
the 80 plots is added.

|
ClrList (L3,L4)
max(L2)+1->P
For(N,1,P)
N-1->L3(N)
0->L4(N)
End
For(M,1,80)
L2(M)+1->Q
L4(Q)+1->L4(Q)
End
max(L4)+5->Ymax
max(L3)+1->Xmax
Plot1(Histogram,L3,L4)
DispGraph
Pause
|
This section of the program uses lists L3 and
L4 to create a frequency count of the data in lists L1 and L2. The data is then
displayed as a histogram.

|
L1(81)/80->X
max(L3)->N
X/N->P
80*binompdf(N,P,L3)->L6
L3+.5->L5
Plot2(xyLine,L5,L6)
Pause
80*poissonpdf(X,L3)->L6
Plot2(xyLine,L5,L6)
Pause
|
Finally this section works out the theoretical
expected frequencies, based first one the Binomial distribution and secondly
the Poisson distribution. The number of dandelions planted (T) stored in
L1(81) divided by the number of plots (80) gives the average number of
dandelions per plot. This is used for the first parameter in calculating the
Poisson probabilities. For the Binomial distribution, the total number of
"successes", N, and the probability of success, P, are found and then used in
calculating the binomial probabilities. (Note that 0.5 is added to the values
in list L3 in order than the points are plotted in the middle of each bar of
the histogram.)
 
|
|
|
The simulation can be run many times for various values of
T. The illustrations above show an example of T=400. Higher values should
provide a better "fit". If this does not happen, then you can ask why. Either
the model is wrong or the data isn't randomly distributed. Here are some more
displays comparing the results with the expected Poisson frequencies:
The Poisson distribution is usually found to give a better "fit"
than the Binomial, particularly for large values of T, which couuld be
measured, of course, using the "Chi-squared" distribution.
The simulation could be developed to include considerations
like:
- some dandelions dying; a probability of survival could be
associated with each seed
- a dandelion's death affecting its neighbours; dandelions
within say 0.1 units from a dead seed having an increased probability of
dying
- various types of soil in the different plots; a probability
of survival could be assocated with each plot.
- inclusion of the "chi-squared" test for goodness of fit.
(Thanks to
Prof. Dr. L. Paditz for checking the program listing. At
his site you will find a version of this program for the Casio CFX-9850G PLUS.
Jo Tomalin has adapted the program and teaching materials for a Casio 7400
and the TI80 and can be emailed
for a copy.) |