HOW TO TEACH RESAMPLING STATS ALONG WITH A STANDARD TEXT

                         Julian L. Simon and Peter Bruce


                                  INTRODUCTION

                  A simple and effective way to teach the resampling

             method at the introductory level is to use your usual

             text and course outline, and present the resampling

             method immediately following the conventional method

             for all the problems that you demonstrate in class.

                  This tactic may be illustrated with the text

             Introductory Statistics for Business and Economics

             (Wiley, 1990), by Thomas H. Wonnacott and Ronald J.

             Wonnacott.  This text was chosen for illustration

             because one of us expects to use it for a class soon,

             and also is acquainted with Tom Wonnacott.  It was not

             chosen because it lends itself particularly well to the

             resampling approach; resampling fits with other texts

             just about as well.

                  Notes to teachers are either indented or in

             brackets.  Other material is intended to be read by

             students.


                              CONFIDENCE INTERVALS

                  W and W begin their book, and the first chapter on

             "The Nature of Statistics," with an example of the

             reliability of a simple randomly-selected 1988

             presidential election poll, showing 840 votes for Bush

             and 660 votes for Dukakis out of 1500.  W and W






             estimate 95% confidence limits for the population

             proportion of Bush supporters in conventional fashion.






                  After showing this, the teacher may proceed by

             lecturing as follows:


             One can also estimate the confidence intervals in a fashion

        different from the classical approach just shown.  The resampling

        method works by experimentally drawing samples from a population

        like the one you wish to investigate.  Let's see how it is done.

             We draw samples of size 1500 from a population whose

        proportion we estimate using the information from the survey

        results, which showed a proportion of .56 Bush supporters.  (One

        makes the same assumption when using the classical method.  [Note

        to the teacher:  Spend some more time here on the logic of this

        assumption, or else we postpone the discussion until later.]

        Then we examine the results of those samples to see how much they

        vary from one another.  We can do this with an urn containing 56

        red balls and 44 black balls (or 5600 red and 4400 black balls),

        putting back the ball every time we draw one.  [The class can

        actually do this, and then go on to the computer procedure below,

        after noting that the procedure by hand is perfectly

        satisfactory, but gets tedious.  Or the teacher can immediately

        skip to the computer procedure, after just describing the urn

        procedure.  So we move on to:]

             Let's do this with the computer program RESAMPLING STATS.

             We first draw a single sample of 1500 "voters" with these

        commands:1


             GENERATE 1500 1,100 A

                  This command draws 1500 balls randomly with numbers

                  between 1 and 100, and puts them in a location we'll






                  call A.  We will let 1-56 = red (Bush), 57-100 = black

                  (Dukakis)

             COUNT A between 1 56 B

                  This command counts the number of red balls in the

                  sample of 1500, and puts the count in location B.

        -----------------------------------------------------------------

        1[Technical note to teacher: To conserve memory, Resampling Stats

        limits vectors to 1000 elements unless you otherwise specify.

        Therefore, this program needs the following command:


        MAXSIZE A 1500     This increases the size allowed for vector A

        to accommodate our 1500 "voters".

        ----------------------------------------------------------------

              Please recall our purpose, which is to find out how much

        the sample results vary from one another.  Therefore, to find out

        the results from a good many samples, we next repeat the process

        (say) 100 times, keep score of the result each time, and then end

        the process when 100 trials are completed.   Then after the 100

        simulated samples have been drawn, we construct a histogram of

        the results.  We do all this by adding a few commands to the one-

        sample program we wrote above, as follows:


        REPEAT 100   Take 100 samples from our simulated population


             GENERATE 1500 1,100 A  Take 100 balls randomly between 1 and

                  100, and put them in a location we'll call A.  Let 1-56

                  = red (Bush), 57-100 = blue (Dukakis).

             COUNT A between 1 56   Count the number of red balls and put the

                  count in location B.







             SCORE B Z.  Record the result of this trial on the

                  "scoreboard" Z.

        END        End the above experiment loop, go back to the

                   beginning, and repeat until 100 trials have been

                   completed.

        HISTOGRAM Z.  Diagram the results of the 100 trials, and show the

                   mean.  The results may be seen Figure W1.







             In the histogram we see that sample results range all the

        way from 786 (53%) favoring Bush to 888 (59%) favoring Bush.  The

        results clearly vary greatly from one trial sample to another,

        teaching the crucial lesson of variability.  Our first estimate

        of the sampling "margin of error" is clearly about 6%.  If we

        were to do a thousand more samples, or ten thousand, however, we

        would expect that the range of samples to be greater: a few "far

        out" samples are more likely to be generated by chance in a

        thousand than in a hundred samples.  We solve this dilemma by

        specifying a "confidence interval" that includes the vast

        majority -- say 95% -- of our sample results.  In this case, the

        range 801 (53.4%) to 871 (58.1%) includes 95% of the trial

        results and, therefore, is our estimate of a "95% confidence

        interval".  You will learn later how to get RESAMPLING STATS to

        examine all your trial results and find the endpoints of this

        interval for you.

                  It is important that without any further ado,

             resampling provides an intellectually complete answer

             to the question that W and W raise in their very first

             pages but cannot answer in a meaningful fashion.  They

             must throw a formula at the reader that the reader

             cannot possibly understand at that point, and indeed

             may never be able to fully understand, even after

             waiting many chapters for the answer to be provided

             with classical methods.  But because W and W are so

             anxious to immediately get the reader swimming in the

             waters of inferential statistics, rather than

             postponing that entry for several chapters, they are






             forced to provide a baffling formula.

                  In contrast, resampling can in the very first

             pages provide a procedure and an answer to the problem

             at hand that students can follow and understand in its

             entirety.  This enables W and W to satisfy their desire

             to immediately introduce inferential statistics,

             without paying the price of baffling and scaring the

             reader.

                  The instructor might try to construct a program in

             BASIC to handle the resampling procedure.  But it will

             soon be clear even to a person adept with that language

             that the program will not be simple to write.  And the

             program will certainly be quite obscure to students who

             do not already understand BASIC, whereas the RESAMPLING

             STATS program above can be understood without prior

             programming experience in any language.

                  Showing a conventional solution with Minitab at

             this point would be entirely meaningless to the

             beginning student, another point in favor of resampling

             and of RESAMPLING STATS.


                               PROBABILITY THEORY

                  W and W next present a lovely opportunity to show

             what resampling can do in the context of probability

             theory.  On page 83 they show how to calculate the

             probability of not getting a boy in five children,

             using the multiplication rule.








             The teacher can then continue and ask:



             What is the probability of getting exactly four girls in

        five children?  The amswer cannot be arrived at with a simple

        rule.  You could work this problem in the same manner that the

        earlier problem about boy-girl-boy was worked, constructing the

        entire sample space (W and W examples 3-2 to 3-4), but this

        obviously would be tedious.  And if the problem were 14 girls out

        of 19 children, it would obviously be impossible to handle with

        sample-space analysis.

             Another way to estimate the chances of getting four girls in

        five children is by resampling (or Monte Carlo) experimentation.

        You might make a first approximation that the probability of a

        girl being born is the same as that of a boy.  And you could

        then use coins to stand for children, a head for a boy and a tail

        for a girl.  Continue as follows:

        1.  Toss a coin 5 times, letting heads = girl, tails = boy.

        2.  Count how often you got a head.

        3.  Record "yes" if 4 heads, "no" if not.

        4.  Repeat steps 1-3, say, 50 times.

        5.  Count how many of the 50 trials had a "yes".



             Instead of using coins, we can do the simulation on the

        computer with RESAMPLING STATS.  This time we'll be more

        realistic and assume that the probability of girl is 48%, and a

        boy 52%.  A program to arrive at an estimate is









             REPEAT 1000    Do the experiment 1000 times

             GENERATE 5 1,100 A    Generate randomly five numbers between

                  1 and 100 and

                  put them in a location called A.  Let 1-48 = girl, 49-

                  100 = boy.

             COUNT A <=48 B    Count the number of girls, put the result in B

             SCORE B Z     Keep score of the result of each trial

        END       End one trial, go back and repeat until all 1000 are

                  complete, then proceed

        HISTOGRAM Z     Produce a histogram of the trial results.







                              BINOMIAL DISTRIBUTION

                  When W and W discuss the binomial distribution,

             they show how to calculate the probability that, from a

             population of microwave ovens that are 80% perfect, a

             sample of 10 will be half perfect and half imperfect

             (p. 119).  After that deductive calculation, the

             resampling procedure -- just like the program for four

             girls out of five children just above -- may be shown.

             Students may be told that they can take their choice of

             which way to handle problems in real life, and on exams

             -- with the binomial formula, or with the RESAMPLING

             STATS program.  If correctly done, both methods will

             arrive at the same result.  If experience holds, most

             students will tend to opt for simulation.

                  Some students will feel that there is something

             illegitimate about simulation, perhaps because it is

             not "exact".  It sometimes helps to point out to the

             students that any probability formula such as the

             binomial is itself only a mathematical shortcut to the

             full procedure of specifying the entire sample space.

             The use of the t-distribution in a two-sample problem

             is an excellent example:  it is a mathematically

             convenient way of describing what happens in a

             randomization procedure, developed in an era in which

             lack of computing power kept people from carrying out

             randomizations for all but the smallest data sets.

             Simulation is simply another shortcut.  When one sees

             that both the formula method and the simulation method






             are on the same footing in this respect, resampling is

             more likely to seem legitimate.

                  W and W then (p. 120) tell the students that

             instead of the formula, they can use a table in the

             back of the book.  At this point the student's

             intuition is of course shut off, because the logic of a

             table is inpenetrable to all.  Once again the

             Resampling Stats procedure is shown, and the students

             can see for themselves that they can completely

             understand everything that is happening.

                  Here again one may wish to compare a BASIC program

             with RESAMPLING STATS in performing the resampling

             procedure.  This is the program that Gnanadesikan et. al

             (The Art and Technique of Simulation, Dale Seymour,

             1987) use to simulate repeated coin tosses:



        80  INPUT "ENTER THE NUMBER OF KEY COMPONENTS";N

        100  INPUT "ENTER THE NUMBER OF TRIALS";NT

        120  DIM T$(NT,N),C(2*N)

        140  FOR I = 1 to NT

        150  LET NH = 0

        160  FOR J = 1 TO N

        170  LET X = RND (1)

        180  IF X < .5 THEN 220

        190  T$ (I,J) = "H"

        200  NH = NH + 1

        210  GOTO 230

        220  T$ (I,J) = "T"







        230  IF J = N THEN 260

        250  GOTO 270

        270  NEXT J

        280  C(NH + 1) = C(NH + 1) + 1

        290  NEXT I

        330  FOR K = 1 TO N + 1

        350  NEXT K

        360  END


                  The above BASIC program is written in general form

             and does not specify a particular number of coins and

             heads, as RESAMPLING STATS does.  (We have simplified

             the program by removing the many "print" statements.)

             Note that the RESAMPLING STATS program listed above

             does the same job, for a sample of 5 coins.


                        INTERLUDE: THE GENERAL PROCEDURE

             The procedural steps taken in solving the particular problem

        above were chosen to fit the specific facts.  We can also

        describe the steps in a more general fashion.  The generalized

        procedure simulates what we do when we estimate a probability

        using resampling problem-solving operations.

             Step A.  Construct a simulated population or "universe" of

        random numbers or cards or dice or another randomizing mechanism

        whose composition is similar to the universe whose behavior we

        wish to describe and investigate.  The term "universe" refers to

        the system that is relevant for a single simple event.  For

        example:







             A coin with two sides, or two sets of random numbers "1-

        52" and 53-100", simulates the system that produces a single male or

        female birth, when we are estimating the probability of four

        girls in the first five children.  Notice that in this universe

        the probability of a girl remains the same from trial event to

        trial event -- that is, the trials are independent --

        demonstrating a universe from which we sample without

        replacement.

             Hard thinking is required in order to determine the

        appropriate "real" universe whose properties interest you.

             Step(s) B.  Specify the procedure that produces a pseudo-

        sample which simulates the real-life sample in which we are

        interested.  That is, one must specify the procedural rules by

        which the sample is drawn from the simulated universe.  These

        rules must correspond to the behavior of the real universe in

        which you are interested.  To put it another way, the simulation

        procedure must produce simple experimental events with the same

        probabilities that the simple events have in the real world.  For

        example:

             In the case of four daughters in five children, you can

        draw a card and then replace it if you are using a deck of red

        and black cards.  Or if you are using a random-numbers table, the

        random numbers automatically simulate replacement.  Just as the

        chances of having a boy or a girl do not change depending on the

        sex of the preceding child, so we want to ensure through

        replacement that the chances do not change each time we choose

        from the deck of cards.

             Recording the outcome of the sampling must be indicated as






        part of this step, e.g. "record `yes' if girl  `no' if

        a boy.

             Step(s) C.  If several simple events must be combined into a

        composite event, and if the composite event was not described in

        the procedure in step B, describe it now.  For example:

             For the four girls in five children, the procedure for

        each simple event of a single birth was described in step B.  Now

        we must specify repeating the simple event four times, and

        determine whether the outcome is or is not four girls.

             Recording of "four or more girls" or "three or less girls"

        is part of this step.  This record indicates the results of all

        the trials and is the basis for a tabulation of the final result.

             Step(s) D.  Calculate from the tabulation of outcomes of the

        resampling trials.  For example: the proportion of "yes" or

        "no" estimates the likelihood we wish to estimate in step C.


                RANDOM SAMPLING AND THE DISTRIBUTION OF THE MEAN

             W and W pose the following problem (p. 202):  "A population

        of men on a large midwestern campus has a mean height of mu = 69

        inches, and a standard deviation sigma = 3.22 inches.  If a

        random sample of n = 10 men is drawn, what is the chance the

        sample mean X-bar will be within 2 inches of the population mean

        mu?"






             The framing of this question reveals the unrealistic fashion

        in which classical statistics poses most question.  The data for

        the population necessarily arise discretely, and the parameter of

        the standard deviation is a derived computation; beginning with

        the discussion the standard deviation given as a datum

        immediately removes the problem from a realistic setting.

             Luckily, W and W earlier present data on the heights of 200

        men (p. 28).








             We take those observations as our supposed population, that

        is, as our best estimate of what the population is like.  We now

        draw samples of 10 from this collection.  Whether we draw them

        with or without replacement depends on what we are assuming the

        collection to be - the entire population, or a sample from it.

        If the latter, we must discuss why it is reasonable to consider

        it our best estimate of the population, and then draw from it.

             [It is unfortunate for pedagogical purposes that W and W

        present the data in grouped format.  The student may therefore

        leap to the unsound conclusion that the appropriate procedure is

        to rearrange the raw data into bins to produce a frequency

        histogram, and then do a bootstrap confidence interval using not

        the original data we collected, but the values of the bin centers

        and their frequencies.  The teacher should forestall that

        possibility. ]

            Programs for the two different situations are as follows:

        SAMPLING WITHOUT REPLACEMENT:


        READ file "heights" A      Read the height data from an ASCII

                  file called "heights" located in the same directory as

                  RESAMPLING STATS.  The heights should be listed in a

                  column; they will become vector A.

        REPEAT 100      Repeat the following trial 100 times

             SHUFFLE A A      Shuffle the height vector A, keep calling

                              it A

             TAKE A 1,10 B      Take the first 10 (without replacement),

                                put them in B

             MEAN B C         Calculate their mean







             SCORE B Z        Keep score

        END          End one trial, go back and repeat until all 100 are

                     complete, then proceed to the next step

        HISTOGRAM Z      Produce a histogram of the "resample" means


        SAMPLING WITH REPLACEMENT:



        READ file "heights" A      Read the height data from an ASCII

                  file called "heights" located in the same directory as

                  RESAMPLING STATS.  The heights should be listed in a

                  column; they will become vector A.

        REPEAT 100      Repeat the following trial 100 times

             SAMPLE 10 A B      Take a sample of size 10, with

                                replacement, put them in B

             MEAN B C        Calculate its mean

             SCORE B Z       Keep score

        END          End one trial, go back and repeat until all 100 are

                     complete, then proceed to the next step

        HISTOGRAM Z      Produce a histogram of the "resample" means







                  W and W show a Monte Carlo simulation for their

             height problem (p. 222).








        The teacher may compare the clarity of the RESAMPLING STATS

        bootstrap-like treatment with the treatment using the normal

        distribution and the computer.


                                  THE BOOTSTRAP

                  Happily, W and W provide an introduction to the

             bootstrap in the context of confidence intervals.  They

             suggest, however, that it is for use "in situations too

             complex for standard theory to handle" (p. 277).  Here

             the teacher may recall how a very similar technique was

             used successfully right at the start of the course (see

             above), and remind students how easy it is to do this

             with RESAMPLING STATS. So how about doing a bootstrap

             right here, using the 200 heights as a sample, not a

             population?  Here's the program:



        BOOTSTRAP SAMPLING:


        READ file "heights" A      Read the height data from an ASCII

                  file called "heights" located in the same directory as

                  RESAMPLING STATS.  The heights should be listed in a

                  column; they will become vector A.

        REPEAT 100      Repeat the following trial 100 times

             SAMPLE 200 A B     Take a sample of size 200, selected

                                randomly and with replacement, from our

                                original sample

             MEAN B C           Calculate the mean of the resample

             SCORE B Z          Keep score







        END          End one trial, go back and repeat until all 100 are

                     complete, then proceed to the next step

        HISTOGRAM Z      Produce a histogram of the "resample" means









                               HYPOTHESIS TESTING

                  W and W begin their discussion of hypothesis

             testing (p. 288) with samples of 10 men's salaries and

             5 women's salaries, and they ask if there is a

             difference between the groups.  (The actual difference

             is $5,000.)  They deal with the problem with the t

             test.

                  Minitab or other software may also be presented at

             this point.








                  After completing the demonstration with the t test

             (and perhaps standard software), the teacher may

             proceed as follows by a modified randomization test

             that samples without replacement.


        COPY (13 11 19 15 22 20 14 17 14 15) A      Copy the data for

                                                    the men's salaries

        COPY (9 12 8 10 16) B       Copy the data for the women's

                                    salaries

        CONCAT A B C       Put all the data together in the same vector

        REPEAT 100         Repeat the following procedure 100 times

             SAMPLE 10 C D      Select 10 salaries, at random and with

                  replacement (our original sample was assumed to be from

                  a larger population), and put them in a vector called D

             MEAN D DD          Calculate the mean salary in this group

             SAMPLE 5 C E       Select 5 salaries, at random and with

                                replacement, and put them in E

             MEAN E EE          Calculate the mean salary in this group

             SUBTRACT DD EE F   Find out by how much the "male" average

                                exceeds the "female" average

             SCORE F Z          Keep score of the difference

        END

        HISTOGRAM Z             Produce a histogram of trial differences







             In the histogram we see that randomly-drawn samples produced

        differences in average salary that were generally less than

        $4,000; only once was there a difference greater than $5,000.


                  The class may then discuss the pro's and con's of

             the classical and the resampling approaches for this

             problem.  Again, the students may be told that they may

             use either method on examinations.  As long as the data

             are given in their full form, the students are likely

             to opt for the resampling method.


                                   DISCUSSION

                  We have presented only a very few illustrative

             problems.  But even with this small set, the teacher

             should be able to have a good idea of the place of

             resampling when taught in parallel with the classical

             methods.  And even this small a sample of problems is

             sufficient to provide a reasonable sense of how the

             general resampling method deals with the garden variety

             of statistical and probabilistic problems.

                  A definition of resampling, a bit of its history,

             and other background materials that may be used one

             place or another in the course may be found in the

             enclosed article from Chance.



        howteach statwork  disk 1-210 May 14, 1991