The ComPADRE Collections

Why Teach using Data Simulations?

Initial Publication Date: May 4, 2007

Why use Data Simulations?

There are many reasons to use data simulation in the classroom. These include:


Simulation is an important tool used by statisticians to solve problems. In a plenary talk at the First US Conference on Teaching Statistics, George Cobb included simulation in the three R's of inference: Randomize (data production), Repeat (by simulation to see what's typical), and Reject (any model that puts your data in its tail). Therefore, students need to learn how to use simulation as part of statistical problem solving.


Simulating data can help students visualize and build a deep understanding of difficult and abstract statistical concepts. Simulations allow students to see dynamic processes, rather than static figures and illustrations. Simulations are one of the best ways to promote a deeper understanding of statistical concepts by allowing students to pose "what if" questions and test them using data (Garfield & Ben-Zvi, in press). For example, what happens to the shape, center, and spread of a sampling distribution if the sample size is increased? Simulations can also be used to help students understand random processes and outcomes, seeing that a random variable can have an unpredictable outcome yet have a predictable pattern over the long run.


Simulations allow students a way to informally address questions involving statistical inference, before formally studying this topic later in class. For example, in a class designed experiment to test whether students can correctly identify Coke or Pepsi in a blind taste test, students can compare the experimental results to what might have happened due to chance, determining whether their result is just due to chance, or leads them to believe something else. This can be done without any formal hypothesis testing, by simulating what data would result if a student was just guessing. To learn more about this activity click here


Simulations provide a way to actively engage students in making and testing conjectures about data, developing their reasoning about statistical concepts and procedures. Simulations can encourage students to develop their analytical reasoning by requiring them to analyze processes and set up a series of steps for predicting outcomes (i.e., modeling) (Ghanadesikan, Scheaffer, & Swift, 1987). For example, students can be asked to predict what will happen if a "One Son" policy is enforced, limiting families to one son. They can reason about what the average family size might be and what the ratio of boys to girls would be. This can then be tested by simulating births, collecting simulated data on family size that can be graphed and summarized to answer the research question. The actual data can be compared to students' conjectures, motivating them to reason about differences between what they predicted and what might actually occur.


For more discussion of why to use data simulations see Chance & Rossman, 2006.