## PSY 2061 Assignment Minitab and Variability

*PSY 2061 Assignment Minitab and Variability*

For my last several posts, I’ve been writing about the problems associated with variability. First, I showed how variability is bad for customers. Next, I showed how variability is generally harder to control than the mean. In this post, I’ll show yet one more way that variability causes problems!

Variability can dramatically reduce your statistical power during hypothesis testing. Statistical power is the probability that a test will detect a difference (or effect) that actually exists.

It’s always a good practice to understand the variability present in your subject matter and how it impacts your ability to draw conclusions. Even when you can’t reduce the variability, you can plan accordingly in order to assure that your study has adequate power.

(As a bonus for readers of this blog, this post contains the information necessary to solve the mystery that I will pose in my first post of the new year!)

#### HOW VARIABILITY AFFECTS STATISTICAL POWER

Higher variability reduces your ability to detect statistical significance. But how?

The probability distribution plots below illustrate how this works. These three plots represent cases where we would use 2-sample t tests to determine whether the two populations have different means. These plots represent entire populations so we *know *that the 3 pairs of populations are truly different. However, for statistical analysis, we almost always use *samples *from the population, which provides a fuzzier picture.

For random samples, increasing the sample size is like increasing the resolution of a picture of the populations. With just a few samples, the picture is so fuzzy that we’d only be able to see differences between the most distinct of populations. However, if we collect a very large sample, the picture becomes sharp enough to determine the difference between even very similar populations.

Each plot below displays two populations that we are studying. For all plots, the two populations have the same two means of 10 and 11, but different standard deviations, so the mean difference between all pairs of populations is always 1.

When the populations in these graphs are less visually distinct, we need higher resolution in our picture (i.e., a larger sample) to detect the difference.

For the high variability group, the 2 populations are virtually indistinguishable. It’s going to take a fairly high resolution to see this difference.