From our previous articles in this topic, you should remember that:
- The standard deviation $\sigma$ measures the typical distance of data values from the mean $\mu$.
- Larger $\sigma$ means more spread, smaller $\sigma$ means the data are more tightly clustered.
Interpreting "Within 1 Standard Deviation"
- If a dataset has mean $\mu$ and standard deviation $\sigma$, then:
- Values within 1 standard deviation are in the interval $[\mu - \sigma, \mu + \sigma]$.
- Values more than 1 standard deviation away are outside that interval.
- This is a practical interpretation tool:
- $\mu \pm \sigma$ gives a range of values you can call "fairly typical" in many contexts.
- Values well beyond $\mu \pm 2\sigma$ are often "unusual", especially if the distribution is approximately normal.
This “typical range” idea works for any distribution, but the percentages you can attach to it (like 68%) only apply when the distribution is roughly normal.
- Do not interpret $\sigma$ as "the maximum distance from the mean".
- Standard deviation is a typical spread, not a boundary.
Standard Deviation and the Normal Distribution
- Many natural measurements (such as height, mass, and age) are often approximately normally distributed: symmetric, bell-shaped, with most values near the mean.
- When data follow a normal distribution, you can infer:
- About 68% of values lie within 1 standard deviation of the mean $$\mu-\sigma \leq X \leq \mu+\sigma$$
- About 95% of values lie within 2 standard deviations of the mean $$\mu-2\sigma \leq X \leq \mu+2\sigma$$
- About 99.7% of values lie within 3 standard deviations of the mean $$\mu-3\sigma \leq X \leq \mu+3\sigma$$
Normal distribution
Symmetric distribution, with most values close to the mean and tailing off evenly in either direction. Its frequency graph is a bell-shaped curve.
These percentages are powerful because they let you make quick, reasonable judgments about what range contains "most" of the data.
- If a question tells you the data are normally distributed, immediately think "68-95-99.7".
- It is often enough to interpret typical ranges or estimate probabilities without heavy calculation.
Interpreting Mean and Standard Deviation
- A skier's run times (minutes) are recorded for $n=15$ runs, with:
- $\sum x = 129$
- $\sum x^2 = 1181$
- Mean: $$\mu = \frac{\sum x}{n} = \frac{129}{15} = 8.6$$
- Population standard deviation: $$\sigma = \sqrt{\frac{\sum x^2}{n} - \left(\frac{\sum x}{n}\right)^2} = \sqrt{\frac{1181}{15} - (8.6)^2} \approx 2.18$$
- Interpretation if times are normally distributed: About 68% of run times lie between $8.6 - 2.18 = 6.42$ and $8.6 + 2.18 = 10.78$ minutes.
Comparing Groups with Standard Deviation
- Standard deviation is especially meaningful when comparing two groups.
- Suppose two companies have starting salaries:
- Company A: mean €45000, standard deviation €5000
- Company B: mean €41000, standard deviation €8000
- Interpretation:
- Company A pays more on average (higher mean).
- Company B's salaries vary more (higher standard deviation), meaning there is more uncertainty. Some employees may earn much more than the mean, but many may also earn much less.
- So "which is better" depends on your preference:
- If you value predictability/consistency, you prefer the smaller standard deviation.
- If you accept risk for a chance at a higher-than-average salary, you might accept the larger standard deviation.
Common Interpretation Statements
Given $\mu$ and $\sigma$, you should be able to write clear conclusions such as:
- "The data are tightly clustered around the mean, so results are consistent." (small $\sigma$)
- "There is a wide spread in the data, indicating high variability." (large $\sigma$)
- "If the distribution is normal, about 68% of values lie between $\mu-\sigma$ and $\mu+\sigma$."
- "A value of $x$ is about 2 standard deviations above the mean, so it is relatively unusual."
- Always interpret $\sigma$ in the original units (minutes, marks, euros, cm).
- A standard deviation of 12.5 marks means "typical deviation is about 12.5 marks", not "12.5%".
- When a question says "normally distributed", think immediately of the $68-95-99.7$ rule and the interval $\mu \pm k \sigma$.
- In one sentence, explain what standard deviation measures.
- If two classes have the same mean test score, what does the class with the larger standard deviation look like (in terms of spread and consistency)?
- For a normal distribution with mean 70 and standard deviation 5, estimate the interval containing about 95% of values.
- What summary statistics do you need to use $\sigma = \sqrt{\frac{\sum x^2}{n} - \left(\frac{\sum x}{n}\right)^2}$?