A normal distribution is a continuous probability distribution that is symmetric and bell-shaped, characterized by its mean \$\mu\$ and standard deviation \$\sigma\$.
The normal distributionis one of the most important probability distributions in statistics because many natural phenomena and measurement errors follow this pattern.
The normal distribution is defined by the probability density function (PDF):
\$\$f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\$\$
where:
- \$\mu\$ is the mean (center of the distribution).
- \$\sigma\$ is the standard deviation (controls the spread of the distribution).
The normal distributionis completely determined by its meanand standard deviation. Changing these parameters shifts and scales the distribution.
The standard normal distribution is a normal distribution with a mean of \$0\$ and a standard deviation of \$1\$.
The standard normal distribution is denoted by \$Z \sim N(0, 1)\$ and has the probability density function:
\$\$f(z) = \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}}\$\$
The standard normal distributionis used as a reference for all normal distributions. Any normal distribution can be transformed into the standard normal distribution using the z-score.
The z-score is a measure of how many standard deviations a data point is from the mean of the distribution.
The z-score is calculated as:
\$\$z = \frac{x - \mu}{\sigma}\$\$
where:
- \$x\$ is the data point.
- \$\mu\$ is the mean of the distribution.
- \$\sigma\$ is the standard deviation of the distribution.
The z-scoreallows us to compare data points from different normal distributions by standardizing them to the standard normal distribution.
The cumulative distribution function (CDF) of a normal distribution gives the probability that a random variable \$X\$ is less than or equal to a certain value \$x\$:
\$\$P(X \leq x) = \int_{-\infty}^{x} f(t) , dt\$\$
For the standard normal distribution, the CDF is denoted by \$\Phi(z)\$:
\$\$ \Phi(z) = P(Z \leq z) = \int_{-\infty}^{z} \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}} , dt \$\$
The CDF of the standard normal distributionis often used to find probabilities for any normal distribution by first converting to z-scores.
1. A random variable \$X\$ follows a normal distribution with mean \$\mu = 20\$ and standard deviation \$\sigma = 5\$. Find the probability that \$X\$ is greater than 25. 2. A random variable \$Y\$ follows a normal distribution with mean \$\mu = 0\$ and standard deviation \$\sigma = 1\$. Find the probability that \$Y\$ is between -1.5 and 1.5.
How do we know that the normal distribution accurately models real-world phenomena? What are the limitations of using mathematical models to represent complex systems?