The *variance* defines a measure of the spread or dispersion within a set of data. There are two types: the population variance, usually denoted by $\sigma^2$ and the sample variance is usually denoted by $s^2$.

The *population variance* is the variance of the population. To calculate the population variance, use the formula \[\sigma^2=\frac{1}{N}\sum\limits_{i=1}^N (x_i-\mu)^2\] where $N$ is the size of the population consisting of $x_1, x_2, \ldots x_N$ and $\mu$ is the population mean.

Usually we only have a sample, the *sample variance* is the variance of this sample. Given a sample of data of size $n$, the sample variance is calculated using \[s^2=\frac{1}{n-1}\sum\limits_{i=1}^n (x_i-\bar{x})^2 \text{.}\]

Make sure you know when to make this distinction. To use the population variance you need all of the data available whereas to use the sample variance you only need a proportion of it. For example, if we take ten words at random from this page to calculate the variance of their length, a sample variance would be needed. To find the population variance, the length of every word on the page would be needed.

For a discrete random variable $X$, the variance can be worked out as follows:

\[\mathrm{Var}[X] = \mathrm{E}[(X- \mathrm{E}[X])^2 ]\text{.}\]

However this calculation can take a lot of time as it involves calculating the difference between each element of the sample space and the mean (which is equal to $\mathrm{E}[X]$ and abbreviated as $\mu$), squaring this difference and then finding the expected value of this new set of square differences.

If we expand the formula for the variance, we see \begin{align} \mathrm{Var}[X] &= \mathrm{E}[(X - \mathrm{E}[X])^2 ] \\ &= \mathrm{E}[X^2 - 2X \mathrm{E}[X] + \mathrm{E}[X]^2] \\ &= \mathrm{E}[X^2] - 2 \mathrm{E}[X]\mathrm{E}[X] + ( \mathrm{E}[X])^2 \\ &= \mathrm{E}[X^2] - 2 \mathrm{E}[X]^2 + ( \mathrm{E}[X])^2 \\ & = \mathrm{E}[X^2] - (\mathrm{E}[X])^2\text{.} \end{align}

So now we have an alternative formula, \[\mathrm{Var}[X] = \mathrm{E}[X^2]- (\mathrm{E}[X])^2\text{.}\]

Given a discrete random variable $X$ over a sample space $S$, we can calculate the variance in one of the following ways: \begin{align} \mathrm{Var}[X] &= \sum\limits_{x\in S} \mathrm{P}[X=x](x - \mu)^2\text{,} \\ \mathrm{Var}[X] &= \sum\limits_{x\in S} \{ \mathrm{P}[X=x]\cdot x^2 \} - \mu ^2\text{.} \end{align}

Given a continuous random variable $X$ over a sample space $S$ with probability density function $f(x)$, we can calculate the variance in one of the following ways: \begin{align*} \mathrm{Var}[X] &= \int\limits_{x\in S} f(x)\cdot (x - \mu)^2 \mathrm{d} x\text{,} \\ \mathrm{Var}[X] &= \int\limits_{x\in S} (f(x)\cdot x^2 )\mathrm{d} x - \mu ^2\text{.} \end{align*}

These results are obtained by combining the definition of variance with the formulae for the expected value for both cases.

**Note:** In the discrete case the mean $\mu = \displaystyle \sum\limits_{x\in S} \mathrm{P}[X=x] \cdot x$ whereas in the continuous case $\displaystyle \mu = \int\limits_{x\in S} f(x)\cdot x \mathrm{d} x$.

The *standard deviation*, often denoted by $\sigma$, is the positive square root of the variance. Data sets with a small standard deviation are tightly grouped around the mean, whereas a larger standard deviation indicates the data is more spread out.

The *population standard deviation* is the standard deviation of the entire population and often denoted by $\sigma$. It is given by the formula \[\sigma = \sqrt{\frac{1}{N}\sum\limits_{i=1}^{N} (x_i - \mu)^2}\] where $N$ is the size of the population consisting of $x_1, x_2, \ldots x_N$ and $\mu$ is the population mean.

The *sample standard deviation*, often represented by $s$, is calculated using the formula \[s= \sqrt{ \frac{1}{n-1} \sum\limits_{x=1}^n (x_i-\bar{x})^2}\] where $n$ is the number of observations obtained in the sample, $x_1, x_2, \ldots, x_n$ are the obtained observations and $\bar{x}$ is the sample mean. To understand why $\frac{1}{n-1}$ is used rather than $\frac{1}{n}$ see degrees of freedom.

The length, in seconds, of the thirteen songs on an album are \[128, 219, 316, 189, 512, 98, 155, 110, 468, 177, 203, 73, 252\text{.}\] Calculate the standard deviation.

First calculate the mean. \begin{align} \mu &= \frac{1}{N}\sum\limits_{i=1}^Nx_i \\ &= \frac{1}{13}( 128+ 219+ 316+ 189+ 512+ 98+ 155+ 110+ 468+177 + 203 + 73 + 252) \\ &=\frac{1}{13} (2900) \\ &= 223.0769\text{.} \end{align}

Because we have the lengths of every song on the album, we calculate the population standard deviation. This is done using the formula \[\sigma = \sqrt{\frac{1}{N}\sum\limits_{i=1}^N (x_i-\mu)^2 } \text{.}\]

So the square distance from the mean of each value needs to be calculated.

\begin{align} (x_1-\mu)^2 &= (128-223.0769)^2 = (-95.0769)^2 = 9039.6169 \\ (x_2-\mu)^2 &=(219-223.0769)^2 = (-4.0769)^2 = 16.6211 \\ (x_3-\mu)^2 &=(316-223.0769)^2= (92.9231)^2 = 8634.7025 \\ (x_4-\mu)^2 &=(189-223.0769)^2= (-34.0769)^2 = 1161.2351 \\ (x_5-\mu)^2 &=(512-223.0769)^2= (288.9231)^2 = 83476.5577 \\ (x_6-\mu)^2 &=(98-223.0769)^2 = (-125.0769)^2 = 15644.2309 \\ (x_7-\mu)^2 &=(155-223.0769)^2 = (-68.0769)^2 = 4634.4643 \\ (x_8-\mu)^2 &= (110-223.0769)^2= (-113.0769)^2= 12786.3853 \\ (x_9-\mu)^2 &=(468-223.0769)^2 = (-244.9231)^2 = 59987.3249 \\ (x_{10}-\mu)^2 &= (177-223.0769)^2 = (-46.0769)^2 =2123.0807 \\ (x_{11}-\mu)^2 &= (203-223.0769)^2 = (-20.0769)^2 = 403.0819\\ (x_{12}-\mu)^2 &= (73-223.0769)^2 = (-150.0769)^2 = 22523.0759 \\ (x_{13}-\mu)^2 &= (252-223.0769)^2 = (28.9231)^2 = 836.5457 \end{align}

So, by substituting into \[\sigma = \sqrt{\frac{1}{N}\sum\limits_{i=1}^N (x_i-\mu)^2}\] we obtain \[\sigma = 130.4627\text{.}\]

Dr. Lee Fawcett calculates the standard deviation of a set of data.

This workbook produced by HELM is a good revision aid, containing key points for revision and many worked examples.

- Descriptive statistics including work on standard deviation and variance.