Three of the most famous words in all of mathematics:
mean (average), median (middle), and mode (most frequent)
all three have some claim to being the center of a data set
The Law of Large Numbers
Sampling distributions
Calculating the mean of grouped data
Understand that we need to make a distinction between the mean of a population,
which we refer to with the Greek letter \( \mu \), and the mean of a sample,
which we refer to with \( \bar{x} \), pronounced "x-bar".
2.6 Skewness and Mean, Median, and Mode
recommended:
Chapter 2 Practice (pp. 113-115): 49-68
Chapter 2 Homework (p. 123): 20
skewed distributions, left- and right-skewed distributions
the behavior of mean, median, and mode in different skewed distributions
2.7 Measures of the Spread of the Data
recommended:
Chapter 2 Practice (p. 115): 69-73
Chapter 2 Homework (pp. 123): 21-49
deviation is the difference of a data value from the mean,
\( x-\bar{x} \) or \( x-\mu \)
notice that the sum all of the deviations from the mean in a sample or in
a population will always be \( 0 \), hence the "average deviation" will
always be \( 0 \), why?
for this reason average deviation is not useful at all if we want to
measure the variability of data in a data set
we're going to need to be more creative to describe "variation" and
"variability"
variance is the average of the squares of the deviations
of a data set
except in the most unusual case, variance can never be \( 0 \)
standard deviation is the square root of average of the squares of the
deviations, or more briefly, the square root of the variance
standard deviation provides us with the tool that we are seeking to
measuring the variability of data in a data set.
For technical reasons which we will discuss later, again we make a distinction
between the standard deviation of a population, which we refer to with the Greek
letter \( \sigma \), and the standard deviation of a sample, which we refer to
with \( s \)
\( \displaystyle{\sigma = \sqrt{\frac{\sum_{i=1}^{N}(x_{i}-\mu)^{2}}{N}}}\),
population size: \( N \), often abbreviated: \( \displaystyle{\sigma = \sqrt{\frac{\sum(x-\mu)^{2}}{N}}}\)
\( \displaystyle{s = \sqrt{\frac{\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}}{n-1}}}\),
sample size: \( n \), often abbreviated: \( \displaystyle{s = \sqrt{\frac{\sum(x-\bar{x})^{2}}{n-1}}}\)
Sampling variability
Calculating the standard deviation of grouped data
We expect that, if our sample is representative, the sample mean and the sample
standard deviation will be "close" to the population mean and the population
standard deviation.
One more powerful use of the standard deviation:
two different data sets, with potentially different means and standard
deviations, can be hard to compare directly,
but by calculating the number of standard deviations (called the
\(z\)-score) of a given data value from the mean in its data set, we
can compare how data values from different data sets stand in relation
to their data sets.