1
Variance and Standard Deviation (3) Frequency Distributions

2
Standard Deviation n n n sx = xi2 – nx2 xi2 – x2 or
Standard Deviation = (xi – x)2 n Standard Deviation can more conveniently be written sx = xi2 – nx xi2 – x2 n n or … this makes manual calculations much simpler

3
Frequency Distributions
Visits to the doctors: Mean = 47 / 20 = 2.35

4
Ruler Experiment

5
Ruler Experiment – Mean
Estimate of Mean = total (based on mean) total frequency

6
Frequency Distributions
Mean within Frequency Distributions Within frequency distribution, mean is defined as … x = xifi n fi means frequency Where data is provided in ranges, the xi value are the mid-point in the range. It represents an estimate of the mean, since it assumes that values are evenly distributed in the range

7
x = xifi n n Standard Deviation with Frequency Distributions
sx = xi2 – x2 n Previously, we arrived at the formula: … now x = xifi n fi means frequency The xi2 part can also be calculated from the tables

9
x = xifi n n n Standard Deviation with Frequency Distributions
sx = xi2 – x2 n Previously, we arrived at the formula: … with frequency distribution, it becomes sx = xi2fi – x2 n fi means frequency … where x = xifi n

10
n xi2fi = 24662.5 xi2fi = 22087.5 sx = xi2fi – x2
90 Boys: xi2fi = X = 13.6 n = 90 sx = 9.44 sx = 70 Girls: xi2fi = X = 15.7 n = 70 sx = 8.31

11
sx = xi2fi – x2 n Mean = 47 / 20 = 2.35 sx = 20 = (8.35 – ) = = 1.68

12
The right average? In a 5 person office: The boss makes 50K
The 2 secretaries make 14K The sales rep makes 25K The trainee sales rep gets 16K The median pay is 14, 14, 16, 25, 50 16K The modal pay is 14K The mean pay is 119K 5 = 23.8K … which represents the ‘best average’ ? The boss says “on average you earn over 23K in my office” The sales rep says “on average you only get 16K in my office”

13
Suppose this had been our experiment
Mean, median, spread? Cannot calculate a mean and standard deviation, since not all data value are known

14
You can still estimate the median and inter-quartile ranges
90 boys tested 70 girls tested Median boy = 11 cm Median girl = 17 cm

15
You can still estimate the median and inter-quartile ranges
Boy IQR = = 14 cm Girl IQR = = 14 cm 90 boys tested 70 girls tested

16
Pros and Cons of different averages (mean and median) and
measures of spread (inter-quartile range and standard deviation) Median and inter-quartile range are unaffected by extreme values therefore the most suitable measures when extreme value occur Median and inter-quartile range can be calculated with some data missing (in the end ranges) Mean and standard deviation include all values Mean and standard deviation are more ‘sensitive’ measures they provide a better picture of the whole data You can therefore chose the values that bias the interpretation in you favour!

17
“There are three kinds of lies: lies,
damned lies and statistics.”; Mark Twain

18
Activity Page 29 of your Statistics 1 book. Read and make a memory map

19
IQR = = 39 cm Median = 58 cm

20
Median and IQR are unaffected by a change in the upper range
= 39 cm Median = 58 cm

21
Estimate of mean = 4940 / 80 = 61.75 sec

22
sx = xi2fi – x2 n X = 61.75 sx = 80 = 951.9

23
Mean and SD are changed slightly by a change in the upper range
sx = xi2fi – x2 n X = (61.75) Mean and SD are changed slightly by a change in the upper range sx = 80 = (951.9)

Similar presentations