Statistical-Inference/03-estimation.Rmd at master · WdeNooy/Statistical-Inference · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
# Estimating a Parameter: Which Population Values Are Plausible? {#param-estim}

> Key concepts: point estimate, interval estimate, confidence (level), precision, standard error, critical value, confidence interval.

Watch this micro lecture on estimation for an overview of the chapter.

```{r, echo=FALSE, out.width="640px", fig.pos='H', fig.align='center', dev="png", screenshot.opts = list(delay = 5)}
knitr::include_url("https://www.youtube.com/embed/4DGmVyQeqKc", height = "360px")
```

### Summary {-}

```{block2, type='rmdimportant'}
Given our sample, what are plausible population values?
```

In this chapter, we set out to make educated guesses of a population value (parameter, often called "the true value") based on our sample. This type of guessing is called _estimation_. Our first guess will be a single value for the population value. We merely guess that the population value is equal to the value of the sample statistic. This guess is the most precise guess that we can make, but, most likely, it is wrong.

Our second guess uses the sampling distribution to make a statement about the approximate population value. In essence, we calculate an interval that we are confident will contain the population value. We can increase our confidence by widening the interval, but this decreases the precision of our guess.

## Point Estimate

```{r point-estimates", fig.pos='H', fig.align='center', fig.pos='H', fig.align='center', fig.cap="A sample from a population with unknown proportion of yelow candies.", eval=FALSE, echo=FALSE}
# A button allows to draw a sample of size 20 from a population with five colours and random proportion of yellow candies within the range .10 .40 ; the sample is shown as a dotplot (stacks of coloured dots) with the proportion of yellow as a number ; result: true population proportion, sample proportion, and student's proportion with current winner (closest to true population proportion) and a bar chart showing cumulative number of wins for sample proportion versus student's choice.

Figure \@ref(fig:point-estimates) generates random samples from a population of candies for which we do not know the proportion of yellow candies. It uses the sample proportion as estimate of the population proportion. Note that each sample is drawn from a different population.

1. Can you do better than the sample proportion? Draw a sample and guess the population proportion. See which estimate is better. Repeat this until you are convinced that you can or cannot do better than the sample proportion.
```

If we have to name one value for the population value, our best guess is the value of the sample statistic. For example, if 18% of the candies in our sample bag are yellow, our best guess for the proportion of yellow candies in the population of all candies from which this bag was filled, is .18. What other number can we give if we only have our sample? This type of guess is called a _point estimate_ and we use it a lot.

The sample statistic is the best estimate of the population value only if the sample statistic is an unbiased estimator of the population value. As we have learned in Section \@ref(unbiased-est), the true population value is equal to the mean of the sampling distribution for an unbiased estimator. The mean of the sampling distribution is the expected value for the sample.

In other words, an unbiased estimator neither systematically overestimates the population value, nor does it systematically underestimate the population value. With an unbiased estimator, then, there is no reason to prefer a value higher or lower than the sample value as our estimate of the population value.

Even though the value of the statistic in the sample is our best guess, it is very unlikely that our sample statistic is exactly equal to the population value (parameter). The recurrent theme in our discussion of random samples is that a random sample differs from the population because of chance during the sampling process. The precise population value is highly unlikely to actually appear in our sample.

The sample statistic value is our best point estimate but it is nearly certain to be wrong. It may be slightly or far off the mark but it will hardly ever be spot on. For this reason, it is better to estimate a range within which the population value falls. Let us turn to this in the next section.

## Interval Estimate for the Sample Statistic

The sampling distribution of a continuous sample statistic tells us the probability of finding a range of scores for the sample statistic in a random sample. For example, the average weight of candies in a sample bag is a continuous random variable. The sampling distribution tells us the probability of drawing a sample with average candy weight between 2.0 and 3.6 grams. We can use this range as our _interval estimate_.

Note that we are reasoning from sampling distribution to sample now. This is not what we want to do in actual research, where we want to reason from sample to sampling distribution to population. We get to that in Section \@ref(ci-parameter). For now, assume that we know the true sampling distribution.

Remember that the average or expected value of a sampling distribution is equal to the population value if the estimator is unbiased. For example, the mean weight of yellow candies averaged over a very large number of samples is equal to the mean weight of yellow candies in the population. For an interval estimate, we now select the sample statistic values that are closest to the average of the sampling distribution.

Between which boundaries do we find the sample statistic values that are closest to the population value? Of course, we have to specify what we mean by "closest". Which part of all samples do we want to include? A popular proportion is 95%, so we want to know the boundary values that include 95% of all samples that are closest to the population value. For example, between which boundaries is average candy weight situated for 95% of all samples that are closest to the average candy weight in the population?

```{r ci-borders, fig.pos='H', fig.align='center', fig.cap="Within which interval do we find the sample results that are closest to the population value?", echo=FALSE, out.width="420px", screenshot.opts = list(delay = 5), dev="png"}
# Graph a normal distribution with mean 2.8 and standard deviation equal to a random number between 0.05 and 0.2 ; x-axis with scale and labelled "Average candy weight"; add two vertical lines, one to the extreme left, one to the extreme right with their values on the x-axis displayed and the percentage of observations (area) between the two lines displayed (initially near 50%) ; slider moves right line and left line (in opposite directions) and adjusts percentage of area between lines ; note that the lines cannot be moved across the centre of the distribution
knitr::include_app("http://82.196.4.233:3838/apps/ci-borders/", height="290px")
```

Figure \@ref(fig:ci-borders) shows the sampling distribution of average sample candy weight.

<A name="question3.2.1"></A>
```{block2, type='rmdquestion'}
1. What is the average candy weight in the population of candies? [<img src="icons/2answer.png" width=115px align="right">](#answer3.2.1)
```

<A name="question3.2.2"></A>
```{block2, type='rmdquestion'}
2. Move the slider until you have found the interval containing 95% of all samples that are closest to the (true) population value. What are the upper and lower limits of the interval that contains these samples? [<img src="icons/2answer.png" width=115px align="right">](#answer3.2.2)
```

Say, for instance, that 95% of all possible samples in the middle of the sampling distribution have an average candy weight ranging from 1.6 to 4.0 grams. The proportion .95 can be interpreted as a probability. Our sampling distribution tells us that we have 95% probability that the average weight of yellow candies lies between 1.6 and 4.0 grams in a random sample that we draw from this population.

We now have boundary values, that is, a range of sample statistic values, and a probability of drawing a sample with a statistic falling within this range. The probability shows our _confidence_ in the estimate. It is called the _confidence level_ of an interval estimate.

### Answers {-}

<A name="answer3.2.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* A sample mean is an unbiased estimator of the population mean. As a
consequence, the mean of the sampling distribution is equal to the population
value, in this case the population mean.
* The (normal) sampling distribution is symmetrical, so the mean of the
sampling distribution is in the middle, exactly under the top of the sampling
distribution. Here, the value is 2.8 (gram). This is average candy weight in
the population. [<img src="icons/2question.png" width=161px align="right">](#question3.2.1)
```

<A name="answer3.2.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* The value of the upper (right) limit is 4.37. With this slider setting, the area in the center of the graph represents 95 per cent of all samples.
* To obtain the lower (left) limit, calculate the difference between the
upper limit (4.37) and the mean of the distribution (2.8): 4.37 - 2.8 = 1.57. Then subtract this difference from the distribution mean (2.8) to get the lower (left) limit of the interval: 2.8 - 1.57 = 1.23. This is the value where the vertical line on the left hits the horizontal axis in the figure.
* Tip: If the cursor is on the slider handle, you can change the slider value
in minimal steps with the left and right arrow keys on your keyboard. [<img src="icons/2question.png" width=161px align="right">](#question3.2.2)
```

## Precision, Standard Error, and Sample Size {#precisionsesamplesize}

The width of the estimated interval represents the _precision_ of our estimate. The wider the interval, the less precise our estimate. With a less precise interval estimate, we will have to take into account a wider variety of outcomes in our sample.

```{r interval-level, fig.pos='H', fig.align='center', fig.cap="How does the confidence level affect the precision of an interval estimate?", echo=FALSE, out.width="420px", screenshot.opts = list(delay = 5), dev="png"}
# (as in ci-borders) Graph a normal distribution with mean 2.8 and standard deviation equal to a random number between 0.05 and 0.2 ; x-axis with scale and labelled "Average candy weight" ; vertical lines for boundaries  interval estimate ; show a double-pointed arrow on top of or above/under the x-axis representing the confidence interval for the initial confidence level (95%) and sample size (30); add a slider to adjust the confidence level (50%-100%?) ; update the vertical lines (boundaries ) and the arrow representing the precision of the confidence interval if the slider position changes.
knitr::include_app("http://82.196.4.233:3838/apps/interval-level/", height="310px")
```

<A name="question3.3.1"></A>
```{block2, type='rmdquestion'}
1. How does the precision (width) of the interval estimate (represented by the double-sided arrow) change if you change the confidence level? Check what happens if you change the confidence level slider in Figure \@ref(fig:interval-level). [<img src="icons/2answer.png" width=115px align="right">](#answer3.3.1)
```

<A name="question3.3.2"></A>
```{block2, type='rmdquestion'}
2. What happens if we want to be 100% certain that our interval contains the average candy weight of the next sample that we will draw? [<img src="icons/2answer.png" width=115px align="right">](#answer3.3.2)
```

If we want to predict something, we value precision. We would rather conclude that the average weight of candies in the next sample we draw is between 2.0 and 3.6 grams than between 1.6 and 4.0 grams. If we would be satisfied with a very imprecise estimate, we need not do any research at all. With relatively little knowledge about the candies that we are investigating, we could straightaway predict that the average candy weight is between zero and ten grams. The goal of our research is to find a more precise estimate.

There are several ways to increase the precision of our interval estimate, that is, to obtain a narrower interval for our estimate. The easiest and least useful way is to decrease our confidence that our estimate is correct. If we lower the confidence that we are right, we can discard a large number of other possible sample statistic outcomes and focus on a narrower range of sample outcomes around the true population value.

This method is not useful because we sacrifice our confidence that the range includes the outcome in the sample that we are going to draw. What is the use of a more precise estimate if we are less certain that it predicts correctly? Therefore, we usually do not change the confidence level and leave it at 95% or thereabouts (90%, 99%). It is important to be quite sure that our prediction will be right.

### Sample size {#sample-size}

A less practical but very useful method of narrowing the interval estimate is increasing sample size. If we buy a larger bag containing more candies, we get a better idea of average candy weight in the population and a better idea of the averages that we should expect in our sample.

```{r interval-size, fig.pos='H', fig.align='center', fig.cap="How does sample size affect the precision of an interval estimate?", echo=FALSE, out.width="420px", screenshot.opts = list(delay = 5), dev="png"}
# same as interval-level but the slider adjusts sample size (N between 5 and 100, steps of 5) ; update the line/arrow representing the precision of the confidence interval if the slider position changes ; also update the normal curve and ensure that the scale of the x-axis remains the same, so it is clear that the sampling distribution becomes more peaked for larger samples.
knitr::include_app("http://82.196.4.233:3838/apps/interval-size/", height="310px")
```

Figure \@ref(fig:interval-size) shows a sampling distribution of average candy weight in candy sample bags. The size of the horizontal arrow represents the precision of the interval estimate: the shorter the arrow, the more precise the interval estimate.

<A name="question3.3.3"></A>
```{block2, type='rmdquestion'}
3. How does the precision of the interval estimate change if you change the size of the sample? Check what happens if you change the sample size slider. [<img src="icons/2answer.png" width=115px align="right">](#answer3.3.3)
```

<A name="question3.3.4"></A>
```{block2, type='rmdquestion'}
4. How does the shape of the sampling distribution change if you change sample size? Explain what this means for the values of the sample statistic. [<img src="icons/2answer.png" width=115px align="right">](#answer3.3.4)
```

As you may have noticed while playing with Figure \@ref(fig:interval-size), a larger sample yields a narrower, that is, more precise interval. You may have expected intuitively that larger samples give more precise estimates because they offer more information. This intuition is correct.

In a larger sample, an observation above the mean is more likely to be compensated by an observation below the mean. Just because there are more observations, it is less likely that we sample relatively high scores but no or considerably fewer scores that are relatively low.

The larger the sample, the more the distribution of scores for a variable in the sample will resemble the distribution of scores for this variable in the population. As a consequence, a sample statistic value will be closer to the population value for this statistic.

Larger samples resemble the population more closely, and therefore large samples drawn from the same population are more similar. The result is that the sample statistic values in the sampling distribution are less varied and more similar. They are more concentrated around the true population value. The middle 95% of all sample statistic values are closer to the centre, so the sampling distribution is more peaked.

### Standard error {#standard-error}

The concentration of sample statistic values, such as average candy weight in a sample bag, around the centre (mean) of the sampling distribution is expressed by the standard deviation of the sampling distribution. Up until now, we have only paid attention to the centre of the sampling distribution, its mean, because it is the expected value in a sample and it is equal to the population value if the estimator is unbiased.

Now, we start looking at the standard deviation of the sampling distribution as well, because it tells us how precise our interval estimate is going to be. The sampling distribution's standard deviation is so important that it has received a special name: the _standard error_.

```{r se-point-est, fig.pos='H', fig.align='center', fig.cap="How does sample size affect the standard error?", echo=FALSE, out.width="420px", screenshot.opts = list(delay = 5), dev="png"}
# Show how the standard error depends on sample size.
knitr::include_app("http://82.196.4.233:3838/apps/se-point-est/", height="310px")
```

<A name="question3.3.5"></A>
```{block2, type='rmdquestion'}
5. How does the standard error depend on the size of the sample? Play around with the sample size slider in Figure \@ref(fig:se-point-est) to find the answer. [<img src="icons/2answer.png" width=115px align="right">](#answer3.3.5)
```

<A name="question3.3.6"></A>
```{block2, type='rmdquestion'}
6. How does the shape of the sampling distribution express the standard error? Can you explain why the standard error, which is the standard deviation of the sampling distribution, affects the shape of the sampling distribution in this way? [<img src="icons/2answer.png" width=115px align="right">](#answer3.3.6)
```

The word _error_ reminds us that the standard error represents the size of the error that we are likely to make (on average under many repetitions) if we use the value of the sample statistic as a point estimate for the population value.

Let us assume, for instance, that the standard error of the average weight of candies in samples is 0.6. Loosely stated, this means that the average difference between true average candy weight and average candy weight in a sample is 0.6 if we draw very many samples from the same population.

The smaller the standard error, the more the sample statistic values resemble the true population value, and the more precise our interval estimate is with a given confidence level, for instance, 95%. Because we like more precise interval estimates, we prefer small standard errors over high standard errors.

It is easy to obtain smaller standard errors: just increase sample size. See Figure \@ref(fig:interval-size), where larger samples yield more peaked sampling distributions. In a peaked distribution, values are closer to the mean and the standard error is smaller. In our example, average candy weights in larger sample bags are closer to the average candy weight in the population.

In practice, however, it is both time-consuming and expensive to draw a very large sample. Usually, we want to settle on the optimal size of the sample, namely a sample that is large enough to have interval estimates at the confidence level and precision that we need but as small as possible to save on time and expenses. We return to this matter in Chapter \@ref(power).

The standard error may also depend on other factors, such as the variation in population scores. In our example, more variation in the weight of candies in the population produces a larger standard error for average candy weight in a sample bag. If there are more very heavy candies and very light candies, it is easier to draw a sample with several heavy candies or with several very light candies. Average weight in these sample bags will be too high or too low. We cannot influence the variation in candy weights in the population, so let us ignore this factor influencing the standard error.

### Answers {-}
<A name="answer3.3.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* The lower our confidence level, the more population values we exclude, the
narrower the interval estimate: precision is higher.
* The higher our confidence level, the more population values we include, the
broader the interval estimate: precision is smaller. [<img src="icons/2question.png" width=161px align="right">](#question3.3.1)
```

<A name="answer3.3.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* If we want to be 100% sure, we cannot rule out any possible average weight
for the sample that we will draw. As a consequence, the interval must contain all
possible average candy weights, starting just above zero (a candy must have
some weight) and ending at the highest weight that we think is possible (10
grams, 1 kilo, 100 kilos?). Of course, such an interval is completely useless
because it tells us that any meaningful value is possible. Well, we could have
figured that out without doing research. [<img src="icons/2question.png"
width=161px align="right">](#question3.3.2)
```

<A name="answer3.3.3"></A>
```{block2, type='rmdanswer'}
Answer to Question 3.

* Larger samples offer more information, so they allow a more precise estimate
of a population value, such as average candy weight. The precision of the
interval will increase, so the interval will become narrower, the arrow will
become shorter.
* The computational reason: A larger sample yields a smaller standard error
(se). See the next section. [<img src="icons/2question.png" width=161px align="right">](#question3.3.3)
```

<A name="answer3.3.4"></A>
```{block2, type='rmdanswer'}
Answer to Question 4.

* Remember that the sampling distribution represents the sample statistic
scores of all possible samples. The percentages, therefore, express the share
of samples with particular scores.
* If a larger part of all sample means are found near the (true) population
mean, which happens for larger samples, the sampling distribution becomes more
peaked/less flat. [<img src="icons/2question.png" width=161px align="right">](#question3.3.4)
```

<A name="answer3.3.5"></A>
```{block2, type='rmdanswer'}
Answer to Question 5.

* If we increase sample size, the standard error becomes smaller. Conversely, the smaller the sample, the larger the standard error. [<img src="icons/2question.png" width=161px align="right">](#question3.3.5)
```

<A name="answer3.3.6"></A>
```{block2, type='rmdanswer'}
Answer to Question 6.

* The larger the standard error, the f(l)atter the sampling distribution. In contrast, a smaller standard error yields a more peaked sampling distribution.
* A standard deviation is a measure of variation or spread. The larger a standard deviation, the more the scores in the distribution are spread out; the scores are on average further away from the mean.
* The standard error is the standard deviation of the sampling distribution. With a larger standard error, the sample means (the scores in the sampling distribution) are on average further away from the mean of the sampling distribution. As a result we get fatter tails and a flatter sampling distribution. [<img src="icons/2question.png" width=161px align="right">](#question3.3.6)
```

## Critical Values {#crit-values}

In the preceding section, we learned that the standard error is related to the precision of the interval estimate. A larger standard error yields a less precise estimate, that is, with a wider interval estimate.

We are interested in the interval that includes a particular percentage of all samples that can be drawn, usually the 95% of all samples that are closest to the population value. In our current example, the 95% of all samples with average candy weight that is closest to average candy weight in the population (2.8 grams).

In theoretical probability distributions like the normal distribution, the percentage of samples is related to the standard error. If we know the standard error, we know the interval within which we find the 95% of samples that are closest to the population value.

```{r crit-values, fig.pos='H', fig.align='center', fig.cap="Standardized sample outcomes and the standard error.", echo=FALSE, out.width="440px", screenshot.opts = list(delay = 5), dev="png"}
# same as interval-size but the slider adjusts the standard error (start value = 0.1, so one standard error aligns with 0.1 grams above or below average; slider range 0.05 and 0.15?) ; the vertical lines are fixed at 2.5% and 97.5% of the cumulative area under the normal curve ; first x-axis has fixed scale in grams (population mean is 2.8) ; add a second x-axis representing z scores with values -1.96, -1.0, 0, 1.0, 1.96 ; if the standard error is adjusted, the normal curve changes as well as the scale of the second x-axis and the two vertical lines at -1.96 and 1.96.
knitr::include_app("http://82.196.4.233:3838/apps/crit-values/", height="360px")
```

Figure \@ref(fig:crit-values) shows the sampling distribution of average candy weight per sample bag. It contains two horizontal axes, one with average candy weight in grams (bottom) and one with average candy weight in standard errors, also called _z_ scores (top).

<A name="question3.4.1"></A>
```{block2, type='rmdquestion'}
1. How do the two horizontal axes tell you the size of the standard error in grams? [<img src="icons/2answer.png" width=115px align="right">](#answer3.4.1)
```

<A name="question3.4.2"></A>
```{block2, type='rmdquestion'}
2. How do you expect the location of the vertical lines to change if you change the size of the standard error? Check your expectation by using the slider. [<img src="icons/2answer.png" width=115px align="right">](#answer3.4.2)
```

In Figure \@ref(fig:crit-values), we approximate the sampling distribution with a theoretical probability distribution, namely the normal distribution. The theoretical probability distribution links probabilities (areas under the curve) to sample statistic outcome values (scores on the horizontal axis). For example, we have 2.5% probability of drawing a sample bag with average candy weight below 1.2 grams or 2.5% probability of drawing a sample bag with average candy weight over 4.4 grams.

### Standardization and _z_ scores

The average candy weights that are associated with 2.5% and 97.5% probabilities in Figure \@ref(fig:crit-values) depend on the sample that we have drawn. As you may notice while playing with Figure \@ref(fig:interval-size), changing the size of the sample also changes the average candy weights that mark the 2.5% and 97.5% probabilities.

We can simplify the situation if we _standardize_ the sampling distribution: Subtract the mean of the sampling distribution from each sample mean in this distribution, and divide the result by the standard error. Thus, we transform the sampling distribution into a distribution of standardized scores. The mean of the new standardized variable is always zero.

If we use the normal distribution for standardized scores, which is called the _standard-normal distribution_ or _z distribution_, there is a single _z_ value that marks the boundary between the top 2.5% and the bottom 97.5% of any sample. This _z_ value is 1.96. If we combine this value with -1.96, separating the bottom 2.5% of all samples from the rest, we obtain an interval [-1.96, 1.96] containing 95% of all samples that are closest to the mean of the sampling distribution.

In a standard-normal or _z_ distribution, 1.96 is called a _critical value_. Together with its negative (-1.96), it separates the 95% sample statistic outcomes that are closest to the parameter, hence that are most likely to appear, from the 5% that are furthest away and least likely to appear. There are also critical _z_ values for other probabilities, for instance, 1.64 for the middle 90% of all samples and 2.58 for the middle 99% in a standard-normal distribution.

### Interval estimates from critical values and standard errors {#int-est-sample-mean}

Critical values in a theoretical probability distribution tell us the boundaries, or range, of the interval estimate expressed in standard errors. In a normal distribution, 95% of all sample means are situated no more than 1.96 standard errors from the population mean.

If the standard error is 0.5 and the population mean is 2.8 grams, we have 95% probability that the mean candy weight in a sample that we draw from this population lies between 1.82 grams (this is 1.96 times 0.5 subtracted from 2.8) and 3.78 grams.

Critical values make it easy to calculate an interval estimate if we know the standard error. Just take the population value and add the critical value times the standard error to obtain the upper limit of the interval estimate. Subtract the critical value times the standard error from the population value to obtain the lower limit.

```{block2, type='rmdimportant'}
- Lower limit of the interval estimate = population value -- critical value * standard error.
- Upper limit of the interval estimate = population value + critical value * standard error.
```

(Standard) normal distributions make life easier for us, because there is a fixed critical value for each probability, such as 1.96 for 95% probability, which is well-worth memorizing.

### Answers {-}

<A name="answer3.4.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* The axis at the bottom gives average candy weight in grams and the axis at the top shows average candy weight in standard errors from the population mean.
* One standard error equals the distance or difference between _z_ score zero (0) and _z_ score one (1). If we follow the lines down from these standard error scores, we obtain the mean average candy weight (2.8 grams) and the average weight one standard error above the mean (3.6 grams).
* The difference is 0.8 grams, so one standard error represents 0.8 grams (in average candy weight per sample bag). This is indeed the initial value on the slider. [<img src="icons/2question.png" width=161px align="right">](#question3.4.1)
```

<A name="answer3.4.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* If we increase the standard error, one standard error coincides with more
grams, so the sampling distribution grows wider; it becomes less peaked. The
variation in sample outcomes (average candy weight) will increase.
* The dotted lines representing plus or minus 1.00 and 1.96 will move away
from the centre but the centre will be fixed at the true average candy weight
in the population, in this example 2.8 grams. [<img src="icons/2question.png" width=161px align="right">](#question3.4.2)
```

## Confidence Interval for a Parameter {#ci-parameter}

Working through the preceding sections, it may have occurred to you that it is all very well to be able to estimate the value of a statistic in a new sample with a particular precision and probability, but that this is not what we are interested in. Instead, we want to estimate the value of the statistic in the population.

For example, we don't care much about the average weight of candies in our sample bag or in the next sample bag that we may buy. We want to say something about the average weight of candies in the population. How can we do this?

In addition, you may have realized that, if we know the sampling distribution,  we also know the precise population value, for instance, average candy weight. After all, the average of the sampling distribution is equal to the population mean for an unbiased estimator. In the preceding paragraphs, we acted as if we knew the sampling distribution. If we know the sampling distribution, and it then follows that we also know the population value, why would we even care about estimating an interval?

Our problem is this: We want to estimate a population value using probabilities. For probabilities we need the sampling distribution but for the sampling distribution, we must know the population. A vicious circle.

```{r exactapproachfigure, eval=TRUE, echo=FALSE, out.width="300px", fig.pos='H', fig.align='center', fig.cap="Probabilities of a sample with a particular number of yellow candies if 20 per cent of the candies are yellow in the population."}
knitr::include_graphics("figures/exactapproach.png")
```

In the exact approach to the sampling distribution of the proportion of yellow candies in a sample bag (Figure \@ref(fig:exactapproachfigure)), for instance, we must know the proportion of yellow candies in the population. If we know the population proportion, we can calculate the exact probability of getting a sample bag with a particular proportion of yellow candies. But we don't know the population proportion of yellow candies; we want to estimate it.

```{r normal-param, eval=FALSE, echo=FALSE, fig.pos='H', fig.align='center', fig.cap="Which population statistics do we need to know if we want to use the standard normal distribution as sampling distribution?"}
# show an X axis labeled 'sample mean' with a scale from 0 to 5 grams ; add sample mean as labelled dot at 3.4 ; add a normal curve with population mean 2.8, a standard error 0.1 (equivalent to sample standard deviation of 1.0 and a sample size of 25) ; show checkboxes for sample size, sample mean, sample standard deviation, standard error, population mean, and population standard error with their (fixed) values ; select all checkboxes ; if the user deselects a checkbox, either nothing happens or a series of 5 normal curves is shown for different values of the statistic associated with the checkbox.
# - sample size: only needed if standard error is deselected
# - sample mean: not necessary
# - sample standard deviation: if population standard deviation or standard error is selected, the sample standard deviations is not needed ; if both standard deviations and the standard error are not selected, show normal curves with different standard deviations
# - standard error: not needed if sample size and sample or population standard deviation are selected ; otherwise, show normal curves with different standard deviations
# - population mean: if deselected, show normal curves for different values
# - population standard deviation: see sample standard deviation

![Example layout](figures/CI_2.png)

Figure \@ref(fig:normal-param) shows a standard normal distribution that we could use to approximate the sampling distribution of average candy weight in a sample (bag) of candies. This theoretical distribution can be used only if some of the statistics (see the check boxes) are known. Uncheck a box if you think this statistic is _not_ needed to use the standard normal distribution. If you are right, nothing happens. If you are wrong, you will see a series of normal curves for different values of the statistic. Then, check the statistic again and try another statistic.

1. Which mean is required for using the standard normal distribution? Which characteristic of the distribution is fixed by this statistic?

2. Do you have to know the standard deviation of the sample as well as that  of the population?

3. Which values do you have to know _only if_ you don't know the standard error? In other words, which statistics affect the standard error (you have learned about before)? Deselect the standard error and experiment with the other options.
```

A theoretical probability distribution can only be used as an approximation of a sampling distribution if we know some characteristics of the population. We know that the sampling distribution of sample means always has the bell shape of a normal or standard-normal (_z_) distribution or _t_ distribution. However, knowing the shape is not sufficient for using the theoretical distribution as an approximation of the sampling distribution.

We must also know the population mean because it specifies where the centre of the sampling distribution is located. So, we must know the population mean to use a theoretical probability distribution to estimate the population mean. This sounds like a problem that only Baron von M&uuml;nchhausen can solve. How can we drag ourselves by the hair out of this swamp?

By the way, we also need the standard error to know how peaked or flat the bell shape is. The standard error can usually be estimated from the data in our sample. But let us not worry about how the standard error is being estimated and focus on estimating the population mean.

### Imaginary population values  {#imag-pop-values}

How can we find plausible population means? Figure \@ref(fig:pop-means-ci) shows average candy weight in a random sample (lower scale). Click somewhere under the top axis to select a possible value for the population mean. The app will then display the interval of most plausible sample means (green if it contains the actual sample mean, red otherwise) if this would be the true population value. In addition, it calculates the _z_ value of the actual sample mean if the selected value would be the true population value.

```{r pop-means-ci, fig.pos='H', fig.align='center', fig.cap="For which population means is our sample mean plausible?", echo=FALSE, out.width="630px", screenshot.opts = list(delay = 5), dev="png"}
# draw two horizontal lines, the top line labeled 'population' and the bottom line labeled 'sample', both lines with a numerical scale (0-5) ; generate a sample mean and standard error within a particular range, say 1-4 for the sample mean and the standard error in a range convenient to have interval estimates within the 0-5 or -1-6 range ; mark the sample mean on the lower line and show the value of the standard error somewhere in the app ; if the user clicks on the upper line, the corresponding number is shown as an (imaginary) population mean ; the 95% probability interval estimate for the sample mean is shown as a horizontal line segment on top/near the lower line (plus a light triangle starting at the population value) ; if the line segment overlaps with the sample mean, the selected population value is marked by a green dot and the line segment is green, they are red dots otherwise ; also show the z value of the sample mean for the chosen population mean ; at next click, remove old interval estimate (line segment plus triangle) but keep the prevously selected population mean ; a 'Reset' button generates new values to restart the interaction (repeat the assignment with different values for the sample mean and standard error to emphasize that the z values remain the same)
knitr::include_app("http://82.196.4.233:3838/apps/pop-means-ci/", height="310px")
```

<A name="question3.5.1"></A>
```{block2, type='rmdquestion'}
1. Click repeatedly on Figure \@ref(fig:pop-means-ci) to find the highest and lowest value of the population mean for which the sample mean is in the interval of sample means that have 95% probability of occurring. [<img src="icons/2answer.png" width=115px align="right">](#answer3.5.1)
```

<A name="question3.5.2"></A>
```{block2, type='rmdquestion'}
2. If you find the highest or lowest value of the population mean for which the sample mean is in the interval of sample means that have 95% probability of occurring, a horizontal arrow appears with the label: 1.96 * SE.
What does this arrow represent and why does it carry this label? [<img src="icons/2answer.png" width=115px align="right">](#answer3.5.2)
```

<A name="question3.5.3"></A>
```{block2, type='rmdquestion'}
3. How does the _z_ value of the sample mean help you to minimize the number of clicks you need? [<img src="icons/2answer.png" width=115px align="right">](#answer3.5.3)
```

<A name="question3.5.4"></A>
```{block2, type='rmdquestion'}
4. How does the depicted interval estimate help you to minimize the number of clicks you need? [<img src="icons/2answer.png" width=115px align="right">](#answer3.5.4)
```

<A name="question3.5.5"></A>
```{block2, type='rmdquestion'}
5. What is the most efficient strategy (minimum number of clicks) to determine the lower and upper limits of the population means for which the sample mean is among the 95% most likely samples? Explain why this is the most efficient strategy. [<img src="icons/2answer.png" width=115px align="right">](#answer3.5.5)
```

How do we solve the M&uuml;nchhausen problem that we must know the population mean to estimate the population mean? A solution is that we select a lot of imaginary population means. We play a _What If?_-game for each imaginary population mean: What are probable sample results if this is the true population value? We calculate the interval within which the sample mean is expected to fall if the imaginary mean would be the true population mean. We use a fixed confidence level, usually a probability of 95%.

As a next step, we check if the mean of the sample that we have actually drawn falls within this interval. If it does, we conclude that this (imaginary) population mean is not at odds with the sample that we have drawn. In contrast, if our sample mean falls outside the interval, we conclude that this population mean is not plausible because our sample is too unlikely to be drawn from a population with this mean.

In this way, we can find all population means that are _consistent_ with our sample. If the true population mean is any of these imaginary means, we are sufficiently likely (95% probability) to draw a sample with our actual sample mean.

While playing with Figure \@ref(fig:pop-means-ci), you may have noticed the _z_ values of the sample mean for the lowest and highest population means for which the sample mean is still within the interval. When you hit the lower bound of the population means, the sample mean has a _z_ value around 1.96 while it has a _z_ value around -1.96 for the highest population mean in the range.

It is not a coincidence that we find the critical values of the standard-normal distribution when we reach the minimum and maximum population means that are plausible. We are using the standard-normal distribution to approximate the sampling distribution of the sample mean. The critical _z_ value 1.96 marks the upper limit of the interval containing 95% of all samples with means closest to the population mean and -1.96 marks the lower limit. A distance of 1.96 standard errors, then, is the maximum distance between a population mean and a sample mean that belongs to the 95% sample means closest to the population mean.

As a consequence, we may simply calculate the range of plausible population values by adding and subtracting 1.96 standard errors from the sample mean. This is much more efficient than selecting a lot of imaginary population means!

```{block2, type='rmdimportant'}
- Confidence interval lower limit = sample value -- critical value * standard error.
- Confidence interval upper limit = sample value + critical value * standard error.

For example, the 95%-confidence interval for a sample mean:

- Lower limit = sample mean - 1.96 * standard error.
- Upper limit = sample mean + 1.96 * standard error.
```

This can be illustrated with an example: If average candy weight in our sample is 2.8 grams and the standard error is 0.5, the lower and upper boundary for plausible population means are 1.82 grams (this is 2.8 minus 1.96 times 0.5) and 3.78 grams (2.8 plus 1.96 times 0.5).

Haven't we seen this calculation before? Yes we did, in Section \@ref(int-est-sample-mean), where we estimated the interval for sample means. We now simply reverse the calculation, using the sample mean to estimate an interval of plausible population means instead of the other way around.

### Confidence interval {#conf-interval}

The upper and lower bounds for the population means that are plausible constitute an interval estimate of the parameter. This interval is linked to a probability, for instance, 95%.

It is very important that we understand that this is NOT the probability that the parameter has a particular value, or that it falls within the interval. The parameter is _not_ a random variable because it is not affected by the random sample that we draw and therefore it does not have a probability.

The parameter has one value, which is either within or outside the interval that we have constructed. We just don't know. But we do know that our sample is more likely for population values within the interval.

We use the term _confidence_ instead of probability when we use this interval to estimate a parameter. The interval is called a _confidence interval_ and we usually add the confidence level, for instance, the 95% confidence interval (abbreviated: 95% CI). We are 95% confident that the parameter falls within the 95% confidence interval. If the  95% confidence interval for average candy weight ranges from 2.4 to 3.2 grams, we write in our report:

```{block2, type='rmdimportant'}
We are 95% confident that average candy weight in the population is between 2.4 and 3.2 grams.
```

```{block2, type='rmdneyman'}
Jerzy Neyman introduced the concept of a confidence interval in 1937:

"In what follows, we shall consider in full detail the problem of estimation by interval. We shall show that it can be solved entirely on the ground of the theory of probability as adopted in this paper, without appealing to any new principles or measures of uncertainty in our judgements". [@RefWorks:3929: 347]

Photo of Jerzy Neyman by Ohonik, Commons Wikimedia, CC BY-SA 4.0]
```

### Confidence intervals with bootstrapping {#bootstrap-confidenceinterval}

If we approximate the sampling distribution with a theoretical probability distribution such as the normal (_z_) or _t_ distribution, critical values and the standard error are used to calculate the confidence interval (see Section \@ref(imag-pop-values)).

There are theoretical probability distributions that do not work with a standard error, such as the _F_ distribution or chi-squared distribution. If we use those distributions to approximate the sampling distribution of a continuous sample statistic, for instance, the association between two categorical variables, we cannot use the formula for a confidence interval (Section \@ref(imag-pop-values)) because we do not have a standard error. We must use bootstrapping to obtain a confidence interval.

```{r bootstrap-ci, eval=FALSE, echo=FALSE, fig.pos='H', fig.align='center', fig.cap="How do we construct confidence intervals with bootstrapping?"}
# Adapt app bootstrapping: Show Initial Sample and Sampling Distribution, buttons Bootstrap 1000 samples and Draw new initial sample. Increase vertical size of sampling distribution but reduce range from 0 to 0.5.
# Add vertical lines for 95% confidence interval limits calculated from percentiles in the bootstrapped sampling distribution and limits calculated from the standard error (= standard deviation of the bootstrapped sampling distribution) times the critical value 1.96 (using a normal approximation).
# Finally, display the average of the bootstrapped sampling distribution as well as the true population proportion (0.2).

# In questions, point out the symmetry of the normal approximation to the confidence interval versus the asymmetry of the percentile-based confidence interval.
```

As you might remember from Section \@ref(boot-approx), we simulate a sampling distribution if we bootstrap a statistic, for instance median candy weight in a sample bag. We can use this sampling distribution to construct a confidence interval. For example, we take the values separating the bottom 2.5% and the top 2.5% of all samples in the bootstrapped sampling distribution as the lower and upper limits of the 95% confidence interval.

It is also possible to construct the entire sampling distribution in exact approaches to the sampling distribution. Both the standard error and percentiles can be used to create confidence intervals. This can be very demanding in terms of computer time, so exact approaches to the sampling distribution usually only report _p_ values (see Section \@ref(pvalue)), not confidence intervals.

### Answers {-}

<A name="answer3.5.1"></A>
```{block2, type='rmdanswer'}
Answer to Question 1.

* The closer the sample mean is to the population mean, the higher the
probability of drawing a sample with more or less this mean. Population means
close to the sample mean will turn up as green dots and their intervals will
include the sample mean.
* The more we move away from the sample mean, the less likely to draw a sample
with a mean that differs at least this much from the population mean. At some
point, the population mean is too far away from the sample mean, so the actual
sample mean is no longer within the range of the 95% most likely sample means to
be drawn from a population with the selected population mean. [<img src="icons/2question.png" width=161px align="right">](#question3.5.1)
```

<A name="answer3.5.2"></A>
```{block2, type='rmdanswer'}
Answer to Question 2.

* The arrow represents the distance (difference) between the mean of the sample (black dot in the bottom of the figure) and the lowest or highest plausible population value given our sample mean (the green dot in the top of the figure).
* The label specifies this distance or difference: the critical value in a standard-normal distribution with a confidence level of 95\%, namely 1.96, multiplied by the standard error (SE). The 95\% most likely sample means are by definition no further away from the true population mean than the critical value times the standard error.
* The population mean is plausible because the actual sample mean is within this distance. [<img src="icons/2question.png" width=161px align="right">](#question3.5.2)
```

<A name="answer3.5.3"></A>
```{block2, type='rmdanswer'}
Answer to Question 3.

* In a standard-normal distribution, _z_ standardizes scores and 95% of all
observations differ no more than 1.96 _z_ scores from the mean. The lowest and
highest population means for which the current sample mean is among the 95%
most plausible samples, then, are reached if the _z_ value of the sample mean is
(almost) 1.96 or -1.96.
* So the _z_ value tells us how close we are to the limit. [<img src="icons/2question.png" width=161px align="right">](#question3.5.3)
```

<A name="answer3.5.4"></A>
```{block2, type='rmdanswer'}
Answer to Question 4.

* The interval estimate shows the width of the interval between the lowest and
highest population average for which the current sample mean is among the 95%
most plausible samples.
* If the current sample mean is at the left limit of the interval, we reach
the highest population mean for which the sample mean is within the interval.
Increase the population mean and the interval will no longer include the
current sample mean. In a similar way, the lowest population mean for which
the interval includes the sample mean is found if the sample mean is on the
right limit of the interval. [<img src="icons/2question.png" width=161px align="right">](#question3.5.4)
```

<A name="answer3.5.5"></A>
```{block2, type='rmdanswer'}
Answer to Question 5.

Using the answer to Question 3, we may use the following steps to find the
lower and upper limits:

a. Click in the middle of the sample mean. This selects the population mean
that is equal to the sample mean.
b. Click on the left limit of the interval. This should be (very near to) the
lowest population mean, the interval of which still includes the sample mean.
c. Repeat Step a to display again the interval for the population mean that is
equal to the sample mean.
d. Click the right limit of this interval to find the highest population mean
such that the current sample mean is still among the 95% most plausible
samples.

* You may not hit the nail spot-on with every click but the procedure itself
is efficient. You can only do better by gambling the values of the two limits
directly and being very very lucky.
* The important lesson here: Instead of constructing the interval around the
population mean, we can construct it around the sample mean to obtain the
range of population means that are plausible given this sample. We flip the
procedure! [<img src="icons/2question.png" width=161px align="right">](#question3.5.5)
```

## Confidence Intervals in SPSS {#SPSS-CI}

### Instruction

```{r SPSSconflevel, echo=FALSE, out.width="640px", fig.pos='H', fig.align='center', fig.cap="(ref:conflevelSPSS)", dev="png", screenshot.opts = list(delay = 5)}
knitr::include_url("https://www.youtube.com/embed/GoGrsHpfIWM", height = "360px")
# SPSS usually displays the 95% confidence interval automatically if you use the standard-normal (_z_) or _t_ distribution.
# * t tests on one or two means: 95% CI displayed. Adjust confidence level under options.
# * post-hoc tests with 1WAY-ANOVA: 95% CI by default, adjust significance level (under Post-Hoc) to change confidence level (1 - significance level)
# * post-hoc tests with 2WAY-ANOVA: 95% CI by default, adjust significance level (under Options) to change confidence level (1 - significance level)
# * t test on regression coefficient: ask for CI under Statistics, level can be adjusted there.
# Bootstrapping: always a confidence interval
# * example: media with Frequencies
# * correlation: only with bootstrapping
# No confidence intervals:
# * non-parametric tests (just categorical variables - irrelevant)
```

### Exercises

<A name="question3.6.1"></A>
```{block2, type='rmdquestion'}
1. Download the data set [candies.sav](http://82.196.4.233:3838/data/candies.sav) and use SPSS to calculate the 95% and 99% confidence intervals of average candy weight.

    Hint: Use the *Analyze > Compare Means > One-Sample T Test* command and leave the test value at zero. As an alternative, use *Analyze > Descriptive statistics > Explore*.

    Interpret the results and explain why the 99% confidence interval is wider than the 95% confidence interval. [<img src="icons/2answer.png" width=115px align="right">](#answer3.6.1)
```

<A name="question3.6.2"></A>
```{block2, type='rmdquestion'}
2. Let SPSS calculate the 95% confidence interval for median candy weight. Interpret the result. The data are in "candies.sav".

    Remember the SPSS exercises in Section \@ref(boot-spss). [<img src="icons/2answer.png" width=115px align="right">](#answer3.6.2)
```

<A name="question3.6.3"></A>
```{block2, type='rmdquestion'}
3. Use SPSS to determine the 95% confidence interval for a paired-samples _t_ test on candy colour fading under sunlight (variables colour_pre and colour_post in "candies.sav"). In your interpretation of the confidence interval, clarify the meaning of the statistic for which the confidence interval was calculated.

    The paired-samples _t_ test is available in SPSS under *Analyze > Compare Means*. [<img src="icons/2answer.png" width=115px align="right">](#answer3.6.3)
```

<A name="question3.6.4"></A>
```{block2, type='rmdquestion'}
4. Use SPSS to determine if candy colourfulness after exposure to sunlight (colour_post) depends on candy weight and candy sweetness. Interpret the 95% confidence intervals for both effects.

    Hint: Use regression analysis, which is available under *Analyze > Regression > Linear* in SPSS. [<img src="icons/2answer.png" width=115px align="right">](#answer3.6.4)
```

### Answers {-}
<A name="answer3.6.1"></A>
```{block2, type='rmdanswer'}
Answer to Exercise 1.

SPSS syntax:

\* Check data.
FREQUENCIES VARIABLES=weight
  /FORMAT=NOTABLE
  /HISTOGRAM NORMAL
  /ORDER=ANALYSIS.
\* 95% CI.
T-TEST
  /TESTVAL=0
  /MISSING=ANALYSIS
  /VARIABLES=weight
  /CRITERIA=CI(.95).
\* 99% CI.
T-TEST
  /TESTVAL=0
  /MISSING=ANALYSIS
  /VARIABLES=weight
  /CRITERIA=CI(.99).

Check data:

There are no impossible values on variable weight.

Check assumptions:

This is a _t_ test, so sample size should be over 30 or the variable should be
normally distributed in the population. The size of this sample (_N_ = 50) is
well over 30, so we do not have to worry about the normal distribution of candy
weight in the population.

Interpret the results:

Average weight of candies in the population is between 2.79 and 2.88 grams with 95% confidence and between 2.77 and 2.90 grams with 99% confidence.
If we want to be more confident, we must allow for a broader range of values, so the confidence interval is wider. [<img src="icons/2question.png" width=161px align="right">](#question3.6.1)
```

<A name="answer3.6.2"></A>
```{block2, type='rmdanswer'}
Answer to Exercise 2.

SPSS syntax:

\* Check data.
FREQUENCIES VARIABLES=weight
  /ORDER=ANALYSIS.
\* Bootstrap on median candy weight.
BOOTSTRAP
  /SAMPLING METHOD=SIMPLE
  /VARIABLES INPUT=weight
  /CRITERIA CILEVEL=95 CITYPE=BCA  NSAMPLES=5000
  /MISSING USERMISSING=EXCLUDE.
FREQUENCIES VARIABLES=weight
  /FORMAT=NOTABLE
  /STATISTICS=MEDIAN
  /ORDER=ANALYSIS.

Check data:

There are no impossible values on the weight variable.

Check assumptions:

The measurement level of variable weight is OK.

Interpret the results:

Median candy weight in the sample is 2.81 grams. With 95% confidence, we expect median candy weight to be between 2.77 (or 2.78) and 2.92 (or 2.91) grams in the population of all candies.
The 95% interval borders can be slightly different because bootstrapping takes random samples. [<img src="icons/2question.png" width=161px align="right">](#question3.6.2)
```

<A name="answer3.6.3"></A>
```{block2, type='rmdanswer'}
Answer to Exercise 3.

SPSS syntax:

\* Check data.
FREQUENCIES VARIABLES=colour_pre colour_post
  /FORMAT=NOTABLE
  /HISTOGRAM NORMAL
  /ORDER=ANALYSIS.
\* Paired-samples t test.
T-TEST PAIRS=colour_pre WITH colour_post (PAIRED)
  /CRITERIA=CI(.9500)
  /MISSING=ANALYSIS.

Check data:

There are no impossible values on variable weight.

Check assumptions:

Sample size (*N* = 50) is well over 30, so we do not have to worry about the normal distribution of candy weight in the population.

Interpret the results:

We are 95% confident that candy colourfulness in the population fades under sunlight by 1.59 to 2.17 points on average.  [<img src="icons/2question.png" width=161px align="right">](#question3.6.3)
```

<A name="answer3.6.4"></A>
```{block2, type='rmdanswer'}
Answer to Exercise 4.

SPSS syntax:

\* Check data: Assumption checks in Chapter 8.
\* Check for impossible values.
FREQUENCIES VARIABLES=weight sweetness colour_post
  /ORDER=ANALYSIS.
\* Regression of colour_post on weight and sweetness.
REGRESSION
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS CI(95) R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN
  /DEPENDENT colour_post
  /METHOD=ENTER weight sweetness.

Do not forget to ask for the confidence intervals under the _Statistics_ button in the regression dialog screen!

Check data:

There are no impossible values on the variables.

Check assumptions:

Assumptions are presented later (Chapter 8).

Interpret the results:

We are 95% confident that candy colourfulness decreases by 0.19 to 0.46 for each additional unit of sweetness in the population.
We are not confident about the direction of the effect of candy weight on colourfulness. This effect can be negative (up to a 1.15 decrease for each additional candy weight gram with 95% confidence) or positive (up to a 2.28 increase for each additional candy weight gram with 95% confidence). [<img src="icons/2question.png" width=161px align="right">](#question3.6.4)
```

## Test Your Understanding

```{r estimation, fig.pos='H', fig.align='center', fig.cap="Point and interval estimates, confidence intervals.", echo=FALSE, out.width="775px", screenshot.opts = list(delay = 5), dev="png"}
# Combination of apps interval-level, interval-size, and crit-values: A normal curve (M = 2.8, SE = 0.1 (SD population = 0.5 and sample size = 25)) as sampling distribution with two vertical lines marking interval limits (initially set at 2.5%/95%/2.5%) ; percentages indicating the area under the curve within and outside the limits ; double arrow from mean to interval limits indicating the interval estimate ; double x-axis: the first in grams and the second in standard errors, both axes labelled ; 2 sliders: confidence level and sample size ; any slider change will change the position of the interval limits (to represent the selected confidence level) ; changing confidence level also changes the percentages ; changing sample size also changes the scale of the x-axis in standard errors ; the value of the standard error is shown to highlight its relation with sample size.
knitr::include_app("http://82.196.4.233:3838/apps/estimation/", height="258px")
```

Figure \@ref(fig:estimation) shows the sampling distribution of average candy weight in a sample bag, which is a normal distribution.

<A name="question3.7.1"></A>
```{block2, type='rmdquestion'}
1. What is the most likely estimate for average candy weight in the population? [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.1)
```

<A name="question3.7.2"></A>
```{block2, type='rmdquestion'}
2. The percentage in between the two vertical lines can be interpreted as a probability. A probability of what? [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.2)
```

<A name="question3.7.3"></A>
```{block2, type='rmdquestion'}
3. The double arrow represents an interval of sample means, in this example, average candy weight. What happens if you change the confidence level? Explain why this makes sense. [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.3)
```

<A name="question3.7.4"></A>
```{block2, type='rmdquestion'}
4. What happens to the graph if you change sample size? [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.4)
```

<A name="question3.7.5"></A>
```{block2, type='rmdquestion'}
5. What happens to the standard error if you change sample size? How are sample size and standard error linked? What characteristic of the sampling distribution is expressed by the standard error? [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.5)
```

<A name="question3.7.6"></A>
```{block2, type='rmdquestion'}
6. The values of the interval limits---average candy weights in this example---on the scale in standard errors are called critical values. What happens to the critical values if you change sample size? [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.6)
```

<A name="question3.7.7"></A>
```{block2, type='rmdquestion'}
7. What happens to the critical values if you change the confidence level? [<img src="icons/2answer.png" width=115px align="right">](#answer3.7.7)
```

### Answers {-}

```{block2, type='rmdanswer', echo=!ch3}
Answers to the Test Your Understanding questions will be shown in the web book when the last tutor group has discussed this chapter.
```

<A name="answer3.7.1"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 1.

* The average of a distribution of sample means is the expected value or
expectation of the population mean (because a sample mean is an unbiased
estimate of a population mean).
* The mean of average candy weight over all samples is 2.8 grams in this
example. This is our best guess (point estimate) for average candy weight in
the population.
* The density (not the probability!) of the sampling distribution is at its
maximum at 2.8 grams. [<img src="icons/2question.png" width=161px align="right">](#question3.7.1)
```

<A name="answer3.7.2"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 2.

* The probability of drawing a sample (bag) with a sample mean (average candy
weight) between the value of the left (lower) limit and the right (upper)
limit. [<img src="icons/2question.png" width=161px align="right">](#question3.7.2)
```

<A name="answer3.7.3"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 3.

* Raising the confidence level increases the width of the interval because it
increases the area under the curve between the interval limits.
* This makes sense because the confidence level is the probability of drawing
a sample with a mean (sample statistic value) within the interval.
* If we want to have a higher probability, we must be more inclusive, that is,
we must allow for a greater range of sample means. [<img src="icons/2question.png" width=161px align="right">](#question3.7.3)
```

<A name="answer3.7.4"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 4.

* A larger sample size makes the sampling distribution more peaked, so the
interval containing the middle 95% of all samples becomes narrower.
* We have to expect less variation in sample means, so we have a more precise
estimate of the sample mean that we are very likely (95% probability) to draw. [<img src="icons/2question.png" width=161px align="right">](#question3.7.4)
```

<A name="answer3.7.5"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 5.

* Increasing sample size decreases the standard error, so the two are
negatively correlated. A larger sample size creates a more peaked
distribution, which indicates lower variation (scores are closer to the mean).
This makes sense: A larger sample contains more information, so its mean
should (on average) be closer to the true population mean (hence to the mean
of the sampling distribution).
* The standard error expresses the variation in the sampling distribution.
Actually, the standard error is just the standard deviation of the sampling
distribution. [<img src="icons/2question.png" width=161px align="right">](#question3.7.5)
```

<A name="answer3.7.6"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 6.

* Nothing changes. Critical values are fixed values in terms of standard
errors if the confidence level (and the degrees of freedom that we will
discuss later) do not change.
* If this was a _t_ distribution, the critical values would slightly decrease
with larger sample size but this change is too small to be of practical
relevance, so it is ignored in this web book. [<img src="icons/2question.png" width=161px align="right">](#question3.7.6)
```

<A name="answer3.7.7"></A>
```{block2, type='rmdanswer', echo=ch3}
Answer to Question 7.

* Higher confidence levels yield more extreme critical values because
the interval must cover a larger proportion of all possible samples. [<img src="icons/2question.png" width=161px align="right">](#question3.7.7)
```

## Take-Home Points

* If a sample statistic is an unbiased estimator, we can use it as a point estimate for the value of the statistic in the population.

* A point estimate may come close to the population value but it is almost certainly not correct.

* A 95% confidence interval is an interval estimate of the population value. We are 95% confident that the population value lies within this interval. Note that confidence is not a probability!

* A larger sample or a lower confidence level yields a narrower, that is, a more precise confidence interval.

* A larger sample yields a smaller standard error, which yields a more precise confidence interval because the limits of a 95% confidence interval fall one standard error times the critical value below and above the value of the sample statistic.