Is it true that as the size of a sample increases the mean of the distribution of sample means increases?

Sampling Distribution of the Mean

David M. Lane

Prerequisites

Introduction to Sampling Distributions, Variance Sum Law I

Learning Objectives

  1. State the mean and variance of the sampling distribution of the mean
  2. Compute the standard error of the mean
  3. State the central limit theorem

The sampling distribution of the mean was defined in the section introducing sampling distributions. This section reviews some important properties of the sampling distribution of the mean introduced in the demonstrations in this chapter.

Mean

The mean of the sampling distribution of the mean is the mean of the population from which the scores were sampled. Therefore, if a population has a mean μ, then the mean of the sampling distribution of the mean is also μ. The symbol μM is used to refer to the mean of the sampling distribution of the mean. Therefore, the formula for the mean of the sampling distribution of the mean can be written as:

μM = μ

Variance

The variance of the sampling distribution of the mean is computed as follows:

Is it true that as the size of a sample increases the mean of the distribution of sample means increases?

That is, the variance of the sampling distribution of the mean is the population variance divided by N, the sample size (the number of scores used to compute a mean). Thus, the larger the sample size, the smaller the variance of the sampling distribution of the mean.

(optional) This expression can be derived very easily from the variance sum law. Let's begin by computing the variance of the sampling distribution of the sum of three numbers sampled from a population with variance σ2. The variance of the sum would be σ2 + σ2 + σ2. For N numbers, the variance would be Nσ2. Since the mean is 1/N times the sum, the variance of the sampling distribution of the mean would be 1/N2 times the variance of the sum, which equals σ2/N.

The standard error of the mean is the standard deviation of the sampling distribution of the mean. It is therefore the square root of the variance of the sampling distribution of the mean and can be written as:

Is it true that as the size of a sample increases the mean of the distribution of sample means increases?

The standard error is represented by a σ because it is a standard deviation. The subscript (M) indicates that the standard error in question is the standard error of the mean.

Central Limit Theorem

The central limit theorem states that:

Given a population with a finite mean μ and a finite non-zero variance σ2, the sampling distribution of the mean approaches a normal distribution with a mean of μ and a variance of σ2/N as N, the sample size, increases.

The expressions for the mean and variance of the sampling distribution of the mean are not new or remarkable. What is remarkable is that regardless of the shape of the parent population, the sampling distribution of the mean approaches a normal distribution as N increases. If you have used the "Central Limit Theorem Demo," you have already seen this for yourself. As a reminder, Figure 1 shows the results of the simulation for N = 2 and N = 10. The parent population was a uniform distribution. You can see that the distribution for N = 2 is far from a normal distribution. Nonetheless, it does show that the scores are denser in the middle than in the tails. For N = 10 the distribution is quite close to a normal distribution. Notice that the means of the two distributions are the same, but that the spread of the distribution for N = 10 is smaller.

Is it true that as the size of a sample increases the mean of the distribution of sample means increases?

Figure 1. A simulation of a sampling distribution. The parent population is uniform. The blue line under "16" indicates that 16 is the mean. The red line extends from the mean plus and minus one standard deviation.

Figure 2 shows how closely the sampling distribution of the mean approximates a normal distribution even when the parent population is very non-normal. If you look closely you can see that the sampling distributions do have a slight positive skew. The larger the sample size, the closer the sampling distribution of the mean would be to a normal distribution.

Is it true that as the size of a sample increases the mean of the distribution of sample means increases?

Figure 2. A simulation of a sampling distribution. The parent population is very non-normal.

Please answer the questions:

Is it true that as the size of a sample increases the mean of the distribution of sample means increases?
feedback

  1. Last updated
  2. Save as PDF
  • Page ID4584
  • Examples of the Central Limit Theorem

    Law of Large Numbers

    The law of large numbers says that if you take samples of larger and larger size from any population, then the mean of the sampling distribution, \(\mu_{\overline x}\) tends to get closer and closer to the true population mean, \(\mu\). From the Central Limit Theorem, we know that as \(n\) gets larger and larger, the sample means follow a normal distribution. The larger n gets, the smaller the standard deviation of the sampling distribution gets. (Remember that the standard deviation for the sampling distribution of \(\overline X\) is \(\frac{\sigma}{\sqrt{n}}\).) This means that the sample mean \(\overline x\) must be closer to the population mean \(\mu\) as \(n\) increases. We can say that \(\mu\) is the value that the sample means approach as n gets larger. The Central Limit Theorem illustrates the law of large numbers.

    This concept is so important and plays such a critical role in what follows it deserves to be developed further. Indeed, there are two critical issues that flow from the Central Limit Theorem and the application of the Law of Large numbers to it. These are

    1. The probability density function of the sampling distribution of means is normally distributed regardless of the underlying distribution of the population observations and
    2. standard deviation of the sampling distribution decreases as the size of the samples that were used to calculate the means for the sampling distribution increases.

    Taking these in order. It would seem counterintuitive that the population may have any distribution and the distribution of means coming from it would be normally distributed. With the use of computers, experiments can be simulated that show the process by which the sampling distribution changes as the sample size is increased. These simulations show visually the results of the mathematical proof of the Central Limit Theorem.

    Here are three examples of very different population distributions and the evolution of the sampling distribution to a normal distribution as the sample size increases. The top panel in these cases represents the histogram for the original data. The three panels show the histograms for 1,000 randomly drawn samples for different sample sizes: \(n=10\), \(n= 25\) and \(n=50\). As the sample size increases, and the number of samples taken remains constant, the distribution of the 1,000 sample means becomes closer to the smooth line that represents the normal distribution.

    Figure \(\PageIndex{3}\) is for a normal distribution of individual observations and we would expect the sampling distribution to converge on the normal quickly. The results show this and show that even at a very small sample size the distribution is close to the normal distribution.

    Figure \(\PageIndex{3}\)

    Figure \(\PageIndex{4}\) is a uniform distribution which, a bit amazingly, quickly approached the normal distribution even with only a sample of 10.

    Is it true that as the size of a sample increases the mean of the distribution of sample means increases?

    Figure \(\PageIndex{4}\)

    Figure \(\PageIndex{5}\) is a skewed distribution. This last one could be an exponential, geometric, or binomial with a small probability of success creating the skew in the distribution. For skewed distributions our intuition would say that this will take larger sample sizes to move to a normal distribution and indeed that is what we observe from the simulation. Nevertheless, at a sample size of 50, not considered a very large sample, the distribution of sample means has very decidedly gained the shape of the normal distribution.

    Figure \(\PageIndex{5}\)

    The Central Limit Theorem provides more than the proof that the sampling distribution of means is normally distributed. It also provides us with the mean and standard deviation of this distribution. Further, as discussed above, the expected value of the mean, \(\mu_{\overline{x}}\), is equal to the mean of the population of the original data which is what we are interested in estimating from the sample we took. We have already inserted this conclusion of the Central Limit Theorem into the formula we use for standardizing from the sampling distribution to the standard normal distribution. And finally, the Central Limit Theorem has also provided the standard deviation of the sampling distribution, \(\sigma_{\overline{x}}=\frac{\sigma}{\sqrt{n}}\), and this is critical to have to calculate probabilities of values of the new random variable, \(\overline x\).

    Figure \(\PageIndex{6}\) shows a sampling distribution. The mean has been marked on the horizontal axis of the \(\overline X\)'s and the standard deviation has been written to the right above the distribution. Notice that the standard deviation of the sampling distribution is the original standard deviation of the population, divided by the sample size. We have already seen that as the sample size increases the sampling distribution becomes closer and closer to the normal distribution. As this happens, the standard deviation of the sampling distribution changes in another way; the standard deviation decreases as \(n\) increases. At very very large \(n\), the standard deviation of the sampling distribution becomes very small and at infinity it collapses on top of the population mean. This is what it means that the expected value of \(\mu_{\overline{x}}\) is the population mean, \(\mu\).

    Figure \(\PageIndex{6}\)

    At non-extreme values of \(n\), this relationship between the standard deviation of the sampling distribution and the sample size plays a very important part in our ability to estimate the parameters we are interested in.

    Figure \(\PageIndex{7}\) shows three sampling distributions. The only change that was made is the sample size that was used to get the sample means for each distribution. As the sample size increases, \(n\) goes from 10 to 30 to 50, the standard deviations of the respective sampling distributions decrease because the sample size is in the denominator of the standard deviations of the sampling distributions.

    Figure \(\PageIndex{7}\)

    The implications for this are very important. Figure \(\PageIndex{8}\) shows the effect of the sample size on the confidence we will have in our estimates. These are two sampling distributions from the same population. One sampling distribution was created with samples of size 10 and the other with samples of size 50. All other things constant, the sampling distribution with sample size 50 has a smaller standard deviation that causes the graph to be higher and narrower. The important effect of this is that for the same probability of one standard deviation from the mean, this distribution covers much less of a range of possible values than the other distribution. One standard deviation is marked on the \(\overline X\) axis for each distribution. This is shown by the two arrows that are plus or minus one standard deviation for each distribution. If the probability that the true mean is one standard deviation away from the mean, then for the sampling distribution with the smaller sample size, the possible range of values is much greater. A simple question is, would you rather have a sample mean from the narrow, tight distribution, or the flat, wide distribution as the estimate of the population mean? Your answer tells us why people intuitively will always choose data from a large sample rather than a small sample. The sample mean they are getting is coming from a more compact distribution. This concept will be the foundation for what will be called level of confidence in the next unit.

    Figure \(\PageIndex{8}\)

    When the sample size increases does sampling distribution mean increase?

    As sample sizes increase, the sampling distributions approach a normal distribution. With "infinite" numbers of successive random samples, the mean of the sampling distribution is equal to the population mean (µ).

    What increases when sample size increases?

    Because we have more data and therefore more information, our estimate is more precise. As our sample size increases, the confidence in our estimate increases, our uncertainty decreases and we have greater precision.

    What does increasing the sample size mean?

    Higher sample size allows the researcher to increase the significance level of the findings, since the confidence of the result are likely to increase with a higher sample size. This is to be expected because larger the sample size, the more accurately it is expected to mirror the behavior of the whole group.