/    /  Statistics – Central-limit Theorem

Central-limit Theorem

Statement: Given a sufficiently large sample size selected from a population with a finitevariance, the mean of all samples from the same population will be approximately equal to the mean of the population thereby forming an approximate normal distribution pattern.When we draw repeated samples from a given population, the shape of the distribution of means will be converging to the normal distribution irrespective of the shape of the population distribution.As the sample size increases, the sampling distribution of the mean, can be approximated by a normal distribution with mean µ and standard deviation σ/√n.

A certain random variable of interest is a sum of a large number of independent random variables where we use the CLT to justify using the normal distribution. Examples of such random variables are found in almost every discipline like:

Usually errors inlaboratory measurement are modeled by normal random variables.

In communication and signal processing, the most frequently used model for noiseisGaussian noise.

When we select random samples from a population to obtain statistics (mean, variance…) about the population, we often assume the resultant as a normal random variable.If you have a problem in which you are interested in a sum of one thousand random variables, it might be extremely difficult to find the distribution of the sum by direct calculation. Using the CLT we can immediately write the distribution, if we know the mean and variance of the random variables.

We can use normal approximation, if n ≥ 30.

Three different components of the central limit theorem

(1) Successive sampling from a population

(2) Increasing sample size

(3) Population distribution.

Example:

Here the resulting frequency distributions each based on 500 means is shown. For n = 4, 4 scores were sampled from a uniform distribution 500 times and then computed the mean each time. Similarly, with means of 7 scores for n = 7 and 10 scores for n = 10.As n is increasing, the spread of the distributions is decreasing and the distributions are becominglimited to center (clustering around the mean)

Statistics - Central-limit Theorem(i2tutorials.com)

Limitations of central limit theorem:

The values must be drawn independently from the same distribution having finite mean and variance and should not be correlated.

The rate of convergence depends on the skewness of the distribution.

Sums from an exponential distribution converge for smaller sample sizes. Sums froma lognormal distribution require larger sizes.