/    /  Statistics – Sampling distribution

Sampling distribution

Sampling distribution is the probability distribution of a sample of a population instead of the entire population using various statistics (mean, mode, median, standard deviation and range) based on randomly selected samples. This distribution helps in hypothesis testing (likeness of an outcome).

The properties of sampling distribution vary depending on the sample size as compared to the population. Properties of Sampling Distributions:

The shape of the distribution is symmetric and approximately normal.

There are no outliers or other deviations from the overall pattern.

The center of the distribution must be very close to the true population mean.

The population is assumed to be normally distributed. If the sample size is large enough, then sampling distribution will also be normal which is determined by the mean and the standard deviation values.

Example:

A random sample of 20 people from the population of women in Hyderabad between the ages of 22 and 35 years is selected and computed the mean height of sample. It might be lesser or greater, but not equal the population mean exactly.The most common measure of how much sample means differ from each other is the standard deviation (standard error of the mean)of the sampling distribution of the mean. The standard error of the mean would be small, if all the sample means were very close to the population mean. The standard error of the mean would be large, if the sample means varied considerably.

Central Limit Theorem:

The central limit theorem states that:

In a population with a finite mean, μ and a finite non-zero variance, σ2 the sampling distribution of the mean approaches a normal distribution with a mean of μ and a variance of σ2/N as the sample size,N increases.As the sample size increases, the closer the sampling distribution of the mean to become a normal distribution.

The sampling distribution of the difference between means can be  computed by following these steps repeatedly:

sample n1scores from Population 1 and n2 scores from Population 2,

compute the means of the two samples (µ1 and µ2), and

Compute the difference between means, µ1– µ2.

The distribution of the differences between means is the sampling distribution of the difference between means, the mean of the sampling distribution is:

Sampling distribution 1(i2tutorials.com)

From the variance sum law, we know that:

Sampling distribution 2(i2tutorials.com)

Sampling distribution 3(i2tutorials.com)

The standard error of the difference between means,

Sampling distribution 4(i2tutorials.com)

The sampling distributions were often derived from the normal distribution implied by the central limit theorem. This holds for

normal distribution for sample means and proportions

t distribution for sample means in a t-test

beta coefficients in regression analysis;

chi-square distribution for variances;

F distribution for variance ratios in ANOVA.

Sampling distribution of the Mean:

The mean of the sampling distribution is in fact the mean of the population after computing sample means and population means. However, the standard deviation differs for the sampling distribution as compared to the population.

If the population is large enough, this is given by σx =Sampling distribution 5(i2tutorials.com)

Where σ – the mean of the population,

σx̄ – the population mean.

Sampling distribution of the Mean

The mean of the sampling distribution, μx = μ (mean of the population)

The standard error of the sampling distribution, σx =  Sampling distribution 5(i2tutorials.com) * Sampling distribution 6(i2tutorials.com)

Whereσ-standard deviation of the population,

N-the population size, and

n-the sample size.

In the standard error formula, the factor Sampling distribution 8(i2tutorials.com) is called the finite population correction or fpc. When the population size is very large relative to the sample size, the fpc ; and the standard error formula can be approximated to:

σx =Sampling distribution 5(i2tutorials.com)

Safer to use this formula when the sample size is no bigger than 1/20 of the population size.

Sampling distribution of the Proportion

In a population of size N, suppose that the probability of the occurrence of an event is P for success; and the probability of the event’s non-occurrence is Q for failure. From this population, we then draw all possible samples of size n. And within each sample, we determine the proportion of successes p and failures q, thus creating a sampling distribution of the proportion.

We find that is equal to the probability of success in the population (P). And is determined by the standard deviation of the population (σ), the population size, and the sample size.

The mean of the sampling distribution of the proportion, μp = P

The standard error of the sampling distribution,

σp=Sampling distribution 5(i2tutorials.com)   *Sampling distribution 6(i2tutorials.com) =Sampling distribution 7(i2tutorials.com) *Sampling distribution 6(i2tutorials.com)

When the population size is very large relative to the sample size, the fpc ;

So, the standard error formula can be approximated to σp =Sampling distribution 7(i2tutorials.com)

Safer to use this formula when the sample size is no bigger than 1/20 of the population size.

To make it easier:

Use the normal distribution, if the population standard deviation is known/ if the sample size is large.

Use the t-distribution, if the population standard deviation is unknown/ if the sample size is small.