Statistical Methods

Probability & Samples

by Adam J. McKee

Using

F. J. Gravetter and L. B. Wallnau's Essentials of Statistics for the Behavioral Sciences (4th Ed.).


Definition

In a situation where several different outcomes are possible, we define the probability for any particular outcome as a fraction or proportion.

Proportions

Probability is defined as a proportion.

This definition makes it possible to restate any probability problem as a proportion.

We can also change probabilities into percentages and fractions—just like we can with proportions.

Percentages

P(spade) = 13/52 = ¼ (drawing a card)

P(heads) = ½ (coin toss)

We can also express these as percentages:

¼ = 0.25 = 25%

½ = .050 = 50%

Restricted Range of Probability

There are two limits to probability:

Things that never happen

Things that Always Happen

Say we have a jar with 10 white marbles in it:

The probability of drawing out a black marble is 0/10 = 0.00

The probability of drawing out a white marble is 10/10 = 1.00

All other probabilities fall within that range

Clarifying our Definition

Four our definition of probability to be accurate, it is necessary that the outcomes be obtained by a process called random sampling.

To be a random sample, two requirements must be met:

Each individual in the population must have an equal chance of being selected

There must be a constant probability for each and every selection (requires replacement)

An Equal Chance of Being Selected

This phrase means that individual outcomes are equally likely.

Thus we cannot calculate the probability of life on mars being 50% because there are two options: There is or there is not—the two are not equally likely!

Constant Probability

If we have 20 marbles in a jar, 10 black and 10 white, and we draw out a white one, we change the probability for the next draw.

The solution is to put the marble back every time you draw—this is called sampling by replacement.

Sampling Error

Sampling error is the discrepancy, or amount of error, between a sample statistic and its corresponding population parameter.

Furthermore, samples are variable, they are not all the same.

If you take two separate samples from the same population, the samples will be different.

Multiple Samples from the Same Population

Two separate samples probably will be different even though they are taken from the same population.

The samples will have different individuals, different scores, different means, and so forth.

Distribution of Sample Means

The distribution of sample means is the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population.

Sampling Distribution

A sampling distribution is a distribution of statistics obtained by selecting all the possible samples of a specific size from a population.

Characteristics of the Sampling Distribution

Note that the distribution of sample means has some predictable and some very useful characteristics:

1. The sample means tend to pile up around the population mean. It should not be surprising that sample means tend to represent the population mean—after all the point of using a sample is that it represents the population

Characteristic #2

The distribution of sample means is approximately normal in shape.

This will be very useful to us because we already know a lot about the characteristics of normal distributions.

Characteristic #3

Finally, not that we can use the distribution of sample means to answer probability questions about the sample means.

Specifically, we can determine the probability of obtaining a particular sample mean.

Central Limit Theorem

For any population with mean μ and standard deviation σ, the distribution of sample means for sample size n will have a mean of μ and a standard deviation of σ/√n and will approach a normal distribution as n approaches infinity.

Value of the Central Limit Theorem

Two important facts:

It describes the distribution for any population, no matter what shape, mean, or standard deviation.

The distribution of sample means approaches a normal distribution very rapidly: By the time the sample size reaches n = 30, the distribution is almost perfectly normal.

The Shape of the Distribution of Sample Means

The distribution of sample means tends to be normal.

This fact should not be surprising: whenever you take a sample form a population, you expect the sample mean to be near the population mean.

When you take lots of different samples, you expect the sample means to "pile up" around the population mean, resulting in a normal distribution.

The Mean of the Distribution of Sample Means

The mean of the distribution of sample means is equal to the population mean and is called the expected value of bar X.

The Standard Error of the Mean of the Sample Distribution of Means

The standard deviation for the distribution of sample means is called the standard error of the mean.

This defines the typical (standard) distance from the mean.

The standard error measures exactly how much difference should be expected, on average, between bar X and Mu.

Defining Standard Error of the Mean

The standard deviation of the distribution of sample means is called the standard error of the mean.

The standard error measures the standard amount of difference one should expect between bar X and μ simply due to chance.

Notation for Standard Error

Expectation of the Sample

Although we do not expect a sample mean to be exactly the same as the population mean, it should be a good estimate.

The standard error tells us how good the estimate will be.


This page available at: