AP Statistics – Unit 5: Sampling Distributions

5.1 Introducing Statistics: Why Is My Sample Not Like Yours?

When we take a sample from a population, our sample statistic (like x̄ or p̂) won't exactly equal the true population parameter. This natural variation is called sampling variability.

Sampling Variability: The natural variation in sample statistics from sample to sample. Different random samples from the same population will produce different statistics.

Parameters vs. Statistics

Concept Parameter Statistic
Definition A number that describes the population A number that describes a sample
Symbol (Proportion) p (population proportion) p̂ (sample proportion, "p-hat")
Symbol (Mean) μ (population mean) x̄ (sample mean, "x-bar")
Symbol (SD) σ (population standard deviation) s (sample standard deviation)
Known? Usually unknown (what we want to estimate) Known from our sample data
Variability Fixed (doesn't change) Varies from sample to sample

🎯 Memory Trick: Parameters describe Populations (both start with P). Statistics describe Samples (both start with S).

What is a Sampling Distribution?

Sampling Distribution: The distribution of all possible values of a sample statistic when taking repeated random samples of the same size from a population.

It shows how the statistic varies across all possible samples.

Thought Experiment: Imagine taking ALL possible samples of size n = 50 from a population.

• Sample 1: Calculate p̂₁

• Sample 2: Calculate p̂₂

• Sample 3: Calculate p̂₃

• ... and so on for every possible sample

The distribution of all these p̂ values is the sampling distribution of p̂.

From Population to Sampling Distribution Population Parameter: p Sample 1: p̂₁ Sample 2: p̂₂ Sample 3: p̂₃ Sampling Distribution of all p̂ values
Exam Tip: A sampling distribution is NOT a distribution of raw data — it's a distribution of a statistic (like p̂ or x̄) calculated from many samples.

5.2 The Normal Distribution, Revisited

The Normal distribution is the most important distribution in statistics. Many sampling distributions are approximately Normal, which allows us to calculate probabilities using z-scores.

Normal Distribution: A symmetric, bell-shaped distribution completely described by its mean (μ) and standard deviation (σ).

Notation: X ~ N(μ, σ) means "X follows a Normal distribution with mean μ and standard deviation σ"

The Normal Distribution μ μ−σ μ+σ μ−2σ μ+2σ 68% 95% The Empirical Rule: 68-95-99.7

The Empirical Rule (68-95-99.7 Rule)

For any Normal distribution:

  • 68% of data falls within 1 standard deviation of the mean (μ ± σ)
  • 95% of data falls within 2 standard deviations of the mean (μ ± 2σ)
  • 99.7% of data falls within 3 standard deviations of the mean (μ ± 3σ)

Z-Scores and the Standard Normal Distribution

Z-Score: The number of standard deviations a value is from the mean.

z =
x − μ
σ

Standard Normal Distribution: A Normal distribution with μ = 0 and σ = 1, denoted N(0, 1).

Example: Heights of adult women are N(64, 2.5) inches. Find the probability a woman is taller than 69 inches.

Step 1: Calculate z-score

z = (69 − 64) / 2.5 = 5 / 2.5 = 2.0

Step 2: Use Normal table or calculator

P(Z > 2.0) = 1 − 0.9772 = 0.0228

About 2.28% of women are taller than 69 inches.

Calculator Commands (TI-83/84)

normalcdf(lower, upper, μ, σ)

Find P(lower < X < upper) for N(μ, σ)

invNorm(area, μ, σ)

Find x-value with given area to the LEFT

🎯 Finding Percentiles:

To find the value at the 90th percentile: Use invNorm(0.90, μ, σ)

This gives the x-value where 90% of the data falls below.

Exam Tip: Always draw a Normal curve, mark the mean, shade the region of interest, and label known values. This helps avoid calculator errors!

5.3 The Central Limit Theorem

The Central Limit Theorem (CLT) is one of the most powerful results in statistics. It explains why Normal distributions appear so often in statistical inference.

The Central Limit Theorem

When we take sufficiently large random samples from any population, the sampling distribution of the sample mean (x̄) will be approximately Normal.

This is true regardless of the shape of the original population!

Key Details of the CLT

For a sampling distribution of x̄:

Mean (Center)

μ_x̄ = μ

Same as population mean

Standard Deviation (Spread)

σ_x̄ =
σ
√n

Called "standard error"

When is the Sample Size "Large Enough"?

Population is Normal

The sampling distribution of x̄ is exactly Normal for any sample size n.

No minimum n required!

Population is Not Normal

Rule of thumb: n ≥ 30 is usually sufficient for the CLT to apply.

More skewed populations need larger n.

The CLT in Action: Any Population → Normal Sampling Distribution Population (Skewed) Take many samples of size n Sample Means x̄₁ x̄₂ x̄₃ x̄₄ ... Sampling Distribution ≈ Normal! As n increases, the sampling distribution becomes more Normal and the standard error (σ/√n) decreases

⚠️ What the CLT Does NOT Say:

• It does NOT say the sample data becomes Normal

• It does NOT say individual observations are Normal

• It applies to the sampling distribution of the statistic, not the raw data

Exam Tip: The CLT explains why we can use Normal-based inference (z-tests, confidence intervals) even when the population isn't Normal — as long as n is large enough!

5.4 Biased and Unbiased Point Estimates

A point estimate is a single value used to estimate a population parameter. But not all estimates are equally good — we want estimates that are unbiased.

Point Estimate: A single value (statistic) calculated from sample data to estimate a population parameter.

Unbiased Estimator: A statistic whose sampling distribution is centered at (has mean equal to) the true parameter value.

Biased vs. Unbiased Estimators Unbiased μ_statistic = parameter Biased True param μ_statistic ≠ parameter

Common Point Estimates

Parameter Point Estimate Unbiased?
Population mean (μ) Sample mean (x̄) ✓ Yes
Population proportion (p) Sample proportion (p̂) ✓ Yes
Population variance (σ²) Sample variance (s²) with n−1 ✓ Yes
Population SD (σ) Sample SD (s) ✗ Slightly biased

🎯 Why n−1 for Sample Variance?

Using n−1 (instead of n) in the denominator of s² corrects for bias. This is called "Bessel's correction." It makes s² an unbiased estimator of σ².

s² =
Σ(xᵢ − x̄)²
n − 1

Variability of Point Estimates

A good estimator should have:

  • Low bias: Centered at the true parameter
  • Low variability: Small spread (tight sampling distribution)

Key insight: Larger sample sizes reduce variability but don't affect bias. An estimator with a systematic error will be biased regardless of sample size.

Analogy: Target Practice

Unbiased + Low variability: Shots clustered at the bullseye ✓

Unbiased + High variability: Shots scattered but centered on bullseye

Biased + Low variability: Shots clustered but off-center

Biased + High variability: Shots scattered and off-center

Exam Tip: When asked "Is this an unbiased estimator?" ask yourself: "If I took many samples, would the average of all my estimates equal the true parameter?"

5.5 Sampling Distributions for Sample Proportions

When we sample from a population and calculate the sample proportion p̂, this statistic has its own distribution with predictable properties.

Sampling Distribution of p̂

Shape: Approximately Normal (if conditions are met)

Center: μ_p̂ = p (the true population proportion)

Spread: σ_p̂ = √[p(1−p)/n]

Conditions for Normality

For the sampling distribution of p̂ to be approximately Normal:

1. Random Sample

Data must come from a random sample or randomized experiment.

2. Independence (10% Condition)

When sampling without replacement, n ≤ 10% of population size.

3. Large Counts (Success/Failure Condition)

np ≥ 10   AND   n(1−p) ≥ 10

Both expected counts must be at least 10.

Formulas for the Sampling Distribution of p̂

Mean

μ_p̂ = p

Standard Deviation (Standard Error)

σ_p̂ =
p(1−p)
n

Example: A polling company samples 400 voters. If the true proportion supporting a candidate is p = 0.55, describe the sampling distribution of p̂.

Check Conditions:

• Random: Assume random sample ✓

• 10%: 400 < 10% of all voters ✓

• Large Counts: np = 400(0.55) = 220 ≥ 10 ✓

n(1−p) = 400(0.45) = 180 ≥ 10 ✓

Sampling Distribution:

μ_p̂ = 0.55

σ_p̂ = √[(0.55)(0.45)/400] = √[0.2475/400] = √0.000619 ≈ 0.0249

p̂ ~ N(0.55, 0.0249)

Follow-up: What's the probability that p̂ > 0.60?

z = (0.60 − 0.55) / 0.0249 = 0.05 / 0.0249 ≈ 2.01

P(p̂ > 0.60) = P(Z > 2.01) ≈ 0.0222

There's about a 2.2% chance the sample proportion exceeds 0.60.

Exam Tip: Always check ALL THREE conditions before using Normal calculations. State them explicitly on the AP exam — this is worth points!

5.6 Sampling Distributions for Differences in Sample Proportions

When comparing two populations, we often want to estimate the difference between their proportions. The sampling distribution of p̂₁ − p̂₂ tells us how this difference varies.

Setting: Two independent random samples from two populations:

  • Population 1: proportion p₁, sample size n₁, sample proportion p̂₁
  • Population 2: proportion p₂, sample size n₂, sample proportion p̂₂

Sampling Distribution of p̂₁ − p̂₂

Shape: Approximately Normal (if conditions are met)

Center: μ_(p̂₁−p̂₂) = p₁ − p₂

Spread: σ_(p̂₁−p̂₂) = √[p₁(1−p₁)/n₁ + p₂(1−p₂)/n₂]

Conditions for Normality

For EACH sample independently:

  • Random: Both samples are random samples
  • 10%: n₁ ≤ 10% of population 1, n₂ ≤ 10% of population 2
  • Large Counts: n₁p₁ ≥ 10, n₁(1−p₁) ≥ 10, n₂p₂ ≥ 10, n₂(1−p₂) ≥ 10

Additionally: The two samples must be independent of each other.

🎯 Why Do Variances Add?

When combining independent random variables, variances always add (even for subtraction!):

Var(p̂₁ − p̂₂) = Var(p̂₁) + Var(p̂₂)

This is why we add the variance terms under the square root.

Example: In a study comparing two treatments:

• Treatment A: n₁ = 200, true success rate p₁ = 0.72

• Treatment B: n₂ = 250, true success rate p₂ = 0.65

Describe the sampling distribution of p̂₁ − p̂₂:

μ_(p̂₁−p̂₂) = 0.72 − 0.65 = 0.07

σ_(p̂₁−p̂₂) = √[(0.72)(0.28)/200 + (0.65)(0.35)/250]

= √[0.001008 + 0.00091]

= √0.001918 ≈ 0.0438

p̂₁ − p̂₂ ~ N(0.07, 0.0438)

Follow-up: What's the probability that p̂₁ − p̂₂ < 0 (Treatment B appears better)?

z = (0 − 0.07) / 0.0438 = −1.60

P(p̂₁ − p̂₂ < 0) = P(Z < −1.60) ≈ 0.0548

There's about a 5.5% chance of observing Treatment B outperforming A in a sample, even though A is truly better.

Exam Tip: The formula has TWO fractions under the square root — one for each sample. Don't forget to add them before taking the square root!

5.7 Sampling Distributions for Sample Means

The sampling distribution of the sample mean x̄ is one of the most important concepts in statistics. This is where the Central Limit Theorem really shines.

Sampling Distribution of x̄

Shape: Normal if population is Normal, OR approximately Normal if n ≥ 30 (CLT)

Center: μ_x̄ = μ (the population mean)

Spread: σ_x̄ = σ/√n (standard error)

Conditions for Normality

The sampling distribution of x̄ is approximately Normal if:

  • Random: Data comes from a random sample or randomized experiment
  • Independence (10% Condition): n ≤ 10% of population (if sampling without replacement)
  • Normal/Large Sample:
    • Population is Normal, OR
    • n ≥ 30 (Central Limit Theorem)

Formulas for the Sampling Distribution of x̄

Mean

μ_x̄ = μ

Standard Error

σ_x̄ =
σ
√n

🎯 Key Insight: Standard Error Decreases with Sample Size

As n increases, σ_x̄ = σ/√n gets smaller. This means:

  • Larger samples give more precise estimates
  • To cut the standard error in half, you need 4× the sample size
  • To cut it by a factor of 10, you need 100× the sample size

Example: The heights of adult males are distributed with μ = 70 inches and σ = 3 inches. A sample of 36 men is selected.

Describe the sampling distribution of x̄:

μ_x̄ = 70 inches

σ_x̄ = 3/√36 = 3/6 = 0.5 inches

x̄ ~ N(70, 0.5)

Find P(x̄ > 71):

z = (71 − 70) / 0.5 = 2.0

P(x̄ > 71) = P(Z > 2.0) ≈ 0.0228

Comparing Individual Values to Sample Means

Individual Value X

z = (x − μ) / σ

Use population SD (σ)

Higher variability — individual values spread more

Sample Mean x̄

z = (x̄ − μ) / (σ/√n)

Use standard error (σ/√n)

Lower variability — averaging reduces spread

⚠️ Common Mistake: Don't confuse the standard deviation of the population (σ) with the standard error of the mean (σ/√n). When working with sample means, ALWAYS use the standard error!

Exam Tip: Read carefully — "probability that a randomly selected individual..." uses σ. "Probability that the sample mean..." uses σ/√n.

5.8 Sampling Distributions for Differences in Sample Means

When comparing two populations' means, we use the sampling distribution of x̄₁ − x̄₂. This is essential for comparing treatment effects, group differences, and more.

Setting: Two independent random samples from two populations:

  • Population 1: mean μ₁, SD σ₁, sample size n₁, sample mean x̄₁
  • Population 2: mean μ₂, SD σ₂, sample size n₂, sample mean x̄₂

Sampling Distribution of x̄₁ − x̄₂

Shape: Normal if both populations are Normal, OR approximately Normal if both n₁ ≥ 30 and n₂ ≥ 30

Center: μ_(x̄₁−x̄₂) = μ₁ − μ₂

Spread: σ_(x̄₁−x̄₂) = √(σ₁²/n₁ + σ₂²/n₂)

Conditions for Normality

For the sampling distribution of x̄₁ − x̄₂ to be approximately Normal:

  • Random: Both samples are random samples
  • Independence:
    • The two samples are independent of each other
    • 10% condition: n₁ ≤ 10% of pop 1, n₂ ≤ 10% of pop 2
  • Normal/Large Sample: Both populations are Normal, OR both n₁ ≥ 30 and n₂ ≥ 30

Formulas for x̄₁ − x̄₂

Mean

μ_(x̄₁−x̄₂) = μ₁ − μ₂

Standard Error

σ_(x̄₁−x̄₂) = (
σ₁²
n₁
+
σ₂²
n₂
)

Example: Comparing test scores between two schools:

• School A: μ₁ = 520, σ₁ = 80, n₁ = 50

• School B: μ₂ = 490, σ₂ = 75, n₂ = 45

Describe the sampling distribution of x̄₁ − x̄₂:

μ_(x̄₁−x̄₂) = 520 − 490 = 30

σ_(x̄₁−x̄₂) = √(80²/50 + 75²/45)

= √(128 + 125)

= √253 ≈ 15.91

x̄₁ − x̄₂ ~ N(30, 15.91)

Follow-up: What's the probability that x̄₂ > x̄₁ (School B's sample mean exceeds School A's)?

This is asking for P(x̄₁ − x̄₂ < 0):

z = (0 − 30) / 15.91 = −1.89

P(x̄₁ − x̄₂ < 0) = P(Z < −1.89) ≈ 0.0294

There's about a 3% chance School B's sample outperforms School A's, even though A has the higher population mean.

Summary: All Sampling Distributions

Statistic Mean Standard Error Normal Condition
p √[p(1−p)/n] np ≥ 10 and n(1−p) ≥ 10
p̂₁ − p̂₂ p₁ − p₂ √[p₁(1−p₁)/n₁ + p₂(1−p₂)/n₂] Large counts for both samples
μ σ/√n Population Normal OR n ≥ 30
x̄₁ − x̄₂ μ₁ − μ₂ √(σ₁²/n₁ + σ₂²/n₂) Both populations Normal OR both n ≥ 30

Calculator Tips for Normal Calculations

normalcdf(lower, upper, μ, σ)

For P(lower < statistic < upper), use the appropriate μ and standard error

invNorm(area, μ, σ)

Find the statistic value at a given percentile

Exam Tip: For difference of means, remember: variances add, so you have σ₁²/n₁ + σ₂²/n₂ under the square root. The ²'s are on the σ's, not under them!

Unit 5 Key Takeaways

Sampling variability: Statistics vary from sample to sample

Central Limit Theorem: x̄ is approximately Normal for large n (≥30)

Unbiased estimator: Mean of sampling distribution equals parameter

Standard error of p̂: √[p(1−p)/n]

Standard error of x̄: σ/√n

For differences: Variances ADD (under the square root)

Always check: Random, Independence (10%), Normal/Large Sample

normalcdf(lower, upper, μ, σ) | invNorm(area, μ, σ) | z = (statistic − μ)/SE

End of Unit 5 Study Guide.

← Unit 4 Unit 5 Quiz →