5.1 Introducing Statistics: Why Is My Sample Not Like Yours?
When we take a sample from a population, our sample statistic (like x̄ or p̂) won't exactly equal the true population parameter. This natural variation is called sampling variability.
Sampling Variability: The natural variation in sample statistics from sample to sample. Different random samples from the same population will produce different statistics.
Parameters vs. Statistics
| Concept | Parameter | Statistic |
|---|---|---|
| Definition | A number that describes the population | A number that describes a sample |
| Symbol (Proportion) | p (population proportion) | p̂ (sample proportion, "p-hat") |
| Symbol (Mean) | μ (population mean) | x̄ (sample mean, "x-bar") |
| Symbol (SD) | σ (population standard deviation) | s (sample standard deviation) |
| Known? | Usually unknown (what we want to estimate) | Known from our sample data |
| Variability | Fixed (doesn't change) | Varies from sample to sample |
🎯 Memory Trick: Parameters describe Populations (both start with P). Statistics describe Samples (both start with S).
What is a Sampling Distribution?
Sampling Distribution: The distribution of all possible values of a sample statistic when taking repeated random samples of the same size from a population.
It shows how the statistic varies across all possible samples.
Thought Experiment: Imagine taking ALL possible samples of size n = 50 from a population.
• Sample 1: Calculate p̂₁
• Sample 2: Calculate p̂₂
• Sample 3: Calculate p̂₃
• ... and so on for every possible sample
The distribution of all these p̂ values is the sampling distribution of p̂.
5.2 The Normal Distribution, Revisited
The Normal distribution is the most important distribution in statistics. Many sampling distributions are approximately Normal, which allows us to calculate probabilities using z-scores.
Normal Distribution: A symmetric, bell-shaped distribution completely described by its mean (μ) and standard deviation (σ).
Notation: X ~ N(μ, σ) means "X follows a Normal distribution with mean μ and standard deviation σ"
The Empirical Rule (68-95-99.7 Rule)
For any Normal distribution:
- 68% of data falls within 1 standard deviation of the mean (μ ± σ)
- 95% of data falls within 2 standard deviations of the mean (μ ± 2σ)
- 99.7% of data falls within 3 standard deviations of the mean (μ ± 3σ)
Z-Scores and the Standard Normal Distribution
Z-Score: The number of standard deviations a value is from the mean.
Standard Normal Distribution: A Normal distribution with μ = 0 and σ = 1, denoted N(0, 1).
Example: Heights of adult women are N(64, 2.5) inches. Find the probability a woman is taller than 69 inches.
Step 1: Calculate z-score
z = (69 − 64) / 2.5 = 5 / 2.5 = 2.0
Step 2: Use Normal table or calculator
P(Z > 2.0) = 1 − 0.9772 = 0.0228
About 2.28% of women are taller than 69 inches.
Calculator Commands (TI-83/84)
normalcdf(lower, upper, μ, σ)
Find P(lower < X < upper) for N(μ, σ)
invNorm(area, μ, σ)
Find x-value with given area to the LEFT
🎯 Finding Percentiles:
To find the value at the 90th percentile: Use invNorm(0.90, μ, σ)
This gives the x-value where 90% of the data falls below.
5.3 The Central Limit Theorem
The Central Limit Theorem (CLT) is one of the most powerful results in statistics. It explains why Normal distributions appear so often in statistical inference.
The Central Limit Theorem
When we take sufficiently large random samples from any population, the sampling distribution of the sample mean (x̄) will be approximately Normal.
This is true regardless of the shape of the original population!
Key Details of the CLT
For a sampling distribution of x̄:
Mean (Center)
Same as population mean
Standard Deviation (Spread)
Called "standard error"
When is the Sample Size "Large Enough"?
Population is Normal
The sampling distribution of x̄ is exactly Normal for any sample size n.
No minimum n required!
Population is Not Normal
Rule of thumb: n ≥ 30 is usually sufficient for the CLT to apply.
More skewed populations need larger n.
⚠️ What the CLT Does NOT Say:
• It does NOT say the sample data becomes Normal
• It does NOT say individual observations are Normal
• It applies to the sampling distribution of the statistic, not the raw data
5.4 Biased and Unbiased Point Estimates
A point estimate is a single value used to estimate a population parameter. But not all estimates are equally good — we want estimates that are unbiased.
Point Estimate: A single value (statistic) calculated from sample data to estimate a population parameter.
Unbiased Estimator: A statistic whose sampling distribution is centered at (has mean equal to) the true parameter value.
Common Point Estimates
| Parameter | Point Estimate | Unbiased? |
|---|---|---|
| Population mean (μ) | Sample mean (x̄) | ✓ Yes |
| Population proportion (p) | Sample proportion (p̂) | ✓ Yes |
| Population variance (σ²) | Sample variance (s²) with n−1 | ✓ Yes |
| Population SD (σ) | Sample SD (s) | ✗ Slightly biased |
🎯 Why n−1 for Sample Variance?
Using n−1 (instead of n) in the denominator of s² corrects for bias. This is called "Bessel's correction." It makes s² an unbiased estimator of σ².
Variability of Point Estimates
A good estimator should have:
- Low bias: Centered at the true parameter
- Low variability: Small spread (tight sampling distribution)
Key insight: Larger sample sizes reduce variability but don't affect bias. An estimator with a systematic error will be biased regardless of sample size.
Analogy: Target Practice
Unbiased + Low variability: Shots clustered at the bullseye ✓
Unbiased + High variability: Shots scattered but centered on bullseye
Biased + Low variability: Shots clustered but off-center
Biased + High variability: Shots scattered and off-center
5.5 Sampling Distributions for Sample Proportions
When we sample from a population and calculate the sample proportion p̂, this statistic has its own distribution with predictable properties.
Sampling Distribution of p̂
Shape: Approximately Normal (if conditions are met)
Center: μ_p̂ = p (the true population proportion)
Spread: σ_p̂ = √[p(1−p)/n]
Conditions for Normality
For the sampling distribution of p̂ to be approximately Normal:
1. Random Sample
Data must come from a random sample or randomized experiment.
2. Independence (10% Condition)
When sampling without replacement, n ≤ 10% of population size.
3. Large Counts (Success/Failure Condition)
Both expected counts must be at least 10.
Formulas for the Sampling Distribution of p̂
Mean
Standard Deviation (Standard Error)
Example: A polling company samples 400 voters. If the true proportion supporting a candidate is p = 0.55, describe the sampling distribution of p̂.
Check Conditions:
• Random: Assume random sample ✓
• 10%: 400 < 10% of all voters ✓
• Large Counts: np = 400(0.55) = 220 ≥ 10 ✓
n(1−p) = 400(0.45) = 180 ≥ 10 ✓
Sampling Distribution:
μ_p̂ = 0.55
σ_p̂ = √[(0.55)(0.45)/400] = √[0.2475/400] = √0.000619 ≈ 0.0249
p̂ ~ N(0.55, 0.0249)
Follow-up: What's the probability that p̂ > 0.60?
z = (0.60 − 0.55) / 0.0249 = 0.05 / 0.0249 ≈ 2.01
P(p̂ > 0.60) = P(Z > 2.01) ≈ 0.0222
There's about a 2.2% chance the sample proportion exceeds 0.60.
5.6 Sampling Distributions for Differences in Sample Proportions
When comparing two populations, we often want to estimate the difference between their proportions. The sampling distribution of p̂₁ − p̂₂ tells us how this difference varies.
Setting: Two independent random samples from two populations:
- Population 1: proportion p₁, sample size n₁, sample proportion p̂₁
- Population 2: proportion p₂, sample size n₂, sample proportion p̂₂
Sampling Distribution of p̂₁ − p̂₂
Shape: Approximately Normal (if conditions are met)
Center: μ_(p̂₁−p̂₂) = p₁ − p₂
Spread: σ_(p̂₁−p̂₂) = √[p₁(1−p₁)/n₁ + p₂(1−p₂)/n₂]
Conditions for Normality
For EACH sample independently:
- Random: Both samples are random samples
- 10%: n₁ ≤ 10% of population 1, n₂ ≤ 10% of population 2
- Large Counts: n₁p₁ ≥ 10, n₁(1−p₁) ≥ 10, n₂p₂ ≥ 10, n₂(1−p₂) ≥ 10
Additionally: The two samples must be independent of each other.
🎯 Why Do Variances Add?
When combining independent random variables, variances always add (even for subtraction!):
This is why we add the variance terms under the square root.
Example: In a study comparing two treatments:
• Treatment A: n₁ = 200, true success rate p₁ = 0.72
• Treatment B: n₂ = 250, true success rate p₂ = 0.65
Describe the sampling distribution of p̂₁ − p̂₂:
μ_(p̂₁−p̂₂) = 0.72 − 0.65 = 0.07
σ_(p̂₁−p̂₂) = √[(0.72)(0.28)/200 + (0.65)(0.35)/250]
= √[0.001008 + 0.00091]
= √0.001918 ≈ 0.0438
p̂₁ − p̂₂ ~ N(0.07, 0.0438)
Follow-up: What's the probability that p̂₁ − p̂₂ < 0 (Treatment B appears better)?
z = (0 − 0.07) / 0.0438 = −1.60
P(p̂₁ − p̂₂ < 0) = P(Z < −1.60) ≈ 0.0548
There's about a 5.5% chance of observing Treatment B outperforming A in a sample, even though A is truly better.
5.7 Sampling Distributions for Sample Means
The sampling distribution of the sample mean x̄ is one of the most important concepts in statistics. This is where the Central Limit Theorem really shines.
Sampling Distribution of x̄
Shape: Normal if population is Normal, OR approximately Normal if n ≥ 30 (CLT)
Center: μ_x̄ = μ (the population mean)
Spread: σ_x̄ = σ/√n (standard error)
Conditions for Normality
The sampling distribution of x̄ is approximately Normal if:
- Random: Data comes from a random sample or randomized experiment
- Independence (10% Condition): n ≤ 10% of population (if sampling without replacement)
- Normal/Large Sample:
- Population is Normal, OR
- n ≥ 30 (Central Limit Theorem)
Formulas for the Sampling Distribution of x̄
Mean
Standard Error
🎯 Key Insight: Standard Error Decreases with Sample Size
As n increases, σ_x̄ = σ/√n gets smaller. This means:
- Larger samples give more precise estimates
- To cut the standard error in half, you need 4× the sample size
- To cut it by a factor of 10, you need 100× the sample size
Example: The heights of adult males are distributed with μ = 70 inches and σ = 3 inches. A sample of 36 men is selected.
Describe the sampling distribution of x̄:
μ_x̄ = 70 inches
σ_x̄ = 3/√36 = 3/6 = 0.5 inches
x̄ ~ N(70, 0.5)
Find P(x̄ > 71):
z = (71 − 70) / 0.5 = 2.0
P(x̄ > 71) = P(Z > 2.0) ≈ 0.0228
Comparing Individual Values to Sample Means
Individual Value X
z = (x − μ) / σ
Use population SD (σ)
Higher variability — individual values spread more
Sample Mean x̄
z = (x̄ − μ) / (σ/√n)
Use standard error (σ/√n)
Lower variability — averaging reduces spread
⚠️ Common Mistake: Don't confuse the standard deviation of the population (σ) with the standard error of the mean (σ/√n). When working with sample means, ALWAYS use the standard error!
5.8 Sampling Distributions for Differences in Sample Means
When comparing two populations' means, we use the sampling distribution of x̄₁ − x̄₂. This is essential for comparing treatment effects, group differences, and more.
Setting: Two independent random samples from two populations:
- Population 1: mean μ₁, SD σ₁, sample size n₁, sample mean x̄₁
- Population 2: mean μ₂, SD σ₂, sample size n₂, sample mean x̄₂
Sampling Distribution of x̄₁ − x̄₂
Shape: Normal if both populations are Normal, OR approximately Normal if both n₁ ≥ 30 and n₂ ≥ 30
Center: μ_(x̄₁−x̄₂) = μ₁ − μ₂
Spread: σ_(x̄₁−x̄₂) = √(σ₁²/n₁ + σ₂²/n₂)
Conditions for Normality
For the sampling distribution of x̄₁ − x̄₂ to be approximately Normal:
- Random: Both samples are random samples
- Independence:
- The two samples are independent of each other
- 10% condition: n₁ ≤ 10% of pop 1, n₂ ≤ 10% of pop 2
- Normal/Large Sample: Both populations are Normal, OR both n₁ ≥ 30 and n₂ ≥ 30
Formulas for x̄₁ − x̄₂
Mean
Standard Error
Example: Comparing test scores between two schools:
• School A: μ₁ = 520, σ₁ = 80, n₁ = 50
• School B: μ₂ = 490, σ₂ = 75, n₂ = 45
Describe the sampling distribution of x̄₁ − x̄₂:
μ_(x̄₁−x̄₂) = 520 − 490 = 30
σ_(x̄₁−x̄₂) = √(80²/50 + 75²/45)
= √(128 + 125)
= √253 ≈ 15.91
x̄₁ − x̄₂ ~ N(30, 15.91)
Follow-up: What's the probability that x̄₂ > x̄₁ (School B's sample mean exceeds School A's)?
This is asking for P(x̄₁ − x̄₂ < 0):
z = (0 − 30) / 15.91 = −1.89
P(x̄₁ − x̄₂ < 0) = P(Z < −1.89) ≈ 0.0294
There's about a 3% chance School B's sample outperforms School A's, even though A has the higher population mean.
Summary: All Sampling Distributions
| Statistic | Mean | Standard Error | Normal Condition |
|---|---|---|---|
| p̂ | p | √[p(1−p)/n] | np ≥ 10 and n(1−p) ≥ 10 |
| p̂₁ − p̂₂ | p₁ − p₂ | √[p₁(1−p₁)/n₁ + p₂(1−p₂)/n₂] | Large counts for both samples |
| x̄ | μ | σ/√n | Population Normal OR n ≥ 30 |
| x̄₁ − x̄₂ | μ₁ − μ₂ | √(σ₁²/n₁ + σ₂²/n₂) | Both populations Normal OR both n ≥ 30 |
Calculator Tips for Normal Calculations
normalcdf(lower, upper, μ, σ)
For P(lower < statistic < upper), use the appropriate μ and standard error
invNorm(area, μ, σ)
Find the statistic value at a given percentile
Unit 5 Key Takeaways
Sampling variability: Statistics vary from sample to sample
Central Limit Theorem: x̄ is approximately Normal for large n (≥30)
Unbiased estimator: Mean of sampling distribution equals parameter
Standard error of p̂: √[p(1−p)/n]
Standard error of x̄: σ/√n
For differences: Variances ADD (under the square root)
Always check: Random, Independence (10%), Normal/Large Sample
normalcdf(lower, upper, μ, σ) | invNorm(area, μ, σ) | z = (statistic − μ)/SE
End of Unit 5 Study Guide.