unit-6 – HighFiveAP

6.1 Introducing Statistics: Why Be Normal?

In Unit 5, we discovered the power of the Normal distribution when describing sampling distributions. For categorical data (proportions), we rely on the Normal model to make inferences about an entire population based on a single sample.

The Foundation of Inference: To estimate a parameter or test a claim, we must ensure the sampling distribution of our statistic (like p̂) is approximately Normal. We check this using the Large Counts Condition.

np ≥ 10 AND n(1−p) ≥ 10

When this condition is met, we can use Z-scores and the standard Normal curve to calculate margins of error and p-values. If the condition fails, the distribution is skewed, and our Normal-based calculations will be incorrect.

Exam Tip: Never just state "Large Counts met." Always show the actual multiplication with your specific sample size and proportion to prove they are $\ge 10$.

6.2 Constructing a Confidence Interval for a Population Proportion

A Confidence Interval (CI) provides a range of plausible values for the unknown population proportion. Because sample statistics naturally vary (sampling variability), a point estimate alone isn't enough.

Formula: 1-Sample Z-Interval for p

p̂ ± z* √

p̂(1−p̂)

Point Estimate ± Margin of Error
(where Margin of Error = Critical Value × Standard Error)

The Four-Step Process (PANIC)

On Free Response questions, use the PANIC acronym to ensure full credit.

Step	What to do
P	Parameter: Define the parameter p in context (e.g., "p = the true proportion of...").
A	Assess Conditions: 1. Random sample/assignment 2. 10% Condition (n ≤ 10% of pop) 3. Large Counts (np̂ ≥ 10, n(1−p̂) ≥ 10)
N	Name Procedure: State "1-Sample Z-Interval for p".
I & C	Interval & Conclude: Calculate the interval and interpret it: "We are C% confident that the interval from [lower] to [upper] captures the true proportion of..."

Example: HighFiveAP surveys a random sample of 200 high school students and finds that 140 of them prefer digital flashcards over paper. Create a 95% confidence interval for the true proportion.

1. Point Estimate (p̂):

140 / 200 = 0.70

2. Critical Value (z*):

For 95% confidence, z* = 1.96

3. Calculation:

0.70 ± 1.96 * √[(0.70)(0.30) / 200]
0.70 ± 1.96 * 0.0324
0.70 ± 0.0635 ➔ (0.6365, 0.7635)

Calculator Commands (TI-83/84)

STAT ➔ TESTS ➔ A: 1-PropZInt

x: success count (must be a whole number!), n: sample size, C-Level: confidence level (e.g., 0.95)

6.3 Justifying a Claim Based on a Confidence Interval

Once we have a confidence interval, we can use it to evaluate claims made about the population parameter.

Plausible Claims

If a hypothesized value IS INSIDE the confidence interval, it is a plausible value for the true parameter. We cannot reject the claim.

Implausible Claims

If a hypothesized value IS OUTSIDE the confidence interval, we have convincing evidence that the claim is false.

Scenario: A school district claims that exactly 80% of students use the HighFiveAP platform for exam review. Based on our interval from Section 6.2 (0.6365, 0.7635), does this support the district's claim?

Conclusion: No. Because 0.80 is not contained within our 95% confidence interval, we have convincing evidence against the district's claim. It is highly likely the true proportion is lower.

6.4 Setting Up a Test for a Population Proportion

A Hypothesis Test provides a formal process for weighing evidence against a specific claim. It starts by setting up two competing hypotheses.

Null Hypothesis (H₀): The claim we weigh evidence against. It represents "no difference" or the status quo. It always uses an equal sign (e.g., p = p₀).

Alternative Hypothesis (Hₐ): The claim we are trying to find evidence FOR. It uses inequalities (<, >, or ≠).

The Test Statistic

To test the claim, we calculate a test statistic (z-score) which tells us how many standard errors our sample proportion (p̂) is away from the null value (p₀).

Test Statistic Formula

z =

p̂ − p₀

√[p₀(1−p₀) / n]

⚠️ Critical Difference: Notice that the denominator uses p₀ (the null value), not p̂! In a hypothesis test, we build our distribution assuming the null hypothesis is true, so we use p₀ for both the standard error and the Large Counts condition check.

6.5 Interpreting p-Values

The p-value is the most critical concept in modern statistical inference. It measures the strength of your evidence against H₀.

The Golden Definition of a p-value

The probability of getting a sample statistic as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true.

Interpretation Example: "If the true proportion of cats who like the new magic wand is really 0.50 (H₀), there is only a 0.03 (3%) probability of getting a sample proportion of 0.63 or higher purely by random chance."

6.6 Concluding a Test for a Population Proportion

To make a final decision, we compare our p-value to our pre-selected significance level ($\alpha$), which is usually 0.05.

p-value ≤ α	p-value > α
Decision: Reject H₀	Decision: Fail to Reject H₀
Because the p-value is less than α, the result is statistically significant. We HAVE convincing evidence for Hₐ.	Because the p-value is greater than α, the result is not statistically significant. We DO NOT HAVE convincing evidence for Hₐ.

🎯 Memory Trick: "If the p is low, the null must go!" (Reject). "If the p is high, the null must fly!" (Fail to Reject).

Exam Tip: Never say we "accept H₀" or "prove H₀." We only ever fail to find enough evidence to reject it. It's like a courtroom: a defendant is found "not guilty" (fail to reject), rather than proven "innocent."

6.6 Potential Errors When Performing Tests

Because inference relies on partial data, occasionally a random sample will lead us to the wrong conclusion. We classify these into two types of errors.

Type I Error (False Positive)

Rejecting H₀ when H₀ is actually true.

You find convincing evidence for a change/effect that doesn't actually exist.
Probability of a Type I error = α (Significance Level).

Type II Error (False Negative)

Failing to reject H₀ when Hₐ is actually true.

You miss a real effect or difference because your sample didn't show enough evidence.
Probability is denoted by β.

Power of a Test: The probability that a test correctly rejects a false null hypothesis.
Power = 1 − P(Type II Error)

You can increase power by increasing sample size ($n$) or increasing the significance level ($\alpha$).

6.8 Confidence Intervals for the Difference of Two Proportions

When we want to estimate the difference between two populations (or two treatment groups in an experiment), we calculate a 2-Sample Z-Interval.

Formula: 2-Sample Z-Interval for $p_1 - p_2$

(p̂₁ − p̂₂) ± z* √ (

p̂₁(1−p̂₁)

n₁

p̂₂(1−p̂₂)

n₂

)

Conditions Update: You must check the Random, 10%, and Large Counts conditions for BOTH samples independently before proceeding.

6.9 Justifying a Claim Based on a Confidence Interval for a Difference

When analyzing an interval for $p_1 - p_2$, the most critical number to look for is Zero (0).

Interval Contains 0

Example: (-0.05, 0.12)

Since 0 is plausible, it is highly possible there is no difference between the two proportions. We cannot justify a claim that one group is greater than the other.

Interval Does Not Contain 0

Example: (0.04, 0.18) OR (-0.22, -0.08)

Since 0 is not plausible, we have convincing evidence that there is a difference between the groups.

6.10 Setting Up a Test for the Difference of Two Population Proportions

When testing to see if two populations differ, our Null Hypothesis assumes they are exactly the same.

H₀: p₁ = p₂ (or p₁ − p₂ = 0)

Hₐ: p₁ > p₂ (or <, or ≠)

The Pooled Proportion ($\hat{p}_C$)

If we assume under $H_0$ that the two populations have the same proportion, we shouldn't use two different $p$ values to calculate our standard error. Instead, we combine (pool) the successes and sample sizes into one giant sample.

Pooled p̂_C =

x₁ + x₂

n₁ + n₂

Exam Tip: For the Large Counts condition in a 2-sample test, use this combined pooled proportion ($\hat{p}_C$) to check $n_1\hat{p}_C \ge 10$, $n_2\hat{p}_C \ge 10$, etc.

6.11 Carrying Out a Test for the Difference of Proportions

Finally, we calculate the z-statistic for a two-sample test using the pooled proportion.

2-Sample Z-Test Statistic Formula

z =

(p̂₁ − p̂₂) − 0

√[p̂_C(1−p̂_C) / n₁ + p̂_C(1−p̂_C) / n₂]

Once you calculate the z-score, find the p-value using your normal distribution tools, compare it to $\alpha$, and state your conclusion in context, exactly as done in Section 6.6.

Calculator Commands (TI-83/84)

STAT ➔ TESTS ➔ 6: 2-PropZTest

x1, n1: sample 1 stats. x2, n2: sample 2 stats. The calculator handles the pooled proportion for you in the background!

Unit 6 Key Takeaways

Confidence Intervals: Point Estimate ± Margin of Error

PANIC for intervals, PHANTOMS for tests

p-value: Prob of result this extreme assuming H₀ is true

Type I Error: Reject true H₀ | Type II Error: Fail to reject false H₀

Use p₀ for SE in 1-Prop Tests, Use pooled $\hat{p}_C$ for SE in 2-Prop Tests

End of Unit 6 Study Guide. Ready to build those flashcards?