AP Statistics – Unit 4: Probability, Random Variables, and Probability Distributions

4.1 Introducing Statistics: Random and Non-Random Patterns?

Probability is the mathematical language of randomness. Understanding the difference between random and non-random patterns is fundamental to statistical inference.

Randomness: A process is random if individual outcomes are unpredictable, but there is a regular pattern in the long run over many repetitions.

Random vs. Non-Random Patterns

Random

Individual outcomes are unpredictable, but follow a predictable pattern in the long run.

  • Coin flips (can't predict next flip)
  • Rolling dice
  • Random sampling

Short-term: Unpredictable
Long-term: Predictable pattern

Non-Random (Deterministic)

Outcomes follow a fixed, predictable pattern.

  • Mathematical formulas
  • Physical laws
  • Programmed sequences

Both short-term and long-term: Predictable

The Law of Large Numbers

Law of Large Numbers: As the number of trials increases, the observed proportion of outcomes gets closer and closer to the true probability.

Example: Flipping a fair coin

• After 10 flips: Might get 70% heads (not surprising)

• After 100 flips: Probably closer to 50% heads

• After 10,000 flips: Very close to 50% heads

The proportion approaches 0.5 as trials increase — this is the Law of Large Numbers in action!

⚠️ Common Misconception: "The Gambler's Fallacy"

After getting 5 heads in a row, the next flip is NOT "due" to be tails! Each flip is independent — the coin has no memory. The Law of Large Numbers works over MANY trials, not by "evening out" in the short term.

Exam Tip: The Law of Large Numbers explains why probability "works" — it's about long-run relative frequency, not short-term predictions.

4.2 Estimating Probabilities Using Simulation

When theoretical probability is difficult to calculate, we can use simulation to estimate probabilities by mimicking the random process many times.

Simulation: Using a model to imitate a random process, then observing outcomes over many repetitions to estimate probabilities.

Steps for Designing a Simulation

Step Description Example: Simulating Free Throws (70% shooter)
1. Model Describe how to use a random device to imitate one trial of the process. Use random digits 0-9. Let 0-6 = make, 7-9 = miss (gives 70% probability)
2. Repetition Describe how to run one complete trial. Generate 10 random digits to simulate 10 free throws.
3. Run Many Trials Perform many trials (at least 100) and record results. Repeat the 10-shot simulation 200 times, recording makes each time.
4. Estimate Calculate proportion of trials with the outcome of interest. If 156 out of 200 trials had 7+ makes, estimate P(7+ makes) ≈ 0.78

Example Simulation: What's the probability of getting at least one "6" when rolling a die 4 times?

Model: Use random digits 1-6 (ignore 0, 7, 8, 9). Or use a die!

Trial: Generate 4 random digits (1-6), check if any are 6.

Results after 100 trials: 52 trials had at least one 6.

Estimate: P(at least one 6 in 4 rolls) ≈ 52/100 = 0.52

(Theoretical answer: 1 - (5/6)⁴ ≈ 0.518 — our simulation is close!)

Using a Random Number Table

• Read digits in groups (singles, pairs, triples) depending on your model

• Skip digits that don't fit your model

• Continue in a consistent direction (left to right, top to bottom)

Example: For 70% probability, let 00-69 = success, 70-99 = failure
Exam Tip: When describing a simulation, be SPECIFIC about your model: What digits represent what outcomes? How do you handle repeated digits? How many trials?

4.3 Introduction to Probability

Probability measures how likely an event is to occur, ranging from 0 (impossible) to 1 (certain). Understanding probability rules is essential for statistical inference.

Basic Probability Concepts

Term Definition Example (Rolling a Die)
Sample Space (S) The set of ALL possible outcomes S = {1, 2, 3, 4, 5, 6}
Event A subset of the sample space; a collection of outcomes A = "rolling even" = {2, 4, 6}
Probability of Event A P(A) = (# of outcomes in A) / (# of outcomes in S) P(even) = 3/6 = 0.5
Complement (Aᶜ) All outcomes NOT in event A Aᶜ = "rolling odd" = {1, 3, 5}

Basic Probability Rules

Rule 1: Range

0 ≤ P(A) ≤ 1

Probability is always between 0 and 1

Rule 2: All Outcomes

P(S) = 1

Something must happen

Rule 3: Complement

P(Aᶜ) = 1 − P(A)

P(not A) = 1 − P(A)

Rule 4: Equally Likely

P(A) = |A| / |S|

Count favorable / count total

0 Impossible 0.5 Equally Likely 1 Certain

🎯 The Complement Rule is Your Friend:

When asked for P("at least one"), it's often easier to calculate:

P(at least one) = 1 − P(none)

Exam Tip: "At least one" means 1 or more. Use the complement: P(at least 1) = 1 − P(0). This is almost always the easiest approach!

4.4 Mutually Exclusive Events

Some events cannot happen at the same time. Understanding when events are mutually exclusive is crucial for calculating probabilities correctly.

Mutually Exclusive (Disjoint) Events: Two events that cannot occur at the same time. If one happens, the other cannot happen.

P(A and B) = 0
Mutually Exclusive vs. Not Mutually Exclusive Mutually Exclusive A B No overlap! NOT Mutually Exclusive A B A∩B Overlap exists

Examples of Mutually Exclusive Events:

• Rolling a die: "getting a 2" and "getting a 5" ✓

• Drawing a card: "getting a heart" and "getting a spade" ✓

• A student's grade: "getting an A" and "getting a B" ✓

NOT Mutually Exclusive:

• Drawing a card: "getting a heart" and "getting a queen" ✗ (Queen of Hearts exists!)

• A student: "plays sports" and "plays music" ✗ (can do both)

Addition Rule for Mutually Exclusive Events

If A and B are mutually exclusive:

P(A or B) = P(A) + P(B)

This extends to multiple events: P(A or B or C) = P(A) + P(B) + P(C) if all are mutually exclusive.

Example: Rolling a die. Find P(rolling a 2 or a 5).

Since "rolling a 2" and "rolling a 5" are mutually exclusive:

P(2 or 5) = P(2) + P(5) = 1/6 + 1/6 = 2/6 = 1/3

⚠️ Important: You can ONLY add probabilities directly when events are mutually exclusive! If events can happen together, you must use the General Addition Rule (covered in 4.6).

Exam Tip: Before adding probabilities, ALWAYS check if events are mutually exclusive. Ask: "Can both happen at the same time?" If yes, don't just add!

4.5 Conditional Probability

Conditional probability is the probability of an event occurring given that another event has already occurred. It answers: "How does knowing one thing affect the probability of another?"

Conditional Probability: The probability of event A occurring, given that event B has occurred.

P(A | B) =
P(A and B)
P(B)

Read "P(A | B)" as "the probability of A given B"

🎯 Intuition: When we know B has happened, we're restricting our sample space to only the outcomes where B occurs. Then we ask: of those outcomes, how many also have A?

Example: A survey of 200 students found:

Plays Sports Doesn't Play Total
Male 60 40 100
Female 45 55 100
Total 105 95 200

Find P(Plays Sports | Male):

Given that a student is male, what's the probability they play sports?

P(Sports | Male) = 60/100 = 0.60

Find P(Male | Plays Sports):

Given that a student plays sports, what's the probability they're male?

P(Male | Sports) = 60/105 ≈ 0.571

Notice: P(A | B) ≠ P(B | A) in general!

General Multiplication Rule

Rearranging the conditional probability formula gives us the General Multiplication Rule:

P(A and B) = P(B) × P(A | B)

Or equivalently: P(A and B) = P(A) × P(B | A)

Example: A bag has 5 red and 3 blue marbles. Draw 2 marbles without replacement. Find P(both red).

P(1st red) = 5/8

P(2nd red | 1st red) = 4/7 (one red already gone)

P(both red) = (5/8) × (4/7) = 20/56 = 5/14 ≈ 0.357

Exam Tip: Pay attention to "with replacement" vs "without replacement." Without replacement changes the probabilities for subsequent draws!

4.6 Independent Events and Unions of Events

Two events are independent if knowing that one occurred doesn't change the probability of the other. This concept is crucial for multiplication rules.

Independent Events

Independent Events: Events A and B are independent if:

P(A | B) = P(A)  or  P(B | A) = P(B)

Knowing B occurred doesn't change the probability of A (and vice versa).

Independent Events ✓

  • Flipping two separate coins
  • Rolling two dice
  • Drawing WITH replacement
  • Unrelated events (rain today, coin flip)

NOT Independent ✗

  • Drawing WITHOUT replacement
  • Related events (studying, test score)
  • Sampling without replacement
  • Events where one affects the other

Multiplication Rule for Independent Events

If A and B are independent:

P(A and B) = P(A) × P(B)

This extends to multiple independent events: P(A and B and C) = P(A) × P(B) × P(C)

Example: Flip a coin and roll a die. Find P(Heads and 6).

Events are independent (coin doesn't affect die).

P(H and 6) = P(H) × P(6) = (1/2) × (1/6) = 1/12

⚠️ Independent ≠ Mutually Exclusive!

Mutually exclusive: Events CANNOT happen together. P(A and B) = 0.

Independent: Events DON'T AFFECT each other. P(A and B) = P(A) × P(B).

In fact, if A and B are mutually exclusive (and both have P > 0), they CANNOT be independent!

General Addition Rule (Union of Events)

For ANY events A and B:

P(A or B) = P(A) + P(B) − P(A and B)

We subtract P(A and B) to avoid counting the overlap twice.

A B A∩B P(A∪B) = P(A) + P(B) − P(A∩B)

Example: P(Heart) = 13/52, P(Queen) = 4/52, P(Queen of Hearts) = 1/52

Find P(Heart or Queen):

P(Heart or Queen) = 13/52 + 4/52 − 1/52 = 16/52 = 4/13

Exam Tip: Always check if events are independent OR mutually exclusive before choosing a formula. The wrong assumption leads to wrong answers!

4.7 Introduction to Random Variables and Probability Distributions

A random variable assigns a numerical value to each outcome of a random process. This allows us to use mathematical tools to analyze random phenomena.

Random Variable: A variable whose value is a numerical outcome of a random phenomenon. Denoted with capital letters (X, Y, Z).

Types of Random Variables

Discrete Random Variable

Takes a countable number of possible values (often whole numbers).

  • Number of heads in 10 flips
  • Number of customers per hour
  • Number of defective items

Can list all possible values

Continuous Random Variable

Takes any value in an interval (infinitely many possible values).

  • Height of a person
  • Time to complete a task
  • Temperature

Cannot list all possible values

Probability Distribution

Probability Distribution: A description of all possible values of a random variable and their associated probabilities.

Requirements:

  • Every probability is between 0 and 1
  • All probabilities sum to 1

Example: Let X = number of heads when flipping a coin twice

x 0 1 2
P(X = x) 1/4 2/4 1/4

Check: 1/4 + 2/4 + 1/4 = 4/4 = 1 ✓

Probability Histogram

Probability Distribution: X = # of Heads in 2 Flips 0 0.25 0.50 0.75 0 1 2 Number of Heads (X) P(X = x)
Exam Tip: When given a probability distribution, always verify that probabilities sum to 1. If they don't, there's an error!

4.8 Mean and Standard Deviation of Random Variables

Just like data sets, random variables have a mean (expected value) and standard deviation. These describe the center and spread of the probability distribution.

Expected Value (Mean)

Expected Value (Mean) of X: The long-run average value of X over many, many repetitions.

μₓ = E(X) = Σ xᵢ · P(xᵢ)

Multiply each value by its probability, then add them all up.

Example: X = number of heads in 2 coin flips

x P(X = x) x · P(x)
0 0.25 0
1 0.50 0.50
2 0.25 0.50
E(X) = 1.00

On average, you expect 1 head in 2 coin flips.

Standard Deviation

Variance of X:

σ²ₓ = Var(X) = Σ (xᵢ − μₓ)² · P(xᵢ)

Standard Deviation of X:

σₓ = SD(X) = √Var(X)

🎯 Interpretation:

μₓ (mean): The "balance point" of the distribution — what you expect on average

σₓ (SD): The typical distance from the mean — measures spread/variability

Calculator Tip

Enter x-values in L1 and probabilities in L2, then:

STAT → CALC → 1:1-Var Stats L1, L2

This gives you the mean (x̄) and standard deviation (σx) of the distribution.

Exam Tip: The expected value does NOT have to be a possible value of X! For example, E(X) = 1 for coin flips, but you can't actually flip "1" heads — you get 0, 1, or 2.

4.9 Combining Random Variables

When we add, subtract, or scale random variables, we need rules for how the means and standard deviations combine.

Linear Transformations: aX + b

If Y = aX + b (multiply by constant a, then add constant b):

μᵧ = a·μₓ + b
σᵧ = |a|·σₓ

Adding a constant shifts the mean but doesn't change spread. Multiplying changes both.

Example: Test scores X have μₓ = 70 and σₓ = 10.

The teacher curves by doubling and adding 5: Y = 2X + 5

μᵧ = 2(70) + 5 = 145

σᵧ = 2(10) = 20

Adding/Subtracting Random Variables

For any random variables X and Y:

μₓ₊ᵧ = μₓ + μᵧ
μₓ₋ᵧ = μₓ − μᵧ

If X and Y are INDEPENDENT:

σ²ₓ₊ᵧ = σ²ₓ + σ²ᵧ
σ²ₓ₋ᵧ = σ²ₓ + σ²ᵧ

⚠️ Critical Point: Variances ALWAYS ADD!

Whether you're adding or subtracting random variables, you ADD the variances. This seems counterintuitive for subtraction, but both addition and subtraction increase variability.

Remember: σₓ₊ᵧ = √(σ²ₓ + σ²ᵧ), NOT σₓ + σᵧ

Example: X = your score (μ = 80, σ = 5) and Y = opponent's score (μ = 75, σ = 4)

Let D = X − Y (your margin of victory)

μ_D = 80 − 75 = 5

σ²_D = 5² + 4² = 25 + 16 = 41

σ_D = √41 ≈ 6.4

(Assuming X and Y are independent)

Summary: Combining Random Variables

Means: Add or subtract as indicated
Variances: ALWAYS ADD (if independent)
Standard Deviations: Take square root of combined variance
Exam Tip: Never add standard deviations directly! You must convert to variances, add them, then take the square root.

4.10 Introduction to the Binomial Distribution

The binomial distribution is one of the most important probability distributions. It models the number of successes in a fixed number of independent trials.

The Binomial Setting (BINS)

B.I.N.S. — The Four Conditions

BBinary: Each trial has only two outcomes (success/failure)

IIndependent: Trials are independent of each other

NNumber: Fixed number of trials (n)

SSuccess: Same probability of success (p) for each trial

Example: Is this binomial? "Flip a coin 10 times and count the number of heads."

Binary: Each flip is heads or tails

Independent: Each flip doesn't affect others

Number: Fixed at n = 10 flips

Success: p = 0.5 for each flip

YES, this is binomial! X ~ Binomial(n=10, p=0.5)

Example: Is this binomial? "Draw cards without replacement until you get an ace."

Binary: Ace or not ace

Independent: NO! Drawing without replacement changes probabilities

Number: Number of draws is not fixed

Success: Same definition of success

NO, this is NOT binomial (fails I and N)

Binomial Probability Formula

If X ~ Binomial(n, p), the probability of exactly k successes is:

P(X = k) = ₙCₖ · pᵏ · (1−p)ⁿ⁻ᵏ

Where ₙCₖ = n! / [k!(n-k)!] is the number of ways to choose k successes from n trials.

🎯 Understanding the Formula:

ₙCₖ = number of ways to arrange k successes in n trials

pᵏ = probability of k successes

(1−p)ⁿ⁻ᵏ = probability of (n−k) failures

Calculator Commands (TI-83/84)

binompdf(n, p, k)

P(X = k) — probability of EXACTLY k successes

binomcdf(n, p, k)

P(X ≤ k) — probability of AT MOST k successes

Exam Tip: Before using binomial formulas, ALWAYS verify BINS conditions. If the problem says "without replacement" from a small population, it's probably NOT binomial!

4.11 Parameters for a Binomial Distribution

The binomial distribution has simple formulas for its mean and standard deviation, making it easy to describe the "typical" number of successes.

Mean and Standard Deviation

For X ~ Binomial(n, p):

Mean (Expected Value)

μₓ = np

Standard Deviation

σₓ = √np(1−p)

Example: A basketball player makes 80% of free throws. In 50 attempts:

X ~ Binomial(n = 50, p = 0.80)

μₓ = np = 50(0.80) = 40 makes

σₓ = √50(0.80)(0.20) = √8 ≈ 2.83 makes

Interpretation: On average, expect 40 makes, typically varying by about 2.83 from that.

Shape of the Binomial Distribution

Condition Shape
p = 0.5 Symmetric
p < 0.5 Skewed right (tail toward high values)
p > 0.5 Skewed left (tail toward low values)
Large n Approaches normal (regardless of p)
Binomial Distributions with Different p Values (n=10) p = 0.2 Skewed Right p = 0.5 Symmetric p = 0.8 Skewed Left Mean = np is marked by the tallest bar region

🎯 10% Rule for "Almost Independent":

When sampling WITHOUT replacement, we can still use binomial if the sample is less than 10% of the population. The probabilities change so slightly that they're "close enough" to independent.

Exam Tip: Know these formulas by heart: μ = np and σ = √np(1−p). The AP exam frequently asks you to calculate and interpret these values.

4.12 The Geometric Distribution

The geometric distribution models the number of trials needed to get the first success. Unlike binomial, the number of trials is not fixed.

The Geometric Setting

Geometric Setting Conditions:

  • Binary: Each trial has two outcomes (success/failure)
  • Independent: Trials are independent
  • Same p: Probability of success is the same for each trial
  • Count trials until FIRST success

Key difference from binomial: In geometric, we count how many trials until success. In binomial, we count how many successes in fixed trials.

Geometric Probability Formula

If X ~ Geometric(p), the probability that the first success occurs on trial k:

P(X = k) = (1−p)ᵏ⁻¹ · p

This is: (fail k−1 times) × (succeed on trial k)

Example: A basketball player makes 70% of free throws. Find the probability that the first make is on the 3rd attempt.

X ~ Geometric(p = 0.7)

P(X = 3) = (0.3)² × (0.7) = 0.09 × 0.7 = 0.063

There's about a 6.3% chance the first make is on the 3rd attempt.

Mean and Standard Deviation

For X ~ Geometric(p):

Mean

μₓ = 1/p

Standard Deviation

σₓ = √(1−p)/p²

Example: For the 70% free throw shooter:

μₓ = 1/0.7 ≈ 1.43 attempts

On average, expect the first make on about the 1.4th attempt (usually 1st or 2nd).

Calculating "At Least" Probabilities

🎯 Useful Shortcut:

P(X > k) = (1−p)ᵏ

This is the probability of NO successes in the first k trials.

Therefore: P(X ≤ k) = 1 − (1−p)ᵏ

Calculator Commands (TI-83/84)

geometpdf(p, k)

P(X = k) — probability first success is on trial k

geometcdf(p, k)

P(X ≤ k) — probability first success is on or before trial k

Binomial vs. Geometric

Feature Binomial Geometric
Question asked How many successes in n trials? How many trials until first success?
Fixed value n (number of trials) Nothing — trials continue until success
Random variable X Number of successes Trial number of first success
Possible values 0, 1, 2, ..., n 1, 2, 3, ... (infinite)
Mean μ = np μ = 1/p
Exam Tip: Read carefully! "Number of successes in 10 trials" = binomial. "Number of trials until first success" = geometric. The wording determines which distribution to use.

Unit 4 Key Takeaways

Law of Large Numbers: Proportions approach true probability over many trials

Addition Rule: P(A or B) = P(A) + P(B) − P(A and B)

Multiplication Rule: P(A and B) = P(A) × P(B|A)

Independent events: P(A and B) = P(A) × P(B)

Random variable mean: μ = Σ x·P(x)

Combining RVs: Add means; ADD variances (if independent)

Binomial: BINS conditions, μ = np, σ = √np(1−p)

Geometric: First success, μ = 1/p

binompdf/cdf(n,p,k) | geometpdf/cdf(p,k) | P(Aᶜ) = 1 − P(A)

End of Unit 4 Study Guide.

← Unit 3 Unit 4 Quiz →
🎉

Nice work finishing Unit 4!

Ready to test yourself? Create a free account to take Quiz 4!

Sign Up Free & Take Quiz 4 →

Free account = this quiz unlocked. No credit card needed.