4.1 Introducing Statistics: Random and Non-Random Patterns?
Probability is the mathematical language of randomness. Understanding the difference between random and non-random patterns is fundamental to statistical inference.
Randomness: A process is random if individual outcomes are unpredictable, but there is a regular pattern in the long run over many repetitions.
Random vs. Non-Random Patterns
Random
Individual outcomes are unpredictable, but follow a predictable pattern in the long run.
- Coin flips (can't predict next flip)
- Rolling dice
- Random sampling
Short-term: Unpredictable
Long-term: Predictable pattern
Non-Random (Deterministic)
Outcomes follow a fixed, predictable pattern.
- Mathematical formulas
- Physical laws
- Programmed sequences
Both short-term and long-term: Predictable
The Law of Large Numbers
Law of Large Numbers: As the number of trials increases, the observed proportion of outcomes gets closer and closer to the true probability.
Example: Flipping a fair coin
• After 10 flips: Might get 70% heads (not surprising)
• After 100 flips: Probably closer to 50% heads
• After 10,000 flips: Very close to 50% heads
The proportion approaches 0.5 as trials increase — this is the Law of Large Numbers in action!
⚠️ Common Misconception: "The Gambler's Fallacy"
After getting 5 heads in a row, the next flip is NOT "due" to be tails! Each flip is independent — the coin has no memory. The Law of Large Numbers works over MANY trials, not by "evening out" in the short term.
4.2 Estimating Probabilities Using Simulation
When theoretical probability is difficult to calculate, we can use simulation to estimate probabilities by mimicking the random process many times.
Simulation: Using a model to imitate a random process, then observing outcomes over many repetitions to estimate probabilities.
Steps for Designing a Simulation
| Step | Description | Example: Simulating Free Throws (70% shooter) |
|---|---|---|
| 1. Model | Describe how to use a random device to imitate one trial of the process. | Use random digits 0-9. Let 0-6 = make, 7-9 = miss (gives 70% probability) |
| 2. Repetition | Describe how to run one complete trial. | Generate 10 random digits to simulate 10 free throws. |
| 3. Run Many Trials | Perform many trials (at least 100) and record results. | Repeat the 10-shot simulation 200 times, recording makes each time. |
| 4. Estimate | Calculate proportion of trials with the outcome of interest. | If 156 out of 200 trials had 7+ makes, estimate P(7+ makes) ≈ 0.78 |
Example Simulation: What's the probability of getting at least one "6" when rolling a die 4 times?
Model: Use random digits 1-6 (ignore 0, 7, 8, 9). Or use a die!
Trial: Generate 4 random digits (1-6), check if any are 6.
Results after 100 trials: 52 trials had at least one 6.
Estimate: P(at least one 6 in 4 rolls) ≈ 52/100 = 0.52
(Theoretical answer: 1 - (5/6)⁴ ≈ 0.518 — our simulation is close!)
Using a Random Number Table
• Read digits in groups (singles, pairs, triples) depending on your model
• Skip digits that don't fit your model
• Continue in a consistent direction (left to right, top to bottom)
Example: For 70% probability, let 00-69 = success, 70-99 = failure
4.3 Introduction to Probability
Probability measures how likely an event is to occur, ranging from 0 (impossible) to 1 (certain). Understanding probability rules is essential for statistical inference.
Basic Probability Concepts
| Term | Definition | Example (Rolling a Die) |
|---|---|---|
| Sample Space (S) | The set of ALL possible outcomes | S = {1, 2, 3, 4, 5, 6} |
| Event | A subset of the sample space; a collection of outcomes | A = "rolling even" = {2, 4, 6} |
| Probability of Event A | P(A) = (# of outcomes in A) / (# of outcomes in S) | P(even) = 3/6 = 0.5 |
| Complement (Aᶜ) | All outcomes NOT in event A | Aᶜ = "rolling odd" = {1, 3, 5} |
Basic Probability Rules
Rule 1: Range
Probability is always between 0 and 1
Rule 2: All Outcomes
Something must happen
Rule 3: Complement
P(not A) = 1 − P(A)
Rule 4: Equally Likely
Count favorable / count total
🎯 The Complement Rule is Your Friend:
When asked for P("at least one"), it's often easier to calculate:
P(at least one) = 1 − P(none)
4.4 Mutually Exclusive Events
Some events cannot happen at the same time. Understanding when events are mutually exclusive is crucial for calculating probabilities correctly.
Mutually Exclusive (Disjoint) Events: Two events that cannot occur at the same time. If one happens, the other cannot happen.
Examples of Mutually Exclusive Events:
• Rolling a die: "getting a 2" and "getting a 5" ✓
• Drawing a card: "getting a heart" and "getting a spade" ✓
• A student's grade: "getting an A" and "getting a B" ✓
NOT Mutually Exclusive:
• Drawing a card: "getting a heart" and "getting a queen" ✗ (Queen of Hearts exists!)
• A student: "plays sports" and "plays music" ✗ (can do both)
Addition Rule for Mutually Exclusive Events
If A and B are mutually exclusive:
This extends to multiple events: P(A or B or C) = P(A) + P(B) + P(C) if all are mutually exclusive.
Example: Rolling a die. Find P(rolling a 2 or a 5).
Since "rolling a 2" and "rolling a 5" are mutually exclusive:
P(2 or 5) = P(2) + P(5) = 1/6 + 1/6 = 2/6 = 1/3
⚠️ Important: You can ONLY add probabilities directly when events are mutually exclusive! If events can happen together, you must use the General Addition Rule (covered in 4.6).
4.5 Conditional Probability
Conditional probability is the probability of an event occurring given that another event has already occurred. It answers: "How does knowing one thing affect the probability of another?"
Conditional Probability: The probability of event A occurring, given that event B has occurred.
Read "P(A | B)" as "the probability of A given B"
🎯 Intuition: When we know B has happened, we're restricting our sample space to only the outcomes where B occurs. Then we ask: of those outcomes, how many also have A?
Example: A survey of 200 students found:
| Plays Sports | Doesn't Play | Total | |
|---|---|---|---|
| Male | 60 | 40 | 100 |
| Female | 45 | 55 | 100 |
| Total | 105 | 95 | 200 |
Find P(Plays Sports | Male):
Given that a student is male, what's the probability they play sports?
P(Sports | Male) = 60/100 = 0.60
Find P(Male | Plays Sports):
Given that a student plays sports, what's the probability they're male?
P(Male | Sports) = 60/105 ≈ 0.571
Notice: P(A | B) ≠ P(B | A) in general!
General Multiplication Rule
Rearranging the conditional probability formula gives us the General Multiplication Rule:
Or equivalently: P(A and B) = P(A) × P(B | A)
Example: A bag has 5 red and 3 blue marbles. Draw 2 marbles without replacement. Find P(both red).
P(1st red) = 5/8
P(2nd red | 1st red) = 4/7 (one red already gone)
P(both red) = (5/8) × (4/7) = 20/56 = 5/14 ≈ 0.357
4.6 Independent Events and Unions of Events
Two events are independent if knowing that one occurred doesn't change the probability of the other. This concept is crucial for multiplication rules.
Independent Events
Independent Events: Events A and B are independent if:
Knowing B occurred doesn't change the probability of A (and vice versa).
Independent Events ✓
- Flipping two separate coins
- Rolling two dice
- Drawing WITH replacement
- Unrelated events (rain today, coin flip)
NOT Independent ✗
- Drawing WITHOUT replacement
- Related events (studying, test score)
- Sampling without replacement
- Events where one affects the other
Multiplication Rule for Independent Events
If A and B are independent:
This extends to multiple independent events: P(A and B and C) = P(A) × P(B) × P(C)
Example: Flip a coin and roll a die. Find P(Heads and 6).
Events are independent (coin doesn't affect die).
P(H and 6) = P(H) × P(6) = (1/2) × (1/6) = 1/12
⚠️ Independent ≠ Mutually Exclusive!
Mutually exclusive: Events CANNOT happen together. P(A and B) = 0.
Independent: Events DON'T AFFECT each other. P(A and B) = P(A) × P(B).
In fact, if A and B are mutually exclusive (and both have P > 0), they CANNOT be independent!
General Addition Rule (Union of Events)
For ANY events A and B:
We subtract P(A and B) to avoid counting the overlap twice.
Example: P(Heart) = 13/52, P(Queen) = 4/52, P(Queen of Hearts) = 1/52
Find P(Heart or Queen):
P(Heart or Queen) = 13/52 + 4/52 − 1/52 = 16/52 = 4/13
4.7 Introduction to Random Variables and Probability Distributions
A random variable assigns a numerical value to each outcome of a random process. This allows us to use mathematical tools to analyze random phenomena.
Random Variable: A variable whose value is a numerical outcome of a random phenomenon. Denoted with capital letters (X, Y, Z).
Types of Random Variables
Discrete Random Variable
Takes a countable number of possible values (often whole numbers).
- Number of heads in 10 flips
- Number of customers per hour
- Number of defective items
Can list all possible values
Continuous Random Variable
Takes any value in an interval (infinitely many possible values).
- Height of a person
- Time to complete a task
- Temperature
Cannot list all possible values
Probability Distribution
Probability Distribution: A description of all possible values of a random variable and their associated probabilities.
Requirements:
- Every probability is between 0 and 1
- All probabilities sum to 1
Example: Let X = number of heads when flipping a coin twice
| x | 0 | 1 | 2 |
|---|---|---|---|
| P(X = x) | 1/4 | 2/4 | 1/4 |
Check: 1/4 + 2/4 + 1/4 = 4/4 = 1 ✓
Probability Histogram
4.8 Mean and Standard Deviation of Random Variables
Just like data sets, random variables have a mean (expected value) and standard deviation. These describe the center and spread of the probability distribution.
Expected Value (Mean)
Expected Value (Mean) of X: The long-run average value of X over many, many repetitions.
Multiply each value by its probability, then add them all up.
Example: X = number of heads in 2 coin flips
| x | P(X = x) | x · P(x) |
|---|---|---|
| 0 | 0.25 | 0 |
| 1 | 0.50 | 0.50 |
| 2 | 0.25 | 0.50 |
| E(X) = | 1.00 | |
On average, you expect 1 head in 2 coin flips.
Standard Deviation
Variance of X:
Standard Deviation of X:
🎯 Interpretation:
μₓ (mean): The "balance point" of the distribution — what you expect on average
σₓ (SD): The typical distance from the mean — measures spread/variability
Calculator Tip
Enter x-values in L1 and probabilities in L2, then:
STAT → CALC → 1:1-Var Stats L1, L2
This gives you the mean (x̄) and standard deviation (σx) of the distribution.
4.9 Combining Random Variables
When we add, subtract, or scale random variables, we need rules for how the means and standard deviations combine.
Linear Transformations: aX + b
If Y = aX + b (multiply by constant a, then add constant b):
Adding a constant shifts the mean but doesn't change spread. Multiplying changes both.
Example: Test scores X have μₓ = 70 and σₓ = 10.
The teacher curves by doubling and adding 5: Y = 2X + 5
μᵧ = 2(70) + 5 = 145
σᵧ = 2(10) = 20
Adding/Subtracting Random Variables
For any random variables X and Y:
If X and Y are INDEPENDENT:
⚠️ Critical Point: Variances ALWAYS ADD!
Whether you're adding or subtracting random variables, you ADD the variances. This seems counterintuitive for subtraction, but both addition and subtraction increase variability.
Remember: σₓ₊ᵧ = √(σ²ₓ + σ²ᵧ), NOT σₓ + σᵧ
Example: X = your score (μ = 80, σ = 5) and Y = opponent's score (μ = 75, σ = 4)
Let D = X − Y (your margin of victory)
μ_D = 80 − 75 = 5
σ²_D = 5² + 4² = 25 + 16 = 41
σ_D = √41 ≈ 6.4
(Assuming X and Y are independent)
Summary: Combining Random Variables
| Means: | Add or subtract as indicated |
| Variances: | ALWAYS ADD (if independent) |
| Standard Deviations: | Take square root of combined variance |
4.10 Introduction to the Binomial Distribution
The binomial distribution is one of the most important probability distributions. It models the number of successes in a fixed number of independent trials.
The Binomial Setting (BINS)
B.I.N.S. — The Four Conditions
B — Binary: Each trial has only two outcomes (success/failure)
I — Independent: Trials are independent of each other
N — Number: Fixed number of trials (n)
S — Success: Same probability of success (p) for each trial
Example: Is this binomial? "Flip a coin 10 times and count the number of heads."
✓ Binary: Each flip is heads or tails
✓ Independent: Each flip doesn't affect others
✓ Number: Fixed at n = 10 flips
✓ Success: p = 0.5 for each flip
YES, this is binomial! X ~ Binomial(n=10, p=0.5)
Example: Is this binomial? "Draw cards without replacement until you get an ace."
✓ Binary: Ace or not ace
✗ Independent: NO! Drawing without replacement changes probabilities
✗ Number: Number of draws is not fixed
✓ Success: Same definition of success
NO, this is NOT binomial (fails I and N)
Binomial Probability Formula
If X ~ Binomial(n, p), the probability of exactly k successes is:
Where ₙCₖ = n! / [k!(n-k)!] is the number of ways to choose k successes from n trials.
🎯 Understanding the Formula:
ₙCₖ = number of ways to arrange k successes in n trials
pᵏ = probability of k successes
(1−p)ⁿ⁻ᵏ = probability of (n−k) failures
Calculator Commands (TI-83/84)
binompdf(n, p, k)
P(X = k) — probability of EXACTLY k successes
binomcdf(n, p, k)
P(X ≤ k) — probability of AT MOST k successes
4.11 Parameters for a Binomial Distribution
The binomial distribution has simple formulas for its mean and standard deviation, making it easy to describe the "typical" number of successes.
Mean and Standard Deviation
For X ~ Binomial(n, p):
Mean (Expected Value)
Standard Deviation
Example: A basketball player makes 80% of free throws. In 50 attempts:
X ~ Binomial(n = 50, p = 0.80)
μₓ = np = 50(0.80) = 40 makes
σₓ = √50(0.80)(0.20) = √8 ≈ 2.83 makes
Interpretation: On average, expect 40 makes, typically varying by about 2.83 from that.
Shape of the Binomial Distribution
| Condition | Shape |
|---|---|
| p = 0.5 | Symmetric |
| p < 0.5 | Skewed right (tail toward high values) |
| p > 0.5 | Skewed left (tail toward low values) |
| Large n | Approaches normal (regardless of p) |
🎯 10% Rule for "Almost Independent":
When sampling WITHOUT replacement, we can still use binomial if the sample is less than 10% of the population. The probabilities change so slightly that they're "close enough" to independent.
4.12 The Geometric Distribution
The geometric distribution models the number of trials needed to get the first success. Unlike binomial, the number of trials is not fixed.
The Geometric Setting
Geometric Setting Conditions:
- Binary: Each trial has two outcomes (success/failure)
- Independent: Trials are independent
- Same p: Probability of success is the same for each trial
- Count trials until FIRST success
Key difference from binomial: In geometric, we count how many trials until success. In binomial, we count how many successes in fixed trials.
Geometric Probability Formula
If X ~ Geometric(p), the probability that the first success occurs on trial k:
This is: (fail k−1 times) × (succeed on trial k)
Example: A basketball player makes 70% of free throws. Find the probability that the first make is on the 3rd attempt.
X ~ Geometric(p = 0.7)
P(X = 3) = (0.3)² × (0.7) = 0.09 × 0.7 = 0.063
There's about a 6.3% chance the first make is on the 3rd attempt.
Mean and Standard Deviation
For X ~ Geometric(p):
Mean
Standard Deviation
Example: For the 70% free throw shooter:
μₓ = 1/0.7 ≈ 1.43 attempts
On average, expect the first make on about the 1.4th attempt (usually 1st or 2nd).
Calculating "At Least" Probabilities
🎯 Useful Shortcut:
P(X > k) = (1−p)ᵏ
This is the probability of NO successes in the first k trials.
Therefore: P(X ≤ k) = 1 − (1−p)ᵏ
Calculator Commands (TI-83/84)
geometpdf(p, k)
P(X = k) — probability first success is on trial k
geometcdf(p, k)
P(X ≤ k) — probability first success is on or before trial k
Binomial vs. Geometric
| Feature | Binomial | Geometric |
|---|---|---|
| Question asked | How many successes in n trials? | How many trials until first success? |
| Fixed value | n (number of trials) | Nothing — trials continue until success |
| Random variable X | Number of successes | Trial number of first success |
| Possible values | 0, 1, 2, ..., n | 1, 2, 3, ... (infinite) |
| Mean | μ = np | μ = 1/p |
Unit 4 Key Takeaways
Law of Large Numbers: Proportions approach true probability over many trials
Addition Rule: P(A or B) = P(A) + P(B) − P(A and B)
Multiplication Rule: P(A and B) = P(A) × P(B|A)
Independent events: P(A and B) = P(A) × P(B)
Random variable mean: μ = Σ x·P(x)
Combining RVs: Add means; ADD variances (if independent)
Binomial: BINS conditions, μ = np, σ = √np(1−p)
Geometric: First success, μ = 1/p
binompdf/cdf(n,p,k) | geometpdf/cdf(p,k) | P(Aᶜ) = 1 − P(A)
End of Unit 4 Study Guide.