StatsIQ · AP Statistics

Lesson 10: Conditional Probability

Unit 2 · Phase 2 · Statistical Practice:** 3 — Analyze Data

Topics:** Conditional probability `P(A|B) = P(A∩B) / P(B)`; reading joint, marginal, and conditional probabilities from a two-way table; the general multiplication rule `P(A∩B) = P(B)·P(A|B)`; tree diagrams for multi-stage processes; checking independence via `P(A|B) = P(A)`; an intuitive "reverse the conditioning" (Bayes) example.

Calculator:** Minimal. Everything here is arithmetic you can do on the home screen of the TI-84 — division of two counts, or multiplying along the branches of a tree. No special statistical function is needed.

Objectives:

Compute a conditional probability from a two-way table by choosing the correct denominator.
Use the general multiplication rule and tree diagrams to find probabilities for two-stage processes.
Decide whether two events are independent by comparing `P(A|B)` to `P(A)`, and reverse the conditioning to answer "given the result, what's the chance of the cause?"

(a) Warm-Up

Last lesson you learned that for independent events, P(A∩B) = P(A)·P(B). But most events in the real world are not independent — knowing one thing changes the odds of another.

Here's the hook. Suppose 64% of students in a school pass a pop quiz. You'd guess any random student has a 64% chance of passing. But now I tell you this particular student got a full night's sleep. Does your guess change?

It should. Among well-rested students, the pass rate turns out to be 80%. Among sleep-deprived students, only 40%. The condition — "got enough sleep" — rewired the probability.

That's conditional probability: the probability of an event given that we already know something. Write it P(Pass | Enough Sleep), read aloud as "the probability of passing given enough sleep." The vertical bar means "given."

Today you'll learn to pull these conditional probabilities straight out of a two-way table, chain them together with tree diagrams, test whether two events are independent, and even run the logic backwards — the seed of Bayes' theorem.

(b) Core Concept

The definition

The conditional probability of A given B is

P(A|B) = P(A∩B) / P(B)

In words: of all the outcomes where B happens, what fraction also have A? The event we condition on — B — becomes the new denominator, the new "universe." This is the single most important idea in the lesson: conditioning shrinks the sample space. We throw away every outcome where B didn't happen and re-measure within what's left. (The formula requires P(B) > 0.)

Reading a two-way table

A two-way table (also called a contingency table) cross-classifies a group of individuals by two categorical variables. It is the most common place the AP exam hides conditional probability. Here are 200 students classified by sleep (≥ 7 hours the night before, or not) and quiz result.

	Passed	Failed	Total
Enough sleep	96	24	120
Not enough	32	48	80
Total	128	72	200

Three kinds of probability live in this table. Pick a student at random.

Joint probability — the chance of both categories at once. The cell count over the grand total.

P(Enough ∩ Passed) = 96/200 = 0.48
P(Not enough ∩ Failed) = 48/200 = 0.24

Marginal probability — the chance of one category, ignoring the other. A row or column total over the grand total. (These live in the margins of the table — hence the name.)

P(Passed) = 128/200 = 0.64
P(Enough) = 120/200 = 0.60

Conditional probability — restrict to one row or one column, then divide within it.

P(Passed | Enough) = 96/120 = 0.80 — we used only the "Enough sleep" row; its total 120 is the denominator.
P(Passed | Not enough) = 32/80 = 0.40 — only the "Not enough" row.
P(Enough | Passed) = 96/128 = 0.75 — only the "Passed" column; its total 128 is the denominator.

Notice that the condition tells you which total to put in the denominator. "Given enough sleep" → row total 120. "Given passed" → column total 128. Get the denominator right and you are 90% of the way home.

The general multiplication rule

Rearrange the definition and you get a rule that works whether or not events are independent:

P(A∩B) = P(B)·P(A|B)

The chance both happen equals the chance the first happens, times the chance the second happens given the first. Check it against the table:

P(Enough ∩ Passed) = P(Enough) · P(Passed | Enough)
                   = 0.60 · 0.80
                   = 0.48   ✓  (matches 96/200)

Last lesson's special rule P(A∩B) = P(A)·P(B) is just this rule for the case where P(A|B) = P(A) — i.e., when the events are independent. The general rule is the safe default; the special one is only valid after you've confirmed independence.

Checking independence

Two events A and B are independent if knowing one happened does not change the probability of the other:

P(A|B) = P(A) (independent) versus P(A|B) ≠ P(A) (dependent / associated)

Test it with our table:

P(Passed | Enough) = 0.80
P(Passed) = 0.64

Since 0.80 ≠ 0.64, sleep and passing are not independent — they're associated. Well-rested students really do pass at a higher rate than the overall population. (You only need one comparison to break independence. If even a single conditional differs from the marginal, the events are dependent.)

Tree diagrams for multi-stage processes

When a process happens in stages — first this, then that — a tree diagram organizes it. Each branch is labeled with a probability; the second stage uses conditional probabilities given the first. You multiply along a path (that's the general multiplication rule) and add across paths that lead to the same outcome.

Suppose 60% of students get enough sleep, and the pass rates above hold:

[GRAPH: Tree diagram, two stages. 
Stage 1 (Sleep) splits from a single root node into two branches:
 - top branch "Enough sleep" labeled 0.60
 - bottom branch "Not enough" labeled 0.40
Stage 2 (Quiz) — each Stage-1 node splits again:
 From "Enough sleep": "Passed" labeled 0.80 (top), "Failed" labeled 0.20 (bottom).
 From "Not enough": "Passed" labeled 0.40 (top), "Failed" labeled 0.60 (bottom).
Four end paths with joint probabilities shown at the tips:
 Enough & Passed: 0.60×0.80 = 0.48
 Enough & Failed: 0.60×0.20 = 0.12
 Not enough & Passed: 0.40×0.40 = 0.16
 Not enough & Failed: 0.40×0.60 = 0.24
The four tip probabilities sum to 1.00.]

Read off the overall pass rate by adding the two "Passed" paths:

P(Passed) = 0.48 + 0.16 = 0.64   ✓

— exactly the marginal we found in the table. The tree and the table tell the same story.

Reversing the conditioning (intuitive Bayes)

The tree gives P(Passed | Enough) directly. But what if you observe the result and want the cause — P(Enough | Passed)? That's a reverse-conditioning problem, the heart of Bayes' theorem. You don't need a scary formula. Just build the joint probabilities and re-divide:

P(Enough | Passed) = P(Enough ∩ Passed) / P(Passed)
                   = 0.48 / 0.64
                   = 0.75

Of the students who passed, 75% were well-rested. Notice we flipped the condition: the tree handed us P(Passed | Enough) = 0.80, but the reversed question P(Enough | Passed) = 0.75 is a different number. Mixing these two up is the classic conditional-probability blunder — and the next section's first warning.

(c) Worked Examples

Example 1 — Conditional probability from a table (easy)

A streaming service surveys 500 users about whether they have a paid plan and whether they watched a show last week.

	Watched	Didn't	Total
Paid	210	90	300
Free	80	120	200
Total	290	210	500

(a) Find P(Watched | Paid). (b) Find P(Paid | Watched).

Strategy. The condition sets the denominator. "Given Paid" → row total 300. "Given Watched" → column total 290.

Solution.

(a) P(Watched | Paid)  = 210/300 = 0.70
(b) P(Paid | Watched)  = 210/290 ≈ 0.724

Interpretation. 70% of paid users watched; but of everyone who watched, about 72.4% were paid users. Same joint cell (210), different denominators, different answers — that's reverse conditioning in action.

Example 2 — General multiplication rule (medium)

A box has 5 red and 3 blue marbles. You draw two without replacement. Find P(both red).

Strategy. Drawing without replacement makes the draws dependent — the second probability depends on the first. Use the general multiplication rule, not P(A)·P(B).

Solution.

P(1st red) = 5/8
P(2nd red | 1st red) = 4/7      (one red gone, 7 marbles left)
P(both red) = 5/8 · 4/7 = 20/56 = 5/14 ≈ 0.357

Interpretation. About 35.7%. If you'd wrongly used 5/8 · 5/8 = 0.391, you'd have treated the draws as independent — the most common mistake with "without replacement" problems.

Example 3 — Two-stage tree diagram (medium → AP-style)

A factory runs two machines. Machine A makes 70% of the parts; Machine B makes 30%. Machine A's defect rate is 2%; Machine B's is 5%. A part is chosen at random. Find the probability it is defective.

Strategy. Two stages: which machine, then defective or not. Multiply along paths, add the defective paths.

Solution.

[GRAPH: Tree diagram. Root splits into Machine A (0.70) and Machine B (0.30).
 From A: Defective 0.02, Good 0.98.  From B: Defective 0.05, Good 0.95.
 Defective paths: A&Def = 0.70×0.02 = 0.014;  B&Def = 0.30×0.05 = 0.015.]

P(Defective) = (0.70)(0.02) + (0.30)(0.05)
             = 0.014 + 0.015
             = 0.029

Interpretation. About 2.9% of all parts are defective — a weighted blend of the two machines' rates.

Example 4 — Reverse conditioning / Bayes (AP-style)

Continue Example 3. A randomly chosen part is found to be defective. What is the probability it came from Machine B?

Strategy. We want P(B | Defective) but the tree gave P(Defective | B). Reverse it: the defective B-path over the total defective probability.

Solution.

P(B | Defective) = P(B ∩ Defective) / P(Defective)
                 = 0.015 / 0.029
                 ≈ 0.517

Interpretation. Even though Machine B makes only 30% of parts, it produces about 51.7% of the defective ones — because its defect rate is more than double A's. Observing the defect shifted our belief toward B. That's Bayesian updating, no formula memorization required: build the joint probabilities, re-divide by the new condition.

(d) Common Mistakes

1. Flipping the condition — P(A|B) vs. P(B|A). These are almost never equal. In Example 1, P(Watched|Paid) = 0.70 but P(Paid|Watched) ≈ 0.724. Fix: circle the word after "given." That variable's total is your denominator. "Given paid" means the paid group is your whole world.

2. Using P(A)·P(B) when events are dependent. The special multiplication rule only works after you've confirmed independence. With "without replacement," "given," or any two-way table showing association, the events are usually dependent — use P(A∩B) = P(B)·P(A|B). Fix: default to the general rule; switch to the product rule only when you've checked P(A|B) = P(A).

3. Wrong denominator — using the grand total for a conditional. P(Passed | Enough) is 96/120, not 96/200. The grand total gives the joint probability P(Passed ∩ Enough), a different quantity. Fix: conditional → restrict to one row or column; joint → use the grand total.

4. Calling events independent because they're mutually exclusive. Mutually exclusive events (they can't both happen) are actually dependent — if one occurs, the other's probability drops to 0. Fix: independence is about P(A|B) = P(A), not about non-overlap.

5. Forgetting to add all paths to an outcome in a tree. P(Defective) needed both the A-path and the B-path. Stopping after one path undercounts. Fix: find every path that ends in your event, then sum.

(e) Practice Problems

Use this table (350 commuters, classified by transport mode and whether they were late) for Problems 1–3.

	Late	On time	Total
Drove	60	140	200
Transit	45	105	150
Total	105	245	350

Question 1

P(Late | Drove) is closest to:

(A) 0.171
(B) 0.300
(C) 0.571
(D) 0.300 of all commuters

Question 2

P(Drove | Late) equals:

(A) 60/200
(B) 60/105
(C) 60/350
(D) 105/350

Are "Drove" and "Late" independent? Justify with a probability comparison.

Question 4

P(A) = 0.5, P(B) = 0.4, P(A∩B) = 0.2. Find P(A|B).

(A) 0.20
(B) 0.40
(C) 0.50
(D) 0.80

Question 5

Using the values in Problem 4, are A and B independent?

(A) Yes, because P(A|B) = P(A)
(B) Yes, because P(A∩B) ≠ 0
(C) No, because P(A∩B) ≠ 0
(D) No, because P(A|B) ≠ P(B)

Question 6

A bag has 4 green and 6 yellow chips. You draw two without replacement. P(both green) is:

(A) 0.16
(B) 0.133
(C) 0.144
(D) 0.067

Question 7

80% of flights from an airport are on time. Of on-time flights, 95% of bags arrive; of late flights, 70% of bags arrive. The probability a randomly chosen flight is on time and the bag arrives is:

(A) 0.76
(B) 0.95
(C) 0.80
(D) 0.56

Question 8

Using Problem 7's setup, the overall probability a bag arrives is:

(A) 0.760
(B) 0.900
(C) 0.760 + 0.14 = 0.900
(D) 0.825

Question 9

P(rain) = 0.30. If it rains, P(traffic jam | rain) = 0.60; if dry, P(traffic jam | dry) = 0.10. Find P(traffic jam).

(A) 0.18
(B) 0.25
(C) 0.70
(D) 0.07

(In context) Continuing Problem 9: there was a traffic jam this morning. Find P(rain | traffic jam). Show your work.

Question 11

A test for a condition has P(+ | condition) = 0.90 and P(+ | no condition) = 0.20. The condition affects 10% of people. Find P(+).

(A) 0.27
(B) 0.90
(C) 0.11
(D) 0.20

(In context) Using Problem 11, a person tests positive. Find P(condition | +). Interpret your answer in one sentence.

Question 13

Which statement describes a joint probability?

(A) The fraction of paid users who watched
(B) The fraction of all users who are paid and watched
(C) The fraction of all users who are paid
(D) The fraction of watchers who are paid

Question 14

If P(A|B) = P(A), then P(A∩B) equals:

(A) P(A) + P(B)
(B) P(A)·P(B)
(C) P(B)
(D) 0

(In context) In a class, P(plays a sport) = 0.5, P(plays an instrument) = 0.4, and P(plays a sport | plays an instrument) = 0.5. Are the two activities independent? Explain in one sentence using the numbers.

---

## (f) FRQ Practice — Free Response (10 points)

Statistical Practice 3: Analyze Data

A public-health researcher studies the relationship between regular exercise and getting the flu during one winter season. She records data on 400 adults, classifying each by whether they exercise regularly and whether they caught the flu.

|-----------------------|:-------:|:------:|:---------:|

| Exercises regularly | 36 | 144 | 180 |

| Does not exercise | 88 | 132 | 220 |

| Total | 124 | 276| 400 |

(a) A person is selected at random from the 400 adults. Find the probability that the person got the flu. (2 points)

(b) Find the probability that a randomly selected person got the flu given that they exercise regularly. (2 points)

(c) A randomly selected person did not exercise regularly. Find the probability that this person got the flu. (2 points)

(d) Based on your answers to (a)–(c), is "getting the flu" independent of "exercising regularly"? Justify your answer with an appropriate probability comparison. (2 points)

(e) A randomly selected person got the flu. Find the probability that this person exercises regularly, and interpret this value in context. (2 points)

---

### Model Response

(a) Use the marginal probability — the flu column total over the grand total.

P(flu) = 124/400 = 0.31

(b) Condition on "exercises regularly," so restrict to that row (total 180).

P(flu | exercises) = 36/180 = 0.20

(c) Condition on "does not exercise," so restrict to that row (total 220).

P(flu | does not exercise) = 88/220 = 0.40

(d) Compare a conditional probability to the corresponding marginal:

P(flu | exercises) = 0.20      P(flu) = 0.31
0.20 ≠ 0.31

Because P(flu | exercises) ≠ P(flu), the two events are not independent — they are associated. Knowing that a person exercises regularly lowers the probability that they got the flu (from 31% overall to 20%).

(e) Reverse the conditioning: condition on "got flu," so restrict to the flu column (total 124).

P(exercises | flu) = 36/124 ≈ 0.290

Interpretation: Of the adults who got the flu, about 29.0% exercise regularly. (Equivalently, most people who got the flu — about 71% — did not exercise regularly.)

Rubric (10 points)

Part	Point	Earned for
(a)	1	Correct denominator: uses grand total 400 (or recognizes a marginal probability is needed)
	1	Correct value `124/400 = 0.31`
(b)	1	Correct denominator: uses the "exercises" row total 180
	1	Correct value `36/180 = 0.20`
(c)	1	Correct denominator: uses the "does not exercise" row total 220
	1	Correct value `88/220 = 0.40`
(d)	1	States a valid comparison, e.g. `P(flu	exercises) `vs` P(flu)` (or the two row conditionals against each other)
	1	Correct conclusion "not independent" consistent with the numbers compared
(e)	1	Correct value `36/124 ≈ 0.290` (denominator is the flu column total 124)
	1	Interpretation in context: "of those who got the flu, ~29% exercise regularly"
Total	10

Where students lose points:

Part (a): writing 36/180 or 36/124 — that's a conditional, not the requested marginal. Must use the grand total 400.
Parts (b)/(c): dividing by 400 instead of the row total — that gives the joint probability, not the conditional asked for.
Part (d): comparing the wrong quantities, or stating "not independent" with no numerical justification. A bare "they look related" earns the comparison point only if backed by numbers; the conclusion point requires the conclusion to match the numbers shown. (Comparing the two conditionals 0.20 ≠ 0.40 is also fully acceptable.)
Part (e): flipping the condition and reporting 36/180 = 0.20 (that's P(flu|exercises), not P(exercises|flu)). The denominator must be the flu column total 124. The interpretation point is lost if the sentence omits context ("about 29%" with no mention of what group it refers to).

🔑 Answer Key

Practice Problems

1. (B) 0.300. P(Late | Drove) = 60/200 = 0.30, restricting to the "Drove" row.

- (A) 60/350 uses the grand total — that's the joint probability P(Late ∩ Drove).

- (C) 60/105 is the reversed condition P(Drove | Late).

- (D) is the marginal P(Drove) mislabeled, and uses the wrong denominator for the question asked.

2. (B) 60/105 ≈ 0.571. "Given Late" → use the Late column total 105. P(Drove | Late) = 60/105.

- (A) 60/200 is P(Late | Drove) — the condition flipped.

- (C) 60/350 is the joint probability.

- (D) 105/350 is the marginal P(Late).

3. Not independent. P(Late | Drove) = 60/200 = 0.30, and the marginal P(Late) = 105/350 = 0.30. Here 0.30 = 0.30, so by this comparison the events are independent. (Check the other row too: P(Late | Transit) = 45/150 = 0.30 — also equal. Every conditional equals the marginal, confirming independence.) Answer: they ARE independent, because P(Late | Drove) = P(Late) = 0.30.

4. (C) 0.50. P(A|B) = P(A∩B)/P(B) = 0.2/0.4 = 0.50.

- (A) 0.20 is the joint P(A∩B). (B) 0.40 is P(B). (D) 0.80 divides by P(A∩B) incorrectly (0.4/0.5).

5. (A) Yes. P(A|B) = 0.50 = P(A), so they're independent.

- (C) wrongly equates "overlapping" with "dependent." (B) overlap doesn't decide independence. (D) compares P(A|B) to P(B), the wrong benchmark.

6. (B) 0.133. Without replacement: P(both green) = (4/10)(3/9) = 12/90 ≈ 0.133.

- (A) 0.16 = (4/10)² treats draws as independent (with replacement). (C) 0.144 mis-multiplies. (D) 0.067 halves the answer.

7. (A) 0.76. P(on time ∩ bag arrives) = 0.80 · 0.95 = 0.76.

- (D) 0.56 uses the late-flight bag rate 0.70. (B)/(C) report single rates, not the joint.

8. (B) 0.900. P(bag arrives) = (0.80)(0.95) + (0.20)(0.70) = 0.76 + 0.14 = 0.90. (C states the same value but its arithmetic label 0.76 + 0.14 is the correct path-sum; B is the clean numeric answer.)

- (A) 0.760 forgets the late-flight path. (D) 0.825 averages the two rates without weighting.

9. (B) 0.25. P(jam) = (0.30)(0.60) + (0.70)(0.10) = 0.18 + 0.07 = 0.25.

- (A) 0.18 is only the rain path. (D) 0.07 is only the dry path. (C) 0.70 adds the conditionals incorrectly.

10. P(rain | traffic jam) = 0.18/0.25 = 0.72. Reverse conditioning: the rain-and-jam joint (0.30)(0.60)=0.18 over total jam probability 0.25 (from Problem 9). So 72% of jam mornings were rainy.

11. (A) 0.27. P(+) = (0.10)(0.90) + (0.90)(0.20) = 0.09 + 0.18 = 0.27.

- (C) 0.11 forgets the false-positive path. (B)/(D) report single conditional rates.

12. P(condition | +) = 0.09/0.27 ≈ 0.333. Joint (0.10)(0.90)=0.09 over P(+)=0.27. Interpretation: Even after a positive test, only about a 1-in-3 chance the person actually has the condition — because the condition is rare and false positives are common. (This is the classic base-rate lesson.)

13. (B). A joint probability counts individuals in both categories out of everyone (grand total). (A) and (D) are conditionals; (C) is a marginal.

14. (B) P(A)·P(B). When P(A|B) = P(A), substitute into P(A∩B) = P(B)·P(A|B) to get P(B)·P(A). That's the definition of independence.

15. Not independent. Compare P(sport | instrument) = 0.5 to P(sport) = 0.5. They're equal, so the activities ARE independent — knowing a student plays an instrument doesn't change the 0.5 chance they play a sport. (Justification: P(sport | instrument) = P(sport) = 0.5.)

FRQ: Full model response and 10-point rubric appear in Section (f) above. Key recomputed values: P(flu) = 124/400 = 0.31; P(flu | exercises) = 36/180 = 0.20; P(flu | no exercise) = 88/220 = 0.40; independence fails since 0.20 ≠ 0.31; reversed P(exercises | flu) = 36/124 ≈ 0.290.

---

StatsIQ · Lesson 10 of 30 · Unit 2: Probability, Random Variables, and Probability Distributions · Phase 2: Probability

This lesson is independent study material aligned to the 2026–27 AP Statistics Course and Exam Description. AP® is a trademark registered by the College Board, which is not affiliated with and does not endorse this product.

Accuracy review: All probabilities in this lesson were independently recomputed. Two-way-table conditionals confirmed against their row/column totals; all joint, marginal, and conditional values cross-checked for internal consistency. Reviewed for statistical accuracy by Isaac, retired actuary.

← Lesson 9

Lesson 11 →