Open your phone's screen-time report. You're looking at a tiny statistical study — and you didn't even design it. Your phone recorded a number of minutes every day, sorted them by app, and handed you an average.
Now answer this honestly: is your screen time going up or down?
Here's the thing — to answer that, you need more than a number. You need to know which days got measured (just this week? this month?), what counts as screen time (does a podcast playing in your pocket count?), and what "going up" even means (up compared to what?).
That fuzzy feeling you have right now — "I'm not totally sure what I'm even being asked" — is exactly the problem statistics exists to solve. Every good statistical study starts by pinning down a vague wondering into a question that data can actually answer. That's where we start today, and it's the first thing the new AP exam tests.
Statistics is the science of learning from data. But "data" doesn't mean a spreadsheet of random numbers — it means measurements taken on individuals to answer a question. Let's build the vocabulary, because on the AP exam, using the wrong word costs points.
The individuals (also called observational units) are the people, animals, or things you collect data about. They don't have to be people — they could be cars, countries, tweets, or basketball games.
A variable is any characteristic you record about each individual. If your individuals are students, your variables might be height, GPA, favorite sport, and number of siblings.
Variables come in two flavors, and telling them apart is a skill you'll use in every single lesson:
Watch out: numbers aren't automatically quantitative. A jersey number is a label (categorical) even though it's written with digits. Ask yourself: "Does averaging these produce something meaningful?" Average jersey number = nonsense. Average height = useful. That's your test.
Quantitative variables split further:
The population is the entire group you want to know about. The sample is the part of the population you actually collect data from.
Why not just measure the whole population? Usually it's impossible or wildly expensive. You can't survey all 50 million U.S. high schoolers, so you study a sample of, say, 1,500 of them — and use that sample to learn about the whole population. That move, from sample to population, is the engine of the entire course.
This is the distinction students mess up most, so slow down here.
p, population standard deviation σ.s.A memory hook: population goes with parameter; sample goes with statistic. Both pairs share a first letter.
Worked mini-example — parameter or statistic?
A polling company wants to know what fraction of all registered U.S. voters approve of a new law. The true (unknown) fraction is p, a parameter — it describes the whole population.
They can't ask everyone, so they survey 1,200 randomly chosen voters and find that 624 approve. The fraction
624/1200 = 0.52is p̂ = 0.52, a statistic — it describes only the sample.The whole point of the study is to use the statistic (0.52) to estimate the parameter (p, unknown). The statistic is what we have; the parameter is what we want.
There are two big jobs in statistics:
A statistical question is one that anticipates variability — it expects the answers to differ from individual to individual, and it can only be answered by collecting data.
"How tall is the principal?" is not a statistical question — there's one answer, no variability. But "How tall are the students at this school?" is statistical: heights vary, and you'd need to collect data to describe them.
A good statistical question is:
The revised AP framework adds an explicit skill: formulating an investigative question as the first step of a real statistical study. This is Practice 1, and it shows up on FRQ 1. Don't skip it.
An investigative question is the precise, data-answerable question that drives an entire study. Going from a vague wondering to a sharp investigative question usually means nailing down four things:
Watch a vague wondering become an investigative question:
The second version names the individuals (Lincoln High students), the variable (hours of sleep on a typical school night), and the comparison (athletes vs. non-athletes). Now you could actually go collect data and answer it.
Every statistical study moves through four stages — this is your roadmap for the whole course:
We'll spiral through these four moves again and again. Today is almost entirely about Ask — because a study built on a fuzzy question can't be rescued by fancy analysis later.
A quick orientation, since the new exam allows a TI-84 on every question. Your calculator will crunch means, standard deviations, probabilities, and full inference procedures for you — we'll build those skills lesson by lesson.
What it will not do: decide whether a variable is categorical or quantitative, choose the right population, formulate your question, or interpret a result in context. The calculator gives numbers. You give meaning. That division of labor is the single most important idea in this course — and it's why "interpretation is the point" will be a drumbeat from here to the exam.
Problem. A school records the following for each student: (i) grade level (9, 10, 11, 12), (ii) number of AP courses taken, (iii) height in centimeters, (iv) primary mode of transport to school (bus, car, walk, bike).
For each, state whether it's categorical or quantitative; if quantitative, state discrete or continuous.
Strategy. For each variable, ask: "Is averaging these meaningful?" If no → categorical. If yes → quantitative, then ask "counted (discrete) or measured (continuous)?"
Solution.
Interpretation. Notice (i): digits don't make a variable quantitative. Always check whether arithmetic means something.
Problem. A streaming service wants to know the average number of hours all its 40 million subscribers watch per week. It pulls a random sample of 5,000 subscribers and finds their mean is x̄ = 11.2 hours.
Identify the (a) individuals, (b) population, (c) sample, (d) parameter, (e) statistic.
Solution.
Interpretation. The company would like to know μ (the parameter) but can only compute x̄ (the statistic). Using 11.2 hours to estimate μ is inference — exactly what Units 3–5 make rigorous.
Problem. Classify each as a statistical question (anticipates variability, needs data) or not, and fix the non-statistical ones.
(i) "What was LeBron James's point total in last night's game?"
(ii) "How many points per game do NBA starters average this season?"
(iii) "What is my resting heart rate right now?"
Solution.
Interpretation. The test is variability: if every reasonable observation gives the same answer, it isn't a statistical question.
Problem. A health teacher wonders, "Does energy-drink use hurt students' grades?" Rewrite this as a precise investigative question suitable for a statistical study, and identify the individuals, variable(s), population, and the comparison or relationship being studied.
Strategy. Pin down the four pieces: individuals, variable(s) + how measured, population, and the comparison/relationship. Make every term something you could actually record.
Solution.
"Among the 1,800 students at Riverside High this semester, is there an association between the number of energy drinks a student consumes in a typical week (self-reported) and their semester GPA?"
Interpretation. The original wording ("hurt") sneaks in causation and is too vague to measure. The rewrite is measurable and honest — it asks about association, not cause. (Why we can't jump to "causes" yet is Lesson 8. For now: a good investigative question never claims more than the data can deliver.)
1. Treating any number as quantitative.
Wrong: "Zip code is quantitative because it's a number." Why it's wrong: averaging zip codes is meaningless; the digits are a label. Fix: apply the averaging test — if the average is nonsense, it's categorical.
2. Swapping parameter and statistic.
Wrong: calling the sample mean a parameter. Why it's wrong: a parameter describes the population (usually unknown); a statistic describes the sample (computed from data). Fix: use the letter cues — Greek (μ, σ, p) = parameter; Roman (x̄, s, p̂) = statistic. population–parameter, sample–statistic.
3. Confusing the sample with the population.
Wrong: "The population is the 5,000 people surveyed." Why it's wrong: those 5,000 are the sample; the population is the whole group you want to learn about. Fix: ask "Who do we ultimately want a conclusion about?" — that's the population.
4. Writing a question with no variability.
Wrong: "How tall is the tallest player?" as a statistical question. Why it's wrong: it has one fixed answer. Fix: phrase it so answers vary across individuals ("How do the heights of the players vary?").
5. Sneaking causation into an investigative question too early.
Wrong: "Does sugar cause hyperactivity in kids?" from survey data. Why it's wrong: observational data usually can't establish cause (Lesson 8). Fix: ask about an association unless the study is a designed experiment.
12/300 = 0.04 defective is a:p. This p is a:A poll surveys 1,000 of a city's 600,000 adults about a transit plan. Identify each of the following as population, sample, parameter, or statistic:
(a) the 600,000 adults; (b) the 1,000 surveyed; (c) the true % of all adults who support the plan; (d) the 58% of surveyed adults who support it.
For each variable on hospital patients, state categorical or quantitative; if quantitative, discrete or continuous:
(a) number of nights stayed; (b) blood pressure (mm Hg); (c) insurance provider; (d) body temperature (°F).
(Interpretation) A nutritionist claims: "Based on my sample of 50 clients, people who eat breakfast weigh less." Explain, in context, the difference between what this describes about the sample and what it would take to make an inference about all people. Use the words descriptive, inferential, sample, and population.
(Interpretation) A student writes the investigative question: "Are phones bad?" Explain why this is not yet a usable statistical question, then rewrite it as a precise investigative question. Identify the individuals, the variable(s) and how you'd measure them, and the population.
Which step of the investigative process (ask → collect → analyze → interpret) does each describe?
(a) Choosing a random sample of 200 voters and recording their party.
(b) Writing the question "Do seniors and juniors differ in average commute time?"
(c) Concluding "Seniors commute about 8 minutes longer, on average, than juniors at this school."
(d) Making a boxplot of commute times by grade.
---
1. B. Blood type sorts people into labeled groups → categorical. A, C, D are all numerical measurements where arithmetic is meaningful → quantitative.
2. B. Number of texts is a count of whole units → quantitative discrete. A (eye color) and D (country) are categorical. C (weight) is quantitative but continuous, not discrete.
3. C. Reaction time is measured and can take any value in an interval → quantitative continuous. B is the classic trap: time is measured, not counted, so it's continuous, not discrete.
4. B. The 200,000 daily cases are the entire group of interest → population. A would be the 300 inspected; C/D are numbers describing data, not a group size.
5. C. 0.04 is computed from the sample of 300 → statistic. A (parameter) describes the whole population; B names a group, not a number; D a variable is a characteristic, not this computed value.
6. C. p is the proportion for all cases (the population), usually unknown → parameter. A is the sample analog (p̂); B/D name groups/individuals, not this number.
7. B. Stating confidence about all cases generalizes from sample to population → inferential. A, C, D only summarize the sample itself → descriptive.
8. C. Sleep varies across students and needs data → statistical question. A and D have single fixed answers; B is one value for one person on one night — no variability.
9. C. A jersey number is a label, not an amount (averaging jersey numbers is meaningless) → categorical. A/B wrongly treat the digits as quantitative; D a parameter is a population number, unrelated.
10. C. It names individuals (gym adults), measurable variables (weekly coffee, resting heart rate), and a relationship → a strong investigative question. A is vague and uses "good"; B has one fixed answer (not statistical); D is a yes/no opinion item, not a study question.
11. (a) population (all 600,000 adults); (b) sample (the 1,000 surveyed); (c) parameter (true % of all adults — describes the population, unknown); (d) statistic (58% computed from the sample).
12. (a) quantitative, discrete (counted nights); (b) quantitative, continuous (measured); (c) categorical (insurance provider is a label); (d) quantitative, continuous (measured temperature).
13. Sample answer. The claim "in my sample of 50 clients, breakfast-eaters weigh less" is descriptive — it summarizes a pattern within the sample of 50 and says nothing certain beyond them. To make an inferential claim — that breakfast-eaters weigh less among the whole population of people — she'd need a well-designed study (ideally random selection, and an experiment to address cause) and a procedure that accounts for uncertainty. With only 50 self-selected clients, she can describe the sample but cannot reliably generalize to the population. (Full credit: correctly separates a within-sample description from a population-level generalization using all four terms.)
14. Sample answer. "Are phones bad?" can't be answered with data: "bad" isn't a measurable variable, no individuals or population are specified, and there's no variable to record. A usable rewrite: "Among students at our school, is there an association between hours of daily phone use (self-reported) and self-reported hours of sleep on school nights?" — Individuals: students at our school. Variables: daily phone-use hours and nightly sleep hours, both quantitative, gathered by survey. Population: all students at our school. (Full credit: explains why "bad" isn't measurable, then gives a measurable question with individuals, measured variable(s), and population.)
15. (a) Collect; (b) Ask; (c) Interpret; (d) Analyze.
---
StatsIQ · Lesson 1 of 30 · Unit 1 · Aligned to the 2026–27 AP Statistics framework. Not affiliated with the College Board. AP is a registered trademark of the College Board. Content pending statistical-accuracy review (Isaac).