In the 1990s a wave of headlines announced that kids who sleep with a night-light on are far more likely to become nearsighted. The correlation was real and strong — researchers had measured it. Parents started flipping off lights. Then a 2000 follow-up found the actual story: nearsighted parents are more likely both to be nearsighted (and pass it on genetically) and to leave a light on in the nursery. The night-light didn't cause anything. A hidden third factor — the parents' eyes — was pulling both strings.
That's not a cute footnote. That's the whole reason research methods exist, and it's the reason this is one of the two most heavily tested topics on the AP exam. Psychology lives or dies on a single question: does this thing actually cause that thing, or do they just travel together? Get the method wrong and you'll confidently believe something false — like a generation of parents rearranging nurseries. This lesson is your tool kit for never being that parent.
Every research method in psychology is trying to do one of three jobs. Descriptive methods (case studies, naturalistic observation, surveys) just describe what's happening — they take a careful snapshot. Correlational methods measure whether two variables move together. Only one method can establish that one thing causes another: the experiment. Keep that hierarchy in your head — describe, relate, cause — because the AP exam's favorite trap is offering you a study that can only describe or relate and daring you to claim it proved a cause.
An experiment is a study in which the researcher deliberately manipulates one variable to see its effect on another, while controlling everything else. The variable you manipulate is the independent variable (IV) — the suspected cause. The variable you measure is the dependent variable (DV) — the outcome, the thing that "depends" on the IV. Mnemonic: the DV is the Data you collect at the end; the IV is what I (the experimenter) change.
To know whether the IV did anything, you need a comparison. The experimental group receives the treatment (the IV); the control group does not, serving as the baseline. If the only systematic difference between the groups is the IV, then any difference in the DV must be caused by the IV.
But "the only difference" is doing enormous work in that sentence. Suppose your experimental group happened to be smarter, better rested, or more motivated than your control group to begin with — then you can't tell whether the IV or those pre-existing differences caused the outcome. A variable other than the IV that could explain your results is a confounding variable. Confounds are the assassins of experiments, and beating them is what the whole apparatus below is for.
The single most powerful defense is random assignment — placing each participant into the experimental or control group purely by chance (a coin flip, a random number generator). Random assignment doesn't make the groups identical, but it spreads pre-existing differences (intelligence, mood, prior experience) evenly and randomly across both groups, so they cancel out. This is the feature that makes an experiment an experiment. (Do not confuse it with random sampling, which is about who gets into the study — we'll hit that under surveys.)
Try This. Design a one-sentence experiment testing whether caffeine improves memory. Name the IV (caffeine vs. no caffeine), the DV (words recalled on a memory test), the experimental group (gets caffeine), the control group (gets a decaf drink), and one confound you'd have to control (time of day, since people are sharper in the morning). If you can fill in all five blanks, you can dissect any experiment the exam throws at you.
Before you can measure anything, you have to define it in measurable terms. Operationalization means stating a variable as the specific operations used to measure or manipulate it. "Aggression" is vague; "number of times the child hits the inflatable doll in five minutes" is operationalized. "Stress" is vague; "cortisol level in saliva" or "self-rated stress on a 1–10 scale" is operationalized. Operational definitions make a study replicable — another researcher can run it exactly — and the exam loves asking you to identify or state one.
Even with random assignment, people's expectations can poison results. If participants know they got the real drug, they may report feeling better just because they expect to — the placebo effect, a change caused by belief in a treatment rather than the treatment itself. To control it, the control group gets a placebo: an inert substance or fake treatment (a sugar pill) that's indistinguishable from the real thing.
In a single-blind study, the participants don't know which group they're in, so their expectations can't differ. But the researcher's expectations can leak too — unconsciously smiling more at the treatment group, scoring their results more generously. In a double-blind study, neither the participants nor the researchers who interact with them know who's in which group until the data are collected. Double-blind is the gold standard for drug and therapy research precisely because it shuts down expectations on both sides.
A correlational study measures the relationship between two variables without manipulating anything — you just observe and record. The result is a correlation coefficient (r), a number from −1.00 to +1.00 that captures the strength and direction of the relationship. A positive correlation means the variables move in the same direction (more studying, higher grades). A negative correlation means they move in opposite directions (more stress, less sleep). A correlation near zero means no consistent relationship. Crucially, the sign tells you direction, not strength — an r of −.85 is a stronger relationship than +.30.
Here is the line you must tattoo on your brain: correlation does not prove causation. When two variables correlate, there are always three possibilities. (1) A causes B. (2) B causes A — the directionality problem (does low self-esteem cause depression, or does depression lower self-esteem?). (3) A hidden third variable causes both, while neither causes the other (the night-lights and nearsightedness — the parents' eyes were the third variable). Because a correlational study can't rule out (2) and (3), it can never establish cause. Only the experiment, with its manipulation and control, can.
One specific correlational error worth a name: an illusory correlation is a perceived relationship between two things that aren't actually related — like believing a full moon causes more emergency-room chaos, or that you always hit traffic when you're already late. We notice and remember the hits, ignore the misses, and invent a pattern that isn't there.
A survey gathers self-reported attitudes or behaviors from many people through questions. Surveys are fast and wide, but vulnerable. Wording effects mean small changes in phrasing shift answers — people support "aid to the needy" far more than "welfare," though they're the same thing. Sampling bias occurs when your sample doesn't represent the population (the entire group you want to draw conclusions about); your sample is the subset you actually study. The defense is a random sample, in which every member of the population has an equal chance of being selected — that's what lets you generalize from sample to population. (Again: random sampling = who's in the study; random assignment = which group they land in.)
A case study examines one individual or small group in great depth. It's superb for rare conditions (a patient with a unique brain injury) and generates hypotheses, but you can't generalize from one person. Naturalistic observation watches behavior in its natural setting without intervening (a primatologist recording chimps in the wild). It captures real behavior but sacrifices control, and risks the observer changing the behavior just by being present. Finally, a meta-analysis isn't a study of people at all — it's a statistical study of studies, pooling the results of many investigations to estimate an overall effect. It's powerful because it averages out the quirks of any single study.
Loftus & Palmer's "smashed vs. hit" experiment (1974).
Who & when: Elizabeth Loftus and John Palmer, 1974 — a landmark demonstration of how leading questions can distort memory, and a clean model of experimental design.
What they did: Participants watched film clips of a car accident, then answered questions about it. The independent variable was a single word in one question: "About how fast were the cars going when they ___ each other?" The blank was filled with smashed, collided, bumped, hit, or contacted — the only thing that differed across groups. The dependent variable was the participants' estimated speed.
What they found: The verb changed the memory. "Smashed" produced an average estimate of about 40.8 mph; "contacted," only about 31.8 mph. In a follow-up a week later, participants who'd heard "smashed" were more than twice as likely to falsely remember seeing broken glass that was never in the film.
Why it matters: It's a textbook experiment — one cleanly manipulated IV, a measured DV, random assignment to verb conditions — and it proved a causal claim: the wording caused the memory distortion. It also launched the study of the misinformation effect (Lesson 11) and reshaped how courts treat eyewitness testimony. For the AP exam, Loftus = memory is reconstructive, and a model of how a tiny IV yields a real causal finding.
Scenario 1. A researcher finds that teenagers who spend more hours on social media report more symptoms of depression. A news site runs the headline: "Social Media Causes Teen Depression."
What's the methodological problem? This is a correlational study, so the headline overreaches — correlation does not prove causation. At least three readings survive: heavy social-media use could worsen depression; depression could drive teens toward more social-media use (the directionality problem); or a third variable — say, social isolation or poor sleep — could cause both. Only a true experiment randomly assigning teens to different social-media levels could support a causal claim.
Scenario 2. A company tests a new "focus" supplement. Volunteers are randomly assigned to take either the supplement or an identical-looking sugar pill. Neither the volunteers nor the staff handing out pills and scoring the focus tests know who got what until the study ends.
Identify the design features. This is a double-blind experiment with a placebo control. The sugar pill is the placebo, controlling for the placebo effect; "neither volunteers nor staff know" makes it double-blind, controlling both participant expectations and experimenter bias; random assignment balances pre-existing differences. The IV is supplement-vs-placebo; the DV is the focus-test score.
Scenario 3. A psychologist wants to understand the daily life and language of a child raised in extreme isolation, a situation far too rare and unethical to create on purpose.
Which method fits, and what's the limitation? A case study — the only practical way to study a unique, unrepeatable individual in depth. The trade-off is generalizability: findings from one extraordinary case can suggest hypotheses but can't be assumed to apply to children in general.
Independent vs. Dependent variable. Constantly swapped. The IV is what I (the experimenter) manipulate — the cause. The DV is the Data I measure at the end — the effect that depends on the IV. Test tip: the IV usually comes first in time (the treatment), the DV after (the result).
Random assignment vs. Random sampling. Both say "random," opposite jobs. Random sampling = how you pick who enters the study (for generalizing to a population). Random assignment = how you sort participants into groups once they're in (for ruling out confounds). Mnemonic: Sampling = who's in the Study; Assignment = which Arm they're in.
Correlation vs. Causation. The headline killer. A correlation tells you two things move together; it cannot tell you which causes which or whether a third variable causes both. If a study didn't manipulate a variable and use random assignment, it cannot claim cause — full stop.
Positive/negative correlation vs. strong/weak. Sign ≠ strength. Positive/negative is direction (same way vs. opposite ways); strength is how close to ±1. So −.90 is a strong (negative) correlation and +.10 is a weak (positive) one. Don't read the minus sign as "weaker."
Four-choice MCQs in current AP format. Answers and explanations in section (h).
1. (B) Independent variable. The tutoring program is what the researcher manipulates (the suspected cause). (A) the DV is the measured outcome — test scores; (C) a confound is an unwanted extra variable, not the planned treatment; (D) the control group is a group, not a variable.
2. (B). Random assignment spreads pre-existing differences randomly across groups so they cancel out, isolating the IV's effect. (A) describes random sampling, a different procedure; (C) describes blinding; (D) random assignment doesn't change sample size.
3. (B). A negative sign means the variables move in opposite directions (more sleep → fewer errors), and .78 is close to 1, so the relationship is strong. (A) reverses the direction; (C) overclaims cause from a correlation; (D) misreads the sign as weakness — sign is direction, not strength.
4. (C). A third variable (population size) plausibly inflates both churches and crime, with neither causing the other. (A) and (B) leap to causation a correlation can't support; (D) "illusory" means a perceived relationship that isn't statistically real — here the relationship is real, just non-causal.
5. (C). An operational definition states a variable in specific, measurable terms, making the study replicable and quantifiable. (A) ethics is unrelated; (B) blinding concerns who knows the conditions; (D) operationalizing doesn't determine the method type.
6. (C). "Double" blind means both participants and the researchers interacting with them are kept unaware, controlling expectations on both sides. (A) describes single-blind; (B) is incomplete; (D) blinding statisticians isn't what defines the term.
7. (C) Naturalistic observation. Watching and recording real behavior in its natural setting without intervening is the definition. (A) a case study probes one individual in depth; (B) an experiment manipulates a variable; (D) a survey collects self-report.
8. (B). The placebo effect is improvement caused by belief in a treatment rather than the treatment itself. (A) describes a real drug effect, not the placebo effect; (C) is experimenter bias; (D) is sampling bias.
9. (B) Wording effects. Identical underlying issues drawing different support based on phrasing is the textbook wording effect. (A) sampling bias concerns who is asked; (C) random assignment is an experimental procedure; (D) directionality is a correlational problem.
10. (B) Sampling bias. A sample from one wealthy private school doesn't represent all U.S. high schoolers, so conclusions can't generalize. (A) and (D) are experimental-control concepts irrelevant to a survey's representativeness; (C) there's no manipulated IV here.
11. (B) Illusory correlation. Perceiving a relationship (socks → wins) where none statistically exists, by remembering hits and ignoring misses, is an illusory correlation. (A) requires an actual measured relationship; (C) a third variable requires two truly correlated variables; (D) operationalization is about defining variables.
12. (C) Meta-analysis. Statistically combining the results of many existing studies to estimate an overall effect is a meta-analysis. (A), (B), and (D) each involve collecting new data from participants, not pooling prior studies.
13. (C) Confounding variable. Time of day (morning vs. post-lunch) differs systematically between groups alongside the IV, offering an alternative explanation for any difference — the definition of a confound. (A) the DV is the quiz score; (B) no fake treatment is involved; (D) the coin flip was random assignment, which the colleague isn't disputing.
14. (B) Strong and negative. A clear downward slope means negative (variables move oppositely); tight clustering around the line means strong. (A) misreads the downward slope as positive; (C) "weak" contradicts the tight clustering; (D) zero would show no slope/random scatter.
15. (B) Social connection. Strength is judged by absolute distance from zero, ignoring sign: |.62| > |−.45| > |.21| > |−.08|, so +.62 is strongest. (A), (C), and (D) all have smaller absolute values, even though commute time's negative sign might tempt a "direction = strength" error.
---
PsyIQ · Lesson 2 of 30 · Unit 1: Biological Bases of Behavior. Q1-style practice modeled on the redesigned (2025+) AP Psychology exam. Not affiliated with the College Board. AP is a registered trademark of the College Board. Content pending external psychology QC.