CSPIQ · AP Computer Science Principles

Lesson 13: Binary Search & Algorithmic Efficiency

Big Idea 3 (AAP) · Phase 3

Objectives

Run binary search by hand: middle, compare, discard half, repeat — and count the comparisons exactly
State the prerequisite that disqualifies binary search (unsorted data) and compare step-counts vs. linear search
Use the CED's efficiency language: reasonable time (constant/linear/square/polynomial) vs. unreasonable time (exponential/factorial); heuristics for when exact is too slow; undecidable for when no algorithm can exist

Warm-Up

I'm thinking of a number from 1 to 1,000. Each guess, I say "higher," "lower," or "correct."

Strategy A: guess 1, 2, 3, ... — up to 1,000 guesses. Strategy B: guess 500. "Higher." Guess 750. "Lower." Guess 625...

Strategy B halves the possibilities every guess: 1000 → 500 → 250 → 125 → 63 → 32 → 16 → 8 → 4 → 2 → 1. Ten guesses, guaranteed, for any number up to 1,000. Twenty guesses would handle a million.

Strategy B is binary search. It's why your contacts app finds "Yusuf" instantly among thousands of names. But it needs one thing to work — you already used it without noticing. The number line was ordered. "Higher/lower" is meaningless without order. Hold that thought.

Core Concept

Binary search: the halving algorithm

Binary search finds a target in a sorted list by repeatedly checking the middle element and discarding the half that can't contain the target.

Find 71 in [12, 25, 38, 46, 59, 71, 88] (7 elements, sorted):

Step	Remaining portion	Middle	Compare	Action
1	indexes 1–7	index 4 → 46	71 > 46	Discard left half; keep 5–7
2	indexes 5–7	index 6 → 71	71 = 71	Found — 2 comparisons

Rules for hand-execution: middle of the current portion (for an even-sized portion, the exam will pick a convention — typically the lower middle — and the question will be built so it doesn't matter or is stated); compare; discard the impossible half including the middle element; repeat until found or nothing remains ("not in list" is a valid result).

Step counts: binary vs. linear

Linear search (Lesson 12) checks one element per step: worst case = LENGTH comparisons. Binary search eliminates half per step: worst case ≈ the number of times you can halve the list before reaching 1.

List size	Linear worst case	Binary worst case
16	16	5
100	100	7
1,000	1,000	10
1,000,000	1,000,000	20

The binary column grows by one step each time the list doubles — that's the halving signature. Exam questions: "maximum comparisons to find a value in a sorted list of 64?" → 64 → 32 → 16 → 8 → 4 → 2 → 1 is 6 halvings → 7 checks at most (count the checks, not just the halvings — with the exam's usual convention, a 64-element list needs at most 7 middle-checks). When in doubt, simulate a small case by hand and count comparisons.

The prerequisite, in bold: binary search requires sorted data. Unsorted list → "discard half" is invalid (the target could be anywhere) → binary search simply cannot be used — not "runs slower," CANNOT. Exam answer choices that apply binary search to unsorted data are wrong by definition; sorting first is possible but is itself work.

Reasonable vs. unreasonable time

The CED classifies algorithm efficiency by how the number of steps scales with input size n:

Reasonable time: step counts that grow as a polynomial in n — constant, linear (n), square (n²), cube... Linear search (n) and binary search (better than linear) are both reasonable; so is comparing every pair in a list (n²).
Unreasonable time: step counts that grow exponentially (2ⁿ) or worse (factorial, n!). Doubling with every added input element means even fast computers die at modest n: 2⁶⁰ steps outlives us all.

The classification cares about scaling, not absolute speed. An n² algorithm on a big input beats a 2ⁿ algorithm on a medium one. Signature phrasings: "tries every possible subset/combination/ordering" → exponential/factorial → unreasonable. "Examines each element (once, or a fixed number of times per element)" → polynomial → reasonable.

(You may know Big-O notation from elsewhere. AP CSP does not use it — answer in the CED's words: reasonable/unreasonable, polynomial/exponential.)

Heuristics: good-enough, fast enough

When the exact algorithm is unreasonable, we don't give up — we approximate. A heuristic is a problem-solving approach that finds a solution that is good enough (not guaranteed optimal or exact) in reasonable time.

Standard example: a delivery service routing a truck through 50 cities. Trying every route (factorial — unreasonable) is out. Heuristic: "always drive to the nearest unvisited city." Result: a good route, quickly — maybe not the best route, and that's the accepted trade. Exam signals for heuristic-correct answers: "approximate," "good enough," "not guaranteed optimal," "in reasonable time."

Undecidable problems: where algorithms end

Some problems cannot be solved by ANY algorithm — not "slow," not "unsolved so far": provably impossible. An undecidable problem has no algorithm that produces a correct yes/no answer for every possible input. An algorithm may work for some inputs — undecidability means no single algorithm handles all of them.

The famous one: the halting problem — write a program that examines any program + its input and always correctly answers "will this ever stop running?" Proven impossible (Turing, 1936). Consequence you've already lived: no tool can reliably pre-detect every infinite loop (Lesson 11) — your IDE genuinely can't always warn you.

Keep the categories straight — three different flavors of "hard": - Unreasonable: an exact algorithm exists but scales exponentially (usable heuristics often exist) - Unsolved: no one has found an efficient method yet - Undecidable: mathematically NO algorithm can exist for all inputs

Worked Examples

Example 1 (easy): Hand-run binary search

Problem: Binary search for 14 in [3, 8, 14, 27, 40]. List the elements examined.

Solution: Middle of 1–5: index 3 → 14. Found in one comparison. Elements examined: just 14.

Interpretation: Yes, sometimes it's instant — the middle is the target. Don't overwork easy cases; count what actually happens.

Example 2 (medium): Count the maximum

Problem: A sorted list has 32 elements. What is the maximum number of elements binary search examines to determine whether a target is present?

Solution: Each check halves what remains: 32 → 16 → 8 → 4 → 2 → 1. Checks: after examining a middle, the remainder halves; worst case examines a middle at sizes 32, 16, 8, 4, 2, 1 → 6 examinations.

Interpretation: Reliable recipe: repeatedly halve the size down to 1, counting each stage including the size-1 check. For 2ᵏ elements, that's k + 1 stages... but note 32 → ... → 1 lists six sizes (32, 16, 8, 4, 2, 1) with five halvings — and standard AP answers accept the six-checks count. When an exam item wants this, its choices are spaced far apart (6 vs. 16 vs. 32) precisely so convention nuances don't matter. Compute the halving chain; pick the nearby answer.

Example 3 (medium): The prerequisite

Problem: A program has the list [42, 7, 91, 15, 60] and must determine whether 15 is present. A student proposes binary search "because it's faster." What's wrong, and what are the options?

Solution: The list is unsorted — binary search's discard-half logic is invalid here; it can produce wrong answers (search for 15: middle is 91; 15 < 91 discards the right half — but there's nothing meaningfully "left" in unsorted data; the logic just doesn't apply). Options: (1) linear search as-is — 5 checks, fine; (2) sort first, then binary search — worth it only if you'll search many times, since sorting costs more than one linear search.

Interpretation: The exam asks this as "why can't binary search be used?" — answer: the data is not sorted. Four words, one point.

Example 4 (AP-style): Classify the algorithms

Problem: Which algorithm runs in unreasonable time?

(A) Finding the largest value in a list by examining each element once (B) Checking whether a list contains duplicates by comparing every pair of elements (C) Finding the best possible schedule by evaluating every possible ordering of 20 tasks (D) Binary search on a sorted list of one million elements

Solution: (C). Every ordering of 20 tasks = 20! ≈ 2.4 × 10¹⁸ evaluations — factorial, unreasonable. (A) is linear; (B) is n² — polynomial, reasonable (slower than linear, still reasonable!); (D) is ~20 steps.

Interpretation: (B) is the tempter — "every pair" sounds explosive but is only n², squarely polynomial. The unreasonable signature is every subset/ordering/combination, not every pair.

Common Mistakes

Binary search on unsorted data. It's not slow there — it's invalid. Check for the word "sorted" before selecting any binary-search answer.
Counting halvings instead of checks (or vice versa). Simulate the chain (32 → 16 → ... → 1) and count what the question asks — element examinations. Choices are spaced so the chain gets you there.
Calling n² unreasonable. Polynomial = reasonable, even when it's slower than linear. Unreasonable starts at exponential (2ⁿ).
Thinking heuristics guarantee the best answer. A heuristic trades optimality for speed. "Guaranteed optimal" + "heuristic" in the same answer choice = wrong choice.
"Undecidable" = "very hard." No: undecidable = no algorithm can exist for all inputs, proven. Hard-but-solvable is "unreasonable"; both terms appear in the same question stem to test the distinction.

Practice Problems

Question 1

Binary search requires that the list being searched is:

(A) Shorter than 100 elements
(B) Composed of numbers only
(C) Stored in a spreadsheet
(D) Sorted

Question 2

Binary search for 88 in [10, 25, 40, 55, 70, 88, 95]: which element is examined FIRST?

(A) 10
(B) 88
(C) 95
(D) 55

Question 3

In problem 2, how many elements are examined in total to find 88?

(A) 6
(B) 3
(C) 2
(D) 7

Question 4

A sorted list has 1,000,000 elements. Binary search's worst case is closest to:

(A) 500,000 examinations
(B) 1,000 examinations
(C) 20 examinations
(D) 1,000,000 examinations

Question 5

When a sorted list's size doubles, binary search's worst-case examinations:

(A) Double
(B) Increase by one
(C) Square
(D) Stay exactly the same

Question 6

Which situation REQUIRES linear search rather than binary search?

(A) The list is sorted in descending order
(B) The list is very long
(C) The list is unsorted and cannot be sorted first
(D) The target might not be in the list

Question 7

An algorithm's steps grow as n² for input size n. The CED classifies it as running in:

(A) Unreasonable time, because n² grows quickly
(B) Reasonable time, because n² is polynomial
(C) Undecidable time
(D) Constant time

Question 8

An algorithm finds the optimal seating chart by testing all 2ⁿ possible guest subsets. For large n this is:

(A) Reasonable, because computers are fast
(B) A heuristic
(C) Undecidable
(D) Unreasonable, because the steps grow exponentially with n

Question 9

A mapping app needs driving routes in milliseconds, so instead of evaluating every possible route it always extends the path toward the destination. This approach is best described as:

(A) A heuristic that finds a good-enough route in reasonable time without guaranteeing the optimal one
(B) An undecidable algorithm
(C) Binary search applied to roads
(D) An algorithm guaranteed to find the shortest route

Question 10

The halting problem is significant because it proves that:

(A) All programs eventually halt
(B) Some problems cannot be solved by any algorithm for all inputs
(C) Exponential algorithms can always be made polynomial
(D) Sorted lists cannot contain duplicates

Question 11

(Select two answers.) Which statements about search algorithms are true?

(A) Linear search works on both sorted and unsorted lists
(B) Binary search examines every element in the worst case
(C) For large sorted lists, binary search requires far fewer examinations than linear search
(D) Linear search requires the list to be sorted

Question 12

Because the halting problem is undecidable, which tool is IMPOSSIBLE to build?

(A) A tool that detects some infinite loops
(B) A tool that correctly determines, for every possible program and input, whether it will halt
(C) A tool that measures how long a program has been running
(D) A tool that stops a program after a time limit

Create PT Connection

Efficiency talk can elevate a written response from adequate to sharp. If your PT procedure searches your list, you can say why you chose linear search:

"My list is unsorted because entries are added in the order the user creates them, so my procedure uses a linear search — examining each element once — which runs in reasonable time for the list sizes my program handles."

One sentence, three graded concepts (data organization, algorithm choice, efficiency vocabulary). Don't force binary search into your PT to seem fancy — using it on an unsorted list is a correctness error a scorer will notice, and sorting adds complexity you'd have to explain. Choose the algorithm your data actually supports; explain the choice. That's what mastery looks like at this level.

Show answer key & explanations

(g) Answer Key

1. (D). The prerequisite. Everything else is irrelevant to validity.

2. (D). Seven elements → first middle is index 4 → 55. Binary search always opens in the middle, never at an end.

3. (C). 88 > 55 → keep indexes 5–7. Middle: index 6 → 88. Found. Two examinations: 55, then 88.

4. (C). A million halves to one in about 20 steps (2²⁰ ≈ 1,048,576). This astonishing smallness is the point of binary search.

5. (B). One extra halving handles twice the data — the signature of halving algorithms. (A) describes linear search's response to doubling.

6. (C). No sort available → binary search's logic is invalid → linear is the only option. (A): descending is still sorted (search with flipped comparisons); (B) favors binary, not linear; (D): both handle absent targets fine.

7. (B). Polynomial (n²) = reasonable, by CED definition — regardless of it being slower than linear. This exact discrimination is a real exam item.

8. (D). 2ⁿ = exponential = unreasonable. (B): trying ALL subsets is the exact opposite of a heuristic — it's brute force.

9. (A). Trades guaranteed-optimal for fast + good-enough = heuristic, textbook definition with the standard use case. (D) claims the guarantee the heuristic explicitly gives up.

10. (B). Undecidability, in one sentence. (A) is false and (C) is a fantasy; (D) is from a different universe.

11. (A) and (C). (A): linear search has no prerequisites — that's its virtue. (C): 20 vs. 1,000,000 examinations. (B): binary's worst case is the halving chain, a tiny fraction of "every element." (D): reverses the prerequisite — it's binary that needs sorting.

12. (B). The impossible tool is the universal halt-decider — "every possible program and input." (A), (C), (D) are all buildable (and exist); detecting some infinite loops is easy. The word "every" carries the entire question — undecidability is always about all inputs.

Answer letter distribution check: D, D, C, C, B, C, B, D, A, B, A+C, B — singles: A×1, B×4, C×3, D×3 + multi (A,C). D-weight delivered as planned; cumulative spread through L13 ≈ A 23%, B 30%, C 26%, D 21%.

Exam tip: Three vocabulary lanes, one question each, nearly every form: (1) sorted-prerequisite, (2) reasonable = polynomial / unreasonable = exponential, (3) undecidable = no algorithm for all inputs, ever. If you can recite those three lines cold, this lesson has paid for itself before any halving chain gets counted.

← All lessons

‹ Lesson 12

Lesson 14 ›