The Geometric Distribution — AP Statistics

AP Statistics · CED Unit 4: Probability, Random Variables, and Probability Distributions · 14 min read

1. What Is The Geometric Distribution? ★★☆☆☆ ⏱ 3 min

The geometric distribution is a discrete probability distribution that models the number of independent trials required to get the first success in a series of repeated Bernoulli (two-outcome) trials. It is often called the "waiting-time distribution" because we measure how many trials we wait for the first success.

Unlike the binomial distribution, which fixes the number of trials and counts the number of successes, the geometric distribution reverses this framing: it fixes the probability of success per trial, and lets the number of trials be the random variable of interest. AP Statistics exclusively uses the "shifted" convention, where we count trials starting at 1, matching the official CED definition.

2. Conditions for a Geometric Setting ★★☆☆☆ ⏱ 4 min

Before you can use the geometric distribution to calculate probabilities or expected values, you must confirm that your scenario meets all four required conditions, abbreviated **BITS**:

**B**: Two possible outcomes per trial: each trial results in either a "success" (the outcome we are waiting for) or a "failure" (the other outcome).
**I**: Independent trials: the outcome of one trial does not change the probability of success for any other trial.
**T**: Wait for the first success: the number of trials is not fixed in advance; the value we measure is the number of trials needed to get the first success.
**S**: Constant success probability: the probability of success $p$ is the same for every trial.

📐 Worked Example

A coffee shop runs a promotion where 15% of coffee cups have a coupon for a free donut. A customer buys one coffee at a time until they get a coupon. Can the number of coffees the customer buys be modeled with a geometric distribution? Check all conditions.

Check the two-outcome condition: Each coffee cup either has a coupon (success) or does not (failure). Only two outcomes, so condition B is satisfied.
Check independence: Coupons are randomly distributed, so one cup having a coupon does not change the chance another cup has a coupon. Condition I is satisfied.
Check what we count: We count the number of coffees (trials) until the first coupon, so the number of trials is not fixed in advance. Condition T is satisfied.
Check constant probability: The probability of a coupon is 15% for all cups, so $p=0.15$ is constant. Condition S is satisfied.
Conclusion: All conditions are met, so this can be modeled with a geometric distribution.

Exam tip: When asked to identify the appropriate distribution for a scenario, always answer the question 'are we counting trials until a success, or counting successes in fixed trials?' first — this eliminates 50% of wrong answers immediately.

3. Geometric Probability Calculations (PMF and CDF) ★★★☆☆ ⏱ 4 min

Once you confirm a scenario meets the geometric conditions, you can calculate probabilities using two core formulas: the probability mass function (PMF) for the probability of first success on an exact trial, and the cumulative distribution function (CDF) for the probability of first success by a certain trial.

To get the probability that the first success occurs *exactly* on the $k$-th trial, you must have $k-1$ consecutive failures first, followed by a success on the $k$-th trial. Because trials are independent, we multiply the probabilities:

P(X = k) = (1-p)^{k-1}p

for $k = 1, 2, 3, ...$

For cumulative probability, the probability that the first success occurs *on or before* the $k$-th trial is equal to 1 minus the probability that the first $k$ trials are all failures, which gives a convenient shortcut:

P(X \leq k) = 1 - (1-p)^k

We can rearrange this to get the probability that the first success occurs *after* the $k$-th trial:

P(X > k) = (1-p)^k

This shortcut saves significant time on the exam, as you do not need to sum multiple individual probabilities.

Exam tip: If you are asked for $P(X < k)$, always adjust the cutoff to get the correct exponent: $P(X < k) = P(X \leq k-1) = 1 - (1-p)^{k-1}$ to avoid off-by-one errors that are common on MCQs.

4. Mean and Standard Deviation of a Geometric Random Variable ★★★☆☆ ⏱ 3 min

The geometric distribution has simple, intuitive formulas for the mean (expected value) and standard deviation. The expected value, which is the long-run average number of trials needed to get the first success, is:

E(X) = \mu_X = \frac{1}{p}

This makes intuitive sense: if the probability of success is 1/10, you expect to wait 10 trials on average for the first success. Lower probability of success means a higher expected number of trials, which matches the formula.

The variance of $X$ is $\text{Var}(X) = \frac{1-p}{p^2}$, so the standard deviation (a measure of the spread of the distribution) is:

\sigma_X = \frac{\sqrt{1-p}}{p}

All geometric distributions are right-skewed: the highest probability is always at $k=1$, and probabilities get smaller as $k$ increases. On FRQs, you are almost always required to interpret the expected value in context, which requires connecting it to the long-run average over many repetitions.

Exam tip: Always include the phrases 'on average' and 'over many repetitions' when interpreting expected value on FRQs to earn full credit for the interpretation.

5. AP Style Practice Examples ★★★★☆ ⏱ 4 min

📐 Worked Example

A warehouse ships smartphones, and 8% of all smartphones have a battery defect. A quality control inspector tests one phone at a time, randomly selected, until he finds a phone with a battery defect. What is the probability that he finds the first defective phone on the 5th phone he tests?<br>Options: A) 0.053, B) 0.069, C) 0.340, D) 0.660

This scenario meets all geometric conditions: we count trials until first success, with independent trials and constant 8% defect probability. Use the geometric PMF $P(X=k) = (1-p)^{k-1}p$.
Substitute $k=5$ and $p=0.08$:
$P(X=5) = (0.92)^4(0.08) \approx 0.7164 * 0.08 \approx 0.057$
This rounds to 0.053, the closest option. Option B uses the wrong zero-based convention, option C is $P(X \leq 5)$, and option D is $P(X > 5)$. Correct answer: A.

📐 Worked Example

A street artist sells hand-painted portraits, and has a 15% chance of making a sale to any random passerby who stops to look at their work. Assume each passerby is independent. Let $X$ be the number of passersby who stop before the artist makes their first sale of the day.<br>(a) Verify that $X$ can be modeled with a geometric distribution.<br>(b) Calculate $P(X > 6)$ and interpret this probability in context.<br>(c) Find the expected value of $X$ and interpret it in context.

(a) Check the four BITS conditions: 1. Two outcomes: each passerby either buys a portrait (success) or does not (failure), so B is satisfied. 2. Independent: the problem states passersby are independent, so I is satisfied. 3. We count passersby (trials) until the first sale, so the number of trials is not fixed, T is satisfied. 4. Probability of sale is 15% for all passersby, so S is satisfied. All conditions are met.
(b) Use the geometric shortcut for $P(X > k)$:
$P(X > 6) = (0.85)^6 \approx 0.377$
Interpretation: There is about a 37.7% chance that the artist will not make a sale to the first 6 passersby who stop.
(c) Calculate expected value:
$E(X) = 1/p = 1/0.15 \approx 6.67$
Interpretation: Over many days where the artist waits for the first sale of the day, the average number of passersby who stop before the first sale is about 6.67.

📐 Worked Example

A geneticist is studying a recessive trait in pea plants. Each offspring plant has a 25% chance of expressing the recessive trait, independent of other offspring. The geneticist is growing plants one at a time until they get 1 plant that expresses the recessive trait for an experiment. Each plant costs \$1.20 in supplies. What is the expected total cost of the experiment? What is the probability the geneticist gets the desired plant within the first 3 plants grown?

Let $X$ be the number of plants grown until the first recessive trait plant is obtained. $X$ is a geometric random variable with $p=0.25$.
Calculate the expected number of plants:
$E(X) = 1/0.25 = 4$
Multiply by cost per plant to get expected total cost:
$4 * 1.20 = 4.80$
The expected total cost is \$4.80.
Calculate the probability of getting the plant within the first 3 plants:
$P(X \leq 3) = 1 - (0.75)^3 = 1 - 0.4219 = 0.5781$
There is a 57.8% chance the geneticist will get the desired plant within the first 3 plants grown.

Common Pitfalls

Why: Confusion between different geometric distribution conventions used in different textbooks; AP exclusively uses the shifted (trial-counting) convention.

Why: Both use Bernoulli trials, so students forget to check what is being counted and whether the number of trials is fixed.

Why: Off-by-one error from misinterpreting the inequality cutoff.

Why: Confusion between probability of success per trial and expected number of trials until first success.

Why: Students forget that independence is violated when sampling without replacement from small populations, just like in binomial settings.

The Geometric Distribution — AP Statistics

1. What Is The Geometric Distribution? ★★☆☆☆ ⏱ 3 min

2. Conditions for a Geometric Setting ★★☆☆☆ ⏱ 4 min

3. Geometric Probability Calculations (PMF and CDF) ★★★☆☆ ⏱ 4 min

4. Mean and Standard Deviation of a Geometric Random Variable ★★★☆☆ ⏱ 3 min

5. AP Style Practice Examples ★★★★☆ ⏱ 4 min

Common Pitfalls

Quick Reference Cheatsheet

More study guides