Introducing Hypothesis Tests for Proportions — AP Statistics

AP Statistics · Inference for Categorical Data: Proportions · 14 min read

1. Stating Null and Alternative Hypotheses ★★☆☆☆ ⏱ 3 min

All hypothesis tests start with two competing claims about the unknown population proportion $p$. The **null hypothesis** ($H_0$) is the default claim of no effect, no difference, or status quo, and by convention always includes an equals sign. The **alternative hypothesis** ($H_a$) is the research claim we seek evidence for, and never includes equality.

One-sided (left-tailed): $H_a: p < p_0$ (true proportion suspected lower than null claim)
One-sided (right-tailed): $H_a: p > p_0$ (true proportion suspected higher than null claim)
Two-sided: $H_a: p \neq p_0$ (true proportion suspected different, no direction given)

Exam tip: Always define the parameter $p$ in context before writing your hypotheses. AP Statistics graders require this step for full credit, even if your hypotheses are written correctly.

2. Checking Conditions for a One-Proportion Z-Test ★★☆☆☆ ⏱ 3 min

Before conducting inference, we must check three core conditions to ensure our sampling distribution is approximately normal, which guarantees our p-value calculation is accurate. The three conditions are summarized as Random, Independent, Normal.

**Random**: Data comes from a random sample or randomized experiment, ensuring the sample is unbiased.
**Independent**: When sampling without replacement, the 10% condition requires the sample size $n$ is less than 10% of the total population size ($n < 0.1N$). This ensures observations can be treated as independent.
**Normal (Large Counts Condition)**: The sampling distribution of $\hat{p}$ is approximately normal if $np_0 \geq 10$ and $n(1-p_0) \geq 10$. For hypothesis tests, we use the null hypothesized value $p_0$ (not $\hat{p}$), because we assume $H_0$ is true for the test.

Exam tip: If the problem does not explicitly state the population size, assume the 10% condition is met as long as the population is clearly much larger than the sample.

3. Calculating the Test Statistic and P-Value ★★★☆☆ ⏱ 4 min

If all conditions are met, we assume $H_0$ is true, so the sampling distribution of $\hat{p}$ is approximately normal with mean $p_0$ and standard deviation $\sqrt{\frac{p_0(1-p_0)}{n}}$, the standard error under the null hypothesis.

z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}

The **p-value** is the probability of observing a test statistic as extreme or more extreme than the one calculated from your sample, *assuming $H_0$ is true*. The p-value calculation depends on the form of the alternative hypothesis:

Left-tailed ($H_a: p < p_0$): $\text{p-value} = P(Z < z)$
Right-tailed ($H_a: p > p_0$): $\text{p-value} = P(Z > z)$
Two-sided ($H_a: p \neq p_0$): $\text{p-value} = 2P(Z > |z|)$ (double the single-tail area)

Exam tip: On free response questions, you must show the formula for the z-test statistic to earn full credit, even if you use a calculator to get the final value.

4. Drawing a Conclusion in Context ★★☆☆☆ ⏱ 2 min

After calculating the p-value, we compare it to a pre-specified significance level $\alpha$, almost always $\alpha = 0.05$ unless another value is given in the problem. There are only two statistically correct conclusions:

If $\text{p-value} < \alpha$: **Reject $H_0$**. There is convincing statistical evidence to support the alternative hypothesis $H_a$ in context.
If $\text{p-value} \geq \alpha$: **Fail to reject $H_0$**. There is not convincing statistical evidence to support the alternative hypothesis $H_a$ in context.

Exam tip: AP exam graders will deduct points if your conclusion contradicts your p-value comparison, so always double-check that your decision matches your p-value.

5. Additional AP-Style Worked Examples ★★★☆☆ ⏱ 2 min

📐 Worked Example

A candy company claims that 25% of all their candy boxes contain a prize. A group of customers suspects that the true proportion of boxes with prizes is less than 25%. They take a random sample of 120 candy boxes, and 23 boxes contain prizes. Use $\alpha = 0.05$ for all inference. State hypotheses, check conditions, calculate test statistic/p-value, and draw a conclusion.

Define the parameter and state hypotheses: Let $p$ = the true proportion of all candy boxes produced by the company that contain a prize.
$H_0: p = 0.25, \quad H_a: p < 0.25$
Check conditions: 1. Random: The sample is stated to be random, so condition satisfied. 2. 10% condition: The total population of candy boxes is much larger than 1200 (10 times the sample size), so independence is satisfied. 3. Large Counts:
$np_0 = 120(0.25) = 30 \geq 10, \quad n(1-p_0) = 120(0.75) = 90 \geq 10$
All conditions are satisfied. Calculate $\hat{p} = 23/120 ≈ 0.192$, then calculate the test statistic:
$z = \frac{0.192 - 0.25}{\sqrt{\frac{(0.25)(0.75)}{120}}} ≈ -1.47$
p-value = $P(Z < -1.47) ≈ 0.071$. Compare to $\alpha = 0.05$: $0.071 > 0.05$, so we fail to reject $H_0$.
Conclusion: At the 0.05 significance level, there is not convincing statistical evidence that the true proportion of candy boxes with prizes is less than the 25% claimed by the company.

Common Pitfalls

Why: Students confuse the known sample statistic with the unknown population parameter when setting up tests.

Why: Students memorize the Large Counts condition from confidence intervals and incorrectly apply it without adjustment.

Why: Students remember one-sided p-value calculation and overlook the "extreme in either direction" logic for two-sided tests.

Why: Students carry over the confidence interval standard error formula to the hypothesis test setting.

Why: Students think the binary decision means either hypothesis is proven true, but we start with the null as an unproven assumption.

Why: Students rush at the end of problems and overlook the AP requirement for contextual interpretation.