Statistics · 14 min read · Updated 2026-05-11

Inference for a Difference in Two Proportions — AP Statistics

AP Statistics · AP Statistics CED Unit 6 · 14 min read

1. Core Concepts and Sampling Distribution ★★☆☆☆ ⏱ 3 min

Inference for a difference in two proportions is a set of statistical methods used to compare the proportion of successes (a binary outcome) between two independent populations or two experimental treatment groups. The core goal is to use data from two independent samples to make claims about the true difference between the two population proportions.

The sampling distribution of $\hat{p}_1 - \hat{p}_2$ has the following key properties:

\mu_{\hat{p}_1 - \hat{p}_2} = p_1 - p_2

\sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}

When conditions are met, the sampling distribution is approximately Normal, which allows us to use z-based inference for the difference.

2. Conditions for Valid Inference ★★☆☆☆ ⏱ 4 min

All valid inference requires four core conditions, with a variation for the normality check based on the type of inference:

**Random**: Both groups come from independent random samples or a randomized controlled experiment, ensuring unbiased estimates.
**10% Condition**: When sampling without replacement from a finite population, each sample size must be no more than 10% of its population to ensure within-group independence. Not required for randomized experiments.
**Independent Groups**: The two groups are independent of each other (no matching or paired data).
**Large Counts (Normality)**: For confidence intervals: all four observed counts $n_1\hat{p}_1, n_1(1-\hat{p}_1), n_2\hat{p}_2, n_2(1-\hat{p}_2) \geq 10$. For $H_0: p_1=p_2$ hypothesis tests: pooled counts $(n_1 + n_2)\hat{p}_{pooled}$ and $(n_1 + n_2)(1-\hat{p}_{pooled}) \geq 10$.

📐 Worked Example

A community organizer wants to compare voter turnout between two neighborhoods (population 1800 and 2500). They take an SRS of 120 voters from Neighborhood 1 and 180 from Neighborhood 2, finding 78 and 99 voters voted respectively. Check all conditions for a confidence interval for the difference in turnout.

1. **Random Condition**: The problem states both are independent simple random samples, so this condition is satisfied.
2. **10% Condition**: $n_1 = 120 < 0.1(1800) = 180$, and $n_2 = 180 < 0.1(2500) = 250$, so the 10% condition is satisfied.
3. **Independent Groups**: Samples from the two neighborhoods are independent, so this condition is met.
4. **Large Counts (Confidence Interval)**: All observed counts: 78, 42, 99, 81 are all ≥ 10, so normality is satisfied.
All conditions for inference are met.

Exam tip: Always write out each condition explicitly when asked; AP graders award a point for each correctly checked condition, never skip any.

3. Confidence Intervals for $p_1 - p_2$ ★★★☆☆ ⏱ 4 min

A confidence interval for a difference in two proportions gives a range of plausible values for the true difference $p_1 - p_2$. We never pool sample proportions for confidence intervals, because we do not assume $p_1 = p_2$ when estimating the difference. We always use unpooled standard error.

(\hat{p}_1 - \hat{p}_2) \pm z^*\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}

Where $z^*$ is the critical value for your desired confidence level (e.g., $z^* = 1.96$ for 95% confidence, $z^* = 1.645$ for 90% confidence). If 0 is inside the interval, 0 is a plausible value for the true difference, meaning there is no statistically significant evidence of a difference at significance level $\alpha = 1 - \text{confidence level}$.

📐 Worked Example

Using the voting turnout example from the previous section: $n_1=120$, $\hat{p}_1 = 0.65$ (Neighborhood 1), $n_2=180$, $\hat{p}_2=0.55$ (Neighborhood 2). Conditions are already checked. Construct and interpret a 95% confidence interval for $p_1 - p_2$.

1. Calculate the difference in sample proportions:
$\hat{p}_1 - \hat{p}_2 = 0.65 - 0.55 = 0.10$
2. Find the 95% confidence critical value:
$z^* = 1.96$
3. Calculate unpooled standard error:
$SE = \sqrt{\frac{0.65(0.35)}{120} + \frac{0.55(0.45)}{180}} \approx 0.0572$
4. Calculate margin of error:
$ME = 1.96 \times 0.0572 \approx 0.112$
5. Construct the interval:
$0.10 \pm 0.112 = (-0.012, 0.212)$
6. Interpretation: We are 95% confident that the true difference in voter turnout between Neighborhood 1 and Neighborhood 2 is between -0.012 and 0.212.

Exam tip: If you swap the order of $p_1$ and $p_2$, the interval bounds flip sign but the conclusion about whether 0 is in the interval stays the same. Just ensure your interpretation matches your order of difference.

4. Hypothesis Tests for Difference in Proportions ★★★☆☆ ⏱ 5 min

We use hypothesis tests to test a claim about whether two population proportions differ. The most common null hypothesis is $H_0: p_1 - p_2 = 0$ (no difference between proportions). Because the null assumes $p_1 = p_2$, we pool the two samples to get a single estimate of the common population proportion, which gives a more accurate standard error for the test.

The pooled standard error and z-test statistic are:

SE_{pooled} = \sqrt{\hat{p}_{pooled}(1-\hat{p}_{pooled})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}

z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{SE_{pooled}}

📐 Worked Example

A college tests whether a new early registration process increases the proportion of students who register before the deadline. 250 students are randomly assigned to the new process (group 1) and 250 to the old process (group 2). 205 students in the new group registered early, compared to 180 in the old group. Test at $\alpha = 0.05$ whether the new process increases the proportion of early registration.

1. Define parameters and state hypotheses: Let $p_1$ = true proportion of early registration for the new process, $p_2$ = true proportion for the old process. $H_0: p_1 - p_2 = 0$, $H_a: p_1 - p_2 > 0$.
2. Check conditions: Random assignment meets the random condition, 10% condition is skipped for an experiment, groups are independent. Calculate pooled large counts:
$\hat{p}_{pooled} = \frac{205 + 180}{250 + 250} = 0.77, (500)(0.77) = 385 \geq 10, 500(0.23) = 115 \geq 10$
3. Calculate test statistic: $\hat{p}_1 = 0.82$, $\hat{p}_2 = 0.72$, difference = 0.10:
$SE_{pooled} \approx 0.0377, z = \frac{0.10}{0.0377} \approx 2.65$
4. Find p-value: For a right-tailed test, $p$-value = $P(Z > 2.65) \approx 0.004$.
5. Conclusion: Since $0.004 < 0.05$, we reject $H_0$. There is sufficient evidence at the 0.05 significance level that the new early registration process increases the proportion of students who register before the deadline.

Exam tip: Only pool when your null hypothesis is $p_1 - p_2 = 0$. Non-zero null differences (extremely rare on AP) require unpooled standard error.

Common Pitfalls

Why: Students confuse pooling rules for hypothesis tests and confidence intervals, incorrectly assuming pooling is always required

Why: Students forget that we use observed counts when population proportions are unknown

Why: Students automatically use two-sample methods for two proportions regardless of study design

Why: Students confuse the definition of confidence level with probability for a fixed interval

Why: Students memorize the 10% condition as required for all inference and forget it only applies to sampling without replacement from finite populations

Why: Students skip this step to save time, leading to unclear reasoning and lost points

Quick Reference Cheatsheet

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Inference for a Difference in Two Proportions — AP Statistics

1. Core Concepts and Sampling Distribution ★★☆☆☆ ⏱ 3 min

2. Conditions for Valid Inference ★★☆☆☆ ⏱ 4 min

3. Confidence Intervals for $p_1 - p_2$ ★★★☆☆ ⏱ 4 min

4. Hypothesis Tests for Difference in Proportions ★★★☆☆ ⏱ 5 min

Common Pitfalls

Quick Reference Cheatsheet

More study guides