Statistics · Unit 7: Inference for Quantitative Data: Means · 14 min read · Updated 2026-05-11
Inference for a Mean Difference with Paired Data — AP Statistics
AP Statistics · Unit 7: Inference for Quantitative Data: Means · 14 min read
1. Paired Data and Study Design★★☆☆☆⏱ 3 min
Paired data occurs when there is a natural one-to-one matching between observations in the two groups we compare. Two common scenarios produce paired data, and pairing eliminates between-pair variability to reduce standard error and make inference more powerful than independent samples.
**Repeated measures**: The same experimental unit is measured twice under two different conditions (e.g., heart rate before and after exercise)
**Matched pairs design**: Units are matched into pairs on shared confounding variables, then one unit per pair gets each treatment
Exam tip: When in doubt, ask: 'Does every observation in the first group have exactly one unique connected observation in the second group?' If yes, it is paired.
2. Conditions for Paired t-Inference★★☆☆☆⏱ 3 min
All paired t-procedures require three conditions, all checked on the sample of differences (not the original two sets of measurements), analogous to one-sample t-procedures.
**Random**: Pairs are randomly selected from the population, or treatments are randomly assigned within pairs
**Normal/Large Sample**: The sampling distribution of $\bar{d}$ is approximately normal if $n \geq 30$, or $n < 30$ with no strong skewness/outliers in the difference distribution
**Independent**: Differences are independent; when sampling without replacement, the 10% condition ($N \geq 10n$) applies
Exam tip: Always check conditions on the differences, not the original two groups. The AP exam deducts points for checking conditions on unpaired original data.
3. Paired t-Test for a Population Mean Difference★★★☆☆⏱ 4 min
A paired t-test is used to test a claim about the true population mean difference $\mu_d$. Almost always, the null hypothesis is $H_0: \mu_d = 0$, as we test whether there is any difference between paired measurements. The test statistic is:
t = \frac{\bar{d} - \mu_{d0}}{s_d / \sqrt{n}}
Where $\mu_{d0}$ is the null hypothesized difference (almost always 0), and degrees of freedom are $df = n - 1$.
Exam tip: Always define $\mu_d$ in context, specifying which measurement is subtracted from which, before writing hypotheses. This is required for full points.
4. Paired t-Confidence Interval for a Population Mean Difference★★★☆☆⏱ 3 min
A paired t-confidence interval estimates the true value of $\mu_d$, quantifying the size of the mean difference rather than just testing if it is non-zero. If the interval does not contain 0, we reject $H_0: \mu_d = 0$ at significance level $\alpha = 1-C$ for a two-sided test, where $C$ is the confidence level. The formula is:
\bar{d} \pm t^* \times \frac{s_d}{\sqrt{n}}
Where $t^*$ is the critical t-value for confidence level $C$ and $df = n-1$, and the second term is the margin of error.
Exam tip: When interpreting the interval, always specify the direction of the difference (which group minus which) in context. Ambiguous interpretations lose points.
5. Concept Check★★☆☆☆⏱ 1 min
Common Pitfalls
Why: Students see two groups of measurements and automatically select the two-sample procedure without checking for pairing
Why: Students forget inference is conducted on the differences, not the original measurements
Why: While $\bar{x}_1 - \bar{x}_2 = \bar{d}$ is true, using original standard deviations to calculate standard error is incorrect
Why: Students confuse the t/z distinction, incorrectly assuming large n justifies z-procedures
Why: Students confuse the distribution of individual differences with the confidence interval for the mean difference
Why: Students stop after rejecting/failing to reject $H_0$ and do not answer the original research question