| Study Guides
Statistics · Exploring Two-Variable Data · 14 min read · Updated 2026-05-11

AP Statistics Residuals — AP Statistics

AP Statistics · Exploring Two-Variable Data · 14 min read

1. Definition and Core Properties of Residuals ★★☆☆☆ ⏱ 3 min

A residual is the difference between the observed value of the response variable and the predicted value from a regression model. It is used exclusively to assess how well a linear model fits bivariate data, and this topic makes up 10-15% of AP Statistics Unit 2, appearing in both multiple-choice and free-response sections.

AP exam notation consistently uses $e$ for residuals, $y$ for observed response, and $\hat{y}$ for predicted response. A key frequently tested property: the sum of all residuals for a least-squares regression line is always zero, because the LSRL balances positive and negative errors. This gives you a built-in check for your calculations.

2. Calculating and Interpreting Individual Residuals ★★☆☆☆ ⏱ 4 min

To find the residual for a single data point, we use a simple formula that follows directly from the definition:

e = y - \hat{y}

Where $y$ is the observed response value from the original data, and $\hat{y}$ is the predicted response value calculated by plugging the observed explanatory variable $x$ into the LSRL equation $\hat{y} = a + bx$. A positive residual means the model underpredicted the response (observed > predicted), and a negative residual means the model overpredicted the response (observed < predicted). On the AP exam, you must complete both calculation and context interpretation to earn full credit.

3. Interpreting Residual Plots to Assess Linear Model Fit ★★★☆☆ ⏱ 3 min

A residual plot is a scatterplot with residuals $e$ on the y-axis and either the explanatory variable $x$ or predicted value $\hat{y}$ on the x-axis. We use residual plots to check if a linear model is appropriate, because subtle patterns that are hard to see on the original $y$ vs $x$ scatterplot become very clear in a residual plot.

The core rule for assessment is simple: if there is no clear, systematic pattern in the residual plot, a linear model is appropriate. If there is a clear systematic pattern, a linear model is not appropriate. Common problematic patterns are: (1) curved patterns (U-shape or inverted U-shape), which indicate a non-linear true relationship; (2) fanning/funnel patterns, where the spread of residuals changes as $x$ increases, which indicate non-constant variance (heteroscedasticity).

4. Standard Deviation of Residuals ★★★☆☆ ⏱ 4 min

The standard deviation of residuals (written $s$ or $s_e$) is a numerical measure of the average size of the residuals, meaning it tells you how far, on average, observed values are from the regression line. It complements graphical assessment from residual plots by giving a quantitative measure of model fit: smaller $s$ means predictions are typically closer to observed values, so the model fits better.

s = \sqrt{\frac{\sum e^2}{n-2}} = \sqrt{\frac{\sum (y - \hat{y})^2}{n-2}}

Where $n$ is the number of observations, we divide by $n-2$ (degrees of freedom for regression) to get an unbiased estimate. We square residuals to eliminate negative signs (since the sum of raw residuals is always zero, the average raw residual is useless) then take the square root to return to the original units of the response variable.

Common Pitfalls

Why: Students confuse the order of subtraction when writing 'prediction error'.

Why: Subtle non-linear patterns are often invisible on the original scatterplot but clear in residuals.

Why: Students forget the sum of raw residuals is always zero for LSRL, so average raw residual is meaningless.

Why: Students confuse random sampling variation with a systematic pattern.

Why: Students confuse population standard deviation with regression residual standard deviation.

Why: Students think residuals should be zero for a good model.

Quick Reference Cheatsheet

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →