Statistics · Exploring Two-Variable Data · 14 min read · Updated 2026-05-11
Correlation for AP Statistics — AP Statistics
AP Statistics · Exploring Two-Variable Data · 14 min read
1. What Is Correlation?★★☆☆☆⏱ 3 min
Correlation measures the strength and direction of the linear relationship between two quantitative variables. It makes up 5-7% of the total AP Statistics exam weight, appearing in both multiple-choice (MCQ) and free-response (FRQ) questions. It is often tested as a standalone MCQ or as a foundation for longer regression FRQs.
Unlike regression slope, $r$ has no units, so it is unaffected by changes in units or scaling of either variable. The AP Statistics CED focuses almost exclusively on the sample correlation $r$ for this topic.
2. Calculating the Correlation Coefficient★★★☆☆⏱ 4 min
There are two equivalent common formulas for the sample correlation coefficient $r$, both of which you may need to use on the AP exam.
The z-score form gives clear intuition: correlation is the average of the product of standardized z-scores for the two variables.
r = \frac{1}{n-1} \sum_{i=1}^n z_{x_i} z_{y_i}
The equivalent deviation form, which is easier for hand calculation on FRQs, is:
r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{(n-1)s_x s_y}
Where $\bar{x}, \bar{y}$ are sample means, and $s_x, s_y$ are sample standard deviations. The numerator captures how $x$ and $y$ deviate from their means in the same direction: if both are above or both below their means, the product is positive, pulling $r$ up; if one is above and the other below, the product is negative, pulling $r$ down.
3. Key Properties of the Correlation Coefficient★★☆☆☆⏱ 3 min
Most AP Statistics MCQ questions on correlation test knowledge of these core properties, which are critical for earning full points:
**Bounds**: $r$ is always between -1 and 1 ($-1 \leq r \leq 1$). $r=1$ is a perfect positive linear relationship, $r=-1$ is a perfect negative linear relationship, and $r=0$ means no linear association.
**Invariance to linear transformations**: $r$ is unaffected by adding a constant or multiplying by a positive constant to either variable. Changing units will never change $r$.
**Symmetry**: The correlation of $x$ on $y$ is identical to the correlation of $y$ on $x$, unlike regression slope.
**Linear only**: $r$ only measures linear association. It can be close to 0 even if there is a strong non-linear relationship between the two variables.
**Sensitivity to outliers**: $r$ is very sensitive to extreme outliers. A single outlier can drastically shift $r$ toward or away from 0.
4. Interpreting Correlation in Context★★★☆☆⏱ 3 min
AP FRQs almost always require you to interpret the value of $r$ in context, and grading rubrics have strict requirements for full credit. To earn all points, your interpretation must include three core elements: (1) direction of the relationship, (2) strength of the *linear* relationship, (3) context that names both variables. You must explicitly mention linear association to avoid point deductions.
A common convention for strength accepted on the AP exam is: $|r| > 0.7$ = strong, $0.3 < |r| < 0.7$ = moderate, $|r| < 0.3$ = weak.
5. AP Style Concept Check★★★☆☆⏱ 3 min
Common Pitfalls
Why: Students confuse "no linear association" with "no association at all". Strong non-linear relationships can still have $r=0$.
Why: Correlation from observational data can be explained by lurking third variables.
Why: Students confuse correlation with measures of association for categorical data. Correlation is only defined for quantitative variables.
Why: Students confuse correlation with regression slope, which does change when you swap variables.
Why: Correlation is not a linear scale of strength; the distance from 0 to 0.4 is not equivalent to the distance from 0.4 to 0.8.
Why: Students forget that $r$ is highly sensitive to extreme outliers, which can drastically change its value.