Statistics · Exploring Two-Variable Data · 14 min read · Updated 2026-05-11

Representing Two-Variable Quantitative Data — AP Statistics

AP Statistics · Exploring Two-Variable Data · 14 min read

1. Bivariate Quantitative Data Overview ★☆☆☆☆ ⏱ 3 min

Two-variable (bivariate) quantitative data consists of paired measurements of two different quantitative variables collected on the same observational unit. The core goal of graphical representation is to visualize any association between the two variables: do changes in one variable tend to correspond to predictable changes in the other?

This foundational topic is part of AP Statistics Unit 2, which accounts for 5-7% of total AP exam weight, and appears in both MCQ and FRQ sections. Unlike univariate data that focuses on the distribution of one variable, bivariate representation focuses exclusively on the relationship between two variables.

2. Explanatory vs Response Variables ★★☆☆☆ ⏱ 3 min

The first critical step in representing bivariate quantitative data is correctly classifying the two variables by their role in the research question. Roles are determined by the research goal, not the variables themselves.

📐 Worked Example

A public health researcher studies 30 urban neighborhoods to examine how the number of fast-food restaurants per square mile predicts the rate of adult obesity (as a percent of the population). Identify which variable is explanatory, which is response, and state the correct axis for each in a scatterplot.

The research goal explicitly states that number of fast-food restaurants is used to predict obesity rate, so the variable used for prediction is the explanatory variable by definition.
Explanatory variable = number of fast-food restaurants per square mile.
The variable being predicted is the response variable: obesity rate (percent of adult population).
By plotting convention, explanatory variables go on the horizontal $x$-axis and response variables go on the vertical $y$-axis.
Final answer: Explanatory = number of fast-food restaurants (x-axis), Response = obesity rate (y-axis)

Exam tip: If you are unsure of roles, look for phrasing like 'use A to predict B' — A is always explanatory, B is always response.

3. Constructing and Interpreting Scatterplots ★★☆☆☆ ⏱ 4 min

A scatterplot is the standard graphical representation for two-variable quantitative data. Each observational unit is represented by a single point placed at the intersection of its $x$ (explanatory) and $y$ (response) values.

Label both axes with the variable name and its units
Use a consistent, appropriate scale that fits all data points
Plot each point accurately
Never connect points with lines (connecting is only done for time series plots, not standard scatterplots of independent observational units)

*Direction*: Positive = as $x$ increases, $y$ tends to increase; Negative = as $x$ increases, $y$ tends to decrease; No direction = no clear association.
*Shape*: Most commonly linear or non-linear (curved).
*Strength*: Strong = points lie close to the overall pattern; Weak = points are widely spread from the pattern.
*Outliers*: Any point that falls far outside the overall pattern of the association.

📐 Worked Example

A real estate agent collects data on 12 recently sold homes, measuring square footage of the home (x, in hundreds of square feet) and sale price (y, in thousands of USD). The resulting scatterplot shows that sale price tends to increase as square footage increases, points lie close to a straight line, and one small 1,000 square foot home sold for \_800,000, far above the trend of other points of the same size. Describe all four key features of the association.

Direction: Sale price tends to increase as square footage increases, so the direction is positive.
Shape: Points follow a straight-line trend, so the shape is linear.
Strength: Points lie close to the linear trend, so the association is strong.
Outliers: There is one clear outlier: a small home that sold for a much higher price than the overall pattern predicts.

Exam tip: Even if there are no outliers, you must explicitly state 'there are no clear outliers' to get full credit on an AP FRQ description question.

4. Linear vs Non-Linear Associations ★★★☆☆ ⏱ 4 min

One of the most important tasks when representing bivariate data is distinguishing between linear and non-linear associations, because all simple linear regression methods you will learn later only produce valid results for linear associations.

Common non-linear patterns you may see on the exam include: increasing at an increasing rate (concave up, e.g., bacterial population growth over time), increasing at a decreasing rate (concave down, e.g., crop yield increasing with fertilizer use that levels off at high fertilizer amounts), and U-shaped or inverted U-shaped curves.

📐 Worked Example

An ecologist studies the relationship between elevation (in meters above sea level) and the number of native plant species found per 100 square meter plot, across 18 plots in a mountain range. The scatterplot shows that the number of species is low at low elevation, increases to a maximum at middle elevation, then decreases again at high elevation. Is this association linear or non-linear? Justify your answer.

A linear association requires the rate of change of species count with respect to elevation to be constant across all elevation values.
In this case, species count first increases with elevation, then decreases, so the rate of change changes from positive to negative as elevation increases, meaning it is not constant.
The overall pattern is a curved, inverted U-shape, not a straight line.
Therefore, the association is non-linear.

Exam tip: Don't assume an association is linear just because it is positive. Always check if the trend is straight, not just increasing or decreasing.

5. Concept Check ★★☆☆☆ ⏱ 2 min

Common Pitfalls

Why: Students assume any variable can go on any axis, and do not tie axis assignment to the research question.

Why: Students confuse scatterplots with algebra class line graphs or time series plots.

Why: Students remember direction and strength, but forget to address shape or explicitly note that there are no outliers.

Why: Students confuse extreme values with outliers from the association pattern.

Why: Students confuse "not linear" with "no relationship between variables".

Why: Students label variables but omit units, which is required for full credit.

Quick Reference Cheatsheet

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Representing Two-Variable Quantitative Data — AP Statistics

1. Bivariate Quantitative Data Overview ★☆☆☆☆ ⏱ 3 min

2. Explanatory vs Response Variables ★★☆☆☆ ⏱ 3 min

3. Constructing and Interpreting Scatterplots ★★☆☆☆ ⏱ 4 min

4. Linear vs Non-Linear Associations ★★★☆☆ ⏱ 4 min

5. Concept Check ★★☆☆☆ ⏱ 2 min

Common Pitfalls

Quick Reference Cheatsheet

More study guides