Statistics · Unit 8: Inference for Categorical Data: Chi-Square · 14 min read · Updated 2026-05-11

Selecting an Inference Procedure — AP Statistics

AP Statistics · Unit 8: Inference for Categorical Data: Chi-Square · 14 min read

1. Overview of the Inference Selection Skill ★★☆☆☆ ⏱ 3 min

This AP Statistics skill requires you to match a given research question, study design, and type of categorical data to the correct inference procedure, rather than just calculating a test statistic or p-value. Per the AP CED, this topic contributes 2-5% of multiple-choice points and 1-2 points on nearly every chi-square-focused FRQ, with points awarded solely for correct selection.

Unlike calculation-focused problems, this topic tests conceptual understanding of how study design and research question drive inference choice, a core competency the AP exam prioritizes. Many students lose easy points here by mixing up the three chi-square procedures, so mastering this selection step is critical for full credit.

2. Identifying a Chi-Square Goodness-of-Fit Test ★★☆☆☆ ⏱ 3 min

The null hypothesis for a GOF test always specifies hypothesized proportions for each category: $H_0: p_1 = p_{1,0}, p_2 = p_{2,0}, ..., p_k = p_{k,0}$, where $k$ is the number of categories. The alternative hypothesis is that at least one $p_i$ does not equal the hypothesized value.

Only one group/sample
One categorical variable
A specific hypothesized distribution or set of proportions is provided
Common contexts: testing claimed ratios, die fairness, expected demographic distributions

📐 Worked Example

A plant biologist expects that the ratio of purple-flowered to white-flowered to blue-flowered offspring from a cross will be 2:1:1. She collects a random sample of 100 offspring from the cross and counts how many plants have each flower color. She wants to test whether her expected ratio is correct. What is the appropriate inference procedure?

1. Count the number of samples and variables: we have one random sample of 100 offspring, and one categorical variable (flower color, 3 levels).
2. Identify the research goal: the biologist wants to test if the observed distribution of flower colors matches her hypothesized 2:1:1 ratio, which gives specific hypothesized proportions $0.5, 0.25, 0.25$ for each category.
3. There is no comparison of multiple groups, and no test of association between two variables.
4. Therefore, the appropriate procedure is a chi-square goodness-of-fit test.

Exam tip: If the problem gives you a pre-specified set of proportions or a ratio to test against, it is a goodness-of-fit test 99% of the time on the AP exam.

3. Identifying a Chi-Square Test for Homogeneity ★★★☆☆ ⏱ 3 min

A common source of confusion: tests for homogeneity produce two-way contingency tables, just like tests for independence, but the sampling design is the key difference. For homogeneity, you sample separately from each pre-defined group, so group sizes are fixed before data collection.

The research question for homogeneity is always: *Does the distribution of [response variable] differ across [multiple groups]?* The null hypothesis is that the distribution of the response variable is the same for all groups, and the alternative is that at least one group has a different distribution.

📐 Worked Example

A researcher wants to test whether the distribution of vaccine acceptance (fully accepting, hesitant, refusing) differs between three groups of adults: rural, suburban, and urban residents. The researcher selects independent random samples of 200 adults from each of the three residence types, then records each adult's vaccine acceptance category. What inference procedure is appropriate?

1. Count groups and variables: we have three independent groups (rural, suburban, urban), with sample sizes fixed in advance at 200 per group. We measure one categorical response variable: vaccine acceptance, with three levels.
2. Research goal: test whether the distribution of vaccine acceptance is the same (homogeneous) across the three residence groups.
3. We are not testing association between two variables from one single sample, and we are not fitting to a pre-specified distribution, so it cannot be GOF or independence.
4. Therefore, the appropriate procedure is a chi-square test for homogeneity.

Exam tip: If the problem explicitly states it took separate random samples from each of multiple groups and wants to compare distributions, it is always a test for homogeneity.

4. Identifying a Chi-Square Test for Independence ★★★☆☆ ⏱ 3 min

The null hypothesis is that the two variables are independent in the population; the alternative is that they are dependent (associated). Like the test for homogeneity, this uses a two-way contingency table, but the sampling design differs: for independence, you take one random sample, no group totals are fixed in advance, both row and column totals are random.

One random sample from the population
Two categorical variables measured on each individual
Research question asks if there is an association or relationship between the two variables

📐 Worked Example

A sociologist collects a random sample of 500 working adults in a large city. She records two variables for each adult: highest level of education completed (high school, bachelor’s, graduate degree) and whether they report being satisfied with their job (satisfied, dissatisfied). She wants to test whether job satisfaction is associated with education level. What is the appropriate inference procedure?

1. Count samples and variables: there is one random sample of 500 working adults, with two categorical variables measured on each individual (education level, job satisfaction).
2. No group sizes were fixed in advance: the number of adults with each education level is a random result of sampling, not set by the researcher.
3. The research question asks whether there is an association between the two variables, which matches the goal of a test for independence.
4. Therefore, the appropriate procedure is a chi-square test for independence.

Exam tip: If the research question asks 'is there an association between' two categorical variables, it is always a test for independence.

5. Concept Check ★★★☆☆ ⏱ 5 min

Common Pitfalls

Why: Both produce the same test statistic calculation, so students assume they are interchangeable, but AP grading requires matching procedure to study design

Why: Students see 'distribution' and default to goodness-of-fit regardless of the number of groups

Why: Students see proportions and default to z-procedures, which are only for one or two proportions

Why: Any 2x2 table can use either procedure, but AP questions expect the procedure matching the research question

Why: Students only remember to check conditions for calculation, not when justifying procedure selection

Quick Reference Cheatsheet

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Selecting an Inference Procedure — AP Statistics

1. Overview of the Inference Selection Skill ★★☆☆☆ ⏱ 3 min

2. Identifying a Chi-Square Goodness-of-Fit Test ★★☆☆☆ ⏱ 3 min

3. Identifying a Chi-Square Test for Homogeneity ★★★☆☆ ⏱ 3 min

4. Identifying a Chi-Square Test for Independence ★★★☆☆ ⏱ 3 min

5. Concept Check ★★★☆☆ ⏱ 5 min

Common Pitfalls

Quick Reference Cheatsheet

More study guides