Statistics · Inference for Categorical Data: Chi-Square · 14 min read · Updated 2026-05-11

Introducing Chi-Square — AP Statistics

AP Statistics · Inference for Categorical Data: Chi-Square · 14 min read

1. What is Chi-Square? ★☆☆☆☆ ⏱ 3 min

Introducing chi-square (written $\ ext{\chi}^2$, pronounced "kigh-square") is the foundation for all inference on categorical data in AP Statistics, making up 6-10% of the total AP exam weight per the official College Board CED. Unlike z or t-tests designed for quantitative data, chi-square procedures are built specifically for working with count data from categorical variables.

The core idea unifying all chi-square inference is comparing observed counts of observations collected in our sample to the expected counts we would see if our null hypothesis were true. The $\ ext{\chi}^2$ distribution is a family of right-skewed, non-negative distributions, whose shape depends only on degrees of freedom, rather than sample size directly. This topic appears on both MCQ and FRQ sections of the AP exam, and is a strict prerequisite for all more advanced chi-square procedures.

2. The Chi-Square Test Statistic ★★☆☆☆ ⏱ 4 min

The $\ ext{\chi}^2$ test statistic is the core calculation for all chi-square inference, quantifying how far observed data deviates from the expected distribution under the null hypothesis. The formula for the test statistic is:

\chi^2 = \sum \frac{(O - E)^2}{E}

where $O$ is the observed count for a category, and $E$ is the expected count for that category under the null hypothesis. The structure of the formula has intuitive reasoning: first, we subtract $E$ from $O$ to get the raw deviation of observed from expected. We square the deviation to ensure all terms are positive, so positive and negative deviations do not cancel each other out. We divide by $E$ to scale the deviation by the size of the expected count: a difference of 10 between $O$ and $E$ is much more meaningful when $E=10$ than when $E=100$. The $\ ext{\chi}^2$ distribution is always right-skewed, and becomes more symmetric (approaching a normal distribution) as degrees of freedom increase, because of the Central Limit Theorem.

📐 Worked Example

A coffee shop claims that their four sizes of iced coffee are equally popular: 25% small, 25% medium, 25% large, 25% extra-large. A barista takes a random sample of 80 orders during a week and gets: 17 small, 26 medium, 24 large, 13 extra-large. Calculate the chi-square test statistic for this data.

List observed counts for each category:
$O_{\text{small}}=17, O_{\text{medium}}=26, O_{\text{large}}=24, O_{\text{xl}}=13. Total n=80.$
Calculate expected counts: the null proportion for each category is 0.25, so $E = 80 \times 0.25 = 20$ for all categories.
Calculate $\frac{(O-E)^2}{E}$ for each category: Small: $\frac{(17-20)^2}{20} = 0.45$ Medium: $\frac{(26-20)^2}{20} = 1.8$ Large: $\frac{(24-20)^2}{20} = 0.8$ Extra-large: $\frac{(13-20)^2}{20} = 2.45$
Sum all terms to get the test statistic:
$\chi^2 = 0.45 + 1.8 + 0.8 + 2.45 = 5.5$

Exam tip: Always show your calculation steps to earn full credit on FRQs

3. Conditions for Chi-Square Inference ★★☆☆☆ ⏱ 3 min

All inference relies on meeting conditions to ensure the p-value we calculate is reliable, and the AP exam almost always awards 1 point on FRQs for correctly stating and checking chi-square conditions. There are three required conditions:

**Random**: The data comes from a random sample from the population of interest, or a randomized experiment. This is identical to the random condition for all other inference procedures, and ensures we can generalize results to the population or establish causation (for experiments).
**Independence**: Individual observations are independent of each other. For sampling without replacement, this means the 10% condition holds: the sample size is less than 10% of the total population size.
**Large Counts**: The $\ ext{\chi}^2$ distribution is a continuous approximation to the discrete sampling distribution of the test statistic, so this condition ensures the approximation is accurate. The AP Stats CED accepts the rule: all expected counts are at least 1, and no more than 20% of expected counts are less than 5. A stricter rule (all expected counts ≥ 5) is also acceptable.

4. Chi-Square Goodness-of-Fit Tests ★★★☆☆ ⏱ 5 min

A chi-square goodness-of-fit (GOF) test is the first full inference procedure introduced for chi-square, used to test whether the distribution of a single categorical variable matches a claimed null distribution.

Degrees of freedom for a GOF test is always $df = k - 1$, where $k$ is the number of categories of the variable. We lose 1 degree of freedom because the sum of expected counts is always fixed to the total sample size, so we have one less free parameter to estimate. All chi-square tests are right-tailed: larger $\ ext{\chi}^2$ values mean larger deviations from the null, so the p-value is $P(\text{\chi}^2(df) \geq \text{calculated } \\text{\chi}^2)$.

📐 Worked Example

A geneticist claims that four possible phenotypes from a genetic cross occur in the proportions 9:3:3:1 (i.e., 9/16, 3/16, 3/16, 1/16). A random sample of 160 offspring gives the following observed counts: 86, 31, 29, 14. Conduct a chi-square GOF test at $\\alpha=0.05$ to test the geneticist's claim.

**Hypotheses**: $H_0$: The phenotypic distribution matches the 9:3:3:1 genetic ratio. $H_a$: The phenotypic distribution differs from the claimed ratio.
**Check conditions**: Random sample is given, population of offspring is far more than 1600, so 10% condition holds. Expected counts: $160 \times 9/16 = 90$, $160 \times 3/16 = 30$, $160 \times 3/16 = 30$, $160 \times 1/16 = 10$. All expected counts ≥ 5, so large counts condition is satisfied.
**Calculate test statistic and df**:
$\chi^2 = \frac{(86-90)^2}{90} + \frac{(31-30)^2}{30} + \frac{(29-30)^2}{30} + \frac{(14-10)^2}{10} \approx 1.844$
$df = 4 - 1 = 3$
**Conclusion**: $P(\chi^2(3) \geq 1.844) \approx 0.605$. Since $0.605 > 0.05$, we fail to reject $H_0$. There is not sufficient evidence to reject the geneticist's claimed ratio.

Common Pitfalls

Why: Students confuse the null proportion for a category with the expected count, forgetting to multiply by total sample size

Why: Students forget that the total sample size uses up one degree of freedom, and confuse GOF degrees of freedom with other chi-square procedures

Why: Students are used to two-tailed tests for z and t, and forget only large deviations count as evidence against $H_0$

Why: Students mix up observed and expected counts when checking conditions

Why: Students carry over bad habits from earlier hypothesis testing, forgetting we can never prove the null is true

Quick Reference Cheatsheet

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →