Statistics · Unit 3: Collecting Data · 14 min read · Updated 2026-05-11

How to Experiment Well — AP Statistics

AP Statistics · Unit 3: Collecting Data · 14 min read

1. Core Principles of Experimental Design ★★☆☆☆ ⏱ 4 min

A well-designed experiment allows researchers to isolate the effect of an explanatory variable (called a factor) on a response variable, enabling valid causal inference that cannot be drawn from observational studies. Unlike observational studies, experiments impose treatments on experimental units (called subjects if they are human) to measure response. Per the AP Statistics CED, this topic makes up ~40% of Unit 3 and 4-6% of the total AP exam score.

**Control**: Account for lurking variables by including a comparison group, eliminating confounding where treatment effects are mixed with lurking variable effects. Control groups may receive no treatment, a placebo, or an existing standard treatment.
**Randomization**: Randomly assign experimental units to treatment groups, balancing both known and unknown lurking variables across groups on average.
**Replication**: Assign each treatment to multiple independent experimental units to reduce sampling variability, making it easier to detect real treatment effects.

📐 Worked Example

A researcher wants to test whether a new over-the-counter sleep aid reduces time to fall asleep compared to a current popular brand. She recruits 60 adult volunteers with occasional insomnia. Describe how she would implement all three core principles.

**Control**: The researcher will compare the new sleep aid to the existing popular brand (the standard treatment). This controls for the placebo effect, allowing a valid comparison between the two products.
**Randomization**: Label each volunteer 1 to 60, then use a random number generator to select 30 unique numbers. Those volunteers go to the new sleep aid group, the remaining 30 to the existing brand. This balances pre-existing differences like baseline insomnia severity across groups.
**Replication**: 30 volunteers are assigned to each treatment, rather than one per treatment. Multiple units per treatment account for natural variability in time to fall asleep, reducing the chance observed differences are due to random chance.

2. Common Experimental Designs ★★★☆☆ ⏱ 4 min

Experiments are structured into three common designs based on whether researchers know of any nuisance variables (variables that affect response but are not of interest) that need to be accounted for.

**Completely Randomized Design (CRD)**: The simplest design, where all experimental units are randomly assigned to treatments with no pre-grouping. Used when no known systematic differences between units exist.
**Randomized Block Design (RBD)**: Units are grouped into blocks by a known nuisance variable, so all units in a block are similar on the nuisance variable. Random assignment of treatments happens *within each block*. Blocking reduces unwanted variability, making it easier to detect treatment effects.
**Matched Pairs Design**: A special case of RBD where each block has exactly two units matched on similar characteristics. One unit gets each treatment. Alternatively, each unit gets both treatments in random order (repeated measures matched pairs), with the unit acting as its own block.

📐 Worked Example

A researcher tests three doses of a new allergy medication: 10mg, 20mg, and 30mg. She knows allergy symptoms are much more severe for people with pet allergies than people with only seasonal allergies. Should she use a completely randomized or randomized block design? Describe the appropriate design.

A randomized block design is appropriate here, because allergy type is a known nuisance variable that affects response. Blocking removes variability from allergy type, making it easier to detect a dose effect.
First, form two blocks: Block 1 contains all participants with pet allergies, Block 2 contains all participants with only seasonal allergies.
Within each block, randomly assign each participant to one of the three dose groups, so each dose gets an equal number of participants in each block.
After four weeks, measure average allergy symptom severity for each dose group and compare results, accounting for block differences.

3. Scope of Inference ★★★★☆ ⏱ 3 min

A key AP Statistics skill is identifying what types of valid conclusions can be drawn from an experiment, based on its design. There are two separate questions to answer for any study.

**Can we conclude the treatment caused the difference in response?**: Causal conclusions are only valid if treatments were *randomly assigned* to units. Without random assignment, you can only conclude association, not causation.
**Can we generalize results to a larger population?**: Generalization is only valid if experimental units were *randomly sampled* from the population of interest. A convenience sample (like volunteer students) does not allow generalization.

📐 Worked Example

A college professor tests whether standing during lectures improves student test scores. He has two sections: 8am and 1pm. He assigns the 8am section to stand, 1pm to sit. He finds 8am scores 7% higher on the final. Can he conclude standing caused the higher score? Name a confounding variable, and state the valid inference.

No, the professor cannot conclude causation, because there was no random assignment of treatments: sections were assigned by meeting time, not randomization.
A possible confounding variable is student motivation: students who choose 8am classes are typically more motivated than those who choose 1pm classes. Motivation is confounded with standing, so we cannot tell which caused the higher score.
Because the professor used a convenience sample of his own students, he also cannot generalize results to all college students. The only valid inference is that there is an association between standing and higher scores in this specific group.

4. Concept Check ★★★☆☆ ⏱ 3 min

Common Pitfalls

Why: Students confuse random sampling (for generalization) with random assignment (the core requirement for a valid experiment that allows causation).

Why: The simplified definition ignores that control groups often get a standard existing treatment or placebo, not no treatment.

Why: Students think blocks are another factor to test, when blocks are nuisance variables grouped to reduce unwanted variability, not variables of interest.

Why: Students confuse post-experiment confirmation with replication within the original experiment.

Why: Students think matching removes the need for randomization, but order effects can confound results.

Why: Students assume any study called an experiment can support causation, but only random assignment enables causal inference.

Quick Reference Cheatsheet

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

How to Experiment Well — AP Statistics

1. Core Principles of Experimental Design ★★☆☆☆ ⏱ 4 min

2. Common Experimental Designs ★★★☆☆ ⏱ 4 min

3. Scope of Inference ★★★★☆ ⏱ 3 min

4. Concept Check ★★★☆☆ ⏱ 3 min

Common Pitfalls

Quick Reference Cheatsheet

More study guides