The Sampling Distribution of a Sample Proportion — AP Statistics
1. Definition and Core Notation ★★☆☆☆ ⏱ 3 min
When we collect a random sample from a population with a categorical trait (e.g., voter support, defective products), we calculate a sample proportion $\hat{p}$ to estimate the true population proportion $p$. This subtopic makes up 5-8% of your total AP exam score, appearing on both multiple choice and free response, often as a prerequisite for inference questions.
- $p$: Fixed, unknown population proportion (parameter)
- $\hat{p}$: Sample proportion (statistic, varies per sample)
- $n$: Sample size
- $\mu_{\hat{p}}$: Mean of the sampling distribution
- $\sigma_{\hat{p}}$: Standard deviation (standard error) of the sampling distribution
2. Center and Spread of the Sampling Distribution of $\hat{p}$ ★★☆☆☆ ⏱ 4 min
For any sampling distribution of $\hat{p}$, the center is always equal to the true population proportion:
μ_{\hat{p}} = p
This property means $\hat{p}$ is an unbiased estimator of $p$: on average, across all possible random samples of the same size, the sample proportion hits the true population proportion exactly, with no systematic over- or under-estimation.
The standard deviation (standard error) of $\hat{p}$ follows the formula:
σ_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}
Intuition: Variability is highest when $p=0.5$, and decreases as $p$ approaches 0 or 1. Larger samples produce less variable estimates: to cut the standard error in half, you need a sample 4 times as large.
Exam tip: Always distinguish between $\hat{p}$ (a single value from your sample) and $\mu_{\hat{p}}$ (the mean of all possible $\hat{p}$ values) — mixing up these two notations is a common point deduction on FRQs.
3. Conditions for a Normal Approximation ★★★☆☆ ⏱ 3 min
To use the Normal distribution to calculate probabilities for $\hat{p}$, two conditions must be satisfied, each for a different purpose:
- **10% Condition**: When sampling without replacement, $n \leq 0.1N$ (sample size no more than 10% of population size). This makes dependence from sampling without replacement negligible, so the standard deviation formula is valid.
- **Large Counts (Normal) Condition**: $np \geq 10$ and $n(1-p) \geq 10$ (expected successes and failures both at least 10). This ensures the sampling distribution is close enough to Normal to use Normal approximation. AP CED requires 10, not the 5 used in some older textbooks.
Exam tip: On FRQs, you must explicitly name each condition and show your calculation for the check to earn full credit — just saying "conditions are met" earns zero points for the condition step.
4. Calculating Probabilities for $\hat{p}$ ★★★☆☆ ⏱ 4 min
Once both conditions are satisfied, the sampling distribution of $\hat{p}$ is approximately Normal with mean $\mu_{\hat{p}} = p$ and standard deviation $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$. To find the probability that $\hat{p}$ falls in any range, convert $\hat{p}$ to a z-score, then use the standard Normal distribution:
z = \frac{\hat{p} - p}{\sqrt{\frac{p(1-p)}{n}}}
Exam tip: When $p$ is known (always true for sampling distribution problems before inference), always use $p$ to calculate $\sigma_{\hat{p}}$ — never use $\hat{p}$ here, that is only for confidence intervals when $p$ is unknown.
Common Pitfalls
Why: Students confuse this with confidence interval inference, where we do not know $p$ so we use $\hat{p}$ to estimate standard error.
Why: Students memorize the variance $\frac{p(1-p)}{n}$ but forget standard deviation is the square root of variance.
Why: Students memorize the two conditions but do not learn what each checks.
Why: Students memorize that random sampling gives unbiased estimators, but do not state the definition correctly.
Why: Some older textbooks use 5, but the AP Statistics CED requires 10.