Computer Science Principles · AP CSP CED Unit: Data · 16 min read · Updated 2026-05-11
Data — AP Computer Science Principles
AP Computer Science Principles · AP CSP CED Unit: Data · 16 min read
1. Binary Numbers and Abstraction★★☆☆☆⏱ 5 min
In digital computing, all data is stored as sequences of binary digits (bits), which use the base-2 number system with only two values: 0 and 1. This matches the two stable states of electronic circuits (on/off), making binary the universal low-level encoding for all digital data.
Abstraction hides low-level binary implementation details, so users only interact with higher-level usable formats like images or text. To convert a binary number to decimal (base-10), multiply each bit by its corresponding power of 2, starting from $2^0$ for the rightmost bit, then sum all products:
Decimal = \sum_{i=0}^{n-1} (bit_i \times 2^i)
2. Data Compression: Lossy vs Lossless★★☆☆☆⏱ 3 min
Data compression reduces file size to save storage and speed up transmission over networks. There are two core types tested on the AP CSP exam, each with distinct use cases.
3. Data Analysis and Visualization★★★☆☆⏱ 4 min
Data analysis is the process of collecting, cleaning, processing, and interpreting data to extract actionable insights. The standard workflow is:
**Collect**: Gather data from surveys, sensors, or public datasets
**Clean**: Remove invalid, duplicate, or incomplete entries
**Process**: Calculate summary metrics and identify relationships between variables
**Communicate**: Use visualization to make patterns easy to interpret
Data visualization uses graphs to display data, making patterns, trends, and outliers easier to spot than in raw spreadsheets. The four most common types tested are:
**Bar charts**: Compare values across discrete categories
**Line charts**: Show changes in a variable over time
**Scatter plots**: Show the relationship between two continuous variables
**Pie charts**: Show proportional shares of a whole
4. Information Privacy and Security★★☆☆☆⏱ 3 min
Information privacy refers to the right of individuals to control how their personal data is collected, used, and shared. The core concept in this area is Personally Identifiable Information (PII).
Several common measures are used to protect private data:
**Encryption**: Scrambles data so it can only be read with a secret decryption key
**Authentication**: Verifies user identity via passwords, two-factor authentication (2FA), or biometrics to block unauthorized access
**Anonymization**: Removes PII from shared datasets to prevent identification of individuals
Common privacy risks include data breaches, phishing attacks, and unauthorized tracking by advertisers. AP CSP exam questions often ask you to evaluate tradeoffs between convenience and privacy: for example, free location-tracking map apps provide useful real-time traffic data, but store a permanent record of your location history.
5. Concept Check★★★☆☆⏱ 4 min
Common Pitfalls
Why: Students learn binary as a number system and forget it encodes all digital data types
Why: Students prioritize smaller file size and forget lossy compression causes permanent data loss
Why: Students see a clear pattern in a visualization and assume a direct causal link
Why: Internet speeds are listed in megabits per second, while file sizes are listed in megabytes
Why: Students think removing obvious PII like names makes data fully anonymous