US Census Data¶
The US Census Bureau is the single most-used data source for US-based GIS work. It's free, authoritative, and updated regularly.
What's available¶
| Product | What it gives you | Vintage |
|---|---|---|
| Decennial Census | Population, housing units; basics every 10 years | 2020, 2010, 2000 |
| American Community Survey (ACS) | Income, race, education, language, commute, etc. | 1-year (large areas) and 5-year (all areas) |
| TIGER/Line | Geographies — boundaries, roads, water | Yearly |
| Population estimates | Annual updates between Censuses | Yearly |
| Economic Census | Businesses, employment by industry | Every 5 years |
| CPS / SIPP | Specialized population/economic surveys | Various |
For most GIS analysis you'll combine TIGER/Line geographies with ACS attributes.
Geographies (smallest → largest)¶
| Geography | Average size | Use it for |
|---|---|---|
| Block | ~50 housing units | Decennial-only; very granular |
| Block group | ~600–3,000 people | Smallest ACS geography |
| Census tract | ~4,000 people | Most common analytical unit |
| County | Wide variation | Regional analysis |
| ZCTA (zip-code-like) | Postal proxy | Marketing / business |
| Place | City / town boundary | Municipal analysis |
Choose tract for most projects
Tracts are the sweet spot for visibility and statistical reliability.
How to get the data¶
Option A — data.census.gov (web UI)¶
- Go to https://data.census.gov.
- Search a topic (e.g., "median household income").
- Filter by Geography (e.g., "All census tracts in Texas").
- Pick a Survey (ACS 5-year, latest).
- Click Download → CSV.
This is the easiest path if you only need 1–3 variables.
Option B — Census API (programmatic)¶
Get an API key: https://api.census.gov/data/key_signup.html
import requests
import pandas as pd
API_KEY = "YOUR_KEY"
year = 2022
ds = "acs/acs5"
get_vars = "NAME,B19013_001E" # name + median household income
geo = "for=tract:*&in=state:48" # all tracts in Texas (FIPS 48)
url = f"https://api.census.gov/data/{year}/{ds}?get={get_vars}&{geo}&key={API_KEY}"
data = requests.get(url).json()
df = pd.DataFrame(data[1:], columns=data[0])
df["GEOID"] = df["state"] + df["county"] + df["tract"]
df["B19013_001E"] = pd.to_numeric(df["B19013_001E"], errors="coerce")
The variable codes (e.g., B19013_001E) are documented at: https://api.census.gov/data/2022/acs/acs5/variables.html
Option C — pygris / tidycensus (Python / R)¶
For a friendlier path:
- Python:
pip install pygris(geometries) +pip install census(variables) - R:
install.packages("tidycensus")— extremely popular among researchers
tidycensus example:
library(tidycensus)
library(sf)
income_tx <- get_acs(geography = "tract",
variables = "B19013_001",
state = "TX",
geometry = TRUE,
year = 2022)
plot(income_tx["estimate"])
Common ACS variables¶
| Variable | Code | Notes |
|---|---|---|
| Total population | B01003_001 | "Universe" pop |
| Median household income | B19013_001 | In current dollars |
| Population below poverty | B17001_002 | Below poverty count |
| Hispanic or Latino | B03003_003 | Origin |
| Race — White alone | B02001_002 | |
| Race — Black/African Am. | B02001_003 | |
| Race — Asian | B02001_005 | |
| Owner-occupied housing | B25003_002 | |
| Renter-occupied housing | B25003_003 | |
| No vehicle households | B25044_003 + B25044_010 | |
| Foreign-born population | B05002_013 | |
| College degree or higher | B15003_022..025 | sum |
| Median age | B01002_001 | |
| Median gross rent | B25064_001 | |
| Median home value | B25077_001 | |
| Commute by public transit | B08301_010 | |
| Commute time | B08303_001 |
Full list: https://api.census.gov/data/2022/acs/acs5/variables.html
Joining ACS to TIGER/Line¶
The GEOID is your join key:
| Geography | GEOID format | Length |
|---|---|---|
| State | 48 | 2 |
| County | 48201 | 5 |
| Tract | 48201100100 | 11 |
| Block group | 482011001001 | 12 |
Always treat GEOID as TEXT, not a number — leading zeros matter ('06037' not 6037).
In ArcGIS Pro¶
- Add the TIGER tract shapefile.
- Open the ACS CSV — make sure GEOID is
TEXT. - Right-click tract layer → Joins and Relates → Add Join.
- Join field:
GEOID(left) ↔GEOID(right). - Validate — count of matches should equal the number of tracts.
→ See How to Clean Data Before Joining for traps.
Margins of error (MOE)¶
ACS variables include a margin of error (e.g., B19013_001M). For small geographies (block groups, small tracts), MOEs are often huge. Always:
- Include MOE next to estimate when reporting.
- Aggregate up if MOE > ~50% of estimate.
- Use 5-year ACS for stability over 1-year for less-populated areas.
"Recipes" you'll run all the time¶
Median income choropleth (county level)¶
Join to county TIGER → choropleth in ArcGIS Pro. → See Population Choropleth project.
% no-vehicle households (block group)¶
pct_no_veh = (no_veh_owners + no_veh_renters) / total * 100
→ See Transit Desert project.
→ Next: OpenStreetMap.