US Census Data¶

The US Census Bureau is the single most-used data source for US-based GIS work. It's free, authoritative, and updated regularly.

What's available¶

Product	What it gives you	Vintage
Decennial Census	Population, housing units; basics every 10 years	2020, 2010, 2000
American Community Survey (ACS)	Income, race, education, language, commute, etc.	1-year (large areas) and 5-year (all areas)
TIGER/Line	Geographies — boundaries, roads, water	Yearly
Population estimates	Annual updates between Censuses	Yearly
Economic Census	Businesses, employment by industry	Every 5 years
CPS / SIPP	Specialized population/economic surveys	Various

For most GIS analysis you'll combine TIGER/Line geographies with ACS attributes.

Geographies (smallest → largest)¶

Block → Block Group → Tract → County subdivision → County → State → Nation

Geography	Average size	Use it for
Block	~50 housing units	Decennial-only; very granular
Block group	~600–3,000 people	Smallest ACS geography
Census tract	~4,000 people	Most common analytical unit
County	Wide variation	Regional analysis
ZCTA (zip-code-like)	Postal proxy	Marketing / business
Place	City / town boundary	Municipal analysis

Choose tract for most projects

Tracts are the sweet spot for visibility and statistical reliability.

How to get the data¶

Option A — data.census.gov (web UI)¶

Go to https://data.census.gov.
Search a topic (e.g., "median household income").
Filter by Geography (e.g., "All census tracts in Texas").
Pick a Survey (ACS 5-year, latest).
Click Download → CSV.

This is the easiest path if you only need 1–3 variables.

Option B — Census API (programmatic)¶

Get an API key: https://api.census.gov/data/key_signup.html

import requests
import pandas as pd

API_KEY = "YOUR_KEY"
year = 2022
ds = "acs/acs5"
get_vars = "NAME,B19013_001E"           # name + median household income
geo = "for=tract:*&in=state:48"          # all tracts in Texas (FIPS 48)

url = f"https://api.census.gov/data/{year}/{ds}?get={get_vars}&{geo}&key={API_KEY}"
data = requests.get(url).json()
df = pd.DataFrame(data[1:], columns=data[0])
df["GEOID"] = df["state"] + df["county"] + df["tract"]
df["B19013_001E"] = pd.to_numeric(df["B19013_001E"], errors="coerce")

The variable codes (e.g., B19013_001E) are documented at: https://api.census.gov/data/2022/acs/acs5/variables.html

Option C — `pygris` / `tidycensus` (Python / R)¶

For a friendlier path:

Python: pip install pygris (geometries) + pip install census (variables)
R: install.packages("tidycensus") — extremely popular among researchers

tidycensus example:

library(tidycensus)
library(sf)
income_tx <- get_acs(geography = "tract",
                     variables = "B19013_001",
                     state = "TX",
                     geometry = TRUE,
                     year = 2022)
plot(income_tx["estimate"])

Common ACS variables¶

Variable	Code	Notes
Total population	B01003_001	"Universe" pop
Median household income	B19013_001	In current dollars
Population below poverty	B17001_002	Below poverty count
Hispanic or Latino	B03003_003	Origin
Race — White alone	B02001_002
Race — Black/African Am.	B02001_003
Race — Asian	B02001_005
Owner-occupied housing	B25003_002
Renter-occupied housing	B25003_003
No vehicle households	B25044_003 + B25044_010
Foreign-born population	B05002_013
College degree or higher	B15003_022..025	sum
Median age	B01002_001
Median gross rent	B25064_001
Median home value	B25077_001
Commute by public transit	B08301_010
Commute time	B08303_001

Full list: https://api.census.gov/data/2022/acs/acs5/variables.html

Joining ACS to TIGER/Line¶

The GEOID is your join key:

Geography	GEOID format	Length
State	`48`	2
County	`48201`	5
Tract	`48201100100`	11
Block group	`482011001001`	12

Always treat GEOID as TEXT, not a number — leading zeros matter ('06037' not 6037).

In ArcGIS Pro¶

Add the TIGER tract shapefile.
Open the ACS CSV — make sure GEOID is TEXT.
Right-click tract layer → Joins and Relates → Add Join.
Join field: GEOID (left) ↔ GEOID (right).
Validate — count of matches should equal the number of tracts.

→ See How to Clean Data Before Joining for traps.

Margins of error (MOE)¶

ACS variables include a margin of error (e.g., B19013_001M). For small geographies (block groups, small tracts), MOEs are often huge. Always:

Include MOE next to estimate when reporting.
Aggregate up if MOE > ~50% of estimate.
Use 5-year ACS for stability over 1-year for less-populated areas.

"Recipes" you'll run all the time¶

Median income choropleth (county level)¶

get_vars = "NAME,B19013_001E"
geo = "for=county:*"

Join to county TIGER → choropleth in ArcGIS Pro. → See Population Choropleth project.

% no-vehicle households (block group)¶

total = "B25044_001E"
no_veh_owners  = "B25044_003E"
no_veh_renters = "B25044_010E"

pct_no_veh = (no_veh_owners + no_veh_renters) / total * 100

→ See Transit Desert project.

→ Next: OpenStreetMap.