SQL Case Study · Cost of Living · 3,222 Counties
The Cost-of-Living SQL Warehouse
Two county-grain cost indexes — monthly shelter and family costs (TSCI) and weekly consumable basket costs (TCBI) — loaded into one normalized SQLite warehouse, then interrogated with six business questions. JOINs, CTEs, window functions, and a correlated subquery, with every number below produced by a query you can rerun.
3,222
3,133 with weekly basket data
793
counties where infant care beats rent
$5,310
San Mateo County, CA · per month
+49.4%
NYC metro vs. New York state average
The Business Question
What does it actually cost to live in your county?
National cost-of-living averages hide the picture that matters: the county. A household’s biggest line items split into two rhythms — the monthly nut (rent or mortgage, insurance, childcare) and the weekly basket (gas, groceries, the family dinner plate, the occasional restaurant meal). Those two rhythms come from different federal data systems and almost never sit in the same table.
This project puts them in the same database. The county outputs of two tools I built and published — the True Shelter & Family Cost Index and the True Cost Basket Index — are loaded into a normalized SQLite warehouse and joined on county FIPS, so questions that cross the monthly/weekly boundary become one query instead of two spreadsheets.
The Warehouse
Five tables, normalized on FIPS
A Python loader (build_db.py) reads the two published JSON files, builds the reference tables, and enforces foreign keys. Total load: 9,680 rows.
states
- state_fips PK
- usps · 2-letter code
- name
52 rows
counties
- fips PK
- state_fips FK
- name, population
- median_income
3,222 rows
tsci_costs
- fips PK/FK
- rent, owner_cost
- childcare, insurance
- owner/renter_nut
- burden %, shelter_index
3,222 rows
tcbi_costs
- fips PK/FK
- gas, pantry, plate,
- restaurant (weekly)
- basket_weekly/annual
- income share, tcbi_index
3,133 rows
tcbi_state_costs
- usps PK/FK
- tcbi_index
- basket_weekly
- income share %
51 rows
Question 01 · CTEs + Multi-Table Joins
Where does shelter squeeze harder than the grocery run?
Two CTEs aggregate each index to the state level across three-table joins, then a final join computes the gap between the average renter shelter burden and the average weekly-basket income share.
WITH shelter AS (
SELECT s.usps, s.name AS state_name,
ROUND(AVG(t.renter_burden_pct), 1) AS avg_renter_burden_pct,
COUNT(*) AS n_counties
FROM tsci_costs t
JOIN counties c ON c.fips = t.fips
JOIN states s ON s.state_fips = c.state_fips
WHERE t.renter_burden_pct IS NOT NULL
GROUP BY s.usps, s.name
),
basket AS (
SELECT s.usps,
ROUND(AVG(b.basket_income_share_pct), 1) AS avg_basket_share_pct
FROM tcbi_costs b
JOIN counties c ON c.fips = b.fips
JOIN states s ON s.state_fips = c.state_fips
WHERE b.basket_income_share_pct IS NOT NULL
GROUP BY s.usps
)
SELECT sh.usps, sh.state_name, sh.avg_renter_burden_pct,
ba.avg_basket_share_pct,
ROUND(sh.avg_renter_burden_pct - ba.avg_basket_share_pct, 1)
AS shelter_minus_basket_gap
FROM shelter sh
JOIN basket ba ON ba.usps = sh.usps
ORDER BY shelter_minus_basket_gap DESC
LIMIT 10;
| State | Renter burden % | Basket share % | Gap (pts) |
|---|---|---|---|
| District of Columbia | 45.6 | 16.4 | 29.2 |
| Massachusetts | 41.1 | 19.3 | 21.8 |
| California | 43.5 | 22.6 | 20.9 |
| Hawaii | 39.1 | 20.8 | 18.3 |
| New Jersey | 35.5 | 18.9 | 16.6 |
| Maryland | 34.6 | 19.3 | 15.3 |
| Vermont | 39.6 | 24.4 | 15.2 |
| Washington | 40.2 | 25.0 | 15.2 |
| Rhode Island | 33.4 | 19.1 | 14.3 |
| New Hampshire | 35.5 | 21.4 | 14.1 |
In the District of Columbia the average county-level renter shelter burden runs 29.2 percentage points of income above the weekly-basket share (45.6% vs. 16.4%); Massachusetts (+21.8) and California (+20.9) follow. In high-cost coastal states it is housing, insurance, and childcare — not gas and groceries — doing the squeezing.
Question 02 · Window Aggregate
Where does infant childcare cost more than rent?
A filter finds every county where monthly infant childcare exceeds median gross rent; COUNT(*) OVER () carries the nationwide total on every row of the ranked output.
SELECT c.fips, c.name AS county, s.usps AS state,
t.childcare_monthly, t.rent,
ROUND(t.childcare_monthly * 1.0 / t.rent, 2) AS childcare_to_rent_ratio,
COUNT(*) OVER () AS total_counties_nationwide
FROM tsci_costs t
JOIN counties c ON c.fips = t.fips
JOIN states s ON s.state_fips = c.state_fips
WHERE t.childcare_monthly > t.rent
ORDER BY childcare_to_rent_ratio DESC
LIMIT 15;
| County | State | Childcare /mo | Rent /mo | Ratio |
|---|---|---|---|---|
| Sheridan County | ND | $869 | $373 | 2.33 |
| Hardin County | IL | $726 | $342 | 2.12 |
| Issaquena County | MS | $516 | $253 | 2.04 |
| Monroe County | OH | $1,278 | $626 | 2.04 |
| Worth County | MO | $686 | $353 | 1.94 |
| Hayes County | NE | $811 | $419 | 1.94 |
| Chouteau County | MT | $925 | $485 | 1.91 |
| Knox County | IL | $1,347 | $725 | 1.86 |
| Sioux County | ND | $851 | $466 | 1.83 |
| Wayne County | IL | $1,145 | $632 | 1.81 |
In 793 of 3,222 counties — roughly one county in four — a month of infant childcare costs more than a month of median rent. The extremes are rural: in Sheridan County, North Dakota, childcare runs 2.33× the rent. Cheap housing does not mean cheap living when daycare is the second rent.
Question 03 · DENSE_RANK + LAG
Which county tops each state’s cost table — and by how much?
DENSE_RANK() orders counties within each state by total renter monthly nut; LAG() over the same partition fetches the runner-up’s figure, turning “most expensive” into “most expensive, by this margin.”
WITH ranked AS (
SELECT s.usps AS state, c.name AS county, t.renter_nut,
DENSE_RANK() OVER (PARTITION BY s.usps
ORDER BY t.renter_nut DESC) AS rank_in_state,
LAG(t.renter_nut) OVER (PARTITION BY s.usps
ORDER BY t.renter_nut) AS runner_up_nut
FROM tsci_costs t
JOIN counties c ON c.fips = t.fips
JOIN states s ON s.state_fips = c.state_fips
WHERE t.renter_nut IS NOT NULL
)
SELECT state, county,
ROUND(renter_nut) AS renter_nut_monthly,
ROUND(runner_up_nut) AS runner_up_monthly,
ROUND(renter_nut - runner_up_nut) AS premium_over_runner_up
FROM ranked
WHERE rank_in_state = 1
ORDER BY renter_nut DESC;
| State | County | Nut /mo | #2 /mo | Premium |
|---|---|---|---|---|
| CA | San Mateo County | $5,310 | $5,112 | $198 |
| VA | Arlington County | $4,712 | $4,077 | $635 |
| MA | Suffolk County | $4,640 | $4,491 | $149 |
| WA | King County | $4,122 | $3,452 | $670 |
| DC | District of Columbia | $4,036 | — | — |
| CO | Douglas County | $3,847 | $3,656 | $191 |
| MD | Howard County | $3,845 | $3,837 | $8 |
| AK | Aleutians West Census Area | $3,810 | $3,400 | $410 |
| CT | Western Connecticut Planning Region | $3,726 | $2,685 | $1,041 |
| HI | Honolulu County | $3,653 | $3,175 | $478 |
San Mateo County, California carries the nation’s heaviest renter nut at $5,310 a month. The most lopsided state is Connecticut, where the Western Connecticut Planning Region out-costs the state’s #2 by $1,041 a month — while in Maryland, Howard County leads Anne Arundel-tier competition by just $8.
Question 04 · NTILE Quartiles
Do weekly baskets climb as fast as monthly shelter?
NTILE(4) sorts every joined county into renter-nut quartiles, then grouped aggregates compare how each cost family scales from the cheapest quarter of America to the most expensive.
WITH joined AS (
SELECT t.renter_nut, b.basket_weekly, c.median_income
FROM tsci_costs t
JOIN tcbi_costs b ON b.fips = t.fips
JOIN counties c ON c.fips = t.fips
WHERE t.renter_nut IS NOT NULL AND b.basket_weekly IS NOT NULL
),
quartiled AS (
SELECT NTILE(4) OVER (ORDER BY renter_nut) AS nut_quartile,
renter_nut, basket_weekly, median_income
FROM joined
)
SELECT nut_quartile, COUNT(*) AS n_counties,
ROUND(AVG(renter_nut)) AS avg_renter_nut_monthly,
ROUND(AVG(basket_weekly), 2) AS avg_basket_weekly,
ROUND(AVG(median_income)) AS avg_median_income
FROM quartiled
GROUP BY nut_quartile
ORDER BY nut_quartile;
| Quartile | Counties | Avg nut /mo | Avg basket /wk | Avg income |
|---|---|---|---|---|
| Q1 (cheapest) | 784 | $1,248 | $288.89 | $52,770 |
| Q2 | 783 | $1,498 | $316.60 | $60,996 |
| Q3 | 783 | $1,743 | $346.50 | $66,212 |
| Q4 (priciest) | 783 | $2,487 | $404.25 | $84,295 |
Shelter diverges roughly twice as fast as the basket. From the cheapest to the priciest quartile the monthly nut doubles ($1,248 → $2,487, +99%) while the weekly basket rises 40% ($288.89 → $404.25). Groceries and gas are sticky; rent and childcare are what separate expensive America from cheap America.
Question 05 · GROUP BY + HAVING
Where is childcare-over-rent a statewide pattern?
Aggregating the Question 02 population by state and keeping only states with 40 or more qualifying counties separates a statewide structural problem from a scattered handful of outliers.
SELECT s.usps, s.name AS state_name,
COUNT(*) AS counties_cc_over_rent,
ROUND(AVG(t.childcare_monthly - t.rent)) AS avg_monthly_excess,
ROUND(MAX(t.childcare_monthly - t.rent)) AS max_monthly_excess
FROM tsci_costs t
JOIN counties c ON c.fips = t.fips
JOIN states s ON s.state_fips = c.state_fips
WHERE t.childcare_monthly > t.rent
GROUP BY s.usps, s.name
HAVING COUNT(*) >= 40
ORDER BY counties_cc_over_rent DESC;
| State | Counties | Avg excess /mo | Max excess /mo |
|---|---|---|---|
| Oklahoma | 63 | $107 | $272 |
| Nebraska | 60 | $108 | $392 |
| Ohio | 58 | $205 | $652 |
| Minnesota | 57 | $138 | $809 |
| Illinois | 53 | $184 | $622 |
| Montana | 47 | $186 | $440 |
| New York | 45 | $175 | $316 |
| North Dakota | 43 | $186 | $496 |
| Wisconsin | 40 | $134 | $346 |
Nine states have 40 or more counties where childcare beats rent, led by Oklahoma (63), Nebraska (60), and Ohio (58). It is a Plains-and-Midwest pattern: low rural rents collide with childcare prices that do not fall nearly as far, and in Minnesota the worst county gap reaches $809 a month.
Question 06 · Correlated Subquery
Which counties most out-spend their own state’s basket?
A correlated subquery recomputes the state’s average weekly basket for each outer county row, surfacing the counties that are priced like a different state than the one they sit in.
SELECT c.name AS county, s.usps AS state, b.basket_weekly,
(SELECT ROUND(AVG(b2.basket_weekly), 2)
FROM tcbi_costs b2
JOIN counties c2 ON c2.fips = b2.fips
WHERE c2.state_fips = c.state_fips) AS state_avg_basket,
ROUND(100.0 * b.basket_weekly /
(SELECT AVG(b3.basket_weekly)
FROM tcbi_costs b3
JOIN counties c3 ON c3.fips = b3.fips
WHERE c3.state_fips = c.state_fips) - 100, 1)
AS pct_above_state_avg
FROM tcbi_costs b
JOIN counties c ON c.fips = b.fips
JOIN states s ON s.state_fips = c.state_fips
WHERE b.basket_weekly IS NOT NULL
ORDER BY pct_above_state_avg DESC
LIMIT 10;
| County | State | Basket /wk | State avg /wk | Above avg |
|---|---|---|---|---|
| Brooklyn | NY | $521.90 | $349.32 | +49.4% |
| Nassau County | NY | $521.90 | $349.32 | +49.4% |
| Manhattan | NY | $521.90 | $349.32 | +49.4% |
| Putnam County | NY | $521.90 | $349.32 | +49.4% |
| Queens | NY | $521.90 | $349.32 | +49.4% |
| Rockland County | NY | $521.90 | $349.32 | +49.4% |
| Suffolk County | NY | $521.90 | $349.32 | +49.4% |
| Westchester County | NY | $521.90 | $349.32 | +49.4% |
| Staten Island | NY | $519.03 | $349.32 | +48.6% |
| Teton County | WY | $506.33 | $343.23 | +47.5% |
Nine of the ten biggest in-state basket outliers are the New York City metro, running 49.4% above the New York state average ($521.90 vs. $349.32 a week) — the tied values reflect the source index’s metro-level price allocation. The lone non-NY entry is Teton County, Wyoming (Jackson Hole), at +47.5%.
Methodology & Limits
How it was built, and where to be careful
Pipeline
build_db.py (Python 3 standard library only) loads the published TSCI county JSON and TCBI composite JSON, rebuilds cost_indexes.db with foreign keys and indexes, exports schema.sql, runs every tagged query in queries.sql, and writes each result to results/*.csv. One command reproduces everything on this page.
Source attribution
County figures are the published outputs of my TSCI and TCBI tools, which build on Census ACS 5-Year tables (rent, home value, owner cost B25088, income, population), DOL National Database of Childcare Prices, NAIC HO-3/HO-4 insurance averages, EIA gasoline prices, BLS average-price and CPI food series, and USDA food-plan benchmarks.
Limits
These are model indexes, not survey microdata. Several TCBI series are proxy-allocated below the state or PADD level — the tied NYC-metro values in Question 06 are that allocation showing through. Childcare uses the latest NDCP county vintage. Puerto Rico’s 78 municipios appear in shelter tables but not basket joins.
More SQL, more data
This is the second SQL case study in the portfolio. The first analyzes $96.5B in federal contracts to veteran-owned small businesses. The full project — loader, schema, queries, and results — is reproducible end to end.