Fix Supabase Disk IO bottleneck: drop unused `deleted` column + index hot queries by Hacksore · Pull Request #135 · Hacksore/oghunt

Hacksore · 2026-06-17T00:41:15Z

Diagnosis

The Disk IO budget alerts come from the Post table being sequentially scanned in full on nearly every page view.

The home page, /list, and /ai are all export const dynamic = "force-dynamic" and call getTodaysLaunches / getTodaysLaunchesPaginated (apps/web/src/app/lib/launches.ts) on every request. Those queries filter Post over a createdAt day-range; /api/list, /api/stats, /slop, and sitemap.ts hit the same table the same way. Post had no indexes except its primary key, so every request read the whole table to return a few hundred rows.

Two changes

1. Drop the `deleted` column (it was vestigial)

deleted dates back to when AI filtering ran every 15 minutes and removed posts mid-day. Now ingestion runs once a day, and the only writer that set it to true was the daily updateMany in ingest-posts that flipped every post not in today's fetch to deleted=true — which is already implied by the createdAt day-range filter every read path uses. It was also silently limiting the sitemap to today's posts only (deleted=false).

So this PR:

removes the deleted column from Post,
removes the daily updateMany (a large daily write — also IO) and the deleted writes in ingest-posts,
removes the now-redundant deleted filters from the launch queries,
lets the sitemap include all posts (fixes the latent SEO bug).

A soft-delete flag can be reintroduced if/when we actually want soft deletes.

2. Index the hot read paths

Post @@index([hasAi, createdAt]) — the launch queries (filter hasAi + createdAt range).
Post @@index([createdAt]) — createdAt-range/ordered reads without a hasAi filter (/api/list, /api/stats, sitemap).
TopicPost @@index([postId]) — the composite PK is keyed on topicId first, so include: { topics } couldn't use it.
Metric @@index([timestamp]) — /slop window + /api/stats/time ordering.

votesCount is intentionally left out of the indexes so the every‑15‑minutes update-vote-count cron doesn't pay index write amplification.

Evidence

Reproduced on Postgres with a production-shaped dataset (~154k posts). EXPLAIN (ANALYZE, BUFFERS) — shared buffers = 8KB pages touched (≈ disk reads when uncached):

Hot query	Before	After	Page reads ↓
Today's launches (`hasAi=false`)	Parallel Seq Scan, 4400 pages, 11ms	Index Scan `[hasAi, createdAt]`, 206 pages, 0.17ms	~21×
Today's count	Parallel Seq Scan, 4400 pages, 9.5ms	Index Only Scan, 5 pages, 0.05ms	~880×
Today's launches (no `hasAi` filter)	Parallel Seq Scan, 4400 pages, 8.9ms	Bitmap Index Scan `[createdAt]`, 14 pages, 0.17ms	~310×
Sitemap	Seq Scan + temp-file sort, 26.7ms	Index Scan Backward `[createdAt]` (now returns all posts)	sort spill removed

Full before/after EXPLAIN ANALYZE output attached.

disk_io_explain_no_deleted.log

App still works end-to-end

After applying the schema change to a seeded DB, home → /list → detail all render real content and the AI filter works (AI • 1.2K / 2.8K • No AI):

oghunt_after_removing_deleted.mp4

Home page after removing deleted
Post detail after removing deleted

Applying in production (important)

This repo uses prisma db push. Dropping the column + creating indexes with a plain db push locks the table and needs --accept-data-loss. On the large production Post table, do it manually instead — build indexes concurrently and drop the column in a quick separate statement:

CREATE INDEX CONCURRENTLY IF NOT EXISTS "Post_hasAi_createdAt_idx" ON "Post" ("hasAi", "createdAt");
CREATE INDEX CONCURRENTLY IF NOT EXISTS "Post_createdAt_idx"       ON "Post" ("createdAt");
CREATE INDEX CONCURRENTLY IF NOT EXISTS "TopicPost_postId_idx"     ON "TopicPost" ("postId");
CREATE INDEX CONCURRENTLY IF NOT EXISTS "Metric_timestamp_idx"     ON "Metric" ("timestamp");
ALTER TABLE "Post" DROP COLUMN IF EXISTS "deleted";

Verification

✅ pnpm db:push applied the drop + 4 indexes (confirmed via pg_indexes).
✅ pnpm db:generate, pnpm check, pnpm build all pass (build now emits sitemap chunks 0–3, i.e. it includes all posts).
✅ App serves real content across /, /list, /ai, /api/list, /slop (see demo).
✅ EXPLAIN (ANALYZE, BUFFERS) before/after confirms seq scan → index scan.

_{To show artifacts inline, enable in settings.}

The hot read paths (home, /list, /ai, /api/list, /api/stats) are force-dynamic and filter Post by deleted + hasAi over a createdAt day range, ordered by votesCount. With no matching index Postgres did a sequential scan of the entire Post table on every request, which is the primary driver of Supabase Disk IO budget consumption as history grows. Add covering indexes for the launch/sitemap queries on Post, an index on TopicPost.postId for the topics include, and an index on Metric.timestamp for the /slop and stats routes. votesCount is intentionally left out of the indexes so the every-15-minutes vote-count cron does not pay index write amplification. Co-authored-by: Sean Boult <Hacksore@users.noreply.github.com>

vercel · 2026-06-17T00:41:21Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
oghunt	Ready	Preview, Comment	Jun 17, 2026 1:14am

The deleted flag was a holdover from when AI filtering ran every 15 minutes and mid-day removed posts. Now ingestion runs once a day and the only writer (the daily updateMany in ingest-posts) just flipped every post not in today's fetch to deleted=true, which is already implied by the createdAt day-range filter the read paths use. It also silently limited the sitemap to today's posts only. Drop the column and the daily updateMany, remove the now-redundant deleted filters from the launch queries, and let the sitemap include all posts. Indexes become [hasAi, createdAt] and [createdAt] (no deleted), which still serve the hot read paths and avoid full table scans. We can reintroduce a deleted flag if/when we actually want soft deletes. Co-authored-by: Sean Boult <Hacksore@users.noreply.github.com>

vercel Bot deployed to Preview June 17, 2026 00:42 View deployment

cursor Bot changed the title ~~Fix Supabase Disk IO bottleneck: add indexes for full-table-scan queries~~ Fix Supabase Disk IO bottleneck: drop unused deleted column + index hot queries Jun 17, 2026

vercel Bot deployed to Preview June 17, 2026 01:14 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Supabase Disk IO bottleneck: drop unused `deleted` column + index hot queries#135

Fix Supabase Disk IO bottleneck: drop unused `deleted` column + index hot queries#135
Hacksore wants to merge 2 commits into
mainfrom
cursor/diagnose-disk-io-bottleneck-9a3e

Hacksore commented Jun 17, 2026 •

edited by cursor Bot

Loading

Uh oh!

vercel Bot commented Jun 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Hacksore commented Jun 17, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diagnosis

Two changes

1. Drop the deleted column (it was vestigial)

2. Index the hot read paths

Evidence

App still works end-to-end

Applying in production (important)

Verification

Uh oh!

vercel Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Hacksore commented Jun 17, 2026 •

edited by cursor Bot

Loading

1. Drop the `deleted` column (it was vestigial)

vercel Bot commented Jun 17, 2026 •

edited

Loading