Intervue is a production-oriented mock-interview marketplace: interviewees book 1:1 video sessions with verified interviewers, Stream records the call, and Google Gemini produces structured AI feedback from the transcript. The backend handles role-based auth, credit accounting, interviewer payouts, and idempotent webhook processing.
- What This Shows
- Overview
- Live Deployments
- Architecture
- AI Integration
- Tech Stack
- Project Structure
- Getting Started
- Configuration
- Testing
- CI and Deployment
- API Reference
- Reliability and Tradeoffs
- Maintainer
- Engineering Notes
This project was designed like a real product system, not a demo screen:
- Separated frontend and backend ownership so UI, auth verification, business logic, database writes, and third-party integrations are isolated behind clear boundaries.
- Stateless backend authentication using Clerk JWTs verified in FastAPI through JWKS — the API never trusts the frontend session blindly, and Cloud Run instances stay disposable.
- Async Python service layer with FastAPI, Prisma Client Python, and
httpxfor non-blocking API and database workflows. - Transactional booking and credit accounting — interviewee deduction, booking creation, and Stream call creation are sequenced deliberately so no partial state is possible.
- Idempotent webhook handling for Stream recording and transcription events — the handler checks existing records before writing, so duplicate delivery produces no duplicate business events.
- End-to-end AI pipeline — transcript normalization, Gemini prompt construction, structured JSON output parsing, and DB persistence are all handled in a single service method, with no side effects leaking across layers and an idempotency guard upstream of the API call.
- Path-filtered CI/CD with independent frontend and backend pipelines, Docker smoke tests, Artifact Registry image tagging by git SHA, and Cloud Run deployments gated on smoke success.
- Explicit operational tradeoffs documented instead of hidden — known concurrency gaps, rate-limiter scope, and webhook observability gaps are all called out below.
Problem: Candidates lack access to realistic technical interview practice with structured feedback; experienced engineers have no structured channel to monetize their interview expertise.
Product slice: A marketplace where verified interviewers list available slots, interviewees purchase credits and book sessions, and AI generates role-specific questions before the call and structured written feedback after it.
Interviewee flow:
- Sign up with Clerk, complete onboarding as INTERVIEWEE.
- Browse interviewer profiles and select a time slot.
- Book the slot — credits are atomically deducted and a Stream call is created.
- Join the
/call/[callId]room at session time; Stream records and transcribes. - Receive AI-generated feedback (summary, technical, communication, problem-solving, rating) once the transcript is ready.
Interviewer flow:
- Onboard as INTERVIEWER, set bio, title, categories, and hourly credit rate.
- Publish availability slots on the dashboard.
- Conduct sessions via Stream video; earnings accumulate in a credit balance.
- Request a withdrawal — admin approves and processes payout via email.
| Surface | URL |
|---|---|
| Frontend | https://intervue-frontend-cxljs3igra-el.a.run.app |
| Backend API | https://intervue-backend-cxljs3igra-el.a.run.app |
| OpenAPI (Swagger) | https://intervue-backend-cxljs3igra-el.a.run.app/docs |
| Health check | https://intervue-backend-cxljs3igra-el.a.run.app/health |
Split into HLD (system context and delivery pipeline) and LLD (internal code structure for each service).
flowchart TB
subgraph people [People]
IV[Interviewee]
IR[Interviewer]
AD[Admin]
end
subgraph gcp [Google Cloud]
FE[Cloud Run — Next.js]
BE[Cloud Run — FastAPI]
AR[(Artifact Registry)]
end
subgraph external [External Services]
CL[Clerk Auth]
ST[Stream.io Video]
GM[Google Gemini]
RS[Resend Email]
SB[(Supabase PostgreSQL)]
end
subgraph gh [GitHub]
GA[Actions CI/CD]
end
IV & IR & AD -->|HTTPS UI| FE
FE -->|Bearer JWT| BE
BE -->|JWKS verify| CL
BE --> SB
BE -->|call token / webhook| ST
ST -->|recording + transcript webhook| BE
BE -->|generate feedback / questions| GM
BE -->|payout notification| RS
GA -->|OIDC WIF push image| AR
GA -->|deploy revision| FE & BE
Reading the diagram:
- Synchronous path: Browser → Next.js (Server Actions / proxy) → FastAPI → Supabase.
- Async path: Stream terminates the call →
POST /api/webhooks/stream→ FastAPI downloads transcript → Gemini generates feedback → DB transaction records feedback and interviewer earning. - Auth: FastAPI verifies every protected route independently via Clerk JWKS. The frontend session is never trusted blindly.
- Supply chain: Images tagged by git SHA land in Artifact Registry; both services deploy via WIF (no long-lived GCP JSON keys in GitHub).
flowchart LR
subgraph repo [Monorepo]
BEsrc[backend/]
FEsrc[frontend/]
WF[.github/workflows/]
end
subgraph cd [CD — main, path-filtered]
D1[deploy-backend.yml]
D2[deploy-frontend.yml]
end
subgraph reg [asia-south1 Artifact Registry]
IMG1[intervue-backend image]
IMG2[intervue-frontend image]
end
BEsrc --> D1 --> IMG1 --> BEcr[Cloud Run backend]
FEsrc --> D2 --> IMG2 --> FEcr[Cloud Run frontend]
WF --> cd
Each deploy workflow: docker build → container smoke on runner → GET /health → push digest → gcloud run deploy. A failing smoke test blocks the deploy.
Auth flow — every protected route verifies the JWT independently:
sequenceDiagram
participant Br as Browser
participant N as Next.js
participant B as FastAPI
participant CL as Clerk JWKS
Br->>N: Sign in
N->>Br: Clerk session token
Br->>N: Request with session
N->>B: Bearer <clerk_jwt>
B->>CL: Fetch / cache JWKS
B->>B: Verify RS256 signature
B->>B: Load or upsert internal User
B-->>N: Typed ClerkUser context in route handler
Booking flow — atomic credit deduction:
sequenceDiagram
participant U as Interviewee
participant N as Next.js
participant B as FastAPI
participant S as Stream.io
participant D as Supabase
U->>N: Select slot + confirm
N->>B: POST /api/bookings (Bearer JWT)
B->>B: Verify role is INTERVIEWEE
B->>B: Verify slot is free + credits sufficient
B->>S: Create Stream call
B->>D: Transaction — Booking + BOOKING_DEDUCT + decrement creditBalance
B-->>N: bookingId + streamCallId
Webhook and AI feedback flow — idempotent, async:
sequenceDiagram
participant S as Stream.io
participant B as FastAPI
participant G as Gemini
participant D as Supabase
S-->>B: call.recording_ready webhook
B->>D: Store recording URL on Booking
S-->>B: call.transcription_ready webhook
B->>B: Resolve Booking from streamCallId
B->>B: Skip if Feedback already exists (idempotency guard)
B->>S: Download transcript
B->>B: Normalize speaker segments
B->>G: POST structured feedback prompt
G-->>B: JSON — summary, technical, communication, problemSolving, recommendation, strengths[], improvements[], overallRating
B->>D: Transaction — upsert Feedback + mark Booking COMPLETED + BOOKING_EARNING + increment interviewer creditBalance
Payout flow:
sequenceDiagram
participant IR as Interviewer
participant B as FastAPI
participant D as Supabase
participant RS as Resend
participant AD as Admin
IR->>B: POST withdrawal request
B->>B: Rate limit — 3 req/hour/user
B->>B: Validate creditBalance >= amount
B->>B: Apply platform fee
B->>D: Create Payout (PROCESSING)
B->>RS: Send payout request email to admin
AD->>B: GET /api/payouts/:id
AD->>B: POST /api/payouts/:id/approve (password)
B->>D: Mark Payout PROCESSED
Layer model enforced in code: router → service → database (Prisma). Cross-cutting auth and rate limiting live in dependencies.py.
flowchart TB
subgraph transport [Transport]
M[main.py + CORS + lifespan]
RU[routers/users]
RO[routers/onboarding]
RI[routers/interviewers]
RB[routers/bookings]
RC[routers/calls]
RA[routers/ai]
RP[routers/payouts]
RW[routers/webhooks]
end
subgraph svc [Services]
SU[UserService]
SB[BookingService]
SC[CallService]
SW[WebhookService]
SP[PayoutService]
end
subgraph infra [Infrastructure]
DB[database.py — Prisma client]
DEP[dependencies.py — auth + rate limiter]
CFG[config.py — typed settings]
end
M --> RU & RO & RI & RB & RC & RA & RP & RW
RU --> SU --> DB
RB --> SB --> DB
RC --> SC --> DB
RW --> SW --> DB
RP --> SP --> DB
DEP -.-> RU & RB & RC & RA & RP
| Path | Responsibility |
|---|---|
app/routers/ |
HTTP contract, request validation, response shaping |
app/services/ |
Business logic, third-party calls, DB writes |
app/dependencies.py |
Clerk JWKS verification, user context, token-bucket rate limiting |
app/config.py |
Typed Pydantic settings loaded from env |
app/database.py |
Prisma client lifecycle |
prisma/schema.prisma |
Canonical data model and migration history |
User
id, clerkUserId, email
role: UNASSIGNED | INTERVIEWEE | INTERVIEWER
interviewee fields: credits, currentPlan, creditsLastAllocatedAt
interviewer fields: bio, title, company, yearsExp, categories, creditRate, creditBalance
Availability
interviewerId → User
startTime, endTime
status: AVAILABLE | BOOKED | BLOCKED
Booking
intervieweeId → User
interviewerId → User
startTime, endTime
status: SCHEDULED | COMPLETED | CANCELLED
creditsCharged, streamCallId, recordingUrl
Feedback
bookingId → Booking
summary, technical, communication, problemSolving, recommendation
strengths[], improvements[]
overallRating, sessionRating
CreditTransaction
userId → User
bookingId → Booking (nullable)
amount
type: CREDIT_PURCHASE | BOOKING_DEDUCT | BOOKING_EARNING | ADMIN_ADJUST
Payout
interviewerId → User
credits, platformFee, netAmount
paymentMethod, paymentDetail
status: PROCESSING | PROCESSED
@@index([interviewerId, startTime]) // slot lookup per interviewer
@@index([status, createdAt]) // admin and status queues
@@index([interviewerId, status]) // interviewer dashboard queries
@@index([intervieweeId, status]) // interviewee appointments listTwo data planes: (1) Server Actions for auth-aware first-paint data loading. (2) /api/proxy/[...path] route handler so the browser never holds backend secrets or CORS credentials directly.
flowchart TB
subgraph rsc [Server layer]
Pg[app — RSC pages]
SA[actions/ — Server Actions]
BF[lib/ — typed fetch helpers]
end
subgraph proxy [API Proxy]
PX[app/api/proxy/path — forward + auth]
end
subgraph ui [Client layer]
Cmp[components/ — UI primitives]
HK[hooks/ — React Query wrappers]
end
Pg --> SA --> BF
Cmp --> HK -->|same-origin fetch| PX
PX -->|BACKEND_URL + JWT| API[FastAPI]
BF -->|BACKEND_URL + JWT| API
| Area | Responsibility |
|---|---|
app/(auth)/ |
Clerk sign-in/sign-up pages and callbacks |
app/(main)/ |
Protected product pages (explore, book, dashboard, call room) |
app/api/proxy/[...path]/ |
BFF proxy — attaches Clerk session token, forwards to FastAPI |
actions/ |
Server Actions for booking, onboarding, payout flows |
components/ |
Accessible UI built on shadcn/ui and Radix primitives |
hooks/ |
React Query wrappers with cache invalidation after mutations |
lib/ |
Typed API fetch helpers, Stream client setup |
types/ |
Shared TypeScript interfaces matching Prisma schema |
Google Gemini is used for two distinct tasks, each with a separate service method and prompt strategy.
After Stream delivers call.transcription_ready, the backend:
- Downloads the raw transcript from Stream.
- Normalizes speaker segments — maps Stream speaker labels to
Interviewer/Intervieweeroles. - Sends the normalized transcript to Gemini with a structured prompt requesting JSON output across six dimensions.
- Parses and validates the response.
- Persists to
Feedbackin a single DB transaction alongside the booking status update and interviewer earning.
Output schema sent to Gemini and stored verbatim:
{
"summary": "string",
"technical": "string",
"communication": "string",
"problemSolving": "string",
"recommendation": "STRONG_HIRE | HIRE | NO_HIRE",
"strengths": ["string"],
"improvements": ["string"],
"overallRating": "number (1–10)",
"sessionRating": "number (1–10)"
}The idempotency guard (skip if Feedback already exists) runs before the Gemini call, so repeated webhook delivery never triggers a duplicate API request or DB write.
Before a session, interviewers or interviewees can request role-specific questions via POST /api/ai/questions. The service constructs a prompt with the interviewer's declared categories and seniority, asks Gemini for a structured list, and returns it directly — no DB write, no side effects. This keeps the route stateless and easy to retry.
| Layer | Technology | Why it fits |
|---|---|---|
| Frontend | Next.js 16, React 19, TypeScript | App Router, Server Actions, typed UI throughout |
| Styling | Tailwind CSS v4, shadcn/ui, Radix UI | Accessible primitives and consistent design tokens |
| Auth | Clerk | Managed auth, JWT issuance, JWKS endpoint for backend verification |
| Video | Stream Video SDK | Calls, recording, transcription, and webhook delivery |
| Backend | FastAPI, Python 3.12, Uvicorn | Async I/O, strict Pydantic schemas, clean route/service layering |
| Database | PostgreSQL on Supabase | Managed relational data; Prisma as typed query client |
| ORM | Prisma Client Python | Shared schema with typed Python client and migration history |
| AI | Google Gemini | Question generation and post-call feedback summarization |
| Resend | Transactional payout notification to admin | |
| Containers | Docker multi-stage builds | Reproducible production runtime, small final image |
| CI/CD | GitHub Actions | Path-filtered independent service pipelines |
| Cloud | GCP Cloud Run, Artifact Registry | Managed containers, scale to zero, simple rollout/rollback |
ai-interview/
├── .github/
│ └── workflows/
│ ├── deploy-backend.yml
│ └── deploy-frontend.yml
├── scripts/
│ └── setup-gcp.sh
├── backend/
│ ├── app/
│ │ ├── main.py
│ │ ├── config.py
│ │ ├── database.py
│ │ ├── dependencies.py # auth verification + rate limiting
│ │ ├── routers/ # HTTP contract layer
│ │ └── services/ # business logic + third-party calls
│ ├── prisma/
│ │ ├── schema.prisma
│ │ └── migrations/
│ ├── Dockerfile
│ └── requirements.txt
└── frontend/
├── app/
│ ├── (auth)/ # Clerk pages
│ ├── (main)/ # protected product pages
│ └── api/proxy/[...path]/ # BFF proxy route
├── actions/ # Server Actions
├── components/ # UI primitives
├── hooks/ # React Query wrappers
├── lib/ # API helpers, Stream setup
├── types/ # Shared TypeScript interfaces
├── Dockerfile
└── package.json
- Node.js 22+
- Python 3.12
- Docker (optional, reproduces CI image locally)
- Accounts: Clerk, Stream.io, Resend, Google AI Studio, Supabase
cd backend
cp .env.example .env # fill from Configuration section below
python3.12 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
prisma generate
prisma db push
uvicorn app.main:app --reload --port 8000- OpenAPI UI:
http://localhost:8000/docs - Liveness:
GET /health
cd frontend
cp .env.example .env.local # fill NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY, etc.
npm install
npm run dev # http://localhost:3000Stream delivers webhooks to a public URL. Use ngrok to expose the local backend:
ngrok http 8000Set the Stream webhook URL in your Stream dashboard:
https://<ngrok-domain>/api/webhooks/stream
| Variable | Description |
|---|---|
DATABASE_URL |
Supabase pooler URL for runtime queries |
DIRECT_URL |
Supabase direct URL for migrations and schema pushes |
CLERK_SECRET_KEY |
Clerk secret key |
CLERK_PUBLISHABLE_KEY |
Clerk publishable key (optional backend reference) |
CLERK_ISSUER |
Clerk issuer URL for JWKS verification |
NEXT_PUBLIC_STREAM_API_KEY |
Stream API key |
STREAM_SECRET_KEY |
Stream secret key for server-side token generation |
GEMINI_API_KEY |
Google AI Studio key for question/feedback generation |
RESEND_API_KEY |
Resend key for payout email notifications |
ADMIN_EMAIL |
Email receiving payout requests |
ADMIN_PAYOUT_PASSWORD |
Password required for payout approval |
NEXT_PUBLIC_APP_URL |
Frontend URL used in generated links |
CORS_ORIGINS |
Comma-separated allowed frontend origins |
| Variable | Description |
|---|---|
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY |
Clerk publishable key baked into frontend bundle |
CLERK_SECRET_KEY |
Server-side Clerk secret for Server Actions |
NEXT_PUBLIC_STREAM_API_KEY |
Stream API key baked into frontend bundle |
NEXT_PUBLIC_APP_URL |
Public frontend base URL |
BACKEND_URL |
Server-side FastAPI base URL (never exposed to the client bundle) |
| Secret | Description |
|---|---|
GCP_PROJECT_ID |
GCP project ID |
GCP_SERVICE_ACCOUNT |
WIF-enabled deploy service account |
GCP_WORKLOAD_IDENTITY_PROVIDER |
WIF provider resource |
NEXT_PUBLIC_CLERK_PUBLISHABLE_KEY |
Clerk frontend key |
CLERK_SECRET_KEY |
Clerk backend key |
NEXT_PUBLIC_STREAM_API_KEY |
Stream API key |
STREAM_SECRET_KEY |
Stream secret |
NEXT_PUBLIC_APP_URL |
Cloud Run frontend URL |
BACKEND_URL |
Cloud Run backend URL |
DATABASE_URL |
Supabase pooler connection string |
DIRECT_URL |
Supabase direct connection string |
CLERK_ISSUER |
Clerk issuer URL |
RESEND_API_KEY |
Resend API key |
GEMINI_API_KEY |
Google AI Studio key |
ADMIN_EMAIL |
Admin payout email |
ADMIN_PAYOUT_PASSWORD |
Payout approval password |
CORS_ORIGINS |
Allowed frontend origins |
# Frontend — type check
cd frontend && npm run typecheck
# Frontend — full production build (catches import and runtime errors)
cd frontend && npm run build
# Backend — Python type and import validation
cd backend && python3 -m compileall app
# Smoke test the live backend
curl https://intervue-backend-cxljs3igra-el.a.run.app/healthThe CI pipeline runs npm run build and python3 -m compileall on every push. The deploy pipeline additionally runs a Docker container smoke test before any image reaches Cloud Run.
| Workflow | Path filter | Pipeline |
|---|---|---|
deploy-backend.yml |
backend/** |
docker build → run container → GET /health smoke → push asia-south1-docker.pkg.dev/.../intervue-backend:<sha> → gcloud run deploy |
deploy-frontend.yml |
frontend/** |
Same pattern → smoke → push intervue-frontend:<sha> → deploy |
Images are tagged with the git commit SHA:
asia-south1-docker.pkg.dev/<project>/intervue/intervue-backend:<sha>
asia-south1-docker.pkg.dev/<project>/intervue/intervue-frontend:<sha>
GCP auth uses Workload Identity Federation — GitHub does not store long-lived service account JSON keys.
# Edit PROJECT_ID, BILLING_ACCOUNT, and GITHUB_REPO first.
bash scripts/setup-gcp.shThe script creates or configures: GCP project, billing link, required APIs, Artifact Registry, GitHub Actions service account, and WIF pool/provider.
Backend:
gcloud builds submit backend \
--tag asia-south1-docker.pkg.dev/<project>/intervue/intervue-backend:<tag>
gcloud run deploy intervue-backend \
--image asia-south1-docker.pkg.dev/<project>/intervue/intervue-backend:<tag> \
--region asia-south1 \
--platform managed \
--allow-unauthenticated \
--port 8000 \
--env-vars-file backend-env.yamlFrontend:
gcloud builds submit frontend --config frontend-cloudbuild.yaml
gcloud run deploy intervue-frontend \
--image asia-south1-docker.pkg.dev/<project>/intervue/intervue-frontend:<tag> \
--region asia-south1 \
--platform managed \
--allow-unauthenticated \
--port 3000 \
--env-vars-file frontend-env.yamlAll protected routes require:
Authorization: Bearer <clerk_jwt>| Method | Path | Purpose |
|---|---|---|
GET |
/health |
Liveness / CI smoke check |
GET |
/api/user/me |
Load authenticated user profile |
POST |
/api/user/sync |
Upsert user from Clerk payload |
POST |
/api/onboarding |
Set role and complete onboarding |
GET |
/api/interviewers |
Browse interviewer listings |
GET |
/api/interviewers/:id |
Single interviewer profile |
POST |
/api/bookings |
Book a slot (deducts credits atomically) |
GET |
/api/appointments |
Upcoming appointments for the caller |
GET |
/api/calls/:callId |
Authorize call access + issue Stream token |
POST |
/api/ai/questions |
Generate AI questions for a session |
GET |
/api/payouts/:id |
Payout detail (admin) |
POST |
/api/payouts/:id/approve |
Approve payout (admin password required) |
POST |
/api/webhooks/stream |
Stream recording/transcription webhook |
Error responses follow FastAPI's standard shape:
{ "detail": "message" }Stateless auth: The backend verifies Clerk JWTs via JWKS on every request and keeps no server-side session state, making Cloud Run instances disposable and horizontally scalable.
Idempotent webhooks: Stream can redeliver the same event. The handler checks for existing feedback and earning records before writing — duplicate webhook delivery produces no duplicate business events.
Transactional booking: Credit deduction, booking creation, and Stream call creation are sequenced deliberately. At larger scale, a database-level uniqueness guard on (interviewerId, startTime) should replace the pre-transaction conflict check.
Token-bucket rate limiting:
booking_rate_limit = RateLimiter(capacity=5, refill_rate=5 / 3600)
withdrawal_rate_limit = RateLimiter(capacity=3, refill_rate=3 / 3600)This is intentionally local-process based. Multi-region or high-traffic deployments should move to Redis-backed distributed rate limiting.
Known gaps: webhook logging is application-level only — a production hardening pass should add structured logs, trace IDs, and alerting around failed transcript/feedback generation. Secrets currently flow through Cloud Run env vars; Secret Manager would improve rotation and access auditing.
Saurav Kumar — designed the stateless auth model (JWKS caching in FastAPI rather than session storage, so Cloud Run stays horizontally scalable); chose atomic DB transactions for booking and earning to make the credit ledger auditable rather than eventually consistent; built the idempotent webhook handler with a pre-write existence check so Stream retries are safe; isolated the Gemini pipeline into a single service method (normalize → prompt → parse → persist) with the idempotency guard upstream of the API call; structured the BFF proxy so the browser never holds backend secrets or CORS credentials; set up path-filtered GitHub Actions with Docker smoke gates so a broken container never reaches Cloud Run; and documented known gaps honestly rather than hiding them.
Intervue is intentionally small enough to understand in a day, but structured like a service that can grow:
- Routes stay thin — they validate input and delegate to services.
- Services own business logic — no DB writes in routers, no HTTP calls in repositories.
- Environment config is typed — missing variables fail at startup, not at runtime.
- Deployments are independently smoke tested — a broken image never reaches production.
- External side effects are isolated behind backend services — the frontend never writes to the DB or calls Stream directly.
- Known tradeoffs are documented rather than hidden — this is the bar for a system the next engineer can own.
That is the engineering standard this project is built to.