π€ AI-powered GitHub PR analytics tool that transforms your pull request history into professional markdown reports with intelligent project categorization, pattern analysis, and performance insights.
- π Pipeline Commands: Fetch, by-project, technical, and perf-review as separate steps
- π Auto-fetch: Report commands re-run
fetchif CSVs are missing - π€ AI-Powered Insights: Uses OpenAI to generate concise summaries and comprehensive pattern analysis
- π Project Categorization: Extracts project names from PR titles (
[CS-1234] ProjectName: description) - π Professional Reports: Generates beautiful markdown reports perfect for performance reviews
- π Comprehensive Analysis: 5-section analysis covering project focus, technical themes, development velocity, cross-project insights, and key accomplishments
- π Smart Metrics: Tracks lines changed, PR distribution, and project priorities
- π― Performance Review Ready: Actionable insights for self-assessments and project planning
pr/
βββ src/
β βββ github_pr_fetcher.py # Fetches PRs from GitHub
β βββ pr_summarizer.py # AI-powered analysis & summarization
βββ output/
β βββ pr_YYYY-MM-DD_YYYY-MM-DD_detailed.csv # Raw PR data
β βββ pr_YYYY-MM-DD_YYYY-MM-DD_summarized.csv # With AI summaries
β βββ pr_YYYY-MM-DD_YYYY-MM-DD_by_project.md
β βββ pr_YYYY-MM-DD_YYYY-MM-DD_technical_highlights.md
β βββ pr_YYYY-MM-DD_YYYY-MM-DD_perf_review.md
βββ main.py # Complete workflow entry point
βββ requirements.txt # Dependencies
βββ .env.example # Configuration template
βββ README.md
git clone <your-repo>
cd pr
pip install -r requirements.txtCreate a .env file with your configuration:
cp .env.example .envEdit .env and add your credentials:
# GitHub Configuration
GITHUB_TOKEN=your_personal_access_token_here
GITHUB_REPO=owner/repository
GITHUB_USERNAME=your_username # Optional: specific user to search for
# Time range (pick one)
DAYS=14 # Past N days from today (default)
# START_DATE=2026-05-04 # Explicit range (both required; overrides DAYS)
# END_DATE=2026-05-08
# AI Configuration (for summarization)
OPENAI_API_KEY=your_openai_api_key_here
# OPENAI_MODEL=gpt-4o-mini # optional; gpt-5* supported automatically- Go to https://github.com/settings/tokens
- Click "Generate new token (classic)"
- Select scopes:
repo(for private repos) orpublic_repo(for public repos) - Copy the generated token
- Go to https://platform.openai.com/api-keys
- Create a new API key
- Copy the key (keep it secure!)
# Default: fetch + by-project + technical + publish to pr-reports
python main.py
# same as:
python main.py ship
# Or run steps individually:
python main.py fetch # β _detailed.csv
python main.py summarize # β _summarized.csv (from latest or --csv detailed)
python main.py by-project # β _by_project.md
python main.py technical # β _technical_highlights.md
python main.py perf-review # β _perf_review.md
python main.py publish # β commit in pr-reports clone
python main.py all # fetch + all reports (no publish)Report commands need both _detailed.csv and _summarized.csv. If missing, they run fetch and/or summarize as needed.
# Past N days (relative to today)
DAYS=7 python main.py ship
# Explicit date range (inclusive)
python main.py ship --start 2026-05-04 --end 2026-05-08
# Same via .env
START_DATE=2026-05-04 END_DATE=2026-05-08 python main.py fetch
# Publish and push
python main.py publish --push
# Re-run reports for an existing CSV (skips fetch)
python main.py by-project --csv output/pr_2026-05-04_2026-05-08_detailed.csv| Command | What it does |
|---|---|
fetch |
GitHub β _detailed.csv |
summarize |
_detailed.csv β _summarized.csv (OpenAI) |
by-project |
_summarized.csv β _by_project.md |
technical |
_detailed.csv β _technical_highlights.md |
perf-review |
_summarized.csv β _perf_review.md |
publish |
Copy output/{prefix}* to PR_REPORTS_DIR and commit |
all |
fetch + all three reports (no publish) |
ship |
fetch + by-project + technical + publish |
Raw PR data with clean descriptions:
pr_url- Direct link to the PRtitle- PR titledescription- Clean description (removes boilerplate/templates)lines_of_code_changes- Total lines changedadditions- Lines addeddeletions- Lines deletedcreated_at- PR creation timestampstate- PR state (open/closed)merged- Whether PR was mergedattachments- URLs of attachments found in PR
All detailed data plus:
ai_summary- Concise AI-generated summary of each PR
Professional markdown report with:
- π Executive Summary: Period, totals, averages
- π― Project Focus & Impact: Which projects got the most attention
- β‘ Technical Themes & Patterns: Performance, security, infrastructure initiatives
- π Development Velocity & Scale: Work distribution and iteration patterns
- π Cross-Project Insights: Shared challenges and dependencies
- π Key Accomplishments & Trends: Significant achievements and innovation
- π Individual PR Details: Each PR with project categorization, status, and summary
The tool intelligently extracts project names from PR titles using the format:
[TICKET-123] ProjectName: Summary of changes
Examples:
[CS-6304] Roles: DB schema for RBACβ Roles project[CS-5916] Connector Catalog: update cypress testsβ Connector Catalog project[INFRA-123] Docker: Update base imagesβ Docker project
PRs that don't match this pattern are categorized as "Uncategorized" and handled separately.
# Just fetch PR data
python src/github_pr_fetcher.py
# Same commands via pr_summarizer directly
python src/pr_summarizer.py fetch
python src/pr_summarizer.py summarize
python src/pr_summarizer.py ship
python src/pr_summarizer.py publish
python src/pr_summarizer.py all
# Publish the latest (or a specific) output run to pr-reports
python src/publish_reports.py
python src/publish_reports.py --prefix pr_2026-05-09_2026-05-16 --pushSet PR_REPORTS_DIR in .env to your local clone of pr-reports, then:
python main.py publish
python main.py ship # includes publish
python main.py publish --pushCopies every file matching the run prefix (CSVs and all report markdown files).
# Last week
DAYS=7 python main.py
# Last month
DAYS=30 python main.py
# Last quarter
DAYS=90 python main.py
# Last year
DAYS=365 python main.pyπ Starting GitHub PR Analytics Suite
==================================================
π Step 1: Fetching PRs from GitHub...
Searching for PRs created by: xindixu
Time range: Past 180 day(s)
Fetching PRs from instabase/instabase created by 'xindixu' after 2025-01-15...
Found 325 PRs matching criteria
Added PR: [CS-6454] DS UI: Disable indexing modal should list connected chatbots (165 lines changed)
Added PR: [CS-6435] DS UI: Indexing items over limit + use real feature flag (165 lines changed)
Added PR: [CS-0000] Roles: GA in 25.30 (11 lines changed)
Added PR: [CS-6452] Roles: gate mount backend for create/edit/delete mount points (50 lines changed)
Processed 5/325 PRs...
...
Exported 325 PRs to output/pr_2025-01-15_2025-07-14_detailed.csv
Summary:
Total PRs found: 325
Total lines changed: 165431
Average lines per PR: 509.0
β
PR data saved to: output/pr_2025-01-15_2025-07-14_detailed.csv
π€ Step 2: Generating AI summaries...
π Loaded 325 PRs from output/pr_2025-01-15_2025-07-14_detailed.csv
β
OpenAI client initialized with model: gpt-4o-mini
π€ Generating AI summaries...
Processing PR 1/325: [CS-6454] DS UI: Disable indexing modal should lis...
Processing PR 2/325: [CS-6435] DS UI: Indexing items over limit + use r...
...
π Analyzing patterns...
πΎ Saved summarized data to output/pr_2025-01-15_2025-07-14_summarized.csv
π Saved pattern analysis to output/pr_2025-01-15_2025-07-14_summary.md
π― QUICK ANALYSIS
==================================================
...
π WORKFLOW COMPLETE!
==================================================
π Detailed PR data: output/pr_2025-01-15_2025-07-14_detailed.csv
π€ AI summarized data: output/pr_2025-01-15_2025-07-14_summarized.csv
π Pattern analysis: output/pr_2025-01-15_2025-07-14_summary.md
Check out the examples/ directory for complete sample output demonstrating:
- Detailed CSV: Raw PR data with 15 realistic pull requests
- Summarized CSV: Same data enhanced with AI summaries
- Analysis Report: Comprehensive markdown analysis