feat(api): add feature-flagged /rest/v1/health endpoint#936
feat(api): add feature-flagged /rest/v1/health endpoint#936skypank-coder wants to merge 1 commit into
Conversation
Adds a lightweight deploy/uptime health probe at GET /rest/v1/health, gated behind the CRE_ENABLE_HEALTH feature flag (off by default).
Behavior:
- Flag off (default): endpoint returns 404, as if it does not exist.
- Flag on, healthy: 200 with {ok, db_reachable, cre_count, standards_count} when the serving DB is reachable and holds a non-empty dataset.
- Flag on, unhealthy: 503 when the DB is unreachable or the dataset is empty/broken (reason explains which).
Node_collection.health_check() runs cheap COUNT queries over CRE and Node, never raises (connectivity errors are reported as ok=False), and treats a zero count for either as an empty dataset.
Scope is intentionally limited to DB reachability + data sanity. Deeper checks (gap-analysis completeness, mapping coverage, Neo4j, Redis) are deliberately excluded by design and belong in ops tooling.
Summary by CodeRabbitRelease Notes
WalkthroughAdds a feature-flagged ChangesHealth probe endpoint
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
application/database/db.py (1)
2292-2297: ⚡ Quick winConsider narrowing the broad Exception handler to SQLAlchemy exception types.
The bare
except Exception:on line 2292 is caught by static analysis (Ruff BLE001). While the defensive intent is clear from the comment, you could make this more precise by catchingsqlalchemy.exc.DatabaseError(which is the base class for all SQLAlchemy database exceptions, includingOperationalError) or a tuple of specific exceptions like(OperationalError, ProgrammingError, DatabaseError). This would maintain the defensive posture while being more explicit about what you're guarding against.♻️ Proposed refinement
+from sqlalchemy.exc import DatabaseError + def health_check(self) -> Dict[str, Any]: try: cre_count = self.session.query(func.count(CRE.id)).scalar() or 0 standards_count = self.session.query(func.count(Node.id)).scalar() or 0 except OperationalError: return { "ok": False, "db_reachable": False, "reason": "database unreachable", } - except Exception: # pragma: no cover - defensive, never fail open + except DatabaseError: # pragma: no cover - defensive, catch all DB errors return { "ok": False, "db_reachable": False, "reason": "database health query failed", }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@application/database/db.py` around lines 2292 - 2297, The broad `except Exception:` handler in the database health check function should be narrowed to catch specific SQLAlchemy exceptions instead. Replace the bare Exception catch with either `sqlalchemy.exc.DatabaseError` (which is the base class for all SQLAlchemy database exceptions) or a tuple of specific exceptions like `(OperationalError, ProgrammingError, DatabaseError)`. This maintains the defensive error handling while being more explicit about what database-related errors are being guarded against, which will satisfy the Ruff BLE001 static analysis check.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@application/database/db.py`:
- Around line 2292-2297: The broad `except Exception:` handler in the database
health check function should be narrowed to catch specific SQLAlchemy exceptions
instead. Replace the bare Exception catch with either
`sqlalchemy.exc.DatabaseError` (which is the base class for all SQLAlchemy
database exceptions) or a tuple of specific exceptions like `(OperationalError,
ProgrammingError, DatabaseError)`. This maintains the defensive error handling
while being more explicit about what database-related errors are being guarded
against, which will satisfy the Ruff BLE001 static analysis check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: dde30f7c-1450-4218-9931-ccaaa7a18f56
📒 Files selected for processing (4)
application/database/db.pyapplication/feature_flags.pyapplication/tests/web_main_test.pyapplication/web/web_main.py
Closes #935
What
Thin v1 deploy/uptime health probe at
GET /rest/v1/health, off by default behind theCRE_ENABLE_HEALTHfeature flag.Behavior
cre_count > 0andstandards_count > 0), returning{ok, db_reachable, cre_count, standards_count}.reasonexplains which).How
Node_collection.health_check()runs two cheapCOUNTqueries (overCREandNode), never raises — connectivity errors are reported asok=Falseso the route can return the right status code.Scope
DB reachability + data sanity only. No Neo4j / Redis. No GA / mapping completeness — those are deliberately excluded and stay in ops tooling (
verify_ga_completeness,monitor_ga_health, weekly automation). A heavier "deep health" can be added separately later.Testing
web_main_test.py: disabled→404, enabled+empty→503, enabled+populated→200, DB unreachable→503. All green.black==24.4.2 --checkclean on all changed files.Files changed
application/feature_flags.py—is_health_endpoint_enabled()application/database/db.py—Node_collection.health_check()application/web/web_main.py—/rest/v1/healthrouteapplication/tests/web_main_test.py— tests