Date: 2025-09-15T03:00:00Z Reporter: claude-parser v2.0.1 team Service: semantic-search MCP
Semantic search fails to find conceptually related code patterns, returning keyword-matched but semantically unrelated files.
- Query:
"black box test real data API test integration" - Expected: Test files, test patterns, testing utilities
- Actual: Random files with keywords but no semantic relevance
# Query: "black box test real data API test integration"
1. test_single.py - "Hello World" print statement (NOT a test)
2. main.py - Basic main function (NOT a test)
3. verify_spec.py - Empty file (0 bytes)
4. settings.py - Test credentials we're trying to REMOVE
5. analytics/__init__.py - Module init (unrelated)Should return files that are semantically related to testing:
- Actual test files (test_*.py)
- Test fixtures and utilities
- Testing patterns and frameworks
- Black box testing examples
The service appears to use literal keyword matching rather than semantic understanding:
- Matches "test" in filename regardless of content
- Doesn't understand "black box testing" as a concept
- Returns files with ANY matching keyword, not conceptual relevance
- @SEMANTIC_SEARCH_REQUIRED pattern becomes unreliable
- Cannot find existing patterns before writing code
- Violates @DRY_FIRST and @UTIL_FIRST enforcement
- Forces manual searching instead of automated discovery
- Use embeddings that understand testing concepts
- Weight actual test files higher (test_*.py pattern)
- Consider file structure context (tests/ directory)
- Understand testing terminology (black box, unit, integration)
Currently using grep/fd for literal searches instead of semantic search for testing patterns.
HIGH - Core LNCA pattern (@SEMANTIC_SEARCH_REQUIRED) is compromised
Detected: 2025-01-15T21:00:00Z Issue: find_violations() timing out after 5 seconds Fix Applied: Added 30s timeout to httpx.AsyncClient Status: RESOLVED
Detected: 2025-01-15T20:45:00Z Issue: MCP tools required explicit project parameter when already in project directory Fix Applied: Added os.getcwd() fallback to auto-detect project from CWD Status: RESOLVED Files Changed:
- mcp_server.py:115-138 (find_violations, check_architecture_compliance)
# This query should return test files, not random files:
results = discover_code_patterns("black box testing pattern example")
assert any("test_" in r['file_name'] for r in results)
assert not any("settings.py" in r['file_name'] for r in results)Filed per @IMMEDIATE_ALERT and @SEMANTIC_SEARCH_BUGS patterns