Evaluation: Update scoring semantics

**Is your feature request related to a problem?**  
The TopicRelevanceOpenAI validator has hardcoded scoring instructions, leading to false positives for forbidden-topic configurations. This misalignment causes confusion when users configure the system prompt with excluded topics.

**Describe the solution you'd like**  
- Update the scoring semantics to ensure high scores reflect "clearly NOT forbidden" for exclusion-based prompts.  
- Revise the TopicRelevanceOpenAI validator to eliminate conflicts in scoring.  
- Conduct testing to confirm the new scoring aligns with user intent.

<details><summary>Original issue</summary>

**Which feature or component needs enhancement?**
Clearly identify the existing feature/component that needs improvement.
The TopicRelevanceOpenAI validator had hardcoded scoring instructions that caused false positives for forbidden-topic configurations. When a user configured the system prompt with forbidden topics (e.g. "do not answer queries about gender detection"), the model was misled by the scoring semantics — score 3 meant "clearly in scope", which conflicted with the intent of an exclusion-based prompt where a high score should mean "clearly NOT forbidden".

**Describe the current behavior**
A clear description of how it currently works and what the limitations are.

**Describe the enhancement you'd like**
A clear and concise description of the improvement you want to see.

**Why is this enhancement needed?**
Explain the benefits (e.g., performance, usability, maintainability, scalability).

**Additional context**
Add any other context, metrics, screenshots, or examples about the enhancement here.

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation: Update scoring semantics #129

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Evaluation: Update scoring semantics #129

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions