fix output for query cli#2017
Conversation
Greptile SummaryThis PR filters the output of the
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/adapters/cli/sdk_workflow.py | Adds a projection step in query_documents to strip VDB-internal fields (_distance, etc.) and expose only text, source, and page_number. Uses .get() with safe defaults. One list comprehension line exceeds the 120-char limit enforced by Black/Flake8. |
| nemo_retriever/tests/test_root_cli_workflow.py | Test updated to distinguish raw_hits (full VDB response with _distance) from public_hits (filtered output), correctly asserting the CLI prints only the three public fields. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[CLI: retriever query] --> B[query_documents]
B --> C[Retriever.query]
C --> D[LanceDB VDB]
D --> E["raw hits\n{text, source, page_number,\n_distance, ...}"]
E --> F["project hits\n{text, source, page_number}"]
F --> G[JSON output to stdout]
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
nemo_retriever/src/nemo_retriever/adapters/cli/sdk_workflow.py:85
The list comprehension on this line is ~130 characters, exceeding the project's enforced 120-character limit (Black + Flake8). The pre-commit hook will reject the file as-is, blocking CI. Black would reformat it to a multi-line style.
```suggestion
hits = [
{
"text": hit.get("text", ""),
"source": hit.get("source", ""),
"page_number": hit.get("page_number"),
}
for hit in hits
]
```
Reviews (3): Last reviewed commit: "Merge branch 'clean-cli-skills' of https..." | Re-trigger Greptile
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…gest into clean-cli-skills
| retriever = Retriever(top_k=top_k, vdb_kwargs={"uri": lancedb_uri, "table_name": table_name}) | ||
| return retriever.query(query) | ||
| hits = retriever.query(query) | ||
| hits = [{"text": hit.get("text", ""), "source": hit.get("source", ""), "page_number": hit.get("page_number")} for hit in hits] |
There was a problem hiding this comment.
The list comprehension on this line is ~130 characters, exceeding the project's enforced 120-character limit (Black + Flake8). The pre-commit hook will reject the file as-is, blocking CI. Black would reformat it to a multi-line style.
| hits = [{"text": hit.get("text", ""), "source": hit.get("source", ""), "page_number": hit.get("page_number")} for hit in hits] | |
| hits = [ | |
| { | |
| "text": hit.get("text", ""), | |
| "source": hit.get("source", ""), | |
| "page_number": hit.get("page_number"), | |
| } | |
| for hit in hits | |
| ] |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/adapters/cli/sdk_workflow.py
Line: 85
Comment:
The list comprehension on this line is ~130 characters, exceeding the project's enforced 120-character limit (Black + Flake8). The pre-commit hook will reject the file as-is, blocking CI. Black would reformat it to a multi-line style.
```suggestion
hits = [
{
"text": hit.get("text", ""),
"source": hit.get("source", ""),
"page_number": hit.get("page_number"),
}
for hit in hits
]
```
How can I resolve this? If you propose a fix, please make it concise.
randerzander
left a comment
There was a problem hiding this comment.
returns are much cleaner, thanks!
Description
cleans output of query sub command to only display source, page_number and text.
Checklist