Skip to content

DAO Pagination (GSI-2412)#231

Open
TheByronHimes wants to merge 7 commits into
mainfrom
feature/dao_pagination
Open

DAO Pagination (GSI-2412)#231
TheByronHimes wants to merge 7 commits into
mainfrom
feature/dao_pagination

Conversation

@TheByronHimes

@TheByronHimes TheByronHimes commented Jun 10, 2026

Copy link
Copy Markdown
Member

The DAO protocol and providers now expose pagination parameters in find_all(): sort, skip, and limit. Additionally, the result of find_all() exposes a .total_count() method that returns the total number of items matching the filter regardless of skip/limit. Together, these should enable both offset-based and cursor-based pagination.

Bumps the version to 8.3.0

@TheByronHimes TheByronHimes marked this pull request as ready for review June 10, 2026 14:52
@TheByronHimes TheByronHimes requested a review from Cito June 10, 2026 14:52
@coveralls

Copy link
Copy Markdown

Coverage Report for CI Build 27284487376

Coverage increased (+0.009%) to 93.308%

Details

  • Coverage increased (+0.009%) from the base build.
  • Patch coverage: 4 uncovered changes across 1 file (87 of 91 lines covered, 95.6%).
  • No coverage regressions found.

Uncovered Changes

File Changed Covered %
src/hexkit/providers/mongokafka/provider/daopub.py 26 22 84.62%
Total (4 files) 91 87 95.6%

Coverage Regressions

No coverage regressions found.


Coverage Stats

Coverage Status
Relevant Lines: 3915
Covered Lines: 3653
Line Coverage: 93.31%
Coverage Strength: 0.93 hits per line

💛 - Coveralls

@Cito Cito left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that looks useful.

I have pointed out two edge cases that need some refinement, and made one suggestions regarding the sort spec.

A FindResult that is async-iterable and also provides total_count().
"""
skip = skip or 0
limit = limit or 0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment that limit=0 means unbounded in mongo (not all db engines do this)

if sort:
cursor = cursor.sort(sort)
async for document in cursor.skip(skip).limit(limit):
if document.get("__metadata__", {}).get("deleted", False):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you filter after skipping and limiting - it should be the other way round

i guess the most performant way is to add a filter for (metadata missing or metadata.deleted false) upfront.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, yes, I don't know how I didn't catch that. Good spot

Number of matching resources to skip before yielding results.
Defaults to None (no skipping).
limit:
Maximum number of resources to yield. Defaults to None (no limit).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring should also define what limit=0 means - unbounded (as in mongo) or literally 0 (which I prefer since it#s much more intuitive). The implementation is currently for the former, the latter would need slight adjustment for mongo.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for having a limit of literally 0?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a value of 0 is not really useful (if you want unbound you can already just pass nothing/None). My main point is that the behavior should be well specified in the docstring. Options are 1) raise error, 2) do exactly what was requested (return nothing) or 3) re-interpret as "infinity". I think 2) is the most obvious one - Django ORM, Python slicing, Postgres and MySQL all work that way.

results = [x.title async for x in dao.find_all(mapping={}, skip=10, sort=asc_sort)]
assert results == []

# skip=0, limit=0 behaves like no pagination (limit=0 means no limit)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should test both, limit=0 and limit=None and check that they do what we say in the docstring (for all providers)


with pytest.raises(UniqueConstraintViolationError):
await dao.upsert(dto5)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should also add some pagination tests after adding raw documents with no metadata at all, and metadata.deleted = True

]

SortSpec = list[tuple[str, Literal[1, -1]]]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use simple simple strings with a "-" prefix for descending order - that's how the Django ORM does it, so it's a well-established pattern in Python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants