Mango `match_failures/2` function by jcoglan · Pull Request #5858 · apache/couchdb

jcoglan · 2026-01-21T16:30:42Z

Overview

This PR represents work so far on a version of mango_selector:match/2 than can return a representation of how the input fails to match the selector, rather than just a boolean. There are a few commits of developing this behaviour incrementally before everything "snaps into place" to get us code paths that can produce failure descriptions, and retain the original boolean behaviour that avoids creating a lot of ephemeral failure lists, and can short-circuit on compound operators.

The rough steps here are as follows; I'm retaining all these commits for now in case we want to compare different designs for code complexity and performance.

Add a lot of unit tests to mango_selector that check every match operator and its negation. This is necessary since various compound operators like $or and $allMatch have surprising edge case behaviour on empty lists and we need to make sure this is not broken. Mostly these tests check that if selector S returns true on an input then { "$not": S } returns false and vice versa. There are some exceptions to this due to how $or works and how $and and $or are normalised.
Replace the existing implementation with one where every operator returns a possibly-empty list of failures, instead of a boolean. match_int then converts this to a bool on the way out.
Replace the Cmp argument to everything with a ctx record that contains cmp, as well as other things needed for failure generation, e.g. the path to the current value, whether matching is currently negated, etc.
Try to fix $allMatch and $elemMatch normalisation to avoid the complexity that comes from not being able to apply DeMorgan to them, due to $allMatch being defined to return false for empty lists. This is a breaking change that is reverted later since we decided to retain existing behaviour above all.
Implement negation handling, where the presence of a $not has to be communicated to nodes lower down the tree in order to produce good failure messages. Not all negation can be normalised out of the tree and so all operators need to handle being negated during evaluation.
Finally, collapse all the complexity into an implementation that supports both the old and new behaviours.

The idea in the design I've ended up with that tries to minimise both complexity and runtime cost is:

Add #ctx.verbose which indicates whether a detailed failure description is wanted.
Keep the original implementations of all operators as the response when passed #ctx{verbose=false}, i.e. when only a boolean result is needed.
For all leaf operators, the #ctx{verbose=true} case can be implemented by calling the #ctx{verbose=false} case, and creating a #failure record if this returns false.
#ctx{verbose=false} cases do not need to deal with #ctx{negate=true}; they continue to return their original result and let $not invert it. We only need special handling of #ctx{negate=true} in verbose mode, where the $not operator passes its effect down via the #ctx. This reduces the number of cases each operator has to deal with to basically: non-verbose mode, and positive and negative verbose cases.
For compound operators, special code is needed to gather up the failures from internal selectors and deal with edge cases in a way that's consistent with the original implementation.
#ctx.path is only updated in #ctx{verbose=true} code paths so this expense is avoided in non-verbose mode. Path items are added to the front of this list as that's cheaper than doing Path ++ [Item]; we would reverse these before returning to a client.
#failure records retain a #ctx from where they can access the path and negation state, in order to generate a good human-readable error message later on.
The tests are updated to make sure that both verbose modes return consistent results, i.e. if verbose=false returns true, then verbose=true returns [], and if the former returns false, the latter gives a non-empty list. These are all passing.

Testing recommendations

We should benchmark this in its current version, and both verbose modes of this version, against a substantial indexing workload to look for performance regressions. Or, if performance is equivalent in both verbose modes, we can remove a lot of redundancy by removing the verbose flag entirely.

Related Issues or Pull Requests

RFC: proposal for declarative VDUs #5792

Checklist

Code is written and works correctly
Changes are covered by tests
Any new configurable parameters are documented in rel/overlay/etc/default.ini
Documentation changes were made in the src/docs folder
Documentation changes were backported (separated PR) to affected branches

nickva · 2026-01-21T16:56:53Z

That's a nice approach using a context record in place of the Cmp arg.

All the extra eunit tests are awesome. If you want, could even put them in a separate PR and we'd merge them right away. It would make it easier to review subsequent PRs because we can obviously see all the existing tests pass.

Rather than returning a boolean to indicate just success or failure, `mango_selector:match/2` now returns a list of "failures" describing the ways in which the selector failed to match the input. If this list is empty, the match was a success.

We will need to pass other things around between `match` calls as well the current `Cmp` function, so here we replace this argument with a `#ctx` record that intially just contains a `cmp` field.

To give detailed feedback to the caller, the `#ctx` argument to `mango_selector:match/3` now records the path that was taken to reach each value, and this path is added to the `#failure` records. Each path segment is either a binary, if it represents an object property, or an integer if it represents an array index. Items are pushed on the front of `#ctx.path` as this is faster than pushing onto the back of a list. This list can then be reversed once the final list of failures has been generated, before the failures are presented to the caller.

Collecting detailed `#failure` records rather than a boolean true/false when evaluating selectors imposes a performance penalty, so we would like to only do this when a selector is used for a VDU, not when it is used for indexing/filtering. To this end we introduce "verbose" mode signalled via the `#ctx.verbose` field, and each branch of `mango_selector:match/3` now has 3 distinct versions: - `#ctx{verbose = false}`: this is the original version that returns true/false, taken when a selector is used for Mango queries. - `#ctx{verbose = true, negate = false}`: verbose mode, when the operator is not negated by an enclosing `$not` operator. Returns a list of `#failure` records which may be empty. - `#ctx{verbose = true, negate = true}`: verbose mode, when the operator is negated by an enclosing `$not` operator. Returns a list of `#failure` records. The different negation modes are needed because, in order to generate meaningful failure messages, we need to record whether an operator was negated. The behaviour of combinators like `$and`, `$or`, `$allMatch` and `$elemMatch` means not all `$not` operators can be normalized out of the selector before evaluation. Instead, when we encounter a `$not` during evaluation, we flip the `#ctx.negate` field before evaluating the inner operator.

Until now, document updates rejected by a Mango VDU returned an opaque "forbidden" message to the client. This commit adds a detailed list of failures, obtained by converting the `#failure` records returned by `mango_selector:match/3` into human-readable messages.

jcoglan force-pushed the mango-match-failures branch 3 times, most recently from 5076677 to 7f8f999 Compare February 4, 2026 09:57

jcoglan force-pushed the mango-match-failures branch 2 times, most recently from 69fe09f to fa65b74 Compare February 11, 2026 14:12

jcoglan changed the base branch from main to 3.5.x February 13, 2026 09:38

jcoglan mentioned this pull request Feb 13, 2026

$data operator for VDUs #5889

Draft

6 tasks

jcoglan force-pushed the mango-match-failures branch from fa65b74 to 1c23788 Compare February 13, 2026 11:39

jcoglan changed the base branch from 3.5.x to main February 13, 2026 11:40

jcoglan mentioned this pull request Feb 20, 2026

Mango unit tests #5895

Merged

6 tasks

janl added this to the 3.8 milestone Feb 27, 2026

jcoglan force-pushed the mango-match-failures branch from 1c23788 to 82142aa Compare March 13, 2026 16:31

jcoglan added 7 commits March 23, 2026 14:12

chore: Replace Cmp argument to mango_selector:match/3 with a record

468b2e0

We will need to pass other things around between `match` calls as well the current `Cmp` function, so here we replace this argument with a `#ctx` record that intially just contains a `cmp` field.

[wip] install erlperf

6c8e7ae

[wip] some benches

74d02e0

jcoglan force-pushed the mango-match-failures branch from 82142aa to 74d02e0 Compare March 23, 2026 14:28

jcoglan marked this pull request as ready for review March 23, 2026 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mango `match_failures/2` function#5858

Mango `match_failures/2` function#5858
jcoglan wants to merge 7 commits intoapache:mainfrom
neighbourhoodie:mango-match-failures

jcoglan commented Jan 21, 2026 •

edited

Loading

Uh oh!

nickva commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jcoglan commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Testing recommendations

Related Issues or Pull Requests

Checklist

Uh oh!

nickva commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jcoglan commented Jan 21, 2026 •

edited

Loading