Skip to content

Karknu/tx undecision#5352

Draft
karknu wants to merge 80 commits into
mainfrom
karknu/tx_undecision
Draft

Karknu/tx undecision#5352
karknu wants to merge 80 commits into
mainfrom
karknu/tx_undecision

Conversation

@karknu

@karknu karknu commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Description

Txsubmission V2, but without a central decision thread.
WIP. WIP. WIP.

Checklist

Quality

  • Commit sequence makes sense and have useful messages, see ref.
  • New tests are added and existing tests are updated.
  • Self-reviewed the PR.

Maintenance

  • Linked an issue or added the PR to the current sprint of ouroboros-network project.
  • Added labels.
  • Updated changelog files.
  • The documentation has been properly updated, see ref.

Attempt to create a buildable version of whats released in cardano-node
10.7.0.
@github-project-automation github-project-automation Bot moved this to In Progress in Ouroboros Network Apr 9, 2026
@karknu karknu force-pushed the karknu/tx_undecision branch 2 times, most recently from e3189b8 to 47f2ed7 Compare April 14, 2026 07:50
@karknu karknu force-pushed the karknu/tx_undecision branch from b747b36 to 441a997 Compare April 17, 2026 08:48
karknu added 23 commits April 27, 2026 10:13
In case of duplicate txids, a second invalid tx could be counted as
valid because it was tracked by txid alongside the valid duplicate,
conflating the two.
Add a test for ensuring that a peer can download all valid TXs when
faced with two peers with conflicting tx order.
Replace the central decsision thread by having server threads coordinate
between them by blocking on STM actions.
Use sharedGeneration to track if the sharedState really changed and only
write to the tvar if it changed.

This makes common operations like receiving and acking a txid that is
already retained something that only affects the peer local state.
Remove shared states that where only written to.
Have the peer update its score during phase change. This makes State's
idle calculation quicker since peers will drain back to 0.

Mark pacIdlePeerScores as lazy to avoid doing the calculations when it
isn't needed.
Add benchmarks for V2
Instead of traversing all peers only touch the peers that need to wake
up.
Avoid copying the map when encountering an existing advertiser.
We don't need to track the ack state, it can be derived by the time we
decide to ack a TX. So txAdvertisers can just be a Set member check.
Remove the tracking of advertisers in TxEntry.
This change also changing how scoring is used to rank peers.
A peers score affects how long time after the TX owners lease expire
they can wake up and attempt to claim it. This means that
acknowledgement/downloading requires minimal coordination between peers.
The first peer that advertised a new txid always got a lease on it.
This is a problem since the peer may be at capacity and unable to
request the TX.
Update the behaviour so that anyone peer that advertise a txid can gain
the first claim.
Replace the local state tvar with V1's Stateful types.
Improve comment around deletion of retaind txs.
The retained functions in Types.hs are just small wrappes around IntPSQ
functions.
Use nothunks to assert that there are no thunks after some property
based tests.
If there is at least one TX outstanding don't ack the final txid in the
window
We don't track peers that don't have an ongoing attempt any longer so
TxNoAttempt isn't needed.
karknu added 13 commits May 1, 2026 14:34
Convert the direct server benchmark into one that can run multiple peers
with async.
Avoid Double conversions, similar to muxs diffTimeToMicroseconds.
Use IntSet for faster lookups for retained keys.
IntPSQ is still needed for quick removal of expeired entries.
Reduce STM contention in V2 TxSubmission inbound by splitting per-peer
in-flight bookkeeping out of SharedTxState into a small per-peer
PeerTxInFlight TVar. SharedTxState is now only written when the shared
state updates. The common case of a new peer advertising an existing
txid is just a read operation for the shared state and a write operation
into the peer local TVar.
Break out mempoolGetSnapshot into its own atomic transaction.
Add the conter thread to the test cases.
Add meta tests for AppV2 generators/shrinkers
Add basic counter test.
Add property that check for leaks in shared state.
add scoreAcceptDecrement which rewards peers that deliver TXs that gets
accepted.
Reduce scoreRate to a slow trickle
Ensure that the delay penalty stings by mapping it to between 10ms and
interTxSpace for any peer with a positive score.
@coot coot added tx-submission Issues related to tx-submission protocol leios Issues / PRs related to Leios labels May 8, 2026
@karknu karknu force-pushed the karknu/tx_undecision branch from d6bf451 to f952b66 Compare May 18, 2026 07:48
karknu and others added 3 commits May 18, 2026 11:02
Trace TraceTxInboundCanRequestMoreTxs and
TraceTxInboundCannotRequestMoreTxs just as V1 does.
@karknu karknu force-pushed the karknu/tx_undecision branch from 90a3350 to fdbd89b Compare May 18, 2026 09:58
karknu added 3 commits May 18, 2026 13:44
50ms was too low in my mainnet relay. Most tx requests took longer than
that. 125ms should be long enough for 90% of all requests.
@karknu karknu force-pushed the karknu/tx_undecision branch from 421c6ca to 77c974c Compare May 26, 2026 06:34
@coot coot moved this from In Progress to Done in Ouroboros Network May 28, 2026
karknu added 2 commits May 29, 2026 13:06
Reasoning:
interTxSpace       = 250 ms   p90 target (Asia)
inflightTimeout    = 600 ms   2x interTxSpace + ~100 ms headroom
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

leios Issues / PRs related to Leios tx-submission Issues related to tx-submission protocol

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants