Skip to content

Use separate tokio runtime for IO in datafusion-cli#21138

Closed
alamb wants to merge 2 commits intoapache:mainfrom
alamb:alamb/use_separate_io_pool
Closed

Use separate tokio runtime for IO in datafusion-cli#21138
alamb wants to merge 2 commits intoapache:mainfrom
alamb:alamb/use_separate_io_pool

Conversation

@alamb
Copy link
Copy Markdown
Contributor

@alamb alamb commented Mar 24, 2026

Which issue does this PR close?

Rationale for this change

While profiling #20820 I noticed that datafusion-cli is still not keeping its CPU entirely busy for unclear reasons

One interesting thing I noted is that there is no separate IO threads, even though there should be.

I want to test and see if using a separate IO runtime makes a difference

What changes are included in this PR?

Add a separate IO runtime to datafusion-cli and ensure that all IO happens on that stream

Are these changes tested?

Are there any user-facing changes?

@alamb alamb marked this pull request as draft March 24, 2026 15:21
@alamb
Copy link
Copy Markdown
Contributor Author

alamb commented Mar 24, 2026

This didn't seem to make a meaningful difference in performance -- I need to look more carefully at the io pattern

@Dandandan
Copy link
Copy Markdown
Contributor

no separate IO threads

My current view: I think we currently are using spawn_blocking fine to keep it of the main / query threads, which is not optimal (because we create too much/unbounded threads), but OK from separating IO calls / CPU.

I think though that we:

  • Don't perfectly balance work - that's where morselization will help to balance work evenly across threads (i.e. all threads are 90% busy most of the time
  • Don't interleave IO and CPU - that's where prefetching might help to get the last ~10-15% or so extra utilization
  • There is also some IO-level efficiency improvements, like file open / close on each read, which might help a bit (although it will matter even less when it is prefetched).

@alamb alamb closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants