Skip to content

Conversation

@AdamGS
Copy link
Contributor

@AdamGS AdamGS commented Jan 13, 2026

Make the underlying scan concurrency configurable through datafusion

@AdamGS AdamGS requested review from a10y and gatesn January 13, 2026 19:04
@AdamGS AdamGS added the feature Release label indicating a new feature or request label Jan 13, 2026
/// during footer parsing.
pub footer_initial_read_size_bytes: usize, default = DEFAULT_FOOTER_INITIAL_READ_SIZE_BYTES
/// The per-file Vortex scan concurrency.
pub scan_concurrency: Option<usize>, default = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't DF let you set this for an entire sessioncontext? do we want to override this on a per-source basis?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the copy of VortexOptions that is part of VortexFormat, which propagates it downstream from there through either VortexFormatFactory::create or VortexFormat::file_source

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it makes sense to use the target_partitions config off the environment, but I realize that's different.

Maybe we can make it clear in the doc comment that this is the intra-partition concurrency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intra-partition was the term I was looking for!

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

❌ Patch coverage is 60.71429% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.39%. Comparing base (d757be1) to head (f1a5852).
⚠️ Report is 3 commits behind head on develop.

Files with missing lines Patch % Lines
vortex-datafusion/src/persistent/source.rs 30.00% 7 Missing ⚠️
vortex-datafusion/src/persistent/format.rs 62.50% 3 Missing ⚠️
vortex-datafusion/src/persistent/opener.rs 90.00% 1 Missing ⚠️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Adam Gutglick <[email protected]>
Signed-off-by: Adam Gutglick <[email protected]>
@AdamGS AdamGS force-pushed the adamg/add-scan-concurrency branch from 25c697a to f1a5852 Compare January 14, 2026 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Release label indicating a new feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants