Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileSource and DataSource traits require deep copies #14939

Open
alamb opened this issue Feb 28, 2025 · 2 comments
Open

FileSource and DataSource traits require deep copies #14939

alamb opened this issue Feb 28, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Feb 28, 2025

Is your feature request related to a problem or challenge?

While working on various upgrade PRs and preparing for the DataFusion 46 release, I have noticed something I would like to change before we release

The FileSource and DataSource traits were introduced in the datasource refactor

They have APIs to update the underlying source in a few ways, but the APIs require cloning. For example, FileSource looks like this:

/// Common behaviors that every file format needs to implement.
///
/// See initialization examples on `ParquetSource`, `CsvSource`
pub trait FileSource: Send + Sync {
...
    /// Initialize new type with batch size configuration
    fn with_batch_size(&self, batch_size: usize) -> Arc<dyn FileSource>;
...
}

The only way to implement with_batch_size is to (deep) clone the object

    fn with_batch_size(&self, batch_size: usize) -> Arc<dyn FileSource> {
        let mut conf = self.clone();
        conf.batch_size = Some(batch_size);
        Arc::new(conf)
    }

fn with_batch_size(&self, batch_size: usize) -> Arc<dyn FileSource> {
let mut conf = self.clone();
conf.batch_size = Some(batch_size);
Arc::new(conf)
}

Describe the solution you'd like

I would like to avoid having to deep clone the object

Describe alternatives you've considered

No response

Additional context

No response

@alamb alamb added the enhancement New feature or request label Feb 28, 2025
@alamb alamb assigned alamb and unassigned alamb Feb 28, 2025
@alamb
Copy link
Contributor Author

alamb commented Feb 28, 2025

Maybe we can try a mutable API instead of a builder style API 🤔

@Standing-Man
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants