Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: REST Catalog implementation #965

Merged
merged 16 commits into from
Mar 10, 2025

Conversation

connortsui20
Copy link
Contributor

@connortsui20 connortsui20 commented Feb 12, 2025

Followup of #962

#962 Introduced a bug where it is not some of the methods allow for both StatusCode::OK and StatusCode::NO_CONTENT as success cases, when in reality it should be one or the other (this was me, sorry about that).

This PR attempts to unify the 3 different types of response helpers that essentially all do the exact same thing slightly differently. The main addition here is a function query_catalog:

    // Queries the Iceberg REST catalog with the given `Request` and a provided handler.
    pub async fn query_catalog<R, H, Fut>(&self, mut request: Request, handler: H) -> Result<R>
    where
        R: DeserializeOwned,
        H: FnOnce(Response) -> Fut,
        Fut: Future<Output = Result<R>>,
    {
        self.authenticate(&mut request).await?;

        let response = self.client.execute(request).await?;

        handler(response).await
    }

By allowing each Catalog method to specify how they want to handle the responses, it gets much finer control on the success/error cases as well as the error messages. Previously, there were 3 functions that all did similar things:

    pub async fn query<R: DeserializeOwned, E: DeserializeOwned + Into<Error>>(
        &self,
        mut request: Request,
    ) -> Result<R> {

    pub async fn execute<E: DeserializeOwned + Into<Error>>(
        &self,
        mut request: Request,
    ) -> Result<()> {

    pub async fn do_execute<R, E: DeserializeOwned + Into<Error>>(
        &self,
        mut request: Request,
        handler: impl FnOnce(&Response) -> Option<R>,
    ) -> Result<R> {

I'm also somewhat using this as a chance to refactor some other parts of this crate, mainly documentation and examples.

@Xuanwo It would be great if I could get feedback on some of these proposed changes before I keep going!

@connortsui20 connortsui20 marked this pull request as ready for review February 16, 2025 03:13
liurenjie1024
liurenjie1024 previously approved these changes Feb 17, 2025
Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @connortsui20 for this pr, LGTM! Let's wait for a moment to have more people to take a review on this.

@connortsui20
Copy link
Contributor Author

connortsui20 commented Feb 24, 2025

@Xuanwo could you take a look at the latest commit and see if it makes sense?

@connortsui20 connortsui20 requested a review from Xuanwo February 28, 2025 21:56
@Xuanwo
Copy link
Member

Xuanwo commented Mar 1, 2025

@Xuanwo could you take a look at the latest commit and see if it makes sense?

Hi, I'm still not in favor of the handler style. The stack size of my brain is not sufficient to handle the extra handler (sorry for that). Maybe we could consider the following?

let resp: Response = client.query_catalog(request).await?;
if response.status() == StatusCode::OK {
    deserialize_catalog_response::<CatalogConfig>(response).await
} else {
    Err(deserialize_unexpected_catalog_error(response).await)
}

@connortsui20
Copy link
Contributor Author

connortsui20 commented Mar 1, 2025

@Xuanwo could you take a look at the latest commit and see if it makes sense?

Hi, I'm still not in favor of the handler style. The stack size of my brain is not sufficient to handle the extra handler (sorry for that). Maybe we could consider the following?

let resp: Response = client.query_catalog(request).await?;
if response.status() == StatusCode::OK {
    deserialize_catalog_response::<CatalogConfig>(response).await
} else {
    Err(deserialize_unexpected_catalog_error(response).await)
}

Okay, I'll make that change and then we can reevaluate if it makes more sense to delay any serialization errors + extra boilerplate. You're probably right by the way, I guess I just think about this weirdly.

@connortsui20
Copy link
Contributor Author

@Xuanwo ready for review! I think that this version is a lot more boilerplate but that should be fine as it is more or less linearly readable.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @connortsui20 for working on this. I love this version.

@Xuanwo
Copy link
Member

Xuanwo commented Mar 3, 2025

Inviting @liurenjie1024 for another look.

@Xuanwo
Copy link
Member

Xuanwo commented Mar 6, 2025

I'll merge this PR in three days if there are no further objections. I believe it's on the right track, and additional polish can be made on top of it.

@Xuanwo Xuanwo merged commit 875349c into apache:main Mar 10, 2025
17 checks passed
@connortsui20 connortsui20 deleted the rest-catalog-cleanup branch March 10, 2025 05:21
ZENOTME pushed a commit to risingwavelabs/iceberg-rust that referenced this pull request Mar 12, 2025
Followup of apache#962

apache#962 Introduced a bug where it is not some of the methods allow for both
`StatusCode::OK` and `StatusCode::NO_CONTENT` as success cases, when in
reality it should be one or the other (this was me, sorry about that).

This PR attempts to unify the 3 different types of response helpers that
essentially all do the exact same thing slightly differently. The main
addition here is a function `query_catalog`:

```rust
    // Queries the Iceberg REST catalog with the given `Request` and a provided handler.
    pub async fn query_catalog<R, H, Fut>(&self, mut request: Request, handler: H) -> Result<R>
    where
        R: DeserializeOwned,
        H: FnOnce(Response) -> Fut,
        Fut: Future<Output = Result<R>>,
    {
        self.authenticate(&mut request).await?;

        let response = self.client.execute(request).await?;

        handler(response).await
    }
```

By allowing each `Catalog` method to specify how they want to handle the
responses, it gets much finer control on the success/error cases as well
as the error messages. Previously, there were 3 functions that all did
similar things:

```rust
    pub async fn query<R: DeserializeOwned, E: DeserializeOwned + Into<Error>>(
        &self,
        mut request: Request,
    ) -> Result<R> {

    pub async fn execute<E: DeserializeOwned + Into<Error>>(
        &self,
        mut request: Request,
    ) -> Result<()> {

    pub async fn do_execute<R, E: DeserializeOwned + Into<Error>>(
        &self,
        mut request: Request,
        handler: impl FnOnce(&Response) -> Option<R>,
    ) -> Result<R> {
```

I'm also somewhat using this as a chance to refactor some other parts of
this crate, mainly documentation and examples.

@Xuanwo It would be great if I could get feedback on some of these
proposed changes before I keep going!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants