Skip to content

Commit 12d766f

Browse files
committed
Squashed commit of the following:
commit cda4a0c Author: congyi wang <[email protected]> Date: Mon Sep 30 11:07:36 2024 +0800 chore: fix typo in FileIO Schemes (apache#653) * fix typo * fix typo commit af9609d Author: Scott Donnelly <[email protected]> Date: Mon Sep 30 04:06:14 2024 +0100 fix: page index evaluator min/max args inverted (apache#648) * fix: page index evaluator min/max args inverted * style: fix clippy lint in test commit a6a3fd7 Author: Alon Agmon <[email protected]> Date: Sat Sep 28 10:10:08 2024 +0300 test (datafusion): add test for table provider creation (apache#651) * add test for table provider creation * fix formatting * fixing yet another formatting issue * testing schema using data fusion --------- Co-authored-by: Alon Agmon <[email protected]> commit 87483b4 Author: Alon Agmon <[email protected]> Date: Fri Sep 27 04:40:08 2024 +0300 making table provider pub (apache#650) Co-authored-by: Alon Agmon <[email protected]> commit 984c91e Author: ZENOTME <[email protected]> Date: Thu Sep 26 17:56:02 2024 +0800 avoid to create memory schema operator every time (apache#635) Co-authored-by: ZENOTME <[email protected]> commit 4171275 Author: Matheus Alcantara <[email protected]> Date: Wed Sep 25 08:28:42 2024 -0300 scan: change ErrorKind when table dont have spanshots (apache#608) commit ab51355 Author: xxchan <[email protected]> Date: Tue Sep 24 21:25:45 2024 +0800 fix: compile error due to merge stale PR (apache#646) Signed-off-by: xxchan <[email protected]> commit 420b4e2 Author: Scott Donnelly <[email protected]> Date: Tue Sep 24 08:20:23 2024 +0100 Table Scan: Add Row Selection Filtering (apache#565) * feat(scan): add row selection capability via PageIndexEvaluator * test(row-selection): add first few row selection tests * feat(scan): add more tests, fix bug where min/max args swapped * fix: ad test and fix for logic bug in PageIndexEvaluator in-clause handler * feat: changes suggested from PR review commit b3709ba Author: Christian <[email protected]> Date: Tue Sep 24 04:47:04 2024 +0200 feat: Add NamespaceIdent.parent() (apache#641) * Add NamespaceIdent.parent() * Use split_last commit 1533c43 Author: Alon Agmon <[email protected]> Date: Mon Sep 23 13:39:46 2024 +0300 feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate (apache#588) * adding main function and tests * adding tests, removing integration test for now * fixing typos and lints * fixing typing issue * - added support in schmema to convert Date32 to correct arrow type - refactored scan to use new predicate converter as visitor and seperated it to a new mod - added support for simple predicates with column cast expressions - added testing, mostly around date functions * fixing format and lic * reducing number of tests (17 -> 7) * fix formats * fix naming * refactoring to use TreeNodeVisitor * fixing fmt * small refactor * adding swapped op and fixing CR comments --------- Co-authored-by: Alon Agmon <[email protected]> commit e967deb Author: xxchan <[email protected]> Date: Mon Sep 23 18:34:59 2024 +0800 feat: expose remove_all in FileIO (apache#643) Signed-off-by: xxchan <[email protected]> commit d03c4f8 Author: Scott Donnelly <[email protected]> Date: Mon Sep 23 08:28:52 2024 +0100 Migrate to arrow-* v53 (apache#626) * chore: migrate to arrow-* v53 * chore: update datafusion to 42 * test: fix incorrect test assertion * chore: update python bindings to arrow 53 commit 88e5e4a Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Sep 23 15:26:18 2024 +0800 chore(deps): Bump crate-ci/typos from 1.24.5 to 1.24.6 (apache#640) Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.5 to 1.24.6. - [Release notes](https://github.com/crate-ci/typos/releases) - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md) - [Commits](crate-ci/typos@v1.24.5...v1.24.6) --- updated-dependencies: - dependency-name: crate-ci/typos dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit c354983 Author: xxchan <[email protected]> Date: Mon Sep 23 14:50:18 2024 +0800 doc: improve FileIO doc (apache#642) Signed-off-by: xxchan <[email protected]> commit 12e12e2 Author: xxchan <[email protected]> Date: Fri Sep 20 19:59:55 2024 +0800 feat: expose arrow type <-> iceberg type (apache#637) * feat: expose arrow type <-> iceberg type Previously we only exposed the schema conversion. Signed-off-by: xxchan <[email protected]> * add tests Signed-off-by: xxchan <[email protected]> --------- Signed-off-by: xxchan <[email protected]> commit 3b27c9e Author: xxchan <[email protected]> Date: Fri Sep 20 18:32:31 2024 +0800 feat: add Sync to TransformFunction (apache#638) Signed-off-by: xxchan <[email protected]> commit 34cb81c Author: Xuanwo <[email protected]> Date: Wed Sep 18 20:18:40 2024 +0800 chore: Bump opendal to 0.50 (apache#634) commit cde35ab Author: FANNG <[email protected]> Date: Fri Sep 13 10:01:16 2024 +0800 feat: support projection pushdown for datafusion iceberg (apache#594) * support projection pushdown for datafusion iceberg * support projection pushdown for datafusion iceberg * fix ci * fix field id * remove depencences * remove depencences commit eae9464 Author: Xuanwo <[email protected]> Date: Thu Sep 12 02:06:31 2024 +0800 refactor(python): Expose transform as a submodule for pyiceberg_core (apache#628) commit 8a3de4e Author: Christian <[email protected]> Date: Mon Sep 9 14:45:16 2024 +0200 Feat: Normalize TableMetadata (apache#611) * Normalize Table Metadata * Improve readability & comments commit e08c0e5 Author: Renjie Liu <[email protected]> Date: Mon Sep 9 11:57:22 2024 +0800 fix: Correctly calculate highest_field_id in schema (apache#590) commit f78c59b Author: Jack <[email protected]> Date: Mon Sep 9 03:35:16 2024 +0100 feat: add `client.region` (apache#623) commit a5aba9a Author: Christian <[email protected]> Date: Sun Sep 8 18:36:05 2024 +0200 feat: SortOrder methods should take schema ref if possible (apache#613) * SortOrder methods should take schema ref if possible * Fix test type * with_order_id should not take reference commit 5812399 Author: Christian <[email protected]> Date: Sun Sep 8 18:18:41 2024 +0200 feat: partition compatibility (apache#612) * Partition compatability * Partition compatability * Rename compatible_with -> is_compatible_with commit ede4720 Author: Christian <[email protected]> Date: Sun Sep 8 16:49:39 2024 +0200 fix: Less Panics for Snapshot timestamps (apache#614) commit ced661f Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun Sep 8 22:43:38 2024 +0800 chore(deps): Bump crate-ci/typos from 1.24.3 to 1.24.5 (apache#616) Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.3 to 1.24.5. - [Release notes](https://github.com/crate-ci/typos/releases) - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md) - [Commits](crate-ci/typos@v1.24.3...v1.24.5) --- updated-dependencies: - dependency-name: crate-ci/typos dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit cbbd086 Author: Xuanwo <[email protected]> Date: Sun Sep 8 10:29:31 2024 +0800 feat: Add more fields in FileScanTask (apache#609) Signed-off-by: Xuanwo <[email protected]> commit 620d58e Author: Callum Ryan <[email protected]> Date: Thu Sep 5 03:44:55 2024 +0100 feat: SQL Catalog - namespaces (apache#534) * feat: SQL Catalog - namespaces Signed-off-by: callum-ryan <[email protected]> * feat: use transaction for updates and creates Signed-off-by: callum-ryan <[email protected]> * fix: pull out query param builder to fn Signed-off-by: callum-ryan <[email protected]> * feat: add drop and tests Signed-off-by: callum-ryan <[email protected]> * fix: String to str, remove pub and optimise query builder Signed-off-by: callum-ryan <[email protected]> * fix: nested match, remove ok() Signed-off-by: callum-ryan <[email protected]> * fix: remove pub, add set, add comments Signed-off-by: callum-ryan <[email protected]> * fix: refactor list_namespaces slightly Signed-off-by: callum-ryan <[email protected]> * fix: add default properties to all new namespaces Signed-off-by: callum-ryan <[email protected]> * fix: remove check for nested namespace Signed-off-by: callum-ryan <[email protected]> * chore: add more comments to the CatalogConfig to explain bind styles Signed-off-by: callum-ryan <[email protected]> * fix: edit test for nested namespaces Signed-off-by: callum-ryan <[email protected]> --------- Signed-off-by: callum-ryan <[email protected]> commit ae75f96 Author: Søren Dalby Larsen <[email protected]> Date: Tue Sep 3 13:46:48 2024 +0200 chore: bump crate-ci/typos to 1.24.3 (apache#598) commit 7aa8bdd Author: Scott Donnelly <[email protected]> Date: Thu Aug 29 04:37:48 2024 +0100 Table Scan: Add Row Group Skipping (apache#558) * feat(scan): add row group and page index row selection filtering * fix(row selection): off-by-one error * feat: remove row selection to defer to a second PR * feat: better min/max val conversion in RowGroupMetricsEvaluator * test(row_group_filtering): first three tests * test(row_group_filtering): next few tests * test: add more tests for RowGroupMetricsEvaluator * chore: refactor test assertions to silence clippy lints * refactor: consolidate parquet stat min/max parsing in one place commit da08e8d Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed Aug 28 14:35:55 2024 +0800 chore(deps): Bump crate-ci/typos from 1.23.6 to 1.24.1 (apache#583) commit ecbb4c3 Author: Sung Yun <[email protected]> Date: Mon Aug 26 23:57:01 2024 -0400 Expose Transforms to Python Binding (apache#556) * bucket transform rust binding * format * poetry x maturin * ignore poetry.lock in license check * update bindings_python_ci to use makefile * newline * python-poetry/poetry#9135 * use hatch instead of poetry * refactor * revert licenserc change * adopt review feedback * comments * unused dependency * adopt review comment * newline * I like this approach a lot better * more tests commit 905ebd2 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Aug 26 20:49:07 2024 +0800 chore(deps): Update typed-builder requirement from 0.19 to 0.20 (apache#582) --- updated-dependencies: - dependency-name: typed-builder dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit f9c92b7 Author: FANNG <[email protected]> Date: Sun Aug 25 22:31:36 2024 +0800 fix: Update sqlx from 0.8.0 to 0.8.1 (apache#584) commit ba66665 Author: FANNG <[email protected]> Date: Sat Aug 24 12:35:36 2024 +0800 fix: correct partition-id to field-id in UnboundPartitionField (apache#576) * correct partition-id to field id in PartitionSpec * correct partition-id to field id in PartitionSpec * correct partition-id to field id in PartitionSpec * xx
1 parent 72d797c commit 12d766f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+7211
-618
lines changed

.github/workflows/bindings_python_ci.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -80,4 +80,4 @@ jobs:
8080
set -e
8181
pip install hatch==1.12.0
8282
hatch run dev:pip install dist/pyiceberg_core-*.whl --force-reinstall
83-
hatch run dev:test
83+
hatch run dev:test

.github/workflows/ci_typos.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -42,4 +42,4 @@ jobs:
4242
steps:
4343
- uses: actions/checkout@v4
4444
- name: Check typos
45-
uses: crate-ci/typos@v1.23.6
45+
uses: crate-ci/typos@v1.24.6

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -24,3 +24,5 @@ dist/*
2424
**/venv
2525
*.so
2626
*.pyc
27+
*.whl
28+
*.tar.gz

Cargo.toml

+11-8
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,12 @@ rust-version = "1.77.1"
3939
anyhow = "1.0.72"
4040
apache-avro = "0.17"
4141
array-init = "2"
42-
arrow-arith = { version = "52" }
43-
arrow-array = { version = "52" }
44-
arrow-ord = { version = "52" }
45-
arrow-schema = { version = "52" }
46-
arrow-select = { version = "52" }
47-
arrow-string = { version = "52" }
42+
arrow-arith = { version = "53" }
43+
arrow-array = { version = "53" }
44+
arrow-ord = { version = "53" }
45+
arrow-schema = { version = "53" }
46+
arrow-select = { version = "53" }
47+
arrow-string = { version = "53" }
4848
async-stream = "0.3.5"
4949
async-trait = "0.1"
5050
async-std = "1.12"
@@ -64,17 +64,20 @@ iceberg = { version = "0.3.0", path = "./crates/iceberg" }
6464
iceberg-catalog-rest = { version = "0.3.0", path = "./crates/catalog/rest" }
6565
iceberg-catalog-hms = { version = "0.3.0", path = "./crates/catalog/hms" }
6666
iceberg-catalog-memory = { version = "0.3.0", path = "./crates/catalog/memory" }
67+
iceberg-datafusion = { version = "0.3.0", path = "./crates/integrations/datafusion" }
6768
itertools = "0.13"
6869
log = "0.4"
6970
mockito = "1"
7071
murmur3 = "0.5.2"
7172
once_cell = "1"
7273
opendal = { git = "https://github.com/twuebi/opendal.git", rev = "a9e3d88e97" }
7374
ordered-float = "4"
74-
parquet = "52"
75+
parquet = "53"
76+
paste = "1"
7577
pilota = "0.11.2"
7678
pretty_assertions = "1.4"
7779
port_scanner = "0.1.5"
80+
rand = "0.8"
7881
regex = "1.10.5"
7982
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"] }
8083
rust_decimal = "1.31"
@@ -87,7 +90,7 @@ serde_with = "3.4"
8790
strum = "0.26.3"
8891
tempfile = "3.8"
8992
tokio = { version = "1", default-features = false }
90-
typed-builder = "0.19"
93+
typed-builder = "0.20"
9194
url = "2"
9295
urlencoding = "2"
9396
uuid = { version = "1.6.1", features = ["v7"] }

bindings/python/Cargo.toml

+2-1
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,5 @@ crate-type = ["cdylib"]
3232

3333
[dependencies]
3434
iceberg = { path = "../../crates/iceberg" }
35-
pyo3 = { version = "0.22", features = ["extension-module"] }
35+
pyo3 = { version = "0.22.3", features = ["extension-module"] }
36+
arrow = { version = "53", features = ["pyarrow"] }

bindings/python/pyproject.toml

+1
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ ignore = ["F403", "F405"]
4343
dependencies = [
4444
"maturin>=1.0,<2.0",
4545
"pytest>=8.3.2",
46+
"pyarrow>=17.0.0",
4647
]
4748

4849
[tool.hatch.envs.dev.scripts]

bindings/python/src/error.rs

+24
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
// Licensed to the Apache Software Foundation (ASF) under one
2+
// or more contributor license agreements. See the NOTICE file
3+
// distributed with this work for additional information
4+
// regarding copyright ownership. The ASF licenses this file
5+
// to you under the Apache License, Version 2.0 (the
6+
// "License"); you may not use this file except in compliance
7+
// with the License. You may obtain a copy of the License at
8+
//
9+
// http://www.apache.org/licenses/LICENSE-2.0
10+
//
11+
// Unless required by applicable law or agreed to in writing,
12+
// software distributed under the License is distributed on an
13+
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
// KIND, either express or implied. See the License for the
15+
// specific language governing permissions and limitations
16+
// under the License.
17+
18+
use pyo3::exceptions::PyValueError;
19+
use pyo3::PyErr;
20+
21+
/// Convert an iceberg error to a python error
22+
pub fn to_py_err(err: iceberg::Error) -> PyErr {
23+
PyValueError::new_err(err.to_string())
24+
}

bindings/python/src/lib.rs

+4-8
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,13 @@
1515
// specific language governing permissions and limitations
1616
// under the License.
1717

18-
use iceberg::io::FileIOBuilder;
1918
use pyo3::prelude::*;
2019

21-
#[pyfunction]
22-
fn hello_world() -> PyResult<String> {
23-
let _ = FileIOBuilder::new_fs_io().build().unwrap();
24-
Ok("Hello, world!".to_string())
25-
}
20+
mod error;
21+
mod transform;
2622

2723
#[pymodule]
28-
fn pyiceberg_core_rust(m: &Bound<'_, PyModule>) -> PyResult<()> {
29-
m.add_function(wrap_pyfunction!(hello_world, m)?)?;
24+
fn pyiceberg_core_rust(py: Python<'_>, m: &Bound<'_, PyModule>) -> PyResult<()> {
25+
transform::register_module(py, m)?;
3026
Ok(())
3127
}

bindings/python/src/transform.rs

+93
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
// Licensed to the Apache Software Foundation (ASF) under one
2+
// or more contributor license agreements. See the NOTICE file
3+
// distributed with this work for additional information
4+
// regarding copyright ownership. The ASF licenses this file
5+
// to you under the Apache License, Version 2.0 (the
6+
// "License"); you may not use this file except in compliance
7+
// with the License. You may obtain a copy of the License at
8+
//
9+
// http://www.apache.org/licenses/LICENSE-2.0
10+
//
11+
// Unless required by applicable law or agreed to in writing,
12+
// software distributed under the License is distributed on an
13+
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
// KIND, either express or implied. See the License for the
15+
// specific language governing permissions and limitations
16+
// under the License.
17+
18+
use arrow::array::{make_array, Array, ArrayData};
19+
use arrow::pyarrow::{FromPyArrow, ToPyArrow};
20+
use iceberg::spec::Transform;
21+
use iceberg::transform::create_transform_function;
22+
use pyo3::prelude::*;
23+
24+
use crate::error::to_py_err;
25+
26+
#[pyfunction]
27+
pub fn identity(py: Python, array: PyObject) -> PyResult<PyObject> {
28+
apply(py, array, Transform::Identity)
29+
}
30+
31+
#[pyfunction]
32+
pub fn void(py: Python, array: PyObject) -> PyResult<PyObject> {
33+
apply(py, array, Transform::Void)
34+
}
35+
36+
#[pyfunction]
37+
pub fn year(py: Python, array: PyObject) -> PyResult<PyObject> {
38+
apply(py, array, Transform::Year)
39+
}
40+
41+
#[pyfunction]
42+
pub fn month(py: Python, array: PyObject) -> PyResult<PyObject> {
43+
apply(py, array, Transform::Month)
44+
}
45+
46+
#[pyfunction]
47+
pub fn day(py: Python, array: PyObject) -> PyResult<PyObject> {
48+
apply(py, array, Transform::Day)
49+
}
50+
51+
#[pyfunction]
52+
pub fn hour(py: Python, array: PyObject) -> PyResult<PyObject> {
53+
apply(py, array, Transform::Hour)
54+
}
55+
56+
#[pyfunction]
57+
pub fn bucket(py: Python, array: PyObject, num_buckets: u32) -> PyResult<PyObject> {
58+
apply(py, array, Transform::Bucket(num_buckets))
59+
}
60+
61+
#[pyfunction]
62+
pub fn truncate(py: Python, array: PyObject, width: u32) -> PyResult<PyObject> {
63+
apply(py, array, Transform::Truncate(width))
64+
}
65+
66+
fn apply(py: Python, array: PyObject, transform: Transform) -> PyResult<PyObject> {
67+
// import
68+
let array = ArrayData::from_pyarrow_bound(array.bind(py))?;
69+
let array = make_array(array);
70+
let transform_function = create_transform_function(&transform).map_err(to_py_err)?;
71+
let array = transform_function.transform(array).map_err(to_py_err)?;
72+
// export
73+
let array = array.into_data();
74+
array.to_pyarrow(py)
75+
}
76+
77+
pub fn register_module(py: Python<'_>, m: &Bound<'_, PyModule>) -> PyResult<()> {
78+
let this = PyModule::new_bound(py, "transform")?;
79+
80+
this.add_function(wrap_pyfunction!(identity, &this)?)?;
81+
this.add_function(wrap_pyfunction!(void, &this)?)?;
82+
this.add_function(wrap_pyfunction!(year, &this)?)?;
83+
this.add_function(wrap_pyfunction!(month, &this)?)?;
84+
this.add_function(wrap_pyfunction!(day, &this)?)?;
85+
this.add_function(wrap_pyfunction!(hour, &this)?)?;
86+
this.add_function(wrap_pyfunction!(bucket, &this)?)?;
87+
this.add_function(wrap_pyfunction!(truncate, &this)?)?;
88+
89+
m.add_submodule(&this)?;
90+
py.import_bound("sys")?
91+
.getattr("modules")?
92+
.set_item("pyiceberg_core.transform", this)
93+
}

bindings/python/tests/test_basic.py

-22
This file was deleted.
+99
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
from datetime import date, datetime
19+
20+
import pyarrow as pa
21+
import pytest
22+
from pyiceberg_core import transform
23+
24+
25+
def test_identity_transform():
26+
arr = pa.array([1, 2])
27+
result = transform.identity(arr)
28+
assert result == arr
29+
30+
31+
def test_bucket_transform():
32+
arr = pa.array([1, 2])
33+
result = transform.bucket(arr, 10)
34+
expected = pa.array([6, 2], type=pa.int32())
35+
assert result == expected
36+
37+
38+
def test_bucket_transform_fails_for_list_type_input():
39+
arr = pa.array([[1, 2], [3, 4]])
40+
with pytest.raises(
41+
ValueError,
42+
match=r"FeatureUnsupported => Unsupported data type for bucket transform",
43+
):
44+
transform.bucket(arr, 10)
45+
46+
47+
def test_bucket_chunked_array():
48+
chunked = pa.chunked_array([pa.array([1, 2]), pa.array([3, 4])])
49+
result_chunks = []
50+
for arr in chunked.iterchunks():
51+
result_chunks.append(transform.bucket(arr, 10))
52+
53+
expected = pa.chunked_array(
54+
[pa.array([6, 2], type=pa.int32()), pa.array([5, 0], type=pa.int32())]
55+
)
56+
assert pa.chunked_array(result_chunks).equals(expected)
57+
58+
59+
def test_year_transform():
60+
arr = pa.array([date(1970, 1, 1), date(2000, 1, 1)])
61+
result = transform.year(arr)
62+
expected = pa.array([0, 30], type=pa.int32())
63+
assert result == expected
64+
65+
66+
def test_month_transform():
67+
arr = pa.array([date(1970, 1, 1), date(2000, 4, 1)])
68+
result = transform.month(arr)
69+
expected = pa.array([0, 30 * 12 + 3], type=pa.int32())
70+
assert result == expected
71+
72+
73+
def test_day_transform():
74+
arr = pa.array([date(1970, 1, 1), date(2000, 4, 1)])
75+
result = transform.day(arr)
76+
expected = pa.array([0, 11048], type=pa.int32())
77+
assert result == expected
78+
79+
80+
def test_hour_transform():
81+
arr = pa.array([datetime(1970, 1, 1, 19, 1, 23), datetime(2000, 3, 1, 12, 1, 23)])
82+
result = transform.hour(arr)
83+
expected = pa.array([19, 264420], type=pa.int32())
84+
assert result == expected
85+
86+
87+
def test_truncate_transform():
88+
arr = pa.array(["this is a long string", "hi my name is sung"])
89+
result = transform.truncate(arr, 5)
90+
expected = pa.array(["this ", "hi my"])
91+
assert result == expected
92+
93+
94+
def test_identity_transform_with_direct_import():
95+
from pyiceberg_core.transform import identity
96+
97+
arr = pa.array([1, 2])
98+
result = identity(arr)
99+
assert result == arr

crates/catalog/memory/src/catalog.rs

+1-1
Original file line numberDiff line numberDiff line change
@@ -371,7 +371,7 @@ mod tests {
371371
let expected_sorted_order = SortOrder::builder()
372372
.with_order_id(0)
373373
.with_fields(vec![])
374-
.build(expected_schema.clone())
374+
.build(expected_schema)
375375
.unwrap();
376376

377377
assert_eq!(

crates/catalog/rest/tests/rest_catalog_test.rs

+2-6
Original file line numberDiff line numberDiff line change
@@ -293,12 +293,8 @@ async fn test_create_table() {
293293
assert_eq!(table.metadata().format_version(), FormatVersion::V2);
294294
assert!(table.metadata().current_snapshot().is_none());
295295
assert!(table.metadata().history().is_empty());
296-
assert!(table.metadata().default_sort_order().unwrap().is_unsorted());
297-
assert!(table
298-
.metadata()
299-
.default_partition_spec()
300-
.unwrap()
301-
.is_unpartitioned());
296+
assert!(table.metadata().default_sort_order().is_unsorted());
297+
assert!(table.metadata().default_partition_spec().is_unpartitioned());
302298
}
303299

304300
#[tokio::test]

crates/catalog/sql/Cargo.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ keywords = ["iceberg", "sql", "catalog"]
3131
[dependencies]
3232
async-trait = { workspace = true }
3333
iceberg = { workspace = true }
34-
sqlx = { version = "0.8.0", features = ["any"], default-features = false }
34+
sqlx = { version = "0.8.1", features = ["any"], default-features = false }
3535
typed-builder = { workspace = true }
3636

3737
[dev-dependencies]

0 commit comments

Comments
 (0)