Skip to content

Commit 7b44d31

Browse files
committed
Merge Builder
commit 2c9d7e4 Author: Christian Thiel <[email protected]> Date: Wed Oct 2 07:51:50 2024 +0200 New builder commit fea1817 Author: Christian Thiel <[email protected]> Date: Tue Oct 1 12:56:10 2024 +0200 Merge branch 'feat/safe-partition-spec' commit 1f49c95 Author: Christian Thiel <[email protected]> Date: Tue Oct 1 12:55:02 2024 +0200 Merge branch 'feat/schema-reassign-field-ids' commit cda4a0c Author: congyi wang <[email protected]> Date: Mon Sep 30 11:07:36 2024 +0800 chore: fix typo in FileIO Schemes (apache#653) * fix typo * fix typo commit af9609d Author: Scott Donnelly <[email protected]> Date: Mon Sep 30 04:06:14 2024 +0100 fix: page index evaluator min/max args inverted (apache#648) * fix: page index evaluator min/max args inverted * style: fix clippy lint in test commit a6a3fd7 Author: Alon Agmon <[email protected]> Date: Sat Sep 28 10:10:08 2024 +0300 test (datafusion): add test for table provider creation (apache#651) * add test for table provider creation * fix formatting * fixing yet another formatting issue * testing schema using data fusion --------- Co-authored-by: Alon Agmon <[email protected]> commit 87483b4 Author: Alon Agmon <[email protected]> Date: Fri Sep 27 04:40:08 2024 +0300 making table provider pub (apache#650) Co-authored-by: Alon Agmon <[email protected]> commit 984c91e Author: ZENOTME <[email protected]> Date: Thu Sep 26 17:56:02 2024 +0800 avoid to create memory schema operator every time (apache#635) Co-authored-by: ZENOTME <[email protected]> commit 4171275 Author: Matheus Alcantara <[email protected]> Date: Wed Sep 25 08:28:42 2024 -0300 scan: change ErrorKind when table dont have spanshots (apache#608) commit ab51355 Author: xxchan <[email protected]> Date: Tue Sep 24 21:25:45 2024 +0800 fix: compile error due to merge stale PR (apache#646) Signed-off-by: xxchan <[email protected]> commit 420b4e2 Author: Scott Donnelly <[email protected]> Date: Tue Sep 24 08:20:23 2024 +0100 Table Scan: Add Row Selection Filtering (apache#565) * feat(scan): add row selection capability via PageIndexEvaluator * test(row-selection): add first few row selection tests * feat(scan): add more tests, fix bug where min/max args swapped * fix: ad test and fix for logic bug in PageIndexEvaluator in-clause handler * feat: changes suggested from PR review commit b3709ba Author: Christian <[email protected]> Date: Tue Sep 24 04:47:04 2024 +0200 feat: Add NamespaceIdent.parent() (apache#641) * Add NamespaceIdent.parent() * Use split_last commit 1533c43 Author: Alon Agmon <[email protected]> Date: Mon Sep 23 13:39:46 2024 +0300 feat (datafusion integration): convert datafusion expr filters to Iceberg Predicate (apache#588) * adding main function and tests * adding tests, removing integration test for now * fixing typos and lints * fixing typing issue * - added support in schmema to convert Date32 to correct arrow type - refactored scan to use new predicate converter as visitor and seperated it to a new mod - added support for simple predicates with column cast expressions - added testing, mostly around date functions * fixing format and lic * reducing number of tests (17 -> 7) * fix formats * fix naming * refactoring to use TreeNodeVisitor * fixing fmt * small refactor * adding swapped op and fixing CR comments --------- Co-authored-by: Alon Agmon <[email protected]> commit e967deb Author: xxchan <[email protected]> Date: Mon Sep 23 18:34:59 2024 +0800 feat: expose remove_all in FileIO (apache#643) Signed-off-by: xxchan <[email protected]> commit d03c4f8 Author: Scott Donnelly <[email protected]> Date: Mon Sep 23 08:28:52 2024 +0100 Migrate to arrow-* v53 (apache#626) * chore: migrate to arrow-* v53 * chore: update datafusion to 42 * test: fix incorrect test assertion * chore: update python bindings to arrow 53 commit 88e5e4a Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Sep 23 15:26:18 2024 +0800 chore(deps): Bump crate-ci/typos from 1.24.5 to 1.24.6 (apache#640) Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.5 to 1.24.6. - [Release notes](https://github.com/crate-ci/typos/releases) - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md) - [Commits](crate-ci/typos@v1.24.5...v1.24.6) --- updated-dependencies: - dependency-name: crate-ci/typos dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit c354983 Author: xxchan <[email protected]> Date: Mon Sep 23 14:50:18 2024 +0800 doc: improve FileIO doc (apache#642) Signed-off-by: xxchan <[email protected]> commit 12e12e2 Author: xxchan <[email protected]> Date: Fri Sep 20 19:59:55 2024 +0800 feat: expose arrow type <-> iceberg type (apache#637) * feat: expose arrow type <-> iceberg type Previously we only exposed the schema conversion. Signed-off-by: xxchan <[email protected]> * add tests Signed-off-by: xxchan <[email protected]> --------- Signed-off-by: xxchan <[email protected]> commit 3b27c9e Author: xxchan <[email protected]> Date: Fri Sep 20 18:32:31 2024 +0800 feat: add Sync to TransformFunction (apache#638) Signed-off-by: xxchan <[email protected]> commit 34cb81c Author: Xuanwo <[email protected]> Date: Wed Sep 18 20:18:40 2024 +0800 chore: Bump opendal to 0.50 (apache#634) commit cde35ab Author: FANNG <[email protected]> Date: Fri Sep 13 10:01:16 2024 +0800 feat: support projection pushdown for datafusion iceberg (apache#594) * support projection pushdown for datafusion iceberg * support projection pushdown for datafusion iceberg * fix ci * fix field id * remove depencences * remove depencences commit eae9464 Author: Xuanwo <[email protected]> Date: Thu Sep 12 02:06:31 2024 +0800 refactor(python): Expose transform as a submodule for pyiceberg_core (apache#628) commit 8a3de4e Author: Christian <[email protected]> Date: Mon Sep 9 14:45:16 2024 +0200 Feat: Normalize TableMetadata (apache#611) * Normalize Table Metadata * Improve readability & comments commit e08c0e5 Author: Renjie Liu <[email protected]> Date: Mon Sep 9 11:57:22 2024 +0800 fix: Correctly calculate highest_field_id in schema (apache#590) commit f78c59b Author: Jack <[email protected]> Date: Mon Sep 9 03:35:16 2024 +0100 feat: add `client.region` (apache#623) commit a5aba9a Author: Christian <[email protected]> Date: Sun Sep 8 18:36:05 2024 +0200 feat: SortOrder methods should take schema ref if possible (apache#613) * SortOrder methods should take schema ref if possible * Fix test type * with_order_id should not take reference commit 5812399 Author: Christian <[email protected]> Date: Sun Sep 8 18:18:41 2024 +0200 feat: partition compatibility (apache#612) * Partition compatability * Partition compatability * Rename compatible_with -> is_compatible_with commit ede4720 Author: Christian <[email protected]> Date: Sun Sep 8 16:49:39 2024 +0200 fix: Less Panics for Snapshot timestamps (apache#614) commit ced661f Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun Sep 8 22:43:38 2024 +0800 chore(deps): Bump crate-ci/typos from 1.24.3 to 1.24.5 (apache#616) Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.24.3 to 1.24.5. - [Release notes](https://github.com/crate-ci/typos/releases) - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md) - [Commits](crate-ci/typos@v1.24.3...v1.24.5) --- updated-dependencies: - dependency-name: crate-ci/typos dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit cbbd086 Author: Xuanwo <[email protected]> Date: Sun Sep 8 10:29:31 2024 +0800 feat: Add more fields in FileScanTask (apache#609) Signed-off-by: Xuanwo <[email protected]> commit 620d58e Author: Callum Ryan <[email protected]> Date: Thu Sep 5 03:44:55 2024 +0100 feat: SQL Catalog - namespaces (apache#534) * feat: SQL Catalog - namespaces Signed-off-by: callum-ryan <[email protected]> * feat: use transaction for updates and creates Signed-off-by: callum-ryan <[email protected]> * fix: pull out query param builder to fn Signed-off-by: callum-ryan <[email protected]> * feat: add drop and tests Signed-off-by: callum-ryan <[email protected]> * fix: String to str, remove pub and optimise query builder Signed-off-by: callum-ryan <[email protected]> * fix: nested match, remove ok() Signed-off-by: callum-ryan <[email protected]> * fix: remove pub, add set, add comments Signed-off-by: callum-ryan <[email protected]> * fix: refactor list_namespaces slightly Signed-off-by: callum-ryan <[email protected]> * fix: add default properties to all new namespaces Signed-off-by: callum-ryan <[email protected]> * fix: remove check for nested namespace Signed-off-by: callum-ryan <[email protected]> * chore: add more comments to the CatalogConfig to explain bind styles Signed-off-by: callum-ryan <[email protected]> * fix: edit test for nested namespaces Signed-off-by: callum-ryan <[email protected]> --------- Signed-off-by: callum-ryan <[email protected]> commit ae75f96 Author: Søren Dalby Larsen <[email protected]> Date: Tue Sep 3 13:46:48 2024 +0200 chore: bump crate-ci/typos to 1.24.3 (apache#598) commit 7aa8bdd Author: Scott Donnelly <[email protected]> Date: Thu Aug 29 04:37:48 2024 +0100 Table Scan: Add Row Group Skipping (apache#558) * feat(scan): add row group and page index row selection filtering * fix(row selection): off-by-one error * feat: remove row selection to defer to a second PR * feat: better min/max val conversion in RowGroupMetricsEvaluator * test(row_group_filtering): first three tests * test(row_group_filtering): next few tests * test: add more tests for RowGroupMetricsEvaluator * chore: refactor test assertions to silence clippy lints * refactor: consolidate parquet stat min/max parsing in one place commit da08e8d Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed Aug 28 14:35:55 2024 +0800 chore(deps): Bump crate-ci/typos from 1.23.6 to 1.24.1 (apache#583) commit ecbb4c3 Author: Sung Yun <[email protected]> Date: Mon Aug 26 23:57:01 2024 -0400 Expose Transforms to Python Binding (apache#556) * bucket transform rust binding * format * poetry x maturin * ignore poetry.lock in license check * update bindings_python_ci to use makefile * newline * python-poetry/poetry#9135 * use hatch instead of poetry * refactor * revert licenserc change * adopt review feedback * comments * unused dependency * adopt review comment * newline * I like this approach a lot better * more tests commit 905ebd2 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Aug 26 20:49:07 2024 +0800 chore(deps): Update typed-builder requirement from 0.19 to 0.20 (apache#582) --- updated-dependencies: - dependency-name: typed-builder dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> commit f9c92b7 Author: FANNG <[email protected]> Date: Sun Aug 25 22:31:36 2024 +0800 fix: Update sqlx from 0.8.0 to 0.8.1 (apache#584) commit ba66665 Author: FANNG <[email protected]> Date: Sat Aug 24 12:35:36 2024 +0800 fix: correct partition-id to field-id in UnboundPartitionField (apache#576) * correct partition-id to field id in PartitionSpec * correct partition-id to field id in PartitionSpec * correct partition-id to field id in PartitionSpec * xx
1 parent 12d766f commit 7b44d31

21 files changed

+3496
-745
lines changed

crates/catalog/glue/src/catalog.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,9 @@ impl Catalog for GlueCatalog {
355355
}
356356
};
357357

358-
let metadata = TableMetadataBuilder::from_table_creation(creation)?.build()?;
358+
let metadata = TableMetadataBuilder::from_table_creation(creation)?
359+
.build()?
360+
.metadata;
359361
let metadata_location = create_metadata_location(&location, 0)?;
360362

361363
self.file_io

crates/catalog/glue/src/schema.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,9 @@ mod tests {
198198
.location("my_location".to_string())
199199
.schema(schema)
200200
.build();
201-
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?.build()?;
201+
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?
202+
.build()?
203+
.metadata;
202204

203205
Ok(metadata)
204206
}

crates/catalog/glue/src/utils.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -299,7 +299,9 @@ mod tests {
299299
.location("my_location".to_string())
300300
.schema(schema)
301301
.build();
302-
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?.build()?;
302+
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?
303+
.build()?
304+
.metadata;
303305

304306
Ok(metadata)
305307
}

crates/catalog/hms/src/catalog.rs

+3-1
Original file line numberDiff line numberDiff line change
@@ -346,7 +346,9 @@ impl Catalog for HmsCatalog {
346346
}
347347
};
348348

349-
let metadata = TableMetadataBuilder::from_table_creation(creation)?.build()?;
349+
let metadata = TableMetadataBuilder::from_table_creation(creation)?
350+
.build()?
351+
.metadata;
350352
let metadata_location = create_metadata_location(&location, 0)?;
351353

352354
self.file_io

crates/catalog/memory/src/catalog.rs

+5-3
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,9 @@ impl Catalog for MemoryCatalog {
194194
}
195195
};
196196

197-
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?.build()?;
197+
let metadata = TableMetadataBuilder::from_table_creation(table_creation)?
198+
.build()?
199+
.metadata;
198200
let metadata_location = format!(
199201
"{}/metadata/{}-{}.metadata.json",
200202
&location,
@@ -355,7 +357,7 @@ mod tests {
355357

356358
assert_eq!(metadata.current_schema().as_ref(), expected_schema);
357359

358-
let expected_partition_spec = PartitionSpec::builder(expected_schema)
360+
let expected_partition_spec = PartitionSpec::builder((*expected_schema).clone())
359361
.with_spec_id(0)
360362
.build()
361363
.unwrap();
@@ -365,7 +367,7 @@ mod tests {
365367
.partition_specs_iter()
366368
.map(|p| p.as_ref())
367369
.collect_vec(),
368-
vec![&expected_partition_spec]
370+
vec![&expected_partition_spec.into_schemaless()]
369371
);
370372

371373
let expected_sorted_order = SortOrder::builder()

crates/iceberg/src/arrow/schema.rs

+10-10
Original file line numberDiff line numberDiff line change
@@ -826,8 +826,8 @@ mod tests {
826826

827827
fn arrow_schema_for_arrow_schema_to_schema_test() -> ArrowSchema {
828828
let fields = Fields::from(vec![
829-
simple_field("key", DataType::Int32, false, "17"),
830-
simple_field("value", DataType::Utf8, true, "18"),
829+
simple_field("key", DataType::Int32, false, "28"),
830+
simple_field("value", DataType::Utf8, true, "29"),
831831
]);
832832

833833
let r#struct = DataType::Struct(fields);
@@ -1057,9 +1057,9 @@ mod tests {
10571057
"required": true,
10581058
"type": {
10591059
"type": "map",
1060-
"key-id": 17,
1060+
"key-id": 28,
10611061
"key": "int",
1062-
"value-id": 18,
1062+
"value-id": 29,
10631063
"value-required": false,
10641064
"value": "string"
10651065
}
@@ -1110,8 +1110,8 @@ mod tests {
11101110

11111111
fn arrow_schema_for_schema_to_arrow_schema_test() -> ArrowSchema {
11121112
let fields = Fields::from(vec![
1113-
simple_field("key", DataType::Int32, false, "17"),
1114-
simple_field("value", DataType::Utf8, true, "18"),
1113+
simple_field("key", DataType::Int32, false, "28"),
1114+
simple_field("value", DataType::Utf8, true, "29"),
11151115
]);
11161116

11171117
let r#struct = DataType::Struct(fields);
@@ -1200,7 +1200,7 @@ mod tests {
12001200
),
12011201
simple_field("map", map, false, "16"),
12021202
simple_field("struct", r#struct, false, "17"),
1203-
simple_field("uuid", DataType::FixedSizeBinary(16), false, "26"),
1203+
simple_field("uuid", DataType::FixedSizeBinary(16), false, "30"),
12041204
])
12051205
}
12061206

@@ -1344,9 +1344,9 @@ mod tests {
13441344
"required": true,
13451345
"type": {
13461346
"type": "map",
1347-
"key-id": 17,
1347+
"key-id": 28,
13481348
"key": "int",
1349-
"value-id": 18,
1349+
"value-id": 29,
13501350
"value-required": false,
13511351
"value": "string"
13521352
}
@@ -1380,7 +1380,7 @@ mod tests {
13801380
}
13811381
},
13821382
{
1383-
"id":26,
1383+
"id":30,
13841384
"name":"uuid",
13851385
"required":true,
13861386
"type":"uuid"

crates/iceberg/src/catalog/mod.rs

+48-5
Original file line numberDiff line numberDiff line change
@@ -445,8 +445,46 @@ impl TableUpdate {
445445
/// Applies the update to the table metadata builder.
446446
pub fn apply(self, builder: TableMetadataBuilder) -> Result<TableMetadataBuilder> {
447447
match self {
448-
TableUpdate::AssignUuid { uuid } => builder.assign_uuid(uuid),
449-
_ => unimplemented!(),
448+
TableUpdate::AssignUuid { uuid } => Ok(builder.assign_uuid(uuid)),
449+
TableUpdate::AddSchema {
450+
schema,
451+
last_column_id,
452+
} => {
453+
if let Some(last_column_id) = last_column_id {
454+
if builder.last_column_id() < last_column_id {
455+
return Err(Error::new(
456+
ErrorKind::DataInvalid,
457+
format!(
458+
"Invalid last column ID: {last_column_id} < {} (previous last column ID)",
459+
builder.last_column_id()
460+
),
461+
));
462+
}
463+
};
464+
Ok(builder.add_schema(schema))
465+
}
466+
TableUpdate::SetCurrentSchema { schema_id } => builder.set_current_schema(schema_id),
467+
TableUpdate::AddSpec { spec } => builder.add_partition_spec(spec),
468+
TableUpdate::SetDefaultSpec { spec_id } => builder.set_default_partition_spec(spec_id),
469+
TableUpdate::AddSortOrder { sort_order } => builder.add_sort_order(sort_order),
470+
TableUpdate::SetDefaultSortOrder { sort_order_id } => {
471+
builder.set_default_sort_order(sort_order_id)
472+
}
473+
TableUpdate::AddSnapshot { snapshot } => builder.add_snapshot(snapshot),
474+
TableUpdate::SetSnapshotRef {
475+
ref_name,
476+
reference,
477+
} => builder.set_ref(&ref_name, reference),
478+
TableUpdate::RemoveSnapshots { snapshot_ids } => {
479+
Ok(builder.remove_snapshots(&snapshot_ids))
480+
}
481+
TableUpdate::RemoveSnapshotRef { ref_name } => Ok(builder.remove_ref(&ref_name)),
482+
TableUpdate::SetLocation { location } => Ok(builder.set_location(location)),
483+
TableUpdate::SetProperties { updates } => builder.set_properties(updates),
484+
TableUpdate::RemoveProperties { removals } => Ok(builder.remove_properties(&removals)),
485+
TableUpdate::UpgradeFormatVersion { format_version } => {
486+
builder.upgrade_format_version(format_version)
487+
}
450488
}
451489
}
452490
}
@@ -1183,16 +1221,21 @@ mod tests {
11831221
let table_metadata = TableMetadataBuilder::from_table_creation(table_creation)
11841222
.unwrap()
11851223
.build()
1186-
.unwrap();
1187-
let table_metadata_builder = TableMetadataBuilder::new(table_metadata);
1224+
.unwrap()
1225+
.metadata;
1226+
let table_metadata_builder = TableMetadataBuilder::new_from_metadata(
1227+
table_metadata,
1228+
"s3://db/table/metadata/metadata1.gz.json",
1229+
);
11881230

11891231
let uuid = uuid::Uuid::new_v4();
11901232
let update = TableUpdate::AssignUuid { uuid };
11911233
let updated_metadata = update
11921234
.apply(table_metadata_builder)
11931235
.unwrap()
11941236
.build()
1195-
.unwrap();
1237+
.unwrap()
1238+
.metadata;
11961239
assert_eq!(updated_metadata.uuid(), uuid);
11971240
}
11981241
}

0 commit comments

Comments
 (0)