You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Background:
There are some requirements for real-time data widening. Now hive supports lookup join, but this solution is not available for production, and the hive table needs to be loaded into memory. Large tables are prone to oom problems. Besides, neither Iceberg nor Hudi support lookup joins.
Here is a summary proposal:
Flink affords the event time temporal join. The right table will be used as a version table, and its data can be managed in rocksdb instead of memory.
-- create a left table, using localtimestamp as event time.
create table source (
...,
arcitc_process_time AS LOCALTIMESTAMP,
WATERMARK FOR arcitc_process_time AS arcitc_process_time,
) with (...);
create table arctic_dim (...) with ('connector'='arctic', 'dim-table.enabled'='true');
select * from source as O left join arctic_dim FOR SYSTEM_TIME AS OF O.arcitc_process_time as P on O.id = P.id;
The arctic source will automatically create a custom watermark strategy if dim-table.enabled equals true.
The text was updated successfully, but these errors were encountered:
Background:
There are some requirements for real-time data widening. Now hive supports lookup join, but this solution is not available for production, and the hive table needs to be loaded into memory. Large tables are prone to oom problems. Besides, neither Iceberg nor Hudi support lookup joins.
Here is a summary proposal:
Flink affords the event time temporal join. The right table will be used as a version table, and its data can be managed in rocksdb instead of memory.
The arctic source will automatically create a custom watermark strategy if
dim-table.enabled
equals true.The text was updated successfully, but these errors were encountered: