-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support snapshots configurations in stats collector, fix bugs where snapshot retention didn't support past DAY format #291
base: main
Are you sure you want to change the base?
Conversation
…napshot retention didn't support past DAY format
apps/spark/src/test/java/com/linkedin/openhouse/jobs/spark/OperationsTest.java
Show resolved
Hide resolved
@@ -336,6 +343,9 @@ private static Map<String, Object> getTablePolicies(Table table) { | |||
policyMap.put( | |||
"sharingEnabled", Boolean.valueOf(policiesObject.get("sharingEnabled").getAsString())); | |||
} | |||
if (policiesObject.get("history") != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this edge case tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup it's tested as part of this test which sets up the history policy json on the backend. Also tested manually with a table with a configured history policy.
openhouse/apps/spark/src/test/java/com/linkedin/openhouse/jobs/spark/OperationsTest.java
Line 895 in cf211ed
public void testCollectHistoryPolicyStatsWithSnapshots() throws Exception { |
private static HistoryPolicyStatsSchema buildHistoryPolicy( | ||
Map<String, Object> historyPolicy, Long currentSnapshotTimestamp) { | ||
String granularity = (String) historyPolicy.getOrDefault("granularity", null); | ||
Integer maxAge = Integer.valueOf((String) historyPolicy.getOrDefault("maxAge", "0")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these defaults accurately represent the table?
aren't there global defaults that apply in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good callout, I will change it to the current defaults of 3 days. I was also thinking of representing unconfigured history tables as null but in practice they'd have 3 days snapshot retention
@@ -30,6 +30,8 @@ public class IcebergTableStats extends BaseTableMetadata { | |||
|
|||
private Long oldestSnapshotTimestamp; | |||
|
|||
private Long numSnapshots; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm. this sounds like a good time to also add secondOldestSnapshotTimestamp. wdyt?
this will address the LONG standing and very noisy bug of false positive snapshot expiration alerts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain more about the false positives that we are seeing? Currently we keep track of the oldest snapshot and the newest snapshot, wondering what value the secondOldestSnapshot would bring. I can add it though if needed.
apps/spark/src/main/java/com/linkedin/openhouse/jobs/spark/Operations.java
Show resolved
Hide resolved
apps/spark/src/main/java/com/linkedin/openhouse/jobs/spark/Operations.java
Show resolved
Hide resolved
Assertions.assertEquals(stats.getSharingEnabled(), false); | ||
Assertions.assertEquals(stats.getHistoryPolicy().getMaxAge(), 2); | ||
Assertions.assertEquals(stats.getHistoryPolicy().getNumVersions(), 20); | ||
Assertions.assertEquals(stats.getHistoryPolicy().getDateGranularity(), "DAY"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we added on test for HOURS as well?
Summary
Issue Briefly discuss the summary of the changes made in this
pull request in 2-3 lines.
We want to be able to emit events with user tables that have the history configuration for them so that monitoring systems can accurately detect when snapshot expiration fails for configured tables.
This PR also fixes a bug where not all date granularities were supported in retention, since
TimeUnit
is maxed granularity at DAYS when technically the snapshot policy can be kept at higher granularities of MONTH and YEAR, although we currently still limit it to 3 DAYS.Changes
For all the boxes checked, please include additional details of the changes made in this pull request.
Testing Done
For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.
Unit tests, tested snapshot version and ran tablestatscollector job in a cluster on Spark, did not see any errors.
Additional Information
For all the boxes checked, include additional details of the changes made in this pull request.