Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-713] add OSM PBF reader #1823

Merged
merged 5 commits into from
Feb 26, 2025
Merged

[SEDONA-713] add OSM PBF reader #1823

merged 5 commits into from
Feb 26, 2025

Conversation

Imbruced
Copy link
Member

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

  • Yes, the URL of the associated JIRA ticket is https://issues.apache.org/jira/browse/SEDONA-713. The PR name follows the format [SEDONA-713] my subject.

  • No:

    • this is a documentation update. The PR name follows the format [DOCS] my subject
    • this is a CI update. The PR name follows the format [CI] my subject

What changes were proposed in this PR?

How was this patch tested?

Did this PR include necessary documentation updates?

  • Yes, I am adding a new API. I am using the current SNAPSHOT version number in vX.Y.Z format.
  • Yes, I have updated the documentation.
  • No, this PR does not affect any public API so no need to change the documentation.

@Imbruced Imbruced changed the title Sedona 713 add osm reader Sedona 713 add OSM PBF reader Feb 20, 2025
@Imbruced Imbruced force-pushed the SEDONA-713-add-osm-reader branch from 091a0e7 to 4530942 Compare February 20, 2025 20:13
@Imbruced Imbruced marked this pull request as ready for review February 20, 2025 21:58
@Imbruced Imbruced requested a review from jiayuasu as a code owner February 20, 2025 21:58
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the code put in spark-3.5 folder? instead of common? Does it only work with spark-3.5?

@Imbruced
Copy link
Member Author

Hi, I added an example to discuss the correctness of the code. We need to either copy the code for spark 3.4 and 3.3 or abstract the

  val path = new Path(new URI(file.filePath.toString()))

As I think it might fail for spark 3.4 and 3.3 due to this, I need to confirm it

@jiayuasu jiayuasu changed the title Sedona 713 add OSM PBF reader [SEDONA-713] add OSM PBF reader Feb 24, 2025
@jiayuasu
Copy link
Member

@zhangfengcdt @Kontinuation I think we probably ran into this val path = new Path(new URI(file.filePath.toString())) issue?

@zhangfengcdt
Copy link
Collaborator

@zhangfengcdt @Kontinuation I think we probably ran into this val path = new Path(new URI(file.filePath.toString())) issue?

I think 3.3 and >3.4.0 should be good bacause the changes are:

https://javadoc.io/static/org.apache.spark/spark-sql_2.12/3.3.2/org/apache/spark/sql/execution/datasources/PartitionedFile.html
https://javadoc.io/static/org.apache.spark/spark-sql_2.12/3.5.0/org/apache/spark/sql/execution/datasources/PartitionedFile.html

The filePath.toString() shild all return the current URI string.

Add documentation.

Add documentation.

Add documentation.

Add documentation.

Add documentation.

SEDONA-713 moving to common.
@Imbruced Imbruced force-pushed the SEDONA-713-add-osm-reader branch from d9c0d99 to ef1cbba Compare February 25, 2025 22:39
@@ -0,0 +1,167 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the same OSM reader code works for all versions (i.e., in spark/common), we should move the test to spark/common as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's what happens :D

@Imbruced
Copy link
Member Author

@jiayuasu looks like everything passed

@jiayuasu jiayuasu added this to the sedona-1.7.1 milestone Feb 26, 2025
@jiayuasu jiayuasu merged commit fba7d75 into master Feb 26, 2025
39 checks passed
@jiayuasu jiayuasu deleted the SEDONA-713-add-osm-reader branch February 28, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants