Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMORO-2009]: Add aws/s3/hadoop-aws and iceberg-aws as default dependenies. #2080

Merged
merged 8 commits into from
Oct 12, 2023

Conversation

baiyangtx
Copy link
Contributor

Why are the changes needed?

Close #2009.

Brief change log

  • Add iceberg-aws as default dependenies for all module
  • Add hadoop-aws, awssdk to AMS module depndenies.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? ( no)

@github-actions github-actions bot added module:core Core module type:build module:ams-dashboard Ams dashboard module labels Oct 11, 2023
@baiyangtx baiyangtx marked this pull request as ready for review October 11, 2023 12:28
@github-actions github-actions bot added the module:mixed-spark Spark module for Mixed Format label Oct 11, 2023
@codecov
Copy link

codecov bot commented Oct 11, 2023

Codecov Report

All modified lines are covered by tests ✅

see 3 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
</dependency>
Copy link
Contributor

@XBaith XBaith Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the scenarios where we need to use hadoop-aws dependency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to support s3a:// file system with hive

</dependency>
</dependencies>
</profile>
</profiles>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm not so sure about is: amoro-core is used in multiple modules, will moving to the aws-server module affect the service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aws sdk shoud be controlled by compute engines. Amoro connectors should not shade them.

@baiyangtx
Copy link
Contributor Author

Works in locally and minio standalone cluster.

图片

图片

Copy link
Contributor

@zhoujinsong zhoujinsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@wangtaohz wangtaohz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

We can merge it ASAP, which would be helpful for my local testing.

@baiyangtx baiyangtx merged commit 0702c64 into apache:master Oct 12, 2023
@baiyangtx baiyangtx deleted the hadoop-aws branch October 12, 2023 08:50
ShawHee pushed a commit to ShawHee/arctic that referenced this pull request Dec 29, 2023
…enies. (apache#2080)

* hadoop-aws pom

* enable aws profile

* add aws sdk and s3 dependencies .

* include iceberg-aws for runtime

* iceberg-aws scope to compile

* fix spark run

---------

Co-authored-by: ZhouJinsong <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:ams-dashboard Ams dashboard module module:core Core module module:mixed-spark Spark module for Mixed Format type:build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement]: Support s3:// and s3a:// protocols in the default distribution.
4 participants