Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container Load Balancer - Network Security Group (NSG) #8054

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

georgeedward2000
Copy link

@georgeedward2000 georgeedward2000 commented Jan 17, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

For a container load balancer (CLB) with floating IP disabled, the Network Security Group (NSG) must allow traffic to dynamic pod IPs. To manage this:

  • Allow Rule for Pod Subnet: Configure the NSG to allow traffic from the entire pod subnet of the managed cluster, accommodating dynamic pod IPs.
  • Backend Pool Configuration: Ensure the NSG rules permit traffic to the pod IPs in the backend pool.
    This setup ensures secure and efficient communication with the pod backend pool.
  • Unit Tests

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added the ability to set the type of backend pool to PodIP. 
This action allows the Load balancer attached to the service to route traffic directly to the containers, 
instead of routing to AKS nodes and then AKS node routing it to the PODs hosting the service.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Jan 17, 2025
Copy link

linux-foundation-easycla bot commented Jan 17, 2025

CLA Not Signed

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/needs-kind labels Jan 17, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @georgeedward2000!

It looks like this is your first PR to kubernetes-sigs/cloud-provider-azure 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cloud-provider-azure has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @georgeedward2000. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 17, 2025
@georgeedward2000 georgeedward2000 changed the title Update NSG - CLB Load Balancer Update NSG - Container Load Balancer Jan 17, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 17, 2025
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 20, 2025
}
if wantLb && !lbFound {
logger.Error(err, "Failed to get load balancer")
return nil, fmt.Errorf("unable to get lb %s", lbName)
}
var backendIPv4List, backendIPv6List []string
if lbFound {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For POD IP based backendpool, we are not populating the POD IPs in the NSG. Rather, the POD subnet. So, we could skip GetBackendPrivateIPs for POD IP based Backend pool.

Copy link
Author

@georgeedward2000 georgeedward2000 Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the initial flow - node ip based backendpool

It is part of the block that is on the else branch of :
if az.IsLBBackendPoolTypePodIP() {
} else {
// HERE
}

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: georgeedward2000, kartickmsft
Once this PR has been reviewed and has the lgtm label, please assign nilo19 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

az.PodCidrIPv6 = prefix
}
}
az.RetrievedClusterPodCidr = true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can podcidrs change during cluster operations (eg. adding new node pool)? If yes, then do we need to handle this differently to account for potential new cidrs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, the pod subnet (hence the cidr) is set during cluster's deployment and cannot be updated during its lifetime. Please confirm @kartickmsft

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that can happen when completely new node pool is added: https://learn.microsoft.com/en-us/azure/aks/create-node-pools#add-a-node-pool-with-a-unique-subnet

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POD subnet can be newly configured for a new nodepool for Azure CNI Dynamic IP allocation and Enhanced subnet option (https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni-dynamic-ip-allocation#adding-node-pool).
So, yeah, we need to handle it differently. As per my understanding, cloud-provider doesn't get any notification about nodepool addition/deletion. If this understanding is correct, maybe reading PODCIDRs each time maybe the only option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it may change. We'd need to figure out a way to get notified on such changes

Copy link
Member

@feiskyer feiskyer Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this understanding is correct, maybe reading PODCIDRs each time maybe the only option.

This may be not enough as there may be down times when new IPs are picked while they are blocked by NSG.

@@ -47,6 +47,18 @@ func (f *AzureLoadBalancerFixture) IPv4Addresses() []string {
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some details about the change in the PR description?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the PR description

@georgeedward2000 georgeedward2000 changed the title Update NSG - Container Load Balancer Container Load Balancer - Network Security Group (NSG) Feb 4, 2025
@feiskyer
Copy link
Member

@georgeedward2000 Thanks for the contribution. Could you sign CLA?

@feiskyer
Copy link
Member

And could you use the template (the template would show when you open the PR) and fill the template contents (including release notes information together with what this PR does)?

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 18, 2025
Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, there is a security limitation of current design, please also add it as comments and TODOs.

{

if az.IsLBBackendPoolTypePodIP() {
if !az.RetrievedClusterPodCidr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, this would block the security boundary of the Kubernetes cluster. The cluster identity should not have the permission to manage itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use new cloud config options and ask the provisioning service (e.g. AKS or capz) to setup the CIDRs on changing.

// If the pod subnet prefix is not set, the value is false.
// If the pod subnet prefix is set, the value is true.
// Note: Multiple rules per subnet with different protocols and destination ports can exist.
RetrievedClusterPodCidr bool `json:"retrievedClusterPodCidr" yaml:"retrievedClusterPodCidr"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented, let's remove this option and ask provisioning service to set the PodCIDRs

// PodCidrIPv6 is the IPv6 pod subnet prefix for the cluster.
// The pod subnet prefix is used to configure the NSG for the pod subnet.
PodCidrIPv4 netip.Prefix `json:"podCidrIPv4" yaml:"podCidrIPv4"`
PodCidrIPv6 netip.Prefix `json:"podCidrIPv6" yaml:"podCidrIPv6"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For extensibility, let's use LIST here?


// PodCidrIPv4 is the IPv4 pod subnet prefix for the cluster.
// PodCidrIPv6 is the IPv6 pod subnet prefix for the cluster.
// The pod subnet prefix is used to configure the NSG for the pod subnet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add a message to say all Pod CIDR would be opened to internet by default?

and add a TODO comment to improve the security later?

@feiskyer
Copy link
Member

Let's also bake some tests.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 18, 2025
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Feb 18, 2025
@k8s-ci-robot
Copy link
Contributor

Adding label do-not-merge/contains-merge-commits because PR contains merge commits, which are not allowed in this repository.
Use git rebase to reapply your commits on top of the target branch. Detailed instructions for doing so can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@nilo19
Copy link
Contributor

nilo19 commented Feb 20, 2025

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. do-not-merge/contains-merge-commits kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants