Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add containerd config options #11080

Merged
merged 2 commits into from
Aug 21, 2024

Conversation

spnngl
Copy link
Contributor

@spnngl spnngl commented Apr 12, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:

Get more containerd configuration options, notably SElinux and debug socket.

Which issue(s) this PR fixes:

Fixes #11081

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

[containerd] added debug config variables `containerd_debug_address`, `containerd_debug_level`, `containerd_debug_format` `containerd_debug_uid`, `containerd_debug_gid`; it is disabled by default.
[containerd] added "io.containerd.grpc.v1.cri" config variables `containerd_enable_selinux`, `containerd_disable_apparmor`, `containerd_tolerate_missing_hugetlb_controller`, `containerd_disable_hugetlb_controller`, `containerd_image_pull_progress_timeout`

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 12, 2024
@k8s-ci-robot k8s-ci-robot requested review from VannTen and yankay April 12, 2024 14:13
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 12, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @spnngl. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Apr 12, 2024
level = "{{ containerd_debug_level }}"
format = "{{ containerd_debug_format }}"

Copy link
Member

@yankay yankay Apr 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @spnngl

When using containerd config default, it outputs:

[debug]
  address = ""
  format = ""
  gid = 0
  level = ""
  uid = 0

So, the default config should not be changed :-)

If we want to customize the config, we can use the code:

{% if containerd_debug_address %}
  address = "{{ containerd_debug_address }}"
{% else %}

And the code in roles/container-engine/containerd/defaults/main.yml

# some comments...
# containerd_debug_address: "/var/run/containerd/debug.sock"

The same as the other variables.
Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it in last push

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I really disagree with that. It's of little use to match the output of containerd config default, because it's not like we can use that directly in our jinja template, so we have to update mostly manually anyway if containerd defaults change.

I'd rather have explicit defaults and a more straightforward template, with as little branching as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put containerd_debug_level back to its previous value and the others will be used only if defined by the user

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that we should not have to use is defined everywhere, and keeping the same structure that the output of containerd config default buy us nothing, so we should just have something like:

{% if containerd_debug_enabled %}
[debug]
  setting_1 = {{ containerd_default_value_for_setting_1 }}
  ...
{% endif %}

@yankay
Copy link
Member

yankay commented Apr 14, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 14, 2024
@spnngl spnngl force-pushed the chore/containerd/config branch from 12ed2f1 to 10ec72f Compare April 15, 2024 14:02
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 15, 2024
@spnngl spnngl requested a review from yankay April 15, 2024 14:03
@spnngl spnngl force-pushed the chore/containerd/config branch 3 times, most recently from 72f332a to 66ca060 Compare April 17, 2024 13:01
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 17, 2024
@spnngl spnngl force-pushed the chore/containerd/config branch from 66ca060 to b2a8f7c Compare April 17, 2024 17:14
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 17, 2024
@mzaian
Copy link
Contributor

mzaian commented May 6, 2024

@spnngl please add a proper release note.

@spnngl
Copy link
Contributor Author

spnngl commented May 7, 2024

I just added it to the release-note section of this PR

@mzaian
Copy link
Contributor

mzaian commented May 13, 2024

@VannTen @yankay Could you please do a final review here.

@@ -8,7 +8,19 @@ oom_score = {{ containerd_oom_score }}
max_send_message_size = {{ containerd_grpc_max_send_message_size }}

[debug]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, the debug socket is not enabled if this section is not present.
IMO, this whole section should be guarded with something like containerd_enabled_debug_socket, defaulting to false.
Production setup should presumably not have a debug socket lying around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug log level is not the same thing that the normal log level, is it not ?

I know that section is already present. What I mean is that it should not be enabled by default, as kubespray by default target production use cases, and having a debug socket enabled in production is not optimal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It set the log level for everything, not just debug ones.
Same for the log format inside it.

Do you still want to disable this section entirely with a containerd_enabled_debug bool ? And no if defined inside it ?

See: https://github.com/containerd/containerd/blob/83031836b2cf55637d7abf847b17134c51b38e53/cmd/containerd/command/main.go#L348

It takes the config.Debug.Level if no cli params is used, cli flags have higher priorities but we do not use them in kubespray.

func setLogLevel(context *cli.Context, config *srvconfig.Config) error {
	l := context.GlobalString("log-level")
	if l == "" {
		l = config.Debug.Level
	}
	if l != "" {
		return log.SetLevel(l)
	}
	return nil
}

Config struct is found here and the debug section looks like this:

type Debug struct {
	Address string `toml:"address"`
	UID     int    `toml:"uid"`
	GID     int    `toml:"gid"`
	Level   string `toml:"level"`
	// Format represents the logging format. Supported values are 'text' and 'json'.
	Format string `toml:"format"`
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is so weird (or I totally misunderstood something). Does that mean there is no way to change the log level from the config without enabling the debug socket ?
I mean, I assumed that the "normal" log stream (which I presume goes to standard output) and the "debug" log stream (to the debug socket) where different, but are they in fact the same ?

I've tried to check containerd documentation but couldn't find explanation about this.


That said, I still thinks we should not enable the debug socket in production, (unless the debug socket is an unfortunate naming and it's in fact required ?). Since we handle the generation of the systemd unit for containerd we could inject the log level as flags... 🤔

That does seems a bit weird from containerd though, I feel like I'm missing something.

Copy link
Contributor Author

@spnngl spnngl May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the [debug] section in the toml config does not mean we activate the debug socket.
It is only activated if config.Debug.Address != "", you can see the code here: https://github.com/containerd/containerd/blob/v1.7.16/cmd/containerd/command/main.go#L232

By default debug socket is always disabled.

		if config.Debug.Address != "" {
			var l net.Listener
			if isLocalAddress(config.Debug.Address) {
				if l, err = sys.GetLocalListener(config.Debug.Address, config.Debug.UID, config.Debug.GID); err != nil {
					return fmt.Errorf("failed to get listener for debug endpoint: %w", err)
				}
			} else {
				if l, err = net.Listen("tcp", config.Debug.Address); err != nil {
					return fmt.Errorf("failed to get listener for debug endpoint: %w", err)
				}
			}
			serve(ctx, l, server.ServeDebug)
		}

https://github.com/containerd/containerd/blob/v1.7.16/cmd/containerd/command/main_unix.go#L76

func isLocalAddress(path string) bool {
	return filepath.IsAbs(path)
}

It is exactly the same for the metrics one: https://github.com/containerd/containerd/blob/v1.7.16/cmd/containerd/command/main.go#L245

		if config.Metrics.Address != "" {
			l, err := net.Listen("tcp", config.Metrics.Address)
			if err != nil {
				return fmt.Errorf("failed to get listener for metrics endpoint: %w", err)
			}
			serve(ctx, l, server.ServeMetrics)
		}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I got it, thanks for bearing with me (https://github.com/containerd/containerd/blob/main/docs/man/containerd-config.toml.5.md is not super helpful here !).

So, defining the address is what activate the debug socket, got it.
But according to the link above, it has a default value of /run/containerd/debug.sock, so it would be activated by default, is that correct ?

What I would do in that case is have in our defaults/main.yml something like that.

containerd_debug_enable: false
contaienrd_debug_address: '/run/containerd/debug.sock
# the rest of the value with default matching containerd 

And in the template something likg

[debug]
     address = "{{ containerd_debug_enable | ternary(containerd_debug_address, '') }}"
     # the rest of the values

wdyt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc is misleading, it says it is enabled by default but it is not.

Here the config from a newly created node in my cluster

version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
oom_score = 0

[grpc]
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

[debug]
  level = "info"

[metrics]
  address = ""
  grpc_histogram = false

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    sandbox_image = "registry.k8s.io/pause:3.9"
    max_container_log_line_size = -1
    enable_unprivileged_ports = false
    enable_unprivileged_icmp = false
  [plugins."io.containerd.grpc.v1.cri".containerd]
    default_runtime_name = "runc"
    snapshotter = "overlayfs"
  
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
        runtime_type = "io.containerd.runc.v2"
        runtime_engine = ""
        runtime_root = ""
        base_runtime_spec = "/etc/containerd/cri-base.json"
    
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            systemdCgroup = true
            binaryName = "/opt/bin/runc"

No debug.sock is setup nor present on the node, it seems to default to "", we can check with /opt/bin/containerd config dump which gives us

[debug]
  address = ""
  format = ""
  gid = 0
  level = "info"
  uid = 0

We can setup it explicitely like you said, it may be less confusing for us and users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[debug]
  address = ""
  format = ""
  gid = 0
  level = "info"
  uid = 0

I think this would be an okay output when running with kubespray defaults, so yeah, let's go the explicit way (minus the enable which is not needed if we just default to "" for the address.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed my modifications

Comment on lines 49 to 53
# debug socket location: unix or tcp format
containerd_debug_address: ""
containerd_debug_level: ""
# logs format, supported values: text, json
containerd_debug_format: ""
containerd_debug_uid: 0
containerd_debug_gid: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have some consistency here : either put the default for all settings, or for none.
And I'd lean towards documenting the defaults explicitely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it to the containerd config default as requested previously, I changed it back to what it was for containerd_debug_level and add the others only if they are defined by the user

level = "{{ containerd_debug_level }}"
format = "{{ containerd_debug_format }}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I really disagree with that. It's of little use to match the output of containerd config default, because it's not like we can use that directly in our jinja template, so we have to update mostly manually anyway if containerd defaults change.

I'd rather have explicit defaults and a more straightforward template, with as little branching as possible.

@spnngl spnngl force-pushed the chore/containerd/config branch from b2a8f7c to 715f28e Compare May 14, 2024 15:45
@VannTen
Copy link
Contributor

VannTen commented May 21, 2024

Also, could you make the release note a bit more concise ? It's supposed to be a single line entry in a Releases Notes (example

@spnngl spnngl force-pushed the chore/containerd/config branch from 715f28e to 5c2f11a Compare May 22, 2024 15:37
@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label May 22, 2024
@spnngl spnngl force-pushed the chore/containerd/config branch from 5c2f11a to a7b8474 Compare June 27, 2024 14:50
@spnngl spnngl force-pushed the chore/containerd/config branch from a7b8474 to 964fc23 Compare July 19, 2024 13:20
@spnngl
Copy link
Contributor Author

spnngl commented Jul 19, 2024

@VannTen Hi, anything to add before merging ?

@spnngl spnngl requested a review from VannTen August 6, 2024 06:58
@yankay
Copy link
Member

yankay commented Aug 15, 2024

Thanks @spnngl

A great PR. /approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 15, 2024
@yankay
Copy link
Member

yankay commented Aug 15, 2024

HI @VannTen @mzaian
Would you please help to review it :-)

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mzaian, spnngl, yankay

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yankay yankay mentioned this pull request Aug 15, 2024
@tico88612
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 21, 2024
@k8s-ci-robot k8s-ci-robot merged commit 2f84567 into kubernetes-sigs:master Aug 21, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Containerd: add debug and some CRI configuration options
6 participants