Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add custom topologySpreadConstraints support to coredns #16983

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

thiagoluiznunes
Copy link

Context on the change

Recently at WildlifeStudios, we had a short temporary outage in CoreDNS in one of our clusters, since a single node crashed, and all CoreDNS pods were running on it.

So, to avoid that, we would like to set our own topologySpreadContraints parameters according to our user case.

What does this PR do?

This PR adds support for customizing the field topologySpreadConstraints in the template of the CoreDNS add-on.

Copy link

linux-foundation-easycla bot commented Dec 6, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot requested a review from zetaab December 6, 2024 19:03
@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 6, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @thiagoluiznunes. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 6, 2024
@thiagoluiznunes thiagoluiznunes force-pushed the thiagonunes/coredns-topology-spread branch from 1ad08a7 to b6bcf4b Compare December 6, 2024 19:30
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Dec 6, 2024
@hakman
Copy link
Member

hakman commented Dec 6, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 6, 2024
@thiagoluiznunes thiagoluiznunes changed the title feat: add custom topologySpreadConstraints support to coredns [WIP] feat: add custom topologySpreadConstraints support to coredns Dec 10, 2024
@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/contains-merge-commits cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/api and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 10, 2024
@thiagoluiznunes thiagoluiznunes force-pushed the thiagonunes/coredns-topology-spread branch 3 times, most recently from 314ad8c to 4d729d5 Compare December 10, 2024 14:20
@k8s-ci-robot k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed do-not-merge/contains-merge-commits cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 10, 2024
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Dec 10, 2024
@thiagoluiznunes thiagoluiznunes force-pushed the thiagonunes/coredns-topology-spread branch from c4ba484 to d1681a4 Compare December 10, 2024 19:02
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 10, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zetaab for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign zetaab for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 10, 2024
@thiagoluiznunes thiagoluiznunes changed the title [WIP] feat: add custom topologySpreadConstraints support to coredns feat: add custom topologySpreadConstraints support to coredns Dec 11, 2024
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024
@thiagoluiznunes
Copy link
Author

Hey folks, could you review the PR?
cc @hakman @justinsb

@hakman
Copy link
Member

hakman commented Dec 16, 2024

@justinsb Could you please take a look?
/lgtm
/assign @justinsb

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 16, 2024
@thiagoluiznunes
Copy link
Author

/retest

@hakman
Copy link
Member

hakman commented Jan 13, 2025

/retest

@thiagoluiznunes Don't worry, it's a known issue.
/override pull-kops-e2e-cni-flannel

Could you also share what value you would like to use in your case and why?

@k8s-ci-robot
Copy link
Contributor

@hakman: Overrode contexts on behalf of hakman: pull-kops-e2e-cni-flannel

In response to this:

/retest

@thiagoluiznunes Don't worry, it's a known issue.
/override pull-kops-e2e-cni-flannel

Could you also share what value you would like to use in your case and why?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@thiagoluiznunes
Copy link
Author

/retest

@thiagoluiznunes Don't worry, it's a known issue. /override pull-kops-e2e-cni-flannel

Could you also share what value you would like to use in your case and why?

@hakman when you ask about my case, is it about the /retest or the feature?
Enabling the customization of the topologySpreadConstraints is necessary because we have some small clusters at Wildlife with 2 or 3 nodes of workload. Eventually, the coredns pods are scheduled in the same node due to the small quantity, and this scenario is dangerous if a failure occurs in that node. Does it make sense?

@k8s-ci-robot
Copy link
Contributor

@thiagoluiznunes: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kops-e2e-gce-cni-calico 824192b link unknown /test pull-kops-e2e-gce-cni-calico
pull-kops-e2e-gce-cni-kindnet 824192b link unknown /test pull-kops-e2e-gce-cni-kindnet

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@thiagoluiznunes
Copy link
Author

/retest

@thiagoluiznunes Don't worry, it's a known issue. /override pull-kops-e2e-cni-flannel
Could you also share what value you would like to use in your case and why?

@hakman when you ask about my case, is it about the /retest or the feature? Enabling the customization of the topologySpreadConstraints is necessary because we have some small clusters at Wildlife with 2 or 3 nodes of workload. Eventually, the coredns pods are scheduled in the same node due to the small quantity, and this scenario is dangerous if a failure occurs in that node. Does it make sense?

Hi @hakman , I wanted to follow up and see if you have any updates on this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/addons area/api area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/office-hours lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants