-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Greetings Team !
We are repetitively facing the issue: the first tekton pipeline initiated in the day is facing issues while getting scheduled and is getting stuck in the pending state for 20 minutes, post that the pods are getting scheduled on the nodes, and the pipelines are either getting completed or getting timed out.
Also, we are using dedicated nodepools (default) for scheduling the tekton pipelines, and the nodes are already running state and its not even due to nodes spinning up to schedule the pods
We are using tekton on our eks cluster (1.33) hosted on AWS with Karpenter as the autoscaler and also we have the following versions of pipeline controller and other components:
Tekton pipelines : v1.1.0
Tekton Triggers: v0.32.0
Tekton Dashboard: v0.58.0
pipeline.tekton.dev/release=v1.1.0
And we are using the following settings in the feature flags:
C:\Users\AL41560>kubectl get cm feature-flags -n tekton-pipelines -o yaml
apiVersion: v1
data:
await-sidecar-readiness: "true"
coschedule: disabled
disable-creds-init: "false"
disable-inline-spec: ""
enable-api-fields: beta
enable-artifacts: "false"
enable-cel-in-whenexpression: "false"
enable-concise-resolver-syntax: "false"
enable-kubernetes-sidecar: "false"
enable-param-enum: "false"
enable-provenance-in-status: "true"
enable-step-actions: "true"
enable-tekton-oci-bundles: "false"
enforce-nonfalsifiability: none
keep-pod-on-cancel: "false"
require-git-ssh-secret-known-hosts: "false"
results-from: termination-message
running-in-environment-with-injected-sidecars: "true"
send-cloudevents-for-runs: "false"
set-security-context: "false"
set-security-context-read-only-root-filesystem: "false"
trusted-resources-verification-no-match-policy: ignore
Also in our trigger templated we are using the following config:
resourcetemplates:
- apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: dev-deploy-$(tt.params.bitbucket-repository-name)-pipeline-run-
spec:
pipelineRef:
name: dev-sit-deploy-pipeline
podTemplate:
nodeSelector:
karpenter.sh/nodepool: "default"
node.kubernetes.io/instance-type: "m5.4xlarge"
how can we fix the issue and resolve the same ??