You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, the flyteadmin pod is blocked from starting up until the OIDC provider is healthy and available (the pod gets stuck in Error state). In some Kubernetes configurations, this erroring-pod could cause deployment-wide issues. The current behavior could be made more resilient.
(Note that this applies to configurations using useAuth=true)
Goal: What should the final outcome look like, ideally?
A better approach in these configurations is to allow flyte to start up, even if the OIDC provider is unavailable. Then, try to re-initialize the OIDC provider later in the deployment lifespan. This is a more resilient approach, and it can be made configurable.
Describe alternatives you've considered
A workaround is to disable Flyte until the OIDC provider is available.
The issue of FlyteAdmin being blocked from starting up until the OIDC provider is healthy is known. According to Flyte's documentation, FlyteAdmin requires an OIDC provider for authentication when useAuth=true. This can cause deployment issues if the OIDC provider is unavailable.
A proposed solution is to allow FlyteAdmin to start even if the OIDC provider is unavailable and re-initialize the OIDC provider later. This approach would improve resiliency and can be made configurable. Currently, a workaround is to disable Flyte until the OIDC provider is available.
Would you like more details on how to implement this or any other specific aspect?
A better approach in these configurations is to allow flyte to start up, even if the OIDC provider is unavailable. Then, try to re-initialize the OIDC provider later in the deployment lifespan. This is a more resilient approach, and it can be made configurable.
This is a double edged sword. If I recall correctly there was some work done recently to indefinitely cache something related to OIDC on boot. If the OIDC provider is down and you do a normal rolling deployment you could end up in a worse state since you previously had working pods but now you have broken ones.
Motivation: Why do you think this is important?
Today, the flyteadmin pod is blocked from starting up until the OIDC provider is healthy and available (the pod gets stuck in Error state). In some Kubernetes configurations, this erroring-pod could cause deployment-wide issues. The current behavior could be made more resilient.
(Note that this applies to configurations using
useAuth=true
)Goal: What should the final outcome look like, ideally?
A better approach in these configurations is to allow flyte to start up, even if the OIDC provider is unavailable. Then, try to re-initialize the OIDC provider later in the deployment lifespan. This is a more resilient approach, and it can be made configurable.
Describe alternatives you've considered
A workaround is to disable Flyte until the OIDC provider is available.
Propose: Link/Inline OR Additional context
Proposed fix here: #5702
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: