Flyte multi-cluster deployment - Allow deferring cluster label selection to external service #6240
fg91
started this conversation in
RFC Incubator
Replies: 1 comment
-
Update contributors' sync: |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Flyte can be deployed with multiple data plane clusters in which executions can be scheduled.
Which cluster is used for an execution can be selected in various ways:
pyflyte run
As a platform engineer maintaining Flyte, I would like to 1) relieve platform users from having to decide on the execution cluster and 2) take realtime consumption of resources into account in an automated way when deciding which cluster to use.
The service that the user would implement for themselves could e.g. monitor utilization of the available clusters and current cloud provider resource quotas in order to decide on an execution cluster label.
As a bonus feature, flyteadmin could provide this external service with a summary which resource types (in particular accelerator types) will be used in what quantity by the workflow execution. This would be very helpful but not a requirement.
Beta Was this translation helpful? Give feedback.
All reactions