[DSIP-63][k8s] Support User-customized K8s YAML Task #16478

Mighten · 2024-08-18T10:50:01Z

Action List: Extension of operations for the k8s YAML task:

Pod [DSIP-66][k8s] YAML Pod for User Customization #16482

Search before asking

I had searched in the DSIP and found no similar DSIP.

Motivation

Supporting user-customized K8s YAML tasks has the following benefits:

Flexibility: Unlike the existing K8s low-code job with limited functionality, YAML tasks provide users with the flexibility to define sophisticated task instances in DolphinScheduler, similar to how custom JSON does in DataX.
Workflow Customization: Users can integrate operational and maintenance processes into DolphinScheduler using YAML for complex workflows.
Configuration Requirements: The current K8s low-code job does not meet users' in-depth needs, particularly for tasks involving multiple pods or specific configurations like environment variables and tolerations; in contrast, K8s YAML tasks do.

In short, by enabling user-customized YAML tasks, DolphinScheduler can better support a wide range of Kubernetes-based workflows and operational requirements.

Design Detail

2.1 Design Overview

The following is a Swimlane Diagram showing how this k8s YAML task is embedded into Apache DolphinScheduler:

Figure 2-1(1). Design Overview

User starts a Web page to edit and save K8s YAML Workflow.
UI provides an editor for user to input YAML in Custom Template mode.
API Server encapsulates command and hands it over to Master.
Master splits the workflow DAG and dispatches tasks to Worker.
Worker picks the appropriate task executor and operation. E.g., for k8s Pod YAML, Worker picks YAML Task Executor, and then picks Pod Operation.
Worker reports status to Master.
User reviews k8s YAML task log in the Task Instance Window.

2.2 Frontend Design

The frontend adds support for user-customized k8s YAML tasks while remaining compatible with the original k8s low-code jobs.

Figure 2-2(1). Design Overview

The Web UI layouts

When the user switches on the Custom Template, the Low-code k8s Job fields should hide and YAML editor should appear (or vice versa), similar to the JSON Custom Template in the DataX plugin.

This feature, as shown in Figure 2-2(1), is implemented using the Vue component span, which is controlled by reactive variables (such as yamlEditorSpan) in the file dolphinscheduler-ui/src/views/projects/task/components/node/fields/use-k8s.ts.
The Request body

When the user switches to Custom Template mode, the request body should include only YAML-related fields (customConfig and yamlContent), and all previously hidden fields should not be sent.

This feature is implemented using the taskParams in the file dolphinscheduler-ui/src/views/projects/task/components/node/format-data.ts
i18n/locales

Apache DolphinScheduler is an international software and should support multiple languages.

The text on the Web UI are retrieved from variables defined in the file dolphinscheduler-ui/src/locales/{en_US, zh_CN}/project.ts. And for user-customized k8s YAML tasks, there are three key variables to consider:
- k8s_custom_template: the label for the switch to enable user-customized k8s YAML tasks.
- k8s_yaml_template: the label for the text editor used to input user YAML.
- k8s_yaml_empty_tips: the warning message displayed when a user tries to submit empty YAML
This feature is implemented by invoking t('project.node.${variable_name}') (such as t('project.node.k8s_yaml_template')) in the file dolphinscheduler-ui/src/views/projects/task/components/node/fields/use-k8s.ts.

2.3 Backend Design

The backend design describes the process of how the worker executes user-customized k8s YAML tasks. As shown in Figure 2-3(1), we can see how user-customized k8s YAML Pod tasks are related to the original k8s low-code jobs.

Figure 2-3(1). Backend Design Overview

After the worker checks the parameters, K8sYamlTaskExecutor is loaded for the current user-customized k8s YAML Pod task. Once the YAML is parsed into HasMetadata, its kind field is used to assign abstractK8sOperation as K8sPodOperation for executing the YAML Pod task.

K8s Task Executors

Figure 2-3(2). K8s Task Executors

Three k8s task executor are involved, as shown in Figure 2-3(2):
- AbstractK8sTaskExecutor is an abstract class that represents a k8s task executor.
- K8sTaskExecutor is a concrete class that extends AbstractK8sTaskExecutor to represent a low-code executor
- K8sYamlTaskExecutor is a concrete class that extends AbstractK8sTaskExecutor to represent a user-customized k8s YAML task executor.
K8s Operation handler

Figure 2-3(3). K8s Operation Handlers

Two operation handlers are involved, as shown in Figure 2-3(3):
- AbstractK8sOperation is an interface representing all k8s resource operations.
- K8sPodOperation is a concrete class that implements AbstractK8sOperation to handle Pod operations

2.4 Usecase Design

A typical use case for a k8s YAML task includes uploading YAML, online workflows, and starting workflows, similar to k8s low-code jobs, unless users switch to the Custom Template option to fill in YAML.

Figure 2-4(1). Usecase Design

The user edits a k8s YAML node in a workflow
If the Custom Template is activated and YAML content is not blank, the user may online this whole workflow
If the workflow is online, the user may start the workflow and review the logs generated during the execution of the workflow.

Compatibility, Deprecation, and Migration Plan

3.1 Compatibility Plan

The user-customized k8s YAML feature requires only customConfig to be activated, By default, the value is 0, which applies to the existing k8s low-code jobs.

The remainder of this section will demonstrate the flexibility and compatibility of this design by using the example of introducing Configmaps:

    this.k8sYamlType = K8sYamlType.valueOf(this.metadata.getKind());
    generateOperation();

After parsing with YamlUtils::load, the kind field acquired by this.metadata.getKind() will be ConfigMaps. Then, this.k8sYamlType is determined and used to generate the corresponding operations:

    private void generateOperation() {
        switch (k8sYamlType) {
            case Pod:
                abstractK8sOperation = new K8sPodOperation(k8sUtils.getClient());
                break;
            case ConfigMaps:
                abstractK8sOperation = new K8sConfigmapsOperation(k8sUtils.getClient());
                break;
            default:
                throw new TaskException(
                        String.format("K8sYamlTaskExecutor do not support type %s", k8sYamlType.name()));
        }
    }

Consequently, generateOperation() will set this.abstractK8sOperation to a new instance of K8sConfigmapsOperation. Next, we can implement K8sConfigmapsOperation to handle the ConfigMaps operations.

3.2 Deprecation Plan

N/A for now, waiting for community opinions.

3.3 Migration Plan

N/A for now, waiting for community opinions.

Test Plan

4.1 Overview

The User-customized k8s YAML task feature allows users to submit YAML task to k8s, including Pod, ConfigMaps, and other resources.

This test plan aims to ensure that the feature functions as expected and meets user requirements.

4.2 Scope

YAML Pod

Test Case #	Name	Action	Expectation
1	UI Display	Edit YAML, save and reopen	The YAML content stays up-to-date.
2	UI Validation	try to submit empty YAML	The UI modal dialog intercepts empty YAML.
3	Online Workflow	Save workflow, and online	The User successfully brings the workflow online.
4	Dryrun Workflow	Run workflow as dryrun mode	The Master successfully dry runs this task.
5	Test Workflow	Run workflow as test mode	The Worker successfully tests this task.
6	Run Workflow	Run workflow	The Worker successfully runs this task.

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

SbloodyS · 2024-08-18T11:28:59Z

cc @Gallardot @ruanwenjun

fuchanghai · 2024-08-18T11:42:00Z

@caishunfeng pls help to add this issue to #14102

SbloodyS · 2024-08-18T11:45:05Z

@caishunfeng pls help to add this issue to #14102

Done. You're also DS Committer and have permission to add to it.

Gallardot · 2024-08-18T13:14:20Z

Before discussing this DSIP, I hope everyone can reach a basic consensus. Supporting customization can indeed meet more demand scenarios, but excessive customization can bring more problems.

I see in the design that it supports users to directly create pod and configmap, and even supports creating multiple POD.

Regarding the support for configmap, I have some questions:

Why support configmap? For the same workflow, does it create a configmap for each task instance? Is the content in the configmap different each time?
If it is the same, why create it each time? As a configuration resource in k8s, shouldn't configmap be static? As a way to obtain configuration, besides configmap, should secret also be supported?
Should the configmap be mounted to the pod as a file? If so, should PV and PVC be supported?
If it is just to reference the configuration in the configmap, can it be directly referenced through env?

Regarding the support for pod, I have some questions:

How is the name of the pod defined? How can different workflow in the same namespace ensure that pod names do not duplicate? This is also the case with configmap.
How is the lifecycle of the pod managed? Will DS delete it after the task ends? How to ensure that DS can definitely delete it?
If the execution strategy of the workflow is parallel, how should the pod be handled?
If multiple pods are created at the same time, are these pods related? Or is it just to run multiple pods concurrently? If it is concurrent, does it support deployments? Does it support StatefulSet? Should DS manage it as a controller of k8s resources? I am afraid this is not what DS should do.
Or more broadly, do you want to support the task of creating helm charts?
How to retrieve the logs of a pod? How to retrieve the logs of multiple pods? If there are multiple containers in a pod, how to retrieve the logs of multiple containers?

If the issues are not adequately addressed, I am afraid I will vote -1 on this DSIP.

fuchanghai · 2024-08-18T14:58:08Z

For each type, we can set a strategy. The first strategy is to ignore if it exists, and the second strategy is to delete first and then add to meet various scenarios.
Add two labels to the pod according to the strategy type. If it is an ignore if it exists strategy, use taskCode as the label. If it is a delete first and then add strategy, use taskInstanceId.
Delete the pod according to the label

ObjectMeta.setLabel

We can also use taskInstance to replace the user-defined pod name

ObjectMeta.setName

5.Perhaps this issue can be targeted at a single pod, without considering multiple pods. In fact, if there are multiple pods in a node, we can give multiple pods a label with the value of processInstanceId+taskInstanceId, and obtain multiple pods through processInstance and obtain logs separately.

@Gallardot cc @EricGao888 @Mighten @SbloodyS WDYT?

SbloodyS · 2024-08-19T06:54:33Z

Totally agreed with @Gallardot

From my persional perspective, since DS is a scheduling system. The current k8s task is mainly used to replace the cron-job of k8s. And we have no plans to support k8s deployment scheduling management since this maintenance will involve a huge amount of work. So we need to reach a basic consensus.

fuchanghai · 2024-08-19T09:16:45Z

This is indeed too big. At present, the most commonly used in our company are configMap and pod types. Deployments are only used when using flink. For SaaS type products.ConfigMap is usually initiated by users or when users modify their own configurations. In most cases, pod type is the most commonly used type. We can open an issue only for pods and first discuss how to complete type of pods.
@Mighten cc @Gallardot @SbloodyS

fuchanghai · 2024-08-19T15:16:58Z

For the scenario of a pod with multiple containers, I think it is necessary to divide the logs by container. When querying the logs, the front end needs to pass the container name to check the logs of the specific container. The front end needs to make a table to switch containers to view the logs. This transformation is a bit much. I hope that this issue will only consider a single pod and a single container.

Gallardot · 2024-08-19T15:46:37Z

This is indeed too big. At present, the most commonly used in our company are configMap and pod types. Deployments are only used when using flink. For SaaS type products.ConfigMap is usually initiated by users or when users modify their own configurations. In most cases, pod type is the most commonly used type. We can open an issue only for pods and first discuss how to complete type of pods.

@Mighten cc @Gallardot @SbloodyS

I’m sorry, but I don’t agree with this view. Pods are the most commonly used because they are the basic unit of service workload. But they are also the least used since only early versions of Kubernetes directly used pods. That’s why more advanced workload like Deployments and StatefulSets were introduced later. Managing the lifecycle of pods is an important task in Kubernetes, not just creating a pod.

fuchanghai · 2024-08-21T02:48:57Z

From the current low-code functions of k8s Task, it is to put a pod in a job type task, which is not much different from a single pod task.

qingwli · 2024-09-24T09:36:39Z

I agree with @Gallardot thinking. And for @fuchanghai said support pod level. Few questions:

1. How to limit users just create pod jobs. And if the user wants to create like a deployment, we need to parse the user's yaml and check?
1. I agree it's not just a pod creation. It's more like a pod managent. For now, If user start a spark job or other k8s job. We can add some limit for this pods. But user can defined yaml can avoid this policy, can occur lots of chain questions.
1. Which scenario needs user defined a pods? We have a k8s pod now, If our k8s job can't support some function like specific configurations like environment variables and tolerations. We can enhance and support them.

Overall, vote -1 for this DSIP.

davidzollo · 2024-09-24T12:58:31Z

My suggestion is that using a single pod with a single container, this scenario is suitable for rapid testing and development.

And to concern, my response are as follows,
1. How is the name of the Pod defined? How can different workflows in the same namespace ensure that Pod names do not duplicate?

Pod names must be unique within the same namespace in Kubernetes. DS can generate unique names programmatically using Kubernetes Java Client API by appending identifiers like workflow name, task ID, and timestamp or uuid, this is not difficult.

2. How is the Pod lifecycle managed? Will DS delete it after the task ends? How to ensure that DS can definitely delete it?
DS can control the creation and deletion of Pods. After task completion through API, DS can use the deleteNamespacedPod method to delete the pods, retry mechanisms or manual cleanup calls can be used.

3. How are Pods handled if the workflow execution strategy is parallel?
For parallel execution, DS can create multiple Pods simultaneously. Each Pod runs independently, managed by unique configurations and labels. Each task can create its own Pod, ensuring resource separation and independent execution.

4. If multiple Pods are created simultaneously, are these Pods related, or do they just run concurrently? Does it support Deployments and StatefulSets? Should DS manage it as a controller of Kubernetes resources?
Multiple Pods created for parallel tasks are independent of each other. DS manages these Pods individually using the Kubernetes API rather than as a controller. For more complex scenarios like continuously running services (Deployments) StatefulSets, it is recommended to let Kubernetes native controllers manage these resources instead of DS, DS focuses on task scheduling.

5. Does DS support tasks for creating Helm Charts?
DS doesn’t natively include task types for directly deploying Helm Charts.

6. How to retrieve the logs of a Pod? How to retrieve logs of multiple Pods? If a Pod contains multiple containers, how to retrieve the logs of multiple containers?
DS can retrieve logs by calling the API of K8s.

SbloodyS · 2024-09-29T03:19:56Z

Closing for no plans to do this.

Mighten added DSIP Waiting for reply Waiting for reply labels Aug 18, 2024

SbloodyS assigned Mighten Aug 18, 2024

SbloodyS removed the Waiting for reply Waiting for reply label Aug 18, 2024

SbloodyS mentioned this issue Aug 18, 2024

[TOC] DSIP Collection #14102

Open

84 tasks

SbloodyS changed the title ~~[DSIP][k8s] Support User-customized K8s YAML Task~~ [DSIP-63][k8s] Support User-customized K8s YAML Task Aug 18, 2024

Mighten mentioned this issue Aug 18, 2024

[DSIP-66][k8s] Support User-customized K8s YAML Pod #16477

Closed

SbloodyS closed this as completed Sep 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DSIP-63][k8s] Support User-customized K8s YAML Task #16478

[DSIP-63][k8s] Support User-customized K8s YAML Task #16478

Mighten commented Aug 18, 2024 •

edited

Loading

SbloodyS commented Aug 18, 2024

fuchanghai commented Aug 18, 2024

SbloodyS commented Aug 18, 2024

Gallardot commented Aug 18, 2024 •

edited

Loading

fuchanghai commented Aug 18, 2024 •

edited

Loading

SbloodyS commented Aug 19, 2024

fuchanghai commented Aug 19, 2024 •

edited

Loading

fuchanghai commented Aug 19, 2024

Gallardot commented Aug 19, 2024

fuchanghai commented Aug 21, 2024

qingwli commented Sep 24, 2024

davidzollo commented Sep 24, 2024

SbloodyS commented Sep 29, 2024

[DSIP-63][k8s] Support User-customized K8s YAML Task #16478

[DSIP-63][k8s] Support User-customized K8s YAML Task #16478

Comments

Mighten commented Aug 18, 2024 • edited Loading

Search before asking

Motivation

Design Detail

2.1 Design Overview

2.2 Frontend Design

2.3 Backend Design

2.4 Usecase Design

Compatibility, Deprecation, and Migration Plan

3.1 Compatibility Plan

3.2 Deprecation Plan

3.3 Migration Plan

Test Plan

4.1 Overview

4.2 Scope

Code of Conduct

SbloodyS commented Aug 18, 2024

fuchanghai commented Aug 18, 2024

SbloodyS commented Aug 18, 2024

Gallardot commented Aug 18, 2024 • edited Loading

fuchanghai commented Aug 18, 2024 • edited Loading

SbloodyS commented Aug 19, 2024

fuchanghai commented Aug 19, 2024 • edited Loading

fuchanghai commented Aug 19, 2024

Gallardot commented Aug 19, 2024

fuchanghai commented Aug 21, 2024

qingwli commented Sep 24, 2024

davidzollo commented Sep 24, 2024

SbloodyS commented Sep 29, 2024

Mighten commented Aug 18, 2024 •

edited

Loading

Gallardot commented Aug 18, 2024 •

edited

Loading

fuchanghai commented Aug 18, 2024 •

edited

Loading

fuchanghai commented Aug 19, 2024 •

edited

Loading