Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge 1.1.0 to main banch #47

Merged
merged 26 commits into from
Jul 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
57c9a43
1. Modify Exchangis deployment url.
zqburde Jul 5, 2022
ee4ce24
Merge pull request #42 from zqburde/1.1.0
zqburde Jul 5, 2022
b924ef7
Documentation translated into English
yuankang134 Jul 6, 2022
0d96dc2
Documentation translated into English
yuankang134 Jul 6, 2022
c0fbcbf
Documentation translated into English
yuankang134 Jul 6, 2022
f69aaf0
Documentation translated into English
yuankang134 Jul 6, 2022
fddfba2
Merge pull request #43 from yuankang134/1.1.0
zqburde Jul 6, 2022
1d4132c
Merge pull request #1 from WeBankFinTech/1.1.0
yuankang134 Jul 6, 2022
32c75e6
Documentation translated into English
yuankang134 Jul 6, 2022
af58358
Merge pull request #44 from yuankang134/1.1.0
zqburde Jul 6, 2022
f4e2f17
Modify Exchangis deployment url.
zqburde Jul 6, 2022
2687302
update link to english
yuankang134 Jul 6, 2022
239c404
update link to english
yuankang134 Jul 6, 2022
fe83496
Optimize DolphinScheduler appconn doc.
zqburde Jul 7, 2022
b530b70
Merge pull request #46 from zqburde/1.1.0
zqburde Jul 7, 2022
9740b14
Merge pull request #45 from yuankang134/1.1.0
zqburde Jul 7, 2022
df1716a
Merge pull request #2 from WeBankFinTech/1.1.0
yuankang134 Jul 7, 2022
1909587
translate readme file
yuankang134 Jul 7, 2022
98940fc
update upgrade doc for project-server config modify.
HmhWz Jul 7, 2022
1f3c5e0
Merge pull request #49 from HmhWz/1.1.0
zqburde Jul 7, 2022
e0b660d
update readme file
yuankang134 Jul 8, 2022
4ef19a7
Optimize README-ZH doc.
zqburde Jul 8, 2022
31ce0f0
Merge pull request #50 from zqburde/1.1.0
zqburde Jul 8, 2022
039c649
Merge branch '1.1.0' into 1.1.0
yuankang134 Jul 8, 2022
005d013
update readme file
yuankang134 Jul 8, 2022
e7d9db8
Merge pull request #48 from yuankang134/1.1.0
zqburde Jul 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions README-ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,12 +75,11 @@ DataSphereStudio

- [DSS 的 Exchangis AppConn 插件安装指南](https://github.com/WeDataSphere/Exchangis/blob/master/docs/zh_CN/ch1/exchangis_appconn_deploy_cn.md)

- [DSS 的 Qualitis AppConn 插件安装指南](https://github.com/WeBankFinTech/Qualitis/blob/master/docs/zh_CN/ch1/%E6%8E%A5%E5%85%A5%E5%B7%A5%E4%BD%9C%E6%B5%81%E6%8C%87%E5%8D%97.md)

- [DSS 的 Streamis AppConn 插件安装指南](https://github.com/WeBankFinTech/Streamis/blob/main/docs/zh_CN/0.2.0/development/StreamisAppConn%E5%AE%89%E8%A3%85%E6%96%87%E6%A1%A3.md)

- [DSS 的 Prophecis AppConn 插件安装指南](https://github.com/WeBankFinTech/Prophecis/blob/master/docs/zh_CN/Deployment_Documents/Prophecis%20Appconn%E5%AE%89%E8%A3%85%E6%96%87%E6%A1%A3.md)

- [DSS 的 Dolphinscheduler AppConn 插件安装指南](zh_CN/安装部署/DolphinScheduler插件安装文档.md)

## 谁在使用 DataSphere Studio

Expand Down
105 changes: 50 additions & 55 deletions README.md

Large diffs are not rendered by default.

92 changes: 92 additions & 0 deletions en_US/Design_Documentation/FlowExecution/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
FlowExecution
-------------------------
FlowExecution:The workflow real-time execution module provides interface services for workflow execution, uses the Entrance module of linkis, and inherits some classes of linkis-entrance for adaptive transformation.
Such as inheriting the PersistenceEngine to implement the persistence operation of the dss workflow task, and overriding the EntranceExecutionJob class to implement the workflow node execution, state transition, kill and other operations. Finally, the parsed and processed tasks are delivered to the linkis service through linkis-computation-client.


### 1 business architecture

User use function points:

| component name | First-level module | Second-level module | Function point |
|---------------------|------------------|-----------------|-----------------|
| DataSphereStudio | Workflow | Workflow Execution | Execute |
| | | | Check to execute |
| | | | fail and rerun |
| | | | Execution history view |

![](images/workflow_execution_uml.png)

### 2. Introduction to FlowExecution interface/class function:

| Core Interface/Class | Core Function |
|---------------------------|------------------------------|
| FlowEntranceRestfulApi | Provides a restful interface for workflow execution, such as task execution, status acquisition |
| WorkflowPersistenceEngine | Overrides the persist method of the PersistenceEngine of linkis, converts the jobRequest to a workflowTask and persists it to the workflow_task table of dss |
| WorkflowQueryService | Provides interface services such as workflow task creation and status update |
| WorkflowExecutionInfoService | Provides services such as creation and query of workflow execution information |
| FlowEntranceEngine | Executor inherited from linkis, implements the execute method, will call flowParser to parse the flow and use it as the entry of the runJob method |
| FlowExecutionExecutorManagerImpl | Inherit the ExecutorManager of the child linkis, rewrite the createExecutor method, so that the created executor is the FlowEntranceEngine of dss |
| FlowExecutionParser | CommonEntranceParser inherited from linkis, rewrites the parseToJob method to return FlowEntranceJob of dss |
| DefaultFlowExecution | Provides the runJob() method to convert all scheduled nodes of FlowEntranceJob to runnable state, and add the running nodes to the task queue. The timing thread pool polls the linkis task status corresponding to each node. If the status task is completed is removed from the queue. |
| FlowEntranceJobParser | Define the parse() method, the subclass contains various parsing implementation classes, such as parsing workflow from jobRequest, parsing the params attribute of workflow nodes |
| FlowEntranceJob | EntranceExecutionJob inherited from linkis, rewrites run(), kill(), onStatusChanged() and other methods to provide job execution entry and status callback processing. |
| DefaultNodeRunner | The node task running thread, converts the node task to LinkisJob and delivers it to linkis, and provides methods for initiating task status query and task cancellation to linkis |
| NodeSkipStrategy | Strategy interface for judging whether a node is skipped, including three implementation strategies: execution, rerun on failure, and selected execution |
| FlowContext | Contains workflow context status information, and provides methods such as getRunningNodes and getSucceedNodes to obtain information such as running nodes and successful nodes of the workflow |



### 3. Workflow execution process link:
![](images/flowexecution.drawio.png)

### 4. Data Structure/Storage Design
Workflow execution information table:
```roomsql
CREATE TABLE `dss_workflow_execute_info` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`task_id` bigint(20) NOT NULL COMMENT 'task id',
`status` int(1) DEFAULT NULL COMMENT 'status,0:faild 1:success,',
`flow_id` bigint(20) NOT NULL COMMENT 'flowId',
`version` varchar(200) DEFAULT NULL COMMENT 'Workflow bml version number',
`failed_jobs` text COMMENT 'execution failed node',
`Pending_jobs` text COMMENT 'Node not executed',
`skipped_jobs` text COMMENT 'execute skip node',
`succeed_jobs` text COMMENT 'Execute successful node',
`createtime` datetime NOT NULL COMMENT 'create time',
`running_jobs` text COMMENT 'running jobs',
`updatetime` datetime DEFAULT NULL COMMENT 'update time',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
```
Workflow task information table:
```roomsql
CREATE TABLE `dss_workflow_task` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'Primary Key, auto increment',
`instance` varchar(50) DEFAULT NULL COMMENT 'An instance of Entrance, consists of IP address of the entrance server and port',
`exec_id` varchar(50) DEFAULT NULL COMMENT 'execution ID, consists of jobID(generated by scheduler), executeApplicationName , creator and instance',
`um_user` varchar(50) DEFAULT NULL COMMENT 'User name',
`submit_user` varchar(50) DEFAULT NULL COMMENT 'submitUser name',
`execution_code` text COMMENT 'Run script. When exceeding 6000 lines, script would be stored in HDFS and its file path would be stored in database',
`progress` float DEFAULT NULL COMMENT 'Script execution progress, between zero and one',
`log_path` varchar(200) DEFAULT NULL COMMENT 'File path of the log files',
`result_location` varchar(200) DEFAULT NULL COMMENT 'File path of the result',
`status` varchar(50) DEFAULT NULL COMMENT 'Script execution status, must be one of the following: Inited, WaitForRetry, Scheduled, Running, Succeed, Failed, Cancelled, Timeout',
`created_time` datetime DEFAULT NULL COMMENT 'Creation time',
`updated_time` datetime DEFAULT NULL COMMENT 'Update time',
`run_type` varchar(50) DEFAULT NULL COMMENT 'Further refinement of execution_application_time, e.g, specifying whether to run pySpark or SparkR',
`err_code` int(11) DEFAULT NULL COMMENT 'Error code. Generated when the execution of the script fails',
`err_desc` text COMMENT 'Execution description. Generated when the execution of script fails',
`execute_application_name` varchar(200) DEFAULT NULL COMMENT 'The service a user selects, e.g, Spark, Python, R, etc',
`request_application_name` varchar(200) DEFAULT NULL COMMENT 'Parameter name for creator',
`script_path` varchar(200) DEFAULT NULL COMMENT 'Path of the script in workspace',
`params` text COMMENT 'Configuration item of the parameters',
`engine_instance` varchar(50) DEFAULT NULL COMMENT 'An instance of engine, consists of IP address of the engine server and port',
`task_resource` varchar(1024) DEFAULT NULL,
`engine_start_time` time DEFAULT NULL,
`label_json` varchar(200) DEFAULT NULL COMMENT 'label json',
PRIMARY KEY (`id`),
KEY `created_time` (`created_time`),
KEY `um_user` (`um_user`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
156 changes: 156 additions & 0 deletions en_US/Design_Documentation/Orchestrator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
Orchestrator Architecture Design
-------------------------
Orchestrator:The orchestration module provides interface services such as adding, deleting, modifying, querying, importing and exporting orchestrated under the project, and serves as a unified input for each orchestration implementation (such as workflow). Connect to the project service upward, and connect to the specific orchestration implementation (such as workflow service) downward.

### 1 business architecture
User use function point:


|Component name | First-level module | Second-level module | Function point |
|---------------------|------------------|-----------------|-----------------|
| DataSphereStudio | Orchestration Mode | New Orchestration Mode | Create a New Orchestration |
| | | Edit Arrangement Mode | Edit Arranged Field Information |
| | | Delete Arrangement Mode | Delete Arrangement |
| | | Open Arrangement Mode | Open Arrangement to perform drag-and-drop development of choreography nodes |
| | | View the list of orchestration versions | View the historical versions of the orchestration mode, you can open and view a version, or roll back a version |
| | | Orchestration mode rollback | Roll back to a historical version of the orchestration (a version will be added after the orchestration is released) |

![](images/orchestrator_uml.png)

### 一、Orchestrator Architecture:
![](images/orchestrator_arch.png)

### 二、Orchestrator module design:
Second-level module core class introduction:

**dss-orchestrator-core**

The core module of Orchestrator defines top-level interfaces such as DSSOrchestrator, DSSOrchestratorContext, and DSSOrchestratorPlugin.

| Core top-level interface/class | Core functionality |
|---------------------------|------------------------------|
| DSSOrchestrator | Defines the method of obtaining the properties of the orchestration, such as obtaining the editorial ranking, associated appconn, context information, etc. |
| DSSOrchestratorContext | Defines the context information of the orchestrator, and provides methods such as obtaining the orchestration plug-in class |
| DSSOrchestratorPlugin | The top-level interface for orchestrating plugins, defines the init method, and the subclass contains the imported and exported plugin implementation classes |
| DSSOrchestratorRelation |Defines the method of obtaining the associated properties of the orchestration, such as obtaining the orchestration mode, and obtaining the appconn associated with the orchestration |

**dss-orchestrator-db**

Defines the unified entry of the dao layer method of orchestration.

**dss-orchestrator-conversion-standard**

Defines the interface specification for orchestration and conversion to third-party systems, including top-level interfaces such as ConversionOperation, ConversionRequestRef, and ConversionService.
The core module of Orchestrator defines top-level interfaces such as DSSOrchestrator, DSSOrchestratorContext, and DSSOrchestratorPlugin.

| Core Interface/Class | Core Function |
|--------------------- |------------------------------------------|
| ConversionOperation | Defines the conversion core convert method, the input parameter is ConversionRequestRef, and returns ResponseRef |
| DSSToRelConversionRequestRef | Defines the basic parameters of the conversion request, such as userName, workspace, dssProject and other information |
| ConversionIntegrationStandard | The following core methods are defined: getDSSToRelConversionService (used to support the orchestration of DSS and convert it to the workflow of the scheduling system) |
| ConversionService | Defines methods for getting labels and getting ConversionIntegrationStandard |


**dss-orchestrator-loader**

Used to load orchestration-related appconn, such as workflow-appconn, and load subclasses of DSSOrchestratorPlugin, such as ExportDSSOrchestratorPlugin.

| Core Interface/Class | Core Function |
|--------------------- |---------------------------------------------|
| OrchestratorManager | The getOrCreateOrchestrator method is defined to load the appconn associated with the orchestrator, which will be cached after the first load to avoid repeated loading |
| LinkedAppConnResolver | Defines the interface for obtaining appconn according to the user |
| SpringDSSOrchestratorContext | The initialization method of the class will load all subclasses of DSSOrchestratorPlugin and cache them in memory |

**dss-framework-orchestrator-server**

The Orchestrator framework service provides interfaces such as adding, deleting, modifying, checking, and rolling back the orchestration front-end, as well as rpc services such as orchestration import and export.

**dss-framework-orchestraotr-publish**

Provides publishing-related plug-ins, such as arranging import and export implementation classes, arranging compressed package generation, and parsing implementation classes.

| Core Interface/Class | Core Function |
|--------------------- |---------------------------------------------|
| ExportDSSOrchestratorPlugin | Defines the orchestration export interface |
| ImportDSSOrchestratorPlugin | The orchestration import interface is defined |
| MetaWriter | Provides the function of outputting the arranged table field information to the metadata file in a specific format|
| MetaReader | Provides the function of parsing the arranged metadata file to generate table field content|

#### Create an orchestration sequence diagram (delete and edit operations are similar):

![](images/Create_an_orchestration_sequence_diagram.png)

#### Import and arrange sequence diagrams (export operations are similar):

![](images/Import_Orchestration_Sequence_Diagram.png)

### Data Structure/Storage Design
orchestrator information sheet:
```roomsql
CREATE TABLE `dss_orchestrator_info` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL COMMENT 'orchestrator name',
`type` varchar(255) NOT NULL COMMENT 'orchestrator type,E.g:workflow',
`desc` varchar(1024) DEFAULT NULL COMMENT 'description',
`creator` varchar(100) NOT NULL COMMENT 'creator',
`create_time` datetime DEFAULT NULL COMMENT 'create time',
`project_id` bigint(20) DEFAULT NULL COMMENT 'project id',
`uses` varchar(500) DEFAULT NULL COMMNET 'uses',
`appconn_name` varchar(1024) NOT NULL COMMENT 'Orchestrate the associated appconn,E.g:workflow',
`uuid` varchar(180) NOT NULL COMMENT 'uuid',
`secondary_type` varchar(500) DEFAULT NULL COMMENT 'Orchestrate of the second type,E.g:workflow-DAG',
`is_published` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'Is it published',
`workspace_id` int(11) DEFAULT NULL COMMENT 'workspace id',
`orchestrator_mode` varchar(100) DEFAULT NULL COMMENT 'orchestrator mode,The value obtained is dic_key(parent_key=p_arrangement_mode) in dss_dictionary',
`orchestrator_way` varchar(256) DEFAULT NULL COMMENT 'orchestrator way',
`orchestrator_level` varchar(32) DEFAULT NULL COMMENT 'orchestrator level',
`update_user` varchar(100) DEFAULT NULL COMMENT 'update user',
`update_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'update time',
PRIMARY KEY (`id`) USING BTREE,
UNIQUE KEY `unique_idx_uuid` (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPACT;
```

orchestrator version information table:
```roomsql
CREATE TABLE `dss_orchestrator_version_info` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`orchestrator_id` bigint(20) NOT NULL COMMENT 'associated orchestration id',
`app_id` bigint(20) DEFAULT NULL COMMENT 'The id of the orchestration implementation, such as flowId',
`source` varchar(255) DEFAULT NULL COMMENT 'source',
`version` varchar(255) DEFAULT NULL COMMENT 'verison',
`comment` varchar(255) DEFAULT NULL COMMENT 'description',
`update_time` datetime DEFAULT NULL COMMENT 'update time',
`updater` varchar(32) DEFAULT NULL COMMENT 'updater',
`project_id` bigint(20) DEFAULT NULL COMMENT 'project id',
`content` varchar(255) DEFAULT NULL COMMENT '',
`context_id` varchar(200) DEFAULT NULL COMMENT 'context id',
`valid_flag` INT(1) DEFAULT '1' COMMENT 'Version valid flag, 0: invalid; 1: valid',
PRIMARY KEY (`id`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPACT;
```

And the scheduling system orchestration association table:
```roomsql
CREATE TABLE `dss_orchestrator_ref_orchestration_relation` (
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'primary key ID',
`orchestrator_id` bigint(20) NOT NULL COMMENT 'The orchestration mode id of dss',
`ref_project_id` bigint(20) DEFAULT NULL COMMENT 'The project ID associated with the scheduling system',
`ref_orchestration_id` int(11) DEFAULT NULL COMMENT 'The id of the scheduling system workflow (the orchestrationId returned by calling the OrchestrationOperation service of SchedulerAppConn)',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPACT;
```

### 5.interface design


### 6.non-functional design
#### 6.1 Safety
Using the special ID in the cookie, GateWay needs to use a special decryption algorithm to identify it.
#### 6.2 Performance
can meet performance requirements.
#### 6.3 Capacity
not involving
#### 6.4 High Availability
Deployable Multi-Active

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading