Skip to content

Commit

Permalink
docs: udpated github and gitlab config doc (#606)
Browse files Browse the repository at this point in the history
  • Loading branch information
yumengwang03 authored Jul 26, 2023
1 parent e49d69a commit 5d6439d
Show file tree
Hide file tree
Showing 6 changed files with 114 additions and 64 deletions.
90 changes: 57 additions & 33 deletions docs/Configuration/GitHub.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,24 @@ Visit Config UI at: `http://localhost:4000`.

## Step 1 - Add Data Connections

On the Connections page, you can select GitHub and create a new connection or it.

### Stept 1.1 - Authentication
![github-add-data-connections](images/github-create-a-connection.png)

### Connection Name
#### Connection Name

Give your connection a unique name to help you identify it in the future.

### Endpoint URL
#### Endpoint URL

This should be a valid REST API endpoint, eg. `https://api.github.com/`. The URL should end with `/`.

### Personal Access Token(s)
#### Personal Access Token(s)

You can use one of the following GitHub tokens: personal access tokens(PATs) or fine-grained personal access tokens.

#### GitHub Personal Access Tokens(Recommended)
##### GitHub Personal Access Tokens(Recommended)

Learn about [how to create a GitHub personal access token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token). The following permissions are required to collect data from repositories:

Expand All @@ -41,7 +44,7 @@ The difference is that you have to give full permission for `repos`, not just `r

The data collection speed is restricted by the **rate limit of [5,000 requests](https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting) per hour per token** (15,000 requests/hour if you pay for GitHub enterprise). You can accelerate data collection by configuring _multiple_ personal access tokens. Please note that multiple tokens should be created by different GitHub accounts. Tokens belonging to the same GitHub account share the rate limit.

#### Fine-grained Personal Access Tokens
##### Fine-grained Personal Access Tokens

Note: this token doesn't support GraphQL APIs. You have to disable `Use GraphQL APIs` on the connection page if you want to use it. However, this will significantly increase the data collection time.

Expand All @@ -54,55 +57,54 @@ The token should be granted read-only permission for the following entities.
- `Metadata`
- `Pull requests`

### Use Graphql APIs
#### Use Graphql APIs

If you are using `github.com` or your on-premise GitHub version supports GraphQL APIs, toggle on this setting to collect data quicker.

- GraphQL APIs are 10+ times faster than REST APIs, but they may not be supported in GitHub on-premise versions.
- Instead of using multiple tokens to collect data, you can use ONLY ONE token because GraphQL APIs are quick enough.

### Proxy URL (Optional)
#### Proxy URL (Optional)

If you are behind a corporate firewall or VPN you may need to utilize a proxy server. Enter a valid proxy server address on your network, e.g. `http://your-proxy-server.com:1080`

### Fixed Rate Limit (Optional)
#### Fixed Rate Limit (Optional)

DevLake uses a dynamic rate limit to collect GitHub data. You can adjust the rate limit if you want to increase or lower the speed.

The maximum rate limit for GitHub is ** [5,000 requests/hour](https://docs.github.com/en/rest/overview/resources-in-the-rest-api#rate-limiting)** (15,000 requests/hour if you pay for GitHub enterprise). Please do not use a rate that exceeds this number.

### Test and Save Connection
#### Test and Save Connection

Click `Test Connection`, if the connection is successful, click `Save Connection` to add the connection.

## Step 2 - Setting Data Scope

![github-set-data-scope](/img/ConfigUI/github-set-data-scope.png)
### Step 1.2 - Add Data Scopes

### Repositories
#### Repositories

Enter the GitHub repositories to collect. If you want to collect more than 1 repo, please separate repos with comma. For example, "apache/incubator-devlake,apache/incubator-devlake-website".
Choose the GitHub repositories you wish to collect either by finding them in the miller column, or searching. You can only add public repositories through the search box.

### Data Entities

Usually, you don't have to modify this part. However, if you don't want to collect certain GitHub entities, you can unselect some entities to accelerate the collection speed.
![github-set-data-scope](images/github-add-data-scopes.png)

- Issue Tracking: GitHub issues, issue comments, issue labels, etc.
- Source Code Management: GitHub repos, refs, commits, etc.
- Code Review: GitHub PRs, PR comments and reviews, etc.
- CI/CD: GitHub Workflow runs, GitHub Workflow jobs, etc.
- Cross Domain: GitHub accounts, etc.

## Step 3 - Adding Transformation Rules (Optional)
### Step 1.3 - Add Scope Config (Optional)
Scope config contains two parts:
- The entities of which domain you wish to collect: Usually, you don't have to modify this part. However, if you don't want to collect certain GitHub entities, you can unselect some entities to accerlerate the collection speed.
- Issue Tracking: GitHub issues, issue comments, issue labels, etc.
- Source Code Management: GitHub repos, refs, commits, etc.
- Code Review: GitHub PRs, PR comments and reviews, etc.
- CI/CD: GitHub Workflow runs, GitHub Workflow jobs, etc.
- Cross Domain: GitHub accounts, etc.
- The transformations on the GitHub data you are going to collect.
- The details of the transformations will be exlained below.
- Without adding transformation rules, you can still view the "[GitHub Metrics](/livedemo/DataSources/GitHub)" dashboard. However, if you want to view "[Weekly Bug Retro](/livedemo/QAEngineers/WeeklyBugRetro)", "[Weekly Community Retro](/livedemo/OSSMaintainers/WeeklyCommunityRetro)" or other pre-built dashboards, the following transformation rules, especially "Type/Bug", should be added.
- Each GitHub repo has at most ONE set of transformations.

![github-add-transformation-rules-list](images/github-set-transformation1.png)
![github-add-transformation-rules-list](images/github-scope-config.png)
![github-add-transformation-rules](images/github-set-transformation2.png)

Without adding transformation rules, you can still view the "[GitHub Metrics](/livedemo/DataSources/GitHub)" dashboard. However, if you want to view "[Weekly Bug Retro](/livedemo/QAEngineers/WeeklyBugRetro)", "[Weekly Community Retro](/livedemo/OSSMaintainers/WeeklyCommunityRetro)" or other pre-built dashboards, the following transformation rules, especially "Type/Bug", should be added.<br/>

Each GitHub repo has at most ONE set of transformation rules.

### Issue Tracking
#### Issue Tracking

- Severity: Parse the value of `severity` from issue labels.

Expand All @@ -119,7 +121,7 @@ Each GitHub repo has at most ONE set of transformation rules.

- Type/Incident: Same as "Type/Requirement", with `type` setting to "INCIDENT".

### CI/CD
#### CI/CD

This set of configurations is used for calculating [DORA metrics](../DORA.md).

Expand All @@ -137,7 +139,7 @@ The deployment and production regex is always applied to the records in the cicd

You can also select "Not using Jobs in GitHub Action as Deployments" if you're not using GitHub action to conduct deployments.

### Code Review
#### Code Review

- Type: The `type` of pull requests will be parsed from PR labels by given regular expression. For example:

Expand All @@ -146,7 +148,7 @@ You can also select "Not using Jobs in GitHub Action as Deployments" if you're n

- Component: The `component` of pull requests will be parsed from PR labels by given regular expression.

### Additional Settings (Optional)
#### Additional Settings (Optional)

- Tags Limit: It'll compare the last N pairs of tags to get the "commit diff', "issue diff" between tags. N defaults to 10.

Expand All @@ -159,9 +161,31 @@ You can also select "Not using Jobs in GitHub Action as Deployments" if you're n

Please click `Save` to save the transformation rules for the repo. In the data scope list, click `Next Step` to continue configuring.

## Step 4 - Setting Sync Policy

You can choose how often you would like to sync your data in this step by selecting a sync frequency option or enter a cron code to specify your prefered schedule.
## Step 2 - Collect Data in a Project
### Step 2.1 - Create a Project
Collecing GitHub data reuiqres creating a project first. You can visit the Project page from the side menu and create a new project by following the instructions on the user interface.

![create-a-project](images/create-a-project.png)

### Step 2.2 - Add a GitHub Connection
You can add a previously configured GitHub connection to the project and select the boards for which you wish to collect the data for.
Please note: if you don't see the repositories you are looking for, please check if you have added them to the connection first.

![add-a-connection](images/add-a-connection-project.png)

### Step 2.3 - Set the Sync Policy
There are three settings for Sync Policy:
- Data Time Range: You can select the time range of the data you wish to collect. The default is set to the past six months.
- Sync Frequency: You can choose how often you would like to sync your data in this step by selecting a sync frequency option or enter a cron code to specify your prefered schedule.
- Skip Failed Tasks: sometime a few tasks may fail in a long pipeline; you can choose to skip them to avoid spending more time in running the pipeline all over again.

![sync-policy](images/sync-policy.png)

### Step 2.4 - Start Data Collection
Click on "Collect Data" to start collecting data for the whole project. You can check the status in the Status tab on the same page.
![collect-data](images/collect-data.png)


## Troubleshooting

Expand Down
79 changes: 52 additions & 27 deletions docs/Configuration/GitLab.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,29 +8,33 @@ Visit Config UI: `http://localhost:4000`.

## Step 1 - Add Data Connections

On the Connections page, you can select GitHub and create a new connection or it.

### Stept 1.1 - Authentication

![gitlab-add-data-connections](images/gitlab-create-a-connection.png)

### Connection Name
#### Connection Name

Give your connection a unique name to help you identify it in the future.

### GitLab Version
#### GitLab Version

Select if you use GitLab Cloud or GitLab Server (v11+).

### Endpoint URL
#### Endpoint URL

This should be a valid REST API endpoint.

- If you use GitLab cloud, you do not need to enter the endpoint, which is always `https://gitlab.com/api/v4/`.
- If you GitLab Server (v11+), the endpoint will look like `https://gitlab.example.com/api/v4/`.
Please note: the endpoint URL should end with `/`.

### Personal Access Token
#### Personal Access Token

Your GitLab personal access token (PAT) is required to add a connection. Learn about [how to create a GitLab personal access token](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html).

#### Personal Access Token Permissions
##### Personal Access Token Permissions

At least one of the following permissions is required to collect data from repositories:

Expand All @@ -42,48 +46,49 @@ You also have to double-check your GitLab user permission settings.
1. Go to the Project information -> Members page of the GitLab projects you wish to collect.
2. Check your role in this project from the Max role column. Make sure you are not the Guest role, otherwise, you will not be able to collect data from this project.

### Proxy URL (Optional)
#### Proxy URL (Optional)

If you are behind a corporate firewall or VPN you may need to utilize a proxy server. Enter a valid proxy server address on your network, e.g. `http://your-proxy-server.com:1080`

### Fixed Rate Limit (Optional)
#### Fixed Rate Limit (Optional)

DevLake uses a dynamic rate limit at around 12,000 requests/hour to collect GitLab data. You can adjust the rate limit if you want to increase or lower the speed.

The maximum rate limit for GitLab Cloud is ** [120,000 requests/hour](https://docs.gitlab.com/ee/user/gitlab_com/index.html#gitlabcom-specific-rate-limits)**. Tokens under the same IP address share the rate limit, so the actual rate limit for your token will be lower than this number.

For self-managed GitLab rate limiting, please contact your GitLab admin to [get or set the maximum rate limit](https://repository.prace-ri.eu/git/help/security/rate_limits.md) of your GitLab instance. Please do not use a rate that exceeds this number.

### Test and Save Connection
#### Test and Save Connection

Click `Test Connection`, if the connection is successful, click `Save Connection` to add the connection.

## Step 2 - Setting Data Scope

![gitlab-set-data-scope](/img/ConfigUI/gitlab-set-data-scope.png)

### Projects
### Step 1.2 - Add Data Scopes

Choose the GitLab projects to collect. Limited by GitLab API, You need to type more than 2 characters to search.
#### Repositories

- If you want to collect public repositories in GitLab, please uncheck "Only search my repositories" to search all repositories.
![gitlab-set-data-scope](/img/ConfigUI/gitlab-set-data-scope.png)

### Data Entities

Usually, you don't have to modify this part. However, if you don't want to collect certain GitLab entities, you can unselect some entities to accerlerate the collection speed.
Choose the GitLab repositories you wish to collect either by finding them in the miller column, or searching. Limited by the GitLab API, You need to type more than 2 characters to search. The repositories only with guest permissions are not shown in the list.

- Issue Tracking: GitLab issues, issue comments, issue labels, etc.
- Source Code Management: GitLab repos, refs, commits, etc.
- Code Review: GitLab MRs, MR comments and reviews, etc.
- CI/CD: GitLab pipelines, jobs, etc.
- Cross Domain: GitLab accounts, etc.

## Step 3 - Adding Transformation Rules (Optional)
### Step 1.3 - Add Scope Config (Optional)
Scope config contains two parts:
- The entities of which domain you wish to collect: Usually, you don't have to modify this part. However, if you don't want to collect certain GitHub entities, you can unselect some entities to accerlerate the collection speed.
- Issue Tracking: GitLab issues, issue comments, issue labels, etc.
- Source Code Management: GitLab repos, refs, commits, etc.
- Code Review: GitLab MRs, MR comments and reviews, etc.
- CI/CD: GitLab pipelines, jobs, etc.
- Cross Domain: GitLab accounts, etc.
- The transformations on the GitLab data you are going to collect.
- The details of the transformations will be exlained below.
- Without adding transformation rules, you can still view some of the dashboards.
- Each GitLab repo has at most ONE set of transformations.

![gitlab-set-transformation1](images/gitlab-set-transformation1.png)
![gitlab-set-transformation1](images/gitlab-scope-config.png)
![gitlab-set-transformation2](images/gitlab-set-transformation2.png)

### CI/CD
#### CI/CD

This set of configurations is used for calculating [DORA metrics](../DORA.md).

Expand All @@ -96,9 +101,29 @@ By the above two fields, DevLake can identify a production deployment among mass

You can also select "Not using Jobs in GitLab CI as Deployments" if you're not using GitLab CI to conduct deployments.

## Step 4 - Setting Sync Policy
## Step 2 - Collect Data in a Project
### Step 2.1 - Create a Project
Collecing GitLab data reuiqres creating a project first. You can visit the Project page from the side menu and create a new project by following the instructions on the user interface.

![create-a-project](images/create-a-project.png)

### Step 2.2 - Add a GitLab Connection
You can add a previously configured GitLab connection to the project and select the boards for which you wish to collect the data for.
Please note: if you don't see the repositories you are looking for, please check if you have added them to the connection first.

![add-a-connection](images/add-a-connection-project.png)

### Step 2.3 - Set the Sync Policy
There are three settings for Sync Policy:
- Data Time Range: You can select the time range of the data you wish to collect. The default is set to the past six months.
- Sync Frequency: You can choose how often you would like to sync your data in this step by selecting a sync frequency option or enter a cron code to specify your prefered schedule.
- Skip Failed Tasks: sometime a few tasks may fail in a long pipeline; you can choose to skip them to avoid spending more time in running the pipeline all over again.

![sync-policy](images/sync-policy.png)

You can choose how often you would like to sync your data in this step by selecting a sync frequency option or enter a cron code to specify your prefered schedule.
### Step 2.4 - Start Data Collection
Click on "Collect Data" to start collecting data for the whole project. You can check the status in the Status tab on the same page.
![collect-data](images/collect-data.png)

## Troubleshooting

Expand Down
9 changes: 5 additions & 4 deletions docs/Configuration/Jira.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,15 +104,16 @@ Scope config contains two parts:

Although this configuration is optional, some of the above transformations are required to measure metrics such as [Requirement Lead Time](https://devlake.apache.org/docs/Metrics/RequirementLeadTime), [Bug Age](https://devlake.apache.org/docs/Metrics/BugAge) or [DORA - Median Time to Restore Service](https://devlake.apache.org/docs/Metrics/MTTR) in the built-in Grafana dashboards.

Without adding transformation rules, you can not view all charts in "Jira" or "Engineering Throughput and Cycle Time" dashboards.<br/>

Each Jira board has at most ONE set of transformations.


![jira-add-transformation-1](images/jira-set-transformation1.png)
![jira-add-transformation-2](images/jira-set-transformation2.png)
![jira-add-transformation-3](images/jira-set-transformation3.png)


Without adding transformation rules, you can not view all charts in "Jira" or "Engineering Throughput and Cycle Time" dashboards.<br/>

Each Jira board has at most ONE set of transformation rules.


#### Issue Tracking

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/Configuration/images/gitlab-scope-config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 5d6439d

Please sign in to comment.