Skip to content

Commit 9ece92f

Browse files
Bug/string agg limit (#133)
* bug/string-agg-limit * update changelog & regen docs * Update int_jira__pivot_daily_field_history.sql * update incremental compatible and regen docs * regen docs * update yml * add tests * update conversation disablement * regen docs * Update int_jira__issue_comments.sql * update readme * release review updates * update readme * Apply suggestions from code review Co-authored-by: Jamie Rodriguez <[email protected]> --------- Co-authored-by: Jamie Rodriguez <[email protected]>
1 parent 9474332 commit 9ece92f

16 files changed

+163
-25
lines changed

CHANGELOG.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,18 @@
1+
# dbt_jira v0.19.0
2+
[PR #133](https://github.com/fivetran/dbt_jira/pull/133) contains the following updates:
3+
4+
## Breaking Changes
5+
- This change is marked as breaking due to its impact on Redshift configurations.
6+
- For Redshift users, comment data aggregated under the `conversations` field in the `jira__issue_enhanced` table is now disabled by default to prevent consistent errors related to Redshift's varchar length limits.
7+
- If you wish to re-enable `conversations` on Redshift, set the `jira_include_conversations` variable to `true` in your `dbt_project.yml`.
8+
9+
## Under the Hood
10+
- Updated the `comment` seed data to ensure conversations are correctly disabled for Redshift by default.
11+
- Renamed the `jira_is_databricks_sql_warehouse` macro to `jira_is_incremental_compatible`, which was updated to return `true` if the Databricks runtime is an all-purpose cluster (previously it checked only for a SQL warehouse runtime) or if the target is any other non-Databricks-supported destination.
12+
- This update addresses Databricks runtimes (e.g., endpoints and external runtimes) that do not support the `insert_overwrite` incremental strategy used in the `jira__daily_issue_field_history` and `int_jira__pivot_daily_field_history` models.
13+
- For Databricks users, the `jira__daily_issue_field_history` and `int_jira__pivot_daily_field_history` models will now apply the incremental strategy only if running on an all-purpose cluster. All other Databricks runtimes will not utilize an incremental strategy.
14+
- Added consistency tests for the `jira__project_enhanced` and `jira__user_enhanced` models.
15+
116
# dbt_jira v0.18.0
217
[PR #131](https://github.com/fivetran/dbt_jira/pull/131) contains the following updates:
318
## Breaking Changes

README.md

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ Include the following jira package version in your `packages.yml` file:
6666
```yaml
6767
packages:
6868
- package: fivetran/jira
69-
version: [">=0.18.0", "<0.19.0"]
69+
version: [">=0.19.0", "<0.20.0"]
7070
7171
```
7272
### Step 3: Define database and schema variables
@@ -82,14 +82,30 @@ vars:
8282
Your Jira connector may not sync every table that this package expects. If you do not have the `SPRINT`, `COMPONENT`, or `VERSION` tables synced, add the respective variables to your root `dbt_project.yml` file. Additionally, if you want to remove comment aggregations from your `jira__issue_enhanced` model, add the `jira_include_comments` variable to your root `dbt_project.yml`:
8383
```yml
8484
vars:
85-
jira_using_sprints: false # Disable if you do not have the sprint table or do not want sprint-related metrics reported
86-
jira_using_components: false # Disable if you do not have the component table or do not want component-related metrics reported
87-
jira_using_versions: false # Disable if you do not have the versions table or do not want versions-related metrics reported
88-
jira_using_priorities: false # disable if you are not using priorities in Jira
89-
jira_include_comments: false # This package aggregates issue comments so that you have a single view of all your comments in the jira__issue_enhanced table. This can cause limit errors if you have a large dataset. Disable to remove this functionality.
85+
jira_using_sprints: false # Enabled by default. Disable if you do not have the sprint table or do not want sprint-related metrics reported.
86+
jira_using_components: false # Enabled by default. Disable if you do not have the component table or do not want component-related metrics reported.
87+
jira_using_versions: false # Enabled by default. Disable if you do not have the versions table or do not want versions-related metrics reported.
88+
jira_using_priorities: false # Enabled by default. Disable if you are not using priorities in Jira.
89+
jira_include_comments: false # Enabled by default. Disabling will remove the aggregation of comments via the `count_comments` and `conversations` columns in the `jira__issue_enhanced` table.
9090
```
91+
9192
### (Optional) Step 5: Additional configurations
9293

94+
#### Controlling conversation aggregations in `jira__issue_enhanced`
95+
96+
The `dbt_jira` package offers variables to enable or disable conversation aggregations in the `jira__issue_enhanced` table. These settings allow you to manage the amount of data processed and avoid potential performance or limit issues with large datasets.
97+
98+
- `jira_include_conversations`: Controls only the `conversation` [column](https://github.com/fivetran/dbt_jira/blob/main/models/jira.yml#L125-L127) in the `jira__issue_enhanced` table.
99+
- Default: Disabled for Redshift due to string size constraints; enabled for other supported warehouses.
100+
- Setting this to `false` removes the `conversation` column but retains the `count_comments` field if `jira_include_comments` is still enabled. This is useful if you want a comment count without the full conversation details.
101+
102+
In your `dbt_project.yml` file:
103+
104+
```yml
105+
vars:
106+
jira_include_conversations: false/true # Disabled by default for Redshift; enabled for other supported warehouses.
107+
```
108+
93109
#### Define daily issue field history columns
94110
The `jira__daily_issue_field_history` model generates historical data for the columns specified by the `issue_field_history_columns` variable. By default, the only columns tracked are `status`, `status_id`, and `sprint`, but all fields found in the Jira `FIELD` table's `field_name` column can be included in this model. The most recent value of any tracked column is also captured in `jira__issue_enhanced`.
95111

dbt_project.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
name: 'jira'
2-
version: '0.18.0'
2+
version: '0.19.0'
33
config-version: 2
44
require-dbt-version: [">=1.3.0", "<2.0.0"]
55
vars:

docs/catalog.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

docs/manifest.json

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

integration_tests/dbt_project.yml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
name: 'jira_integration_tests'
2-
version: '0.18.0'
2+
version: '0.19.0'
33
config-version: 2
44
profile: 'integration_tests'
55

66
vars:
7+
# Comment out the below when generating docs
8+
issue_field_history_columns: ['summary', 'story points', 'components']
79
jira_source:
810
jira_schema: jira_integrations_tests_41
911
jira_comment_identifier: "comment"
@@ -28,9 +30,6 @@ vars:
2830
jira_user_identifier: "user"
2931
jira_version_identifier: "version"
3032

31-
# Comment out the below when generating docs
32-
issue_field_history_columns: ['summary', 'story points', 'components']
33-
3433
models:
3534
jira:
3635
+schema: "{{ 'jira_integrations_tests_sqlw' if target.name == 'databricks-sql' else 'jira' }}"

integration_tests/seeds/comment.csv

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

integration_tests/tests/consistency/consistency_issue_enhanced.sql

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,15 @@
44
enabled=var('fivetran_validation_tests_enabled', false)
55
) }}
66

7+
{# Exclude columns that depend on calculations involving the current time in seconds or aggregate strings in a random order, as they will differ between runs. #}
8+
{% set exclude_columns = ['open_duration_seconds', 'any_assignment_duration_seconds', 'last_assignment_duration_seconds'] %}
79
with prod as (
8-
select *
10+
select {{ dbt_utils.star(from=ref('jira__issue_enhanced'), except=exclude_columns) }}
911
from {{ target.schema }}_jira_prod.jira__issue_enhanced
1012
),
1113

1214
dev as (
13-
select *
15+
select {{ dbt_utils.star(from=ref('jira__issue_enhanced'), except=exclude_columns) }}
1416
from {{ target.schema }}_jira_dev.jira__issue_enhanced
1517
),
1618

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
2+
{{ config(
3+
tags="fivetran_validations",
4+
enabled=var('fivetran_validation_tests_enabled', false)
5+
) }}
6+
7+
{# Exclude columns that depend on calculations involving the current time in seconds or aggregate strings in a random order, as they will differ between runs. #}
8+
{% set exclude_columns = ['avg_age_currently_open_seconds', 'avg_age_currently_open_assigned_seconds', 'median_age_currently_open_seconds', 'median_age_currently_open_assigned_seconds', 'epics', 'components'] %}
9+
10+
with prod as (
11+
select {{ dbt_utils.star(from=ref('jira__project_enhanced'), except=exclude_columns) }}
12+
from {{ target.schema }}_jira_prod.jira__project_enhanced
13+
),
14+
15+
dev as (
16+
select {{ dbt_utils.star(from=ref('jira__project_enhanced'), except=exclude_columns) }}
17+
from {{ target.schema }}_jira_dev.jira__project_enhanced
18+
),
19+
20+
prod_not_in_dev as (
21+
-- rows from prod not found in dev
22+
select * from prod
23+
except distinct
24+
select * from dev
25+
),
26+
27+
dev_not_in_prod as (
28+
-- rows from dev not found in prod
29+
select * from dev
30+
except distinct
31+
select * from prod
32+
),
33+
34+
final as (
35+
select
36+
*,
37+
'from prod' as source
38+
from prod_not_in_dev
39+
40+
union all -- union since we only care if rows are produced
41+
42+
select
43+
*,
44+
'from dev' as source
45+
from dev_not_in_prod
46+
)
47+
48+
select *
49+
from final
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
2+
{{ config(
3+
tags="fivetran_validations",
4+
enabled=var('fivetran_validation_tests_enabled', false)
5+
) }}
6+
7+
{# Exclude columns that depend on calculations involving the current time in seconds or aggregate strings in a random order, as they will differ between runs. #}
8+
{% set exclude_columns = ['avg_age_currently_open_seconds', 'median_age_currently_open_seconds', 'projects'] %}
9+
10+
with prod as (
11+
select {{ dbt_utils.star(from=ref('jira__user_enhanced'), except=exclude_columns) }}
12+
from {{ target.schema }}_jira_prod.jira__user_enhanced
13+
),
14+
15+
dev as (
16+
select {{ dbt_utils.star(from=ref('jira__user_enhanced'), except=exclude_columns) }}
17+
from {{ target.schema }}_jira_dev.jira__user_enhanced
18+
),
19+
20+
prod_not_in_dev as (
21+
-- rows from prod not found in dev
22+
select * from prod
23+
except distinct
24+
select * from dev
25+
),
26+
27+
dev_not_in_prod as (
28+
-- rows from dev not found in prod
29+
select * from dev
30+
except distinct
31+
select * from prod
32+
),
33+
34+
final as (
35+
select
36+
*,
37+
'from prod' as source
38+
from prod_not_in_dev
39+
40+
union all -- union since we only care if rows are produced
41+
42+
select
43+
*,
44+
'from dev' as source
45+
from dev_not_in_prod
46+
)
47+
48+
select *
49+
from final

0 commit comments

Comments
 (0)