Skip to content

Conversation

emirkmo
Copy link

@emirkmo emirkmo commented Aug 14, 2025

Summary

Fix persist docs for replicated cluster setups. (Solves #295)

Split persist docs macro into two alter statements to handle cases where on cluster wouldn't persist comments on other replicas.

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG

Note for maintainers

Note that current github integrations tests do not use the cluster test_replica and hence didn't see the issue. The added tests only fail & pass using test_replica cluster (i.e. replicated (and sharded if needed) cluster setups).

Explanation

The previous version of this macro blended the MODIFY COMMENT command for the table comment with the COMMENT COLUMN commands for the column comments, inside the same ALTER TABLE.

This combined syntax does not work with ON CLUSTER for replicated setups. (This was thought to be a clickhouse bug, but that bug is fixed and the behavior remains.)

The revised macro now sends a separate ALTER TABLE for the table comment MODIFY COMMENT, and a separate one for all the COMMENT COLUMN commands.

I've split it up into 2 helper macros that accomplish each task for maintainability. This also works on non-cluster setups so there is no need to further complicate it by combining table and column comment alterations into a single ALTER.

clickhouse__persist_docs now runs the relation (table comment) helper macro, if persist_relations_docs() is set and descriptions exist for the relation, and runs the column comment helper macro if persist_column_docs() is set and columns are defined in the model.

Additionally, the formatting was cleaned up for multiple column comments so that comments for a different column are not appearing on the same line. Just makes it easier to read in clickhouse/dbt logs.

Example:

Previous macro produced comments like:

alter table <table>
[ON CLUSTER] modify comment
$dbt_comment_literal_block$<table comment>,
$dbt_comment_literal_block$, comment column `col1_name`
$dbt_comment_literal_block$<column comment>
$dbt_comment_lieteral_block$, comment column `col2_name`

...

A bit messy, but works on a single node.

Although ON CLUSTER is correctly added, this syntax is incorrect in a replicated setup and ONLY produces table and column comments for the initiator node. Now it does:

alter table <table>
[ON CLUSTER] modify comment
$dbt_comment_literal_block$ <table comment>
$dbt_comment_literal_block$

alter table <table>
[ON CLUSTER]
comment column `col1_name`
 $dbt_comment_literal_block$<column multi
linecomment>$dbt_comment_literal_block$,
comment column `col2_name`
 $dbt_comment_literal_block$ <column comment>$dbt_comment_literal_block$,

...

Edit: Removed dev test leftovers.
Fixes: #295

@CLAassistant
Copy link

CLAassistant commented Aug 14, 2025

CLA assistant check
All committers have signed the CLA.

@emirkmo emirkmo changed the title Fix persist docs for Replicated cluster setups Fix: persist docs for Replicated cluster setups Aug 14, 2025
@emirkmo
Copy link
Author

emirkmo commented Aug 14, 2025

Note: for the added test to catch regressions in CI you should add the following to test_matrix.yaml:

      - name: Run HTTP tests on replicated cluster
        env:
          DBT_CH_TEST_CLUSTER: test_replica
          DBT_CH_TEST_CLUSTER_MODE: True

        run: |
          PYTHONPATH="${PYTHONPATH}:dbt"
          pytest tests/  
        # or just run the comment test: 
        # pytest tests/integration/adapter/clickhouse/test_clickhouse_comments.py

failure requires `test_replica` cluster to be used.
The previous version of this macro blended the
`MODIFY COMMENT` command for the table comment
with the `COMMENT COLUMN` commands for the column comments,
inside the same `ALTER TABLE`.

This combined syntax **does not work** with `ON CLUSTER`.
(This was thought to be a clickhouse bug, but that bug is fixed
and the behavior remains.)

The revised macro now sends a separate `ALTER TABLE` for
the table comment `MODIFY COMMENT`, and a separate one for
all the `COMMENT COLUMN` commands.

I've split it up into 2 helper macros that accomplish each
task for maintainability. This also works on non-cluster
setups so there is no need to further complicate it by combining
table and column comment alterations into a single ALTER.

`clickhouse__persist_docs` now runs the relation (table comment)
helper macro if `persist_relations_docs()` is set and descriptions exist
for the relation and the column comment helper macro if
`persist_column_docs()` and columns are defined in the model.

Additionally, the formatting was cleaned up for multiple column
comments so that comments for a different column are not appearing
on the same line. Just makes it easier to read in clickhouse/dbt logs.

Example:

It produced comments like:

```sql
alter table <table>
[ON CLUSTER] modify comment
$dbt_comment_literal_block$<table comment>
$dbt_comment_literal_block$, comment column `col1_name`
$dbt_comment_literal_block$<column comment>
$dbt_comment_lieteral_block$, comment column `col2_name`

...
```

A bit messy, but works on a single node.

Although `ON CLUSTER` is correctly added, this syntax is
incorrect and ONLY produces table and column comments for
the initiator node. Now it does:

```sql
alter table <table>
[ON CLUSTER] modify comment
$dbt_comment_literal_block$ <table comment>
$dbt_comment_literal_block$

alter table <table>
[ON CLUSTER]
comment column `col1_name`
 $dbt_comment_literal_block$<column multi
linecomment>$dbt_comment_literal_block$,
comment column `col2_name`
 $dbt_comment$ <column comment>$dbt_comment$,

...
```
@emirkmo emirkmo force-pushed the fix-persist-docs-for-cluster branch from 4841279 to 671f7d0 Compare August 14, 2025 16:21
@emirkmo emirkmo marked this pull request as draft August 28, 2025 11:56
@emirkmo emirkmo marked this pull request as ready for review August 28, 2025 11:56
@emirkmo
Copy link
Author

emirkmo commented Aug 28, 2025

Tried to retrigger it for attention. Could I please get the workflows approved or a review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Table and Column Comments set up misbehaviour
2 participants