Skip to content

Conversation

canbekley
Copy link
Contributor

@canbekley canbekley commented Apr 22, 2025

Adds support for microbatch distributed incremental materialization, by adding a new model config is_distributed. Distributed models that run in microbatches would need to be created as following:

{{
    config(
        materialized='incremental',
        is_distributed=True,
        incremental_strategy='microbatch',
        ...
    )
}}
...

related issue: #439

@canbekley canbekley marked this pull request as ready for review April 22, 2025 08:37
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for distributed incremental microbatches by introducing a new model config is_distributed and updating the underlying adapter logic accordingly.

  • Updates integration tests to include distributed and non-distributed cases
  • Modifies ClickHouseRelation.get_on_cluster and ClickHouseConnectionManager methods to account for the new is_distributed flag
  • Adds a get_materialization function in the ClickHouse adapter to properly alias materialization types for distributed models

Reviewed Changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 2 comments.

File Description
tests/integration/adapter/incremental/test_incremental_microbatch.py Updates tests to pass is_distributed values and use dynamic materialized settings for both distributed and non-distributed scenarios
dbt/adapters/clickhouse/relation.py Updates get_on_cluster signature and usage to include the is_distributed flag derived from model configs
dbt/adapters/clickhouse/impl.py Propagates the is_distributed flag in the should_on_cluster method
dbt/adapters/clickhouse/init.py Introduces get_materialization to transform materialized and is_distributed settings for compatibility
Files not reviewed (2)
  • dbt/include/clickhouse/macros/adapters/relation.sql: Language not supported
  • dbt/include/clickhouse/macros/materializations/incremental/distributed_incremental.sql: Language not supported

"input_model.sql": _input_model_sql,
"microbatch_model.sql": _microbatch_model_sql,
"input_model.sql": _input_model_sql % "table",
"microbatch_model.sql": _microbatch_model_sql % "False", # `is_distributed` param
Copy link
Preview

Copilot AI Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is_distributed parameter is provided as a string ('False' or 'True') in this test, but the adapter functions expect a boolean value. Consider converting the string to a boolean to avoid misinterpretation of truthy values.

Suggested change
"microbatch_model.sql": _microbatch_model_sql % "False", # `is_distributed` param
"microbatch_model.sql": _microbatch_model_sql % False, # `is_distributed` param

Copilot uses AI. Check for mistakes.

Comment on lines +150 to 151
is_distributed = relation_config.config.get('extra', {}).get('is_distributed')
engine = relation_config.config.get('engine') or ''
Copy link
Preview

Copilot AI Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value for is_distributed may be passed as a string instead of a boolean, which could lead to unintended behavior in get_on_cluster. Consider adding an explicit conversion to boolean.

Suggested change
is_distributed = relation_config.config.get('extra', {}).get('is_distributed')
engine = relation_config.config.get('engine') or ''
is_distributed = relation_config.config.get('extra', {}).get('is_distributed')
# Ensure is_distributed is a boolean
is_distributed = str(is_distributed).lower() == "true" if is_distributed is not None else False

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant