Skip to content

Commit 114923b

Browse files
authoredJan 31, 2023
feat: clone with new owner (#27)
* feat: clone with new owner * chore: add more doc and comments * feat: simplify the functions * fix: update ownership model * rename macro * style * qualify macros * typos
1 parent 37f03fc commit 114923b

9 files changed

+247
-20
lines changed
 

‎2-step_cloning_pattern.md

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# 2-Step dbt Cloning Pattern
2+
3+
_Credit: [This cloning pattern is inspired by Dan Gooden’s article here from the Airtasker Tribe blog.](https://medium.com/airtribe/test-sql-pipelines-against-production-clones-using-dbt-and-snowflake-2f8293722dd4)_
4+
5+
Cloning is a cost- and time-efficient way of developing dbt models on Snowflake but it can be challenging when your cloning needs traverse different environments with different access controls: i.e. you want to clone a production database for use in development.
6+
7+
A solution for this is to run a 2-step cloning pattern:
8+
9+
1. A production role clones the production database or schema and then changes the ownership of its sub-objects to a developer role, thus creating a developer clone of production. The cloned object is still owned by the production role (which preserves the privilege to drop or replace that clone), but now the developer role has full access of its sub-objects.
10+
2. Developer users use the developer role to clone that developer clone database or schema, thus creating a new personal developer clone for development. The developer role has full ownership of this cloned database and all its sub-objects.
11+
12+
This pattern can be used for cloning a schema or a database. If all the dbt models are stored within a single schema, schema-level cloning is a good option. When dbt is configured to write data to multiple schemata, database-level cloning is a good, more production-like option.
13+
14+
This patterns optimizes for the following:
15+
16+
- **Access Control:** no need to compromise on your access control system, such as by allowing your developer role to have extensive access on production. This pattern takes environmental separation as a given.
17+
- **Flexible Availability:** step 1 can be run on any preferred schedule: the developer clone could be updated hourly, daily, weekly, or any other cadence. This first clone is ideally run after a complete execution of dbt for the freshest data possible.
18+
- **Developer Flexibility:** developers can take personal clones whenever they need to and can even take multiple clones if they have need of more than one concurrent development environment. These developer clones are ideally commonly rotated to keep data fresh and production-like.
19+
20+
## Setup:
21+
22+
1. Update one of your production jobs to include step 1 of the cloning pattern. Here is an example implementation for database-level cloning from production to production_clone:
23+
24+
```bash
25+
dbt build &&
26+
dbt run-operation clone_database \
27+
--args "{'source_database': 'production', 'destination_database': 'production_clone', 'new_owner_role': 'developer_role'}"
28+
```
29+
30+
2. As needed, locally run step 2 of the cloning pattern to create or update personal development clones. Here is an example implementation for database-level cloning from production_clone to an ephemeral database called developer_clone_me:
31+
32+
```bash
33+
dbt run-operation clone_database \
34+
--args "{'source_database': 'production_clone', 'destination_database': 'developer_clone_me'}"
35+
```

‎README.md

+58-11
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ This [dbt](https://github.com/dbt-labs/dbt-core) package contains Snowflake-spec
66
Check [dbt Hub](https://hub.getdbt.com/montreal-analytics/snowflake_utils/latest/) for the latest installation instructions, or [read the docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
77

88
## Prerequisites
9-
Snowflake Utils is compatible with dbt 0.20.0 and later.
9+
Snowflake Utils is compatible with dbt 1.1.0 and later.
1010

1111
----
1212

@@ -68,28 +68,60 @@ When a variable is configured for a conditon _and_ that condition is matched whe
6868
When compiling or generating docs, the console reports that dbt is using the incremental run warehouse. It isn't actually so. During these operations, only the target warehouse is activated.
6969

7070
### snowflake_utils.clone_schema ([source](macros/clone_schema.sql))
71-
This macro clones the source schema into the destination schema.
71+
This macro is a part of the recommended 2-step Cloning Pattern for dbt development, explained in detail [here](2-step_cloning_pattern.md).
72+
73+
This macro clones the source schema into the destination schema and optionally grants ownership over its tables and views to a new owner.
74+
75+
Note: the owner of the schema is the role that executed the command, but if configured, the owner of its sub-objects would be the new_owner_role. This is important for maintaining and replacing clones and is explained in more detail [here](2-step_cloning_pattern.md).
7276

7377
#### Arguments
7478
* `source_schema` (required): The source schema name
7579
* `destination_schema` (required): The destination schema name
76-
* `source_database` (optional): The source database name
77-
* `destination_database` (optional): The destination database name
80+
* `source_database` (optional): The source database name; default value is your profile's target database.
81+
* `destination_database` (optional): The destination database name; default value is your profile's target database.
82+
* `new_owner_role` (optional): The new ownership role name. If no value is passed, the ownership will remain unchanged.
83+
84+
#### Usage
85+
86+
Call the macro as an [operation](https://docs.getdbt.com/docs/using-operations):
87+
88+
```
89+
dbt run-operation clone_schema \
90+
--args "{'source_schema': 'analytics', 'destination_schema': 'ci_schema'}"
91+
92+
# set the databases and new_owner_role
93+
dbt run-operation clone_schema \
94+
--args "{'source_schema': 'analytics', 'destination_schema': 'ci_schema', 'source_database': 'production', 'destination_database': 'temp_database', 'new_owner_role': 'developer_role'}"
95+
```
96+
97+
98+
### snowflake_utils.clone_database ([source](macros/clone_database.sql))
99+
This macro is a part of the recommended 2-step Cloning Pattern for dbt development, explained in detail [here](2-step_cloning_pattern.md).
100+
101+
This macro clones the source database into the destination database and optionally grants ownership over its schemata and its schemata's tables and views to a new owner.
102+
103+
Note: the owner of the database is the role that executed the command, but if configured, the owner of its sub-objects would be the new_owner_role. This is important for maintaining and replacing clones and is explained in more detail [here](2-step_cloning_pattern.md).
104+
105+
#### Arguments
106+
* `source_database` (required): The source database name
107+
* `destination_database` (required): The destination database name
108+
* `new_owner_role` (optional): The new ownership role name. If no value is passed, the ownership will remain unchanged.
78109

79110
#### Usage
80111

81112
Call the macro as an [operation](https://docs.getdbt.com/docs/using-operations):
82113

83114
```
84-
# for multiple arguments, use the dict syntax
85-
dbt run-operation clone_schema --args "{'source_schema': 'analytics', 'destination_schema': 'ci_schema'}"
115+
dbt run-operation clone_database \
116+
--args "{'source_database': 'production_clone', 'destination_database': 'developer_clone'}"
86117
87-
# set the databases
88-
dbt run-operation clone_schema --args "{'source_schema': 'analytics', 'destination_schema': 'ci_schema', 'source_database': 'production', 'destination_database': 'temp_database'}"
118+
# set the new_owner_role
119+
dbt run-operation clone_database \
120+
--args "{'source_database': 'production_clone', 'destination_database': 'developer_clone', 'new_owner_role': 'developer_role'}"
89121
```
90122

91123
### snowflake_utils.drop_schema ([source](macros/drop_schema.sql))
92-
This macro drops a schema in the selected database (defaults to target database if no database is selected).
124+
This macro drops a schema in the selected database (defaults to target database if no database is selected). A schema can only be dropped by the role that owns it.
93125

94126
#### Arguments
95127
* `schema_name` (required): The schema to drop
@@ -100,8 +132,23 @@ This macro drops a schema in the selected database (defaults to target database
100132
Call the macro as an [operation](https://docs.getdbt.com/docs/using-operations):
101133

102134
```
103-
# for multiple arguments, use the dict syntax
104-
dbt run-operation drop_schema --args "{'schema_name': 'customers_temp', 'database': 'production'}"
135+
dbt run-operation drop_schema \
136+
--args "{'schema_name': 'customers_temp', 'database': 'production'}"
137+
```
138+
139+
### snowflake_utils.drop_database ([source](macros/drop_database.sql))
140+
This macro drops a database. A database can only be dropped by the role that owns it.
141+
142+
#### Arguments
143+
* `database_name` (required): The database name
144+
145+
#### Usage
146+
147+
Call the macro as an [operation](https://docs.getdbt.com/docs/using-operations):
148+
149+
```
150+
dbt run-operation drop_database \
151+
--args "{'database_name': 'production_clone'}"
105152
```
106153

107154
### snowflake_utils.apply_meta_as_tags ([source](macros/apply_meta_as_tags.sql))

‎dbt_project.yml

+2-3
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
11
name: 'snowflake_utils'
2-
version: '0.3.0'
2+
version: '0.4.0'
33

44
config-version: 2
55

6-
require-dbt-version: ">=0.17.0"
6+
require-dbt-version: ">=1.1.0"
77

8-
source-paths: ["models"]
98
target-path: "target"
109
clean-targets: ["target", "dbt_modules"]
1110
test-paths: ["test"]

‎macros/clone_database.sql

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
{#
2+
-- This macro clones the source database into the destination database and
3+
-- optionally grants ownership over it, its schemata, and its schemata's tables
4+
-- and views to a new owner.
5+
#}
6+
{% macro clone_database(
7+
source_database,
8+
destination_database,
9+
new_owner_role=''
10+
) %}
11+
12+
{% if source_database and destination_database %}
13+
14+
{{ (log("Cloning existing database " ~ source_database ~
15+
" into database " ~ destination_database, info=True)) }}
16+
17+
{% call statement('clone_database', fetch_result=True, auto_begin=False) -%}
18+
CREATE OR REPLACE DATABASE {{ destination_database }}
19+
CLONE {{ source_database }};
20+
{%- endcall %}
21+
22+
{%- set result = load_result('clone_database') -%}
23+
{{ log(result['data'][0][0], info=True)}}
24+
25+
{% else %}
26+
27+
{{ exceptions.raise_compiler_error("Invalid arguments. Missing source database and/or destination database") }}
28+
29+
{% endif %}
30+
31+
{% if new_owner_role != '' %}
32+
33+
{% set list_schemas_query %}
34+
-- get all schemata within the cloned database to then iterate through them and
35+
-- change their ownership
36+
SELECT schema_name
37+
FROM {{ destination_database }}.information_schema.schemata
38+
WHERE schema_name != 'INFORMATION_SCHEMA'
39+
{% endset %}
40+
41+
{% set results = run_query(list_schemas_query) %}
42+
43+
{% if execute %}
44+
{# Return the first column #}
45+
{% set schemata_list = results.columns[0].values() %}
46+
{% else %}
47+
{% set schemata_list = [] %}
48+
{% endif %}
49+
50+
{% for schema_name in schemata_list %}
51+
52+
{{ snowflake_utils.grant_ownership_on_schema_objects(new_owner_role, schema_name, destination_database) }}
53+
54+
{% endfor %}
55+
56+
{{ log("Grant ownership on " ~ destination_database ~ " to " ~ new_owner_role, info=True)}}
57+
58+
{% call statement('clone_database', fetch_result=True, auto_begin=False) -%}
59+
GRANT ALL PRIVILEGES ON DATABASE {{ destination_database }} TO {{ new_owner_role }};
60+
{%- endcall %}
61+
62+
{% endif %}
63+
64+
{% endmacro %}

‎macros/clone_schema.sql

+17-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,14 @@
1-
{% macro clone_schema(source_schema, destination_schema, source_database=target.database, destination_database=target.database) %}
1+
{#
2+
-- This macro clones the source schema into the destination schema and
3+
-- optionally grants ownership over it and its tables and views to a new owner.
4+
#}
5+
{% macro clone_schema(
6+
source_schema,
7+
destination_schema,
8+
source_database=target.database,
9+
destination_database=target.database,
10+
new_owner_role=''
11+
) %}
212

313
{% if source_schema and destination_schema %}
414

@@ -19,4 +29,10 @@
1929

2030
{% endif %}
2131

32+
{% if new_owner_role != '' %}
33+
34+
{{ snowflake_utils.grant_ownership_on_schema_objects(new_owner_role, destination_schema, destination_database) }}
35+
36+
{% endif %}
37+
2238
{% endmacro %}

‎macros/drop_database.sql

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
{#
2+
-- This macro drops a database.
3+
#}
4+
{% macro drop_database(database_name) %}
5+
6+
{% if database_name %}
7+
8+
{{ log("Dropping database " ~ database_name ~ "...", info=True) }}
9+
10+
{% call statement('drop_database', fetch_result=True, auto_begin=False) -%}
11+
DROP DATABASE {{ database_name }}
12+
{%- endcall %}
13+
14+
{%- set result = load_result('drop_database') -%}
15+
{{ log(result['data'][0][0], info=True)}}
16+
17+
{% else %}
18+
19+
{{ exceptions.raise_compiler_error("Invalid arguments. Missing database name") }}
20+
21+
{% endif %}
22+
23+
{% endmacro %}

‎macros/drop_schema.sql

+11-4
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,18 @@
1-
{% macro drop_schema(schema_name, database=target.database) %}
1+
{#
2+
-- This macro drops a schema in the selected database (defaults to target
3+
-- database if no database is selected).
4+
#}
5+
{% macro drop_schema(
6+
schema_name,
7+
database_name=target.database
8+
) %}
29

310
{% if schema_name %}
411

5-
{{ log("Dropping schema " ~ database ~ "." ~ schema_name ~ "...", info=True) }}
12+
{{ log("Dropping schema " ~ database_name ~ "." ~ schema_name ~ "...", info=True) }}
613

714
{% call statement('drop_schema', fetch_result=True, auto_begin=False) -%}
8-
DROP SCHEMA {{ database }}.{{ schema_name }}
15+
DROP SCHEMA {{ database_name }}.{{ schema_name }}
916
{%- endcall %}
1017

1118
{%- set result = load_result('drop_schema') -%}
@@ -17,4 +24,4 @@
1724

1825
{% endif %}
1926

20-
{% endmacro %}
27+
{% endmacro %}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
{#
2+
-- This macro grants ownership over a schema's tables and views and is
3+
-- optionally called by the clone_schema and clone_database macros.
4+
#}
5+
{% macro grant_ownership_on_schema_objects(
6+
new_owner_role,
7+
destination_schema,
8+
destination_database=target.database
9+
) %}
10+
11+
{% if new_owner_role and destination_schema %}
12+
13+
{{ (log("Granting ownership on " ~ destination_database ~ "." ~ destination_schema ~
14+
" and its tables and views to " ~ new_owner_role, info=True)) }}
15+
16+
{% call statement('grant_ownership_on_schema_objects', fetch_result=True, auto_begin=False) -%}
17+
GRANT USAGE ON SCHEMA {{ destination_database }}.{{ destination_schema }}
18+
TO {{ new_owner_role }};
19+
GRANT OWNERSHIP ON ALL TABLES IN SCHEMA {{ destination_database }}.{{ destination_schema }}
20+
TO {{ new_owner_role }} REVOKE CURRENT GRANTS;
21+
GRANT OWNERSHIP ON ALL VIEWS IN SCHEMA {{ destination_database }}.{{ destination_schema }}
22+
TO {{ new_owner_role }} REVOKE CURRENT GRANTS;
23+
GRANT ALL PRIVILEGES ON SCHEMA {{ destination_database }}.{{ destination_schema }}
24+
TO {{ new_owner_role }};
25+
{%- endcall %}
26+
27+
{%- set result = load_result('grant_ownership_on_schema_objects') -%}
28+
{{ log(result['data'][0][0], info=True)}}
29+
30+
{% else %}
31+
32+
{{ exceptions.raise_compiler_error("Invalid arguments. Missing new owner role and/or destination schema") }}
33+
34+
{% endif %}
35+
36+
{% endmacro %}

‎packages.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
packages:
22
- package: dbt-labs/dbt_utils
3-
version: ">=0.7.0"
3+
version: [">=0.7.0", "<1.1.0"]

0 commit comments

Comments
 (0)
Please sign in to comment.