Add macro to generate row counts for all tables in a schema by BindusekharGorintla · Pull Request #923 · elementary-data/dbt-data-reliability

BindusekharGorintla · 2026-02-03T17:54:51Z

This macro generates a row count summary for all tables in a given schema. It dynamically queries the information_schema.tables to list all tables, then builds a UNION ALL query that returns the row counts for each table in that schema

Benefits
Automates row count checks across all tables in a schema. Useful for data quality monitoring and schema validation. Provides a quick snapshot of table sizes during dbt runs. Logs the number of tables processed for transparency.

Summary by CodeRabbit

New Features
- Added schema-wide row-count collection for metadata tracking and monitoring; automatically enumerates all tables in a schema and produces per-table row counts.
- Each record includes query timestamp, catalog, schema and table identifiers to aid auditing, trend analysis, and alerting for data volume changes.

This macro generates a row count summary for all tables in a given schema. It dynamically queries the information_schema.tables to list all tables, then builds a UNION ALL query that returns the row counts for each table in that schema Benefits Automates row count checks across all tables in a schema. Useful for data quality monitoring and schema validation. Provides a quick snapshot of table sizes during dbt runs. Logs the number of tables processed for transparency.

github-actions · 2026-02-03T17:55:04Z

👋 @BindusekharGorintla
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

coderabbitai · 2026-02-03T17:55:12Z

Warning

Rate limit exceeded

@BindusekharGorintla has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 minutes and 26 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Adds a new dbt Jinja macro that queries the information_schema for a given schema and emits a UNION ALL SQL query which returns run timestamp, catalog, schema, table names, and row counts for every table in the computed target schema.

Changes

Cohort / File(s)	Summary
New Metadata Collection Macro `macros/edr/metadata_collection/get_all_table_counts_from_schema.sql`	Adds `get_all_table_counts_from_schema(schema_name, catalog_name = target.catalog \| default(target.database, true))`. Computes target schema from `target.schema` + `_` + `schema_name`, queries `information_schema.tables` for tables, logs total table count, and generates a UNION ALL of per-table SELECTs returning `run_started_at`, `table_catalog`, `table_schema`, `table_name`, and `COUNT(*)` for each table.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I hop through schemas, one by one,
I ask the catalog what tables live there,
I stitch their counts into a single run,
A tiny carrot of data I share,
Row by row — tally, union, and flair. 🍃

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title accurately and concisely summarizes the main change: adding a macro to generate row counts for tables in a schema, which matches the file addition and objectives.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@macros/edr/metadata_collection/get_all_table_counts_from_schema.sql`:
- Around line 13-26: The macro currently emits nothing when
results['table_name'] is empty which yields invalid SQL; update the template to
check if results['table_name'] is defined and has length before the for-loop
and, if empty, emit a safe placeholder SELECT (for example: select
'{{run_started_at}}' as date_time, null as catalog_name, null as schema_name,
'NO_TABLES_FOUND' as table_name, 0 as count) so callers always receive valid
SQL; keep the existing for-loop that iterates over results['table_name'] and its
use of results['table_catalog'][loop.index-1],
results['table_schema'][loop.index-1], and results['table_name'][loop.index-1]
when the list is non-empty.
- Line 1: The macro get_all_table_counts_from_schema currently defaults
catalog_name to target.catalog which doesn't exist on some adapters (e.g.,
Postgres/Redshift) and will raise at runtime; change the parameter default to
use adapter-agnostic target.database or make catalog_name optional and fallback
to target.database when target.catalog is undefined inside the macro
(referencing get_all_table_counts_from_schema, the schema_name and catalog_name
parameters) so the macro works across adapters or explicitly guard/branch on
adapter type before using target.catalog.
- Around line 5-9: The SQL directly interpolates schema_to_use into the
catalog_sql string, which risks SQL injection and syntax errors; update the
catalog_sql construction to quote/escape the schema identifier instead of raw
interpolation (use the adapter's quoting/identifier functions or dbt-utils
quoting helper and apply lower() if needed) so that schema_to_use is safely
quoted when used with catalog_name and information_schema.tables; adjust the
template where catalog_sql is defined to call the quoting helper for
schema_to_use and ensure the rest of the query (the FROM {{ catalog_name
}}.information_schema.tables and where clause) uses the quoted value.
- Around line 19-20: Replace the direct interpolations of catalog/schema/table
with adapter-safe quoted identifiers and switch to loop.index0; specifically,
update the SELECT and FROM to use expressions like {{
adapter.quote(results['table_catalog'][loop.index0]) }}, {{
adapter.quote(results['table_schema'][loop.index0]) }}, and {{
adapter.quote(results['table_name'][loop.index0]) }} (keep the visible column
aliases like date_time and relation unchanged) so identifiers are properly
quoted and protected from spaces/reserved words.

🧹 Nitpick comments (2)

macros/edr/metadata_collection/get_all_table_counts_from_schema.sql (2)
19-19: run_started_at timestamp format may need casting.

The run_started_at variable is a Python datetime object. When interpolated as a string, the format may not be compatible with all database timestamp types. Consider explicit casting or formatting.
♻️ Proposed improvement for timestamp handling
-    select '{{run_started_at}}' as date_time, ...
+    select cast('{{run_started_at.strftime("%Y-%m-%d %H:%M:%S")}}' as timestamp) as date_time, ...
Or use the adapter's timestamp literal format for better cross-database compatibility.
3-3: Hardcoded schema naming convention may not fit all use cases.

The schema construction target.schema ~ '_' ~ schema_name assumes a specific naming convention. Consider making the full schema name passable directly, or document this convention clearly.
💡 Suggested enhancement for flexibility
-{%- macro get_all_table_counts_from_schema(schema_name, catalog_name = target.catalog) -%}
-
-{%- set schema_to_use = target.schema ~ '_' ~ schema_name -%}
+{%- macro get_all_table_counts_from_schema(schema_name, catalog_name = target.catalog | default(target.database, true), use_target_prefix = true) -%}
+
+{%- set schema_to_use = (target.schema ~ '_' ~ schema_name) if use_target_prefix else schema_name -%}

macros/edr/metadata_collection/get_all_table_counts_from_schema.sql

Added database Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Table and schema names Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

BindusekharGorintla requested a deployment to elementary_test_env February 3, 2026 17:55 — with GitHub Actions Waiting

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

Apply suggestion from @coderabbitai[bot]

525a19f

Added database Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

BindusekharGorintla requested a deployment to elementary_test_env February 3, 2026 18:12 — with GitHub Actions Waiting

Apply suggestion from @coderabbitai[bot]

2ee9580

Table and schema names Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

BindusekharGorintla requested a deployment to elementary_test_env February 3, 2026 18:13 — with GitHub Actions Waiting

BindusekharGorintla closed this by deleting the head repository Feb 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add macro to generate row counts for all tables in a schema#923

Add macro to generate row counts for all tables in a schema#923
BindusekharGorintla wants to merge 3 commits intoelementary-data:masterfrom
BindusekharGorintla:patch-1

BindusekharGorintla commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

BindusekharGorintla commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Feb 3, 2026

Uh oh!

coderabbitai bot commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BindusekharGorintla commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 3, 2026 •

edited

Loading