TobikoData · Lewis-Lyons · Sep 11, 2025 · Sep 11, 2025 · Sep 11, 2025 · Sep 12, 2025
diff --git a/docs/concepts/audits.md b/docs/concepts/audits.md
@@ -722,3 +722,104 @@ MODEL (
   )
 );
 ```
+
+### AUDIT_ONLY Models
+
+In addition to traditional audits, SQLMesh provides a special model kind called `AUDIT_ONLY` for validating data relationships across multiple tables without materializing any results.
+
+#### When to Use AUDIT_ONLY Models
+
+Use `AUDIT_ONLY` models when you need to:
+- Validate relationships between multiple tables (e.g., referential integrity)
+- Run complex validation queries that don't belong to a single model
+- Create validation logic that participates in the model DAG with proper dependencies
+- Avoid creating unnecessary materialized tables just for validation
+
+Unlike traditional audits that are scoped to a single model, `AUDIT_ONLY` models can depend on multiple models and validate relationships between them.
+
+#### Creating AUDIT_ONLY Models
+
+AUDIT_ONLY models are defined like regular models but with `kind AUDIT_ONLY`:
+
+```sql
+MODEL (
+    name data_quality.order_validation,
+    kind AUDIT_ONLY (
+        blocking TRUE,         -- Fail pipeline if validation fails (default)
+        max_failing_rows 20   -- Number of sample rows to show in error (default: 10)
+    ),
+    depends_on [orders, customers],
+    cron '@daily',
+    owner 'data_quality_team'
+);
+
+-- Query should return 0 rows for success
+-- Any returned rows indicate validation failures
+SELECT 
+    o.order_id,
+    o.customer_id,
+    'Missing customer record' as issue_type
+FROM orders o
+LEFT JOIN customers c ON o.customer_id = c.customer_id
+WHERE c.customer_id IS NULL;
+```
+
+#### Key Differences from Regular Audits
+
+| Feature | Traditional Audits | AUDIT_ONLY Models |
+|---------|-------------------|-------------------|
+| **Scope** | Single model | Multiple models |
+| **Dependencies** | Implicit (via @this_model) | Explicit (via depends_on) |
+| **Materialization** | N/A | Never materializes |
+| **Location** | `audits/` directory or inline | `models/` directory |
+| **Scheduling** | With parent model | Independent cron schedule |
+| **DAG Participation** | Attached to model | Full model in DAG |
+
+#### Configuration Options
+
+AUDIT_ONLY models support these configuration options:
+
+- **`blocking`** (default: `TRUE`): Whether validation failures should stop the pipeline
+- **`max_failing_rows`** (default: `10`): Maximum number of failing rows to show in error messages
+
+Example with non-blocking configuration:
+
+```sql
+MODEL (
+    name data_quality.revenue_anomalies,
+    kind AUDIT_ONLY (
+        blocking FALSE,        -- Log warnings but don't stop pipeline
+        max_failing_rows 50   -- Show up to 50 failing rows
+    ),
+    depends_on [revenue_by_day]
+);
+
+-- Detect revenue anomalies
+WITH stats AS (
+    SELECT AVG(revenue) as avg_rev, STDDEV(revenue) as stddev_rev
+    FROM revenue_by_day
+)
+SELECT 
+    day,
+    revenue,
+    'Anomaly: >3 standard deviations' as issue
+FROM revenue_by_day
+CROSS JOIN stats
+WHERE revenue > avg_rev + (3 * stddev_rev)
+   OR revenue < 0;
+```
+
+#### How AUDIT_ONLY Models Work
+
+1. **No Table Creation**: The model's query executes but doesn't create or update any tables
+2. **Validation Logic**: The model fails if the query returns any rows (similar to audits)
+3. **Error Reporting**: Shows a sample of failing rows in the error message
+4. **Pipeline Integration**: Participates in plan/apply workflow with proper dependency ordering
+
+#### Best Practices
+
+1. **Use descriptive names**: Name your AUDIT_ONLY models clearly (e.g., `audit_order_integrity`, `validate_user_consistency`)
+2. **Set appropriate blocking**: Use `blocking TRUE` for critical validations, `FALSE` for warnings
+3. **Include context in output**: Return columns that help identify and debug issues
+4. **Group related validations**: Consider combining related checks in a single AUDIT_ONLY model
+5. **Document validation logic**: Use model descriptions to explain what's being validated and why
diff --git a/docs/concepts/models/model_kinds.md b/docs/concepts/models/model_kinds.md
@@ -860,6 +860,115 @@ SELECT DISTINCT
 FROM db.employees;
 ```
 
+## AUDIT_ONLY
+
+The `AUDIT_ONLY` model kind is designed for data validation across multiple tables without materializing any results. These models execute validation queries and fail if any rows are returned, similar to [audits](../audits.md#audit_only-models) but with the ability to participate as full models in the DAG.
+
+### Purpose
+
+`AUDIT_ONLY` models are ideal for:
+- Validating referential integrity between multiple tables
+- Detecting data quality issues across different models
+- Running complex validation queries that don't belong to a single model
+- Avoiding unnecessary table materialization for validation purposes
+
+### Configuration
+
+The `AUDIT_ONLY` kind supports two configuration parameters:
+
+- **`blocking`** (default: `TRUE`): Determines whether validation failures stop the pipeline
+- **`max_failing_rows`** (default: `10`): Maximum number of failing rows to display in error messages
+
+### Example: Referential Integrity Check
+
+This example validates that all orders reference existing customers:
+
+```sql linenums="1"
+MODEL (
+  name data_quality.order_integrity,
+  kind AUDIT_ONLY (
+    blocking TRUE,
+    max_failing_rows 20
+  ),
+  depends_on [orders, customers],
+  cron '@daily',
+  owner 'data_quality_team'
+);
+
+-- Query should return 0 rows for validation to pass
+SELECT 
+  o.order_id,
+  o.customer_id,
+  o.order_date,
+  'Missing customer record' as issue_type
+FROM orders o
+LEFT JOIN customers c ON o.customer_id = c.customer_id
+WHERE c.customer_id IS NULL;
+```
+
+### Example: Non-Blocking Anomaly Detection
+
+This example detects revenue anomalies but doesn't stop the pipeline:
+
+```sql linenums="1"
+MODEL (
+  name data_quality.revenue_anomalies,
+  kind AUDIT_ONLY (
+    blocking FALSE,        -- Log warnings but continue
+    max_failing_rows 100
+  ),
+  depends_on [daily_revenue],
+  cron '@hourly'
+);
+
+WITH stats AS (
+  SELECT 
+    AVG(revenue) as avg_revenue,
+    STDDEV(revenue) as stddev_revenue
+  FROM daily_revenue
+  WHERE revenue > 0
+)
+SELECT 
+  date,
+  revenue,
+  CASE 
+    WHEN revenue < 0 THEN 'Negative revenue'
+    WHEN revenue > avg_revenue + (5 * stddev_revenue) THEN 'Extreme outlier'
+  END as anomaly_type
+FROM daily_revenue
+CROSS JOIN stats
+WHERE revenue < 0 
+   OR revenue > avg_revenue + (5 * stddev_revenue);
+```
+
+### Behavior
+
+1. **No Materialization**: AUDIT_ONLY models never create or update tables
+2. **Validation Logic**: The model succeeds if the query returns 0 rows, fails otherwise
+3. **Error Reporting**: When validation fails, shows a sample of failing rows (up to `max_failing_rows`)
+4. **DAG Integration**: Fully participates in the model DAG with proper dependency tracking
+5. **Scheduling**: Can be scheduled independently using cron expressions
+
+### Best Practices
+
+- **Naming Convention**: Use descriptive names like `audit_*` or `validate_*` to clearly indicate the model's purpose
+- **Include Context**: Add columns that describe what validation failed for easier debugging
+- **Optimize Performance**: These queries run during every plan/apply, so ensure they're efficient
+- **Set Appropriate Blocking**: Use `blocking TRUE` for critical validations, `FALSE` for monitoring
+- **Document Purpose**: Use the `description` field to explain what the validation checks
+
+### Comparison with Traditional Audits
+
+While both AUDIT_ONLY models and traditional audits validate data, they serve different purposes:
+
+| Aspect | Traditional Audits | AUDIT_ONLY Models |
+|--------|-------------------|-------------------|
+| **Scope** | Single model | Multiple models |
+| **Location** | `audits/` directory or inline | `models/` directory |
+| **Dependencies** | Implicit via parent model | Explicit via `depends_on` |
+| **Scheduling** | With parent model | Independent cron |
+| **Use Case** | Validate model output | Validate cross-model relationships |
+
 ## SEED
 The `SEED` model kind is used to specify [seed models](./seed_models.md) for using static CSV datasets in your SQLMesh project.
 

diff --git a/docs/reference/model_configuration.md b/docs/reference/model_configuration.md
@@ -306,6 +306,17 @@ Configuration options for [`SCD_TYPE_2_BY_COLUMN` models](../concepts/models/mod
 
 Python model kind `name` enum value: [ModelKindName.SCD_TYPE_2_BY_COLUMN](https://sqlmesh.readthedocs.io/en/stable/_readthedocs/html/sqlmesh/core/model/kind.html#ModelKindName)
 
+### `AUDIT_ONLY` models
+
+Configuration options for [`AUDIT_ONLY` models](../concepts/models/model_kinds.md#audit_only) (in addition to [general model properties](#general-model-properties)).
+
+| Option              | Description                                                                                                                                                       | Type | Required |
+| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--: | :------: |
+| `blocking`          | If set to true, the pipeline will fail when the validation query returns any rows. If false, only warnings are logged. (Default: `True`)                        | bool |    N     |
+| `max_failing_rows`  | Maximum number of failing rows to display in error messages when a validation fails. (Default: `10`)                                                             | int  |    N     |
+
+Python model kind `name` enum value: [ModelKindName.AUDIT_ONLY](https://sqlmesh.readthedocs.io/en/stable/_readthedocs/html/sqlmesh/core/model/kind.html#ModelKindName)
+
 ### `SEED` models
 
 Configuration options for [`SEED` models](../concepts/models/model_kinds.md#seed). `SEED` models do not support all the general properties supported by other models; they only support the properties listed in this table.

diff --git a/examples/sushi/models/audit_duplicate_orders.sql b/examples/sushi/models/audit_duplicate_orders.sql
@@ -0,0 +1,43 @@
+MODEL (
+    name sushi.audit_duplicate_orders,
+    kind AUDIT_ONLY (
+        blocking FALSE,
+        max_failing_rows 100
+    ),
+    depends_on [sushi.orders],
+    cron '@hourly',
+    owner 'data_engineering',
+    tags ['validation', 'duplicates', 'data_quality'],
+    description 'Detects potential duplicate orders based on customer, waiter, and timing'
+);
+
+-- Find potential duplicate orders
+-- Orders from the same customer to the same waiter within 5 minutes might be duplicates
+WITH potential_duplicates AS (
+    SELECT 
+        o1.id as order_id_1,
+        o2.id as order_id_2,
+        o1.customer_id,
+        o1.waiter_id,
+        o1.start_ts as order_1_time,
+        o2.start_ts as order_2_time,
+        ABS(o1.start_ts - o2.start_ts) as seconds_apart
+    FROM sushi.orders o1
+    INNER JOIN sushi.orders o2
+        ON o1.customer_id = o2.customer_id
+        AND o1.waiter_id = o2.waiter_id
+        AND o1.id < o2.id  -- Avoid comparing order with itself and duplicating pairs
+        AND o1.event_date = o2.event_date  -- Same day
+    WHERE ABS(o1.start_ts - o2.start_ts) <= 300  -- Within 5 minutes (300 seconds)
+)
+SELECT 
+    order_id_1,
+    order_id_2,
+    customer_id,
+    waiter_id,
+    seconds_apart,
+    CONCAT('Orders ', order_id_1::TEXT, ' and ', order_id_2::TEXT, 
+           ' from customer ', customer_id::TEXT, 
+           ' are only ', seconds_apart::TEXT, ' seconds apart') as issue_description
+FROM potential_duplicates
+ORDER BY seconds_apart, order_id_1
diff --git a/examples/sushi/models/audit_order_integrity.sql b/examples/sushi/models/audit_order_integrity.sql
@@ -0,0 +1,25 @@
+MODEL (
+    name sushi.audit_order_integrity,
+    kind AUDIT_ONLY (
+        blocking FALSE,  -- Set to non-blocking for example/demo purposes
+        max_failing_rows 20
+    ),
+    depends_on [sushi.orders, sushi.customers],
+    cron '@daily',
+    owner 'data_quality_team',
+    tags ['validation', 'referential_integrity', 'critical'],
+    description 'Validates referential integrity between orders and customers tables'
+);
+
+-- Check for orders with non-existent customer IDs
+-- This should return no rows if all orders have valid customers
+SELECT 
+    o.id as order_id,
+    o.customer_id,
+    o.event_date,
+    'Missing customer record' as issue_type,
+    CONCAT('Order ', o.id::TEXT, ' references non-existent customer ', o.customer_id::TEXT) as issue_description
+FROM sushi.orders o
+LEFT JOIN sushi.customers c 
+    ON o.customer_id = c.customer_id
+WHERE c.customer_id IS NULL
diff --git a/examples/sushi/models/audit_waiter_revenue_anomalies.sql b/examples/sushi/models/audit_waiter_revenue_anomalies.sql
@@ -0,0 +1,47 @@
+MODEL (
+    name sushi.audit_waiter_revenue_anomalies,
+    kind AUDIT_ONLY (
+        blocking FALSE,
+        max_failing_rows 50
+    ),
+    depends_on [sushi.waiter_revenue_by_day],
+    cron '@daily',
+    owner 'analytics_team',
+    tags ['validation', 'revenue', 'daily'],
+    description 'Detects anomalies in daily waiter revenue that may indicate data quality issues'
+);
+
+-- Detect anomalies in waiter daily revenue
+-- Only flag extreme outliers (>5 std dev) or negative revenue
+WITH revenue_stats AS (
+    SELECT 
+        AVG(revenue) as avg_revenue,
+        STDDEV(revenue) as stddev_revenue
+    FROM sushi.waiter_revenue_by_day
+    WHERE revenue > 0  -- Exclude zeros from stats calculation
+),
+anomalies AS (
+    SELECT 
+        w.waiter_id,
+        w.event_date,
+        w.revenue,
+        r.avg_revenue,
+        r.stddev_revenue,
+        CASE 
+            WHEN w.revenue < 0 THEN 'Negative revenue'
+            WHEN w.revenue > r.avg_revenue + (5 * r.stddev_revenue) THEN 'Extremely high revenue (>5 std dev)'
+        END as anomaly_type
+    FROM sushi.waiter_revenue_by_day w
+    CROSS JOIN revenue_stats r
+    WHERE 
+        w.revenue < 0 
+        OR w.revenue > r.avg_revenue + (5 * r.stddev_revenue)  -- Only flag extreme outliers
+)
+SELECT 
+    waiter_id,
+    event_date,
+    revenue,
+    anomaly_type,
+    CONCAT('Waiter ', waiter_id::TEXT, ' has ', anomaly_type, ' on ', event_date::TEXT) as issue_description
+FROM anomalies
+ORDER BY event_date DESC, waiter_id
diff --git a/sqlmesh/core/dialect.py b/sqlmesh/core/dialect.py
@@ -621,6 +621,7 @@ def parse(self: Parser) -> t.Optional[exp.Expression]:
                         ModelKindName.SCD_TYPE_2_BY_TIME,
                         ModelKindName.SCD_TYPE_2_BY_COLUMN,
                         ModelKindName.CUSTOM,
+                        ModelKindName.AUDIT_ONLY,
                     ) and self._match(TokenType.L_PAREN, advance=False):
                         props = self._parse_wrapped_csv(functools.partial(_parse_props, self))
                     else: