Skip to main content
A model is a SQL file that defines one transformation step. Each model produces one table or view in the warehouse.

MODEL() header

Every model file starts with a MODEL() block that declares its materialization, configuration, and schema metadata:
MODEL (
  materialized table,
  tags [marts],
  description "Order fact table with waffle and payment details.",
  columns (
    order_id (audits [not_null]),
  ),
);

SELECT
  o.order_id,
  o.customer_id,
  w.waffle_name,
  w.price_cents * o.quantity AS line_total_cents
FROM __ref("stg_orders") o
LEFT JOIN __seed("waffle_types") w ON o.waffle_type_id = w.waffle_type_id

Materialization types

view

Creates a database view. Rebuilt on every run.
MODEL (
  materialized view,
  tags [staging],
);

SELECT id AS order_id, customer_id, status
FROM __source("raw__orders")

table

Creates a table via CREATE TABLE AS. SQLBuild materializes into a staging table first, runs audits, then promotes to the target. Fully rebuilt each time.
MODEL (
  materialized table,
  tags [marts],
);

SELECT customer_id, COUNT(*) AS total_orders
FROM __ref("stg_orders")
GROUP BY customer_id

incremental

Inserts or updates into an existing table using a cursor-based strategy. See Incremental for full configuration.
MODEL (
  materialized incremental,
  incremental_strategy delete_insert,
  cursor activity_hour,
  cursor_type timestamp,
  cursor_grain hour,
  cursor_inputs (
    fact_orders ordered_at,
  ),
  incremental_mode microbatch,
  batch_size 1d,
  tags [marts],
);

SELECT
  DATE_TRUNC('hour', o.ordered_at) AS activity_hour,
  COUNT(*) AS orders_placed,
  SUM(o.quantity) AS waffles_ordered
FROM __ref("fact_orders") o
GROUP BY DATE_TRUNC('hour', o.ordered_at)

snapshot

Maintains historical row versions with SCD Type 2 semantics. Supports timestamp-based and value-check-based change detection, historical source inputs, hard delete invalidation, and configurable full-refresh safety policies.
MODEL (
  materialized snapshot,
  unique_key [customer_id],
  snapshot_strategy timestamp,
  updated_at updated_at,
);

SELECT
  customer_id,
  name,
  plan,
  status,
  updated_at
FROM __source("customers")
See Snapshots for full configuration, historical input modes, and querying patterns.

custom

User-defined Python materialization function. Custom materializations get full access to the framework including adapter, schema change signals, query change detection, and audit hooks.
MODEL (
  materialized partition_tracked,
  tags [marts],
  placeholders (
    partition_start "'2026-04-01'",
    partition_end "'2026-04-05'",
  ),
  config (
    tracking_table partition_state,
    partition_column order_date,
    date_range_start 2026-04-01,
    date_range_end 2026-04-05,
  ),
  description "Partition-tracked daily order summary using custom materialization.",
  columns (
    order_date (audits [not_null]),
  ),
  audits [
    expression_is_true (
      name "waffles ordered is positive",
      expression "waffles_ordered > 0",
    ),
  ],
);

SELECT
  CAST(o.ordered_at AS DATE) AS order_date,
  COUNT(DISTINCT o.order_id) AS order_count,
  SUM(o.quantity) AS waffles_ordered,
  COUNT(DISTINCT o.customer_id) AS unique_customers
FROM __ref("stg_orders") o
WHERE CAST(o.ordered_at AS DATE) >= CAST(@@@partition_start AS DATE)
  AND CAST(o.ordered_at AS DATE) < CAST(@@@partition_end AS DATE)
GROUP BY CAST(o.ordered_at AS DATE)
Custom materializations use @@@placeholder syntax for values substituted at runtime. These deferred placeholders are preserved through compilation and resolved by the materialization at execution time. The config block passes arbitrary key-value pairs to the Python materialize() function via ctx.config.

References

Models use typed reference calls that SQLBuild resolves to qualified warehouse relation names during compilation:
ReferenceSyntaxResolves to
Model__ref("name")Another model
Seed__seed("name")A seed CSV table
Source__source("name")An external source
Scalar UDF__udf("name")A user-defined function
SELECT
  o.order_id,
  o.customer_id,
  w.waffle_name,
  w.price_cents * o.quantity AS line_total_cents,
  __udf("udf__is_completed_order")(o.status) AS is_completed
FROM __ref("stg_orders") o
LEFT JOIN __seed("waffle_types") w ON o.waffle_type_id = w.waffle_type_id
Seeds use __seed(), not __ref(). Using __ref() with a seed name raises a compile error with a helpful message pointing you to __seed(). See Functions for UDF and table function details.

DAG ordering

SQLBuild automatically discovers the dependency graph from reference calls, then executes models in topological order. Upstream models are always built before their downstream dependents.

Schema declarations

Model metadata - description, columns, audits, and type information - lives directly in the MODEL() header. There is no separate schema.yml for models.
MODEL (
  materialized view,
  tags [staging],
  description "Cleaned order records.",
  columns (
    order_id (audits [not_null, unique]),
    customer_id (audits [not_null]),
    status (
      audits [
        accepted_values (values ["placed", "preparing", "ready", "completed", "cancelled"]),
      ],
    ),
  ),
);

SELECT
  id AS order_id,
  customer_id,
  status
FROM __source("raw__orders")

Column-level audits

Attach audits to individual columns inside the columns block. Simple audits like not_null and unique are listed by name. Parameterized audits like accepted_values pass arguments inline:
columns (
  order_id (audits [not_null, unique]),
  status (
    audits [
      accepted_values (values ["placed", "preparing", "completed", "cancelled"]),
    ],
  ),
),

Model-level audits

Attach audits to the model itself for multi-column or expression-based checks:
MODEL (
  materialized table,
  audits [
    expression_is_true (
      name "revenue is non-negative",
      expression "total_revenue_cents >= 0",
    ),
  ],
);

Type enforcement

Type enforcement is implicit. If any column in the MODEL() header declares a type, type enforcement is automatically enabled for that model:
MODEL (
  materialized table,
  columns (
    order_id (type INTEGER, audits [not_null]),
    amount_cents (type INTEGER),
  ),
);
When enabled, SQLBuild casts columns to declared types and uses them for schema-change detection. There is no need to set type_enforcement: true explicitly.

Contracts

Contracts enforce that a model’s output matches its declared column schema exactly - column names, column count, and column types. When contract enforced is set, the declared columns become the authoritative output contract.
MODEL (
  materialized table,
  contract enforced,
  columns (
    order_id (type INTEGER, audits [not_null]),
    customer_id (type INTEGER, audits [not_null]),
    amount_cents (type INTEGER),
    status (type VARCHAR),
  ),
);
Contract enforcement happens at two levels: Compile time - config fields that reference columns (unique_key, cursor, updated_at, check_columns) are validated against the declared column names. If a referenced column is not in the contract, compilation fails. Runtime - after materialization into the staging table, SQLBuild inspects the actual output columns and validates them against the contract before promotion:
  • Missing declared columns fail with code K010
  • Extra undeclared columns fail with code K011
  • Type mismatches (e.g. VARCHAR where INTEGER was declared) fail with code K013
If any validation fails, the production table is untouched. Types are compared using adapter-aware normalization, so equivalent types across dialects are handled correctly. Contract values:
ValueBehavior
enforcedDeclared columns are the complete, authoritative output schema
noneNo contract enforcement (default)
Contracts interact with schema change policies. For snapshot models, snapshot_schema_change append_new_columns is incompatible with contract enforced because appending columns would violate the contract.

Audit run scope

Audits on incremental models can specify run_scope to control when they execute:
MODEL (
  materialized incremental,
  incremental_strategy delete_insert,
  cursor activity_hour,
  cursor_type timestamp,
  cursor_grain hour,
  columns (
    activity_hour (audits [not_null (run_scope delta_and_final)]),
  ),
  audits [
    expression_is_true (
      name "orders placed is non-negative",
      expression "orders_placed >= 0",
      run_scope delta_and_final,
    ),
  ],
);
delta_and_final runs the audit against each delta batch before DML and again against the target after all batches complete. See Audits for details.

Hooks

Pre-hooks and post-hooks run before and after materialization. Each entry is either a sql("...") hook that executes SQL, or a python("hook_name") hook that calls a Python function from the hooks/ directory.

SQL hooks

MODEL (
  materialized table,
  post_hooks [sql('GRANT SELECT ON @@CTX:destination.qualified TO analyst_role')],
);
SQL hooks support macro expansion (@macro()), project variables (@@name), environment variables (@@ENV:NAME), and context variables (@@CTX:). SQL is validated at compile time when SQL analysis is enabled. Available context variables in hooks:
VariableValue
@@CTX:destination.qualifiedFully qualified destination relation name
@@CTX:destination.schemaDestination schema
@@CTX:destination.tableDestination relation name
@@CTX:model.nameModel name
@@CTX:run.targetActive target name
@@CTX:run.idCurrent run ID

Python hooks

Python hooks call @hook-decorated functions discovered from the hooks/ directory:
# hooks/permissions.py
from sqlbuild.hooks import hook

@hook
def grant_analyst(ctx):
    ctx.execute_sql(f"GRANT SELECT ON {ctx.destination.qualified} TO analyst_role")
Reference a Python hook in the MODEL() header by name, with optional keyword arguments:
MODEL (
  materialized table,
  post_hooks [python("grant_analyst")],
);
MODEL (
  materialized table,
  post_hooks [python("grant_analyst", role: "reader_role")],
);
You can mix SQL and Python hooks in the same list:
MODEL (
  materialized table,
  pre_hooks [sql('SET search_path TO analytics')],
  post_hooks [
    python("grant_analyst"),
    sql('ANALYZE @@CTX:destination.qualified'),
  ],
);

Hook context

Python hooks receive a HookContext as their first parameter (named ctx, context, or hook_context):
FieldDescription
ctx.model_nameName of the model being built
ctx.phasepre_hooks or post_hooks
ctx.hook_nameName of the hook being invoked
ctx.run_idCurrent run ID
ctx.targetActive target name
ctx.varsProject variables
ctx.destination.qualifiedFully qualified destination relation name
ctx.destination.schemaDestination schema
ctx.destination.nameDestination relation name
ctx.destination.databaseDestination database
ctx.adapterAdapter instance
ctx.connectionLive connection
ctx.execute_sql(sql)Execute SQL on the connection
ctx.query(sql)Execute SQL and return rows
ctx.log(message)Log to the run output
ctx.skip(reason, mode=...)Skip the model’s materialization. mode accepts "soft" (default) or "hard" (blocks downstream models).
ctx.providersAccess discovered providers by name
Pre-hooks can return ctx.skip(...) to skip the model’s materialization entirely. A soft skip skips only this model; a hard skip also blocks downstream models. Providers can also be injected directly as hook function parameters by name. See Providers.

Hook decorator

The @hook decorator accepts optional metadata:
from sqlbuild.hooks import hook

@hook
def grant_analyst(ctx):
    """Grant analyst role on the destination table."""
    ctx.execute_sql(f"GRANT SELECT ON {ctx.destination.qualified} TO analyst_role")

@hook(name="custom_name", description="Custom hook with explicit name")
def my_hook(ctx, role="analyst_role"):
    ctx.execute_sql(f"GRANT SELECT ON {ctx.destination.qualified} TO {role}")
ArgumentDescription
nameOverride the hook name (defaults to the function name)
descriptionHuman-readable description (defaults to the function docstring)

Discovery rules

  • Hook functions are discovered from .py files under hooks/ recursively
  • Files named __init__.py or starting with _ are skipped
  • Each function decorated with @hook is registered by name
  • Hook names must be unique across all hook files
  • Python hook references in MODEL() headers are validated at compile time: unknown names, unknown kwargs, and missing required parameters all raise compile errors

Validation

At compile time, SQLBuild validates every python("hook_name") reference:
  • The hook name must match a discovered @hook function
  • Any keyword arguments passed in the MODEL() header must match parameters on the function signature
  • If the function does not accept **kwargs, unknown arguments raise a compile error

Config reference

Common config

FieldDescription
materializedview, table, incremental, or a custom materialization name
tagsList of tags for selector filtering
descriptionHuman-readable description of the model
columnsColumn declarations with optional types, audits, and descriptions
auditsModel-level audit instances
schemaOverride target schema
databaseOverride target database
aliasOverride target relation name
pre_hooksLifecycle hooks to run before materialization: sql("...") and/or python("hook_name") entries
post_hooksLifecycle hooks to run after materialization: sql("...") and/or python("hook_name") entries
enabledSet to false to skip the model
contractenforced or none. When enforced, declared columns are the authoritative output schema.

Incremental config

FieldDescription
incremental_strategyappend, delete_insert, or merge
cursorOutput column used to track incremental position
cursor_typetimestamp or integer
cursor_grainTime grain for timestamp cursors: second, minute, hour, day, month, year
cursor_startLower bound floor for the cursor
cursor_inputsMap of upstream ref/source names to their cursor columns
unique_keyColumn(s) used for merge and delete_insert matching
incremental_modeSet to microbatch to enable batched execution
batch_sizeBatch window size (e.g. 1d, 1h, or an integer)
lookbackExtend the replay window backwards to re-process recent data
on_schema_changeappend_new_columns, sync_all_columns, ignore, or fail
replay_on_changeforward (default), full, or bounded-<duration> (e.g. bounded-14d)
run_despite_unchangedForce periodic rebuilds: always or a duration (e.g. 24h, 30d). Table materializations only.
See Incremental for detailed usage.

Custom materialization config

FieldDescription
configArbitrary key-value pairs passed to ctx.config in the Python function
placeholdersDefault values for @@@placeholder tokens in the SQL

Diff config

FieldDescription
row_diff_exclude_columnsColumns to exclude from row-level diff comparisons
row_diff_tolerancesTolerance rules for numeric diff comparisons