Skip to main content
Sources declare external data that your models depend on. They are defined in YAML files under sources/ in your project directory.

Table sources

Point at an existing table or view in your warehouse:
sources:
  - name: raw_events
    database: analytics
    schema: raw
    table: events
Reference it in models with __source("raw_events").

Expression sources

Define source data inline as a SQL expression. No external tables or setup scripts needed:
sources:
  - name: raw__customers
    expression: |
      SELECT * FROM (VALUES
        (1, 'Leslie', 'Knope', 'leslie@pawnee.gov', TIMESTAMP '2026-01-15 09:00:00'),
        (2, 'Ron', 'Swanson', 'ron@pawnee.gov', TIMESTAMP '2026-02-01 08:00:00')
      ) AS raw__customers(id, first_name, last_name, email, created_at)
Expression sources are resolved at compile time. They’re the escape hatch for anything the framework doesn’t natively model: external tables, warehouse-specific syntax, function calls, or any other relation type that doesn’t fit a standard table reference.

Source audits

Sources support the same audit system as models. Audits attached to sources run before any dependent model is built:
sources:
  - name: raw_orders
    columns:
      - name: id
        audits:
          - not_null
          - unique
    audits:
      - expression_is_true:
          name: no future orders
          expression: "ordered_at <= CURRENT_TIMESTAMP"
If a source audit with error severity fails, all downstream models that depend on that source are blocked.

Type enforcement

Type enforcement is implicit for sources, the same as for models. If any column declares a type, SQLBuild automatically casts that column and uses declared types for schema-change detection:
sources:
  - name: raw__customers
    expression: |
      SELECT 1 AS id, 'Leslie' AS first_name, 'Knope' AS last_name
    columns:
      - name: id
        type: INTEGER
      - name: first_name
      - name: last_name
In this example, only id has a type declared, so only id is cast. Columns without a type are passed through unchanged. For expression sources, SQLBuild probes the expression’s output columns and builds a projection that casts typed columns while preserving the rest. For table sources, it uses warehouse metadata to validate that declared column names exist and applies casts accordingly. You can explicitly set type_enforcement: false on a source to disable casting even when column types are declared.

Managed sources (loaders)

Sources can be loaded by Python functions instead of pointing at existing tables or inline expressions. Set managed: true to bind a source to the @loader function of the same name, and SQLBuild will call it to populate the source table:
sources:
  - name: raw_customers
    managed: true
    write_strategy: table
    columns:
      - name: id
        type: INTEGER
      - name: name
        type: VARCHAR
Managed sources support incremental write strategies (table, append, delete_insert, merge), cursor-based loading, and concurrent execution. See Loaders for the full guide on writing loader functions, write strategies, the loader context API, and auto-load behavior during builds.

Source freshness

Source freshness lets SQLBuild observe whether a source’s data has changed between runs. This feeds into planning and change detection: models downstream of unchanged sources are skipped automatically. Configure freshness per source with a freshness: block:
sources:
  - name: raw_events
    schema: raw
    table: events
    freshness:
      strategy: column
      column: updated_at
      type: timestamp
      lag_tolerance: 15m

Strategies

StrategyDescriptionRequired fields
adapterUses warehouse metadata (e.g. Snowflake LAST_ALTERED). No query against the source table.None
columnReads MAX(column) from the source table.column, type
sqlRuns a custom query that returns a single scalar value.query, type

adapter

freshness:
  strategy: adapter
Uses adapter-level table metadata. Supported on Snowflake, BigQuery, Databricks, PostgreSQL, DuckDB, and SQL Server. Does not support type, column, or query.

column

freshness:
  strategy: column
  column: updated_at
  type: timestamp
Queries MAX(column) from the source table. The column must be a plain column name (no expressions; use sql strategy for those). Requires type.

sql

freshness:
  strategy: sql
  query: "SELECT MAX(version_id) FROM raw.events"
  type: integer
Runs an arbitrary SQL query that returns a single scalar. Requires type. Does not support column.

Type

The type field declares the value kind for comparison:
TypeDescription
timestampDatetime value. Supports lag_tolerance.
integerInteger value. Change detected by exact comparison.
stringString value. Change detected by exact comparison.

Lag tolerance

lag_tolerance is optional and only valid with type: timestamp. It declares how much the observed timestamp can drift from the previous observation before being treated as a change:
freshness:
  strategy: column
  column: updated_at
  type: timestamp
  lag_tolerance: 2h
Accepts positive durations: 15m (minutes), 2h (hours), 1d (days). If the current observation is within the tolerance of the previous one, the source is treated as unchanged.

Auto-observation

Sources without an explicit freshness: block are auto-observed using the adapter strategy if:
  • The source has a physical table (not an expression source)
  • The source is not managed
  • The adapter supports table freshness metadata
This means most unmanaged table sources get freshness tracking automatically on adapters that support it, with no configuration needed. Use sqb freshness to observe source freshness on demand without triggering a build. See Planning and Change Detection for how freshness feeds into change-aware builds.

Freshness config reference

FieldDescription
strategyObservation strategy: adapter, column, or sql
typeValue kind: timestamp, integer, or string
columnColumn name for column strategy
querySQL query for sql strategy
lag_toleranceDuration tolerance for timestamp comparisons (e.g. 15m, 2h, 1d). Only valid with type: timestamp.

Config reference

FieldDescription
nameSource name, used in __source("name") references
databaseTarget database (optional)
schemaTarget schema (optional)
tableTarget table name (defaults to name if omitted)
expressionInline SQL expression (alternative to table reference)
managedSet to true to bind the source to the @loader function of the same name (see Loaders)
write_strategyHow the loader writes data: table, append, delete_insert, or merge (requires managed: true)
cursor_columnColumn for incremental cursor tracking (required for delete_insert and merge)
unique_keyMerge key column(s) (required for merge)
freshnessSource freshness observation config (see Source freshness)
descriptionHuman-readable description
type_enforcementOverride implicit type enforcement (true/false). Defaults to true when any column declares a type.
contractenforced or none. When enforced, downstream models validate configured column references against source columns.
columnsColumn declarations with optional types and audits
auditsSource-level audits