How it works
- Write a Python function under
loaders/decorated with@loader - Declare a managed source in
sources/*.ymlwithmanaged: trueand the same name as the loader function - SQLBuild calls the function, writes returned rows to a staging table, then applies the configured write strategy to the target
sqb build runs, managed sources are loaded before any dependent model is materialized.
Defining a loader
Place Python files underloaders/ in your project directory. Each file can contain one or more loader functions:
LoaderContext and returns rows as a list of dicts, an iterator of dicts, or None for self-managed loaders.
Binding to a source
Declare a managed source insources/*.yml. A managed source is bound to the loader function with the same name - there is no separate loader field:
managed: true makes this a managed source - SQLBuild owns both the loading and the schema. The binding is by name: the source raw_customers is populated by the @loader function named raw_customers. SQLBuild raises an error if a managed source has no loader function of the same name.
Models reference managed sources the same way as any other source:
Write strategies
Thewrite_strategy field controls how returned rows are written to the target table.
table
Full replace. The target is dropped and recreated from the loader output on every run.append
Insert all returned rows into the target. No deduplication.delete_insert
Delete rows in the cursor range, then insert replacements. Requirescursor_column.
ctx.current_cursor_value with the current MAX(cursor_column) from the target, so it can fetch only new or updated data. Its function name matches the source name (raw_order_events):
merge
Upsert based onunique_key. Requires both unique_key and cursor_column.
Self-managed loaders
If a loader returnsNone, SQLBuild skips its row-writing pipeline. The loader is responsible for writing data to the target itself, using whatever approach makes sense - ctx.execute_sql(), an external library, a subprocess, or anything else:
write_strategy:
write_strategy. They are useful when you want to use adapter-specific SQL (e.g. COPY INTO, external tables), call an external ingestion tool like dlt, or handle writes in a way that doesn’t fit the dict-return pattern.
Loader context
Every loader function receives aLoaderContext as its first argument. It provides access to the destination relation, cursor state, active target, and helper methods.
Properties
| Property | Type | Description |
|---|---|---|
destination | str | Fully-qualified destination relation name (where rows are written) |
destination_database | str | None | Destination database |
destination_schema | str | None | Destination schema |
destination_name | str | Unqualified destination table name |
current_cursor_value | object | None | Current MAX(cursor_column) from the destination, or None if the table does not exist or has no cursor column |
run_id | str | Unique identifier for this execution run |
target | str | None | Active target name (e.g. dev, prod) |
vars | dict | Project variables (merged from project, target, and local config) |
is_reload | bool | True when --reload was passed |
start_cursor_ts | datetime | None | Timestamp cursor start override from --start-cursor-ts |
end_cursor_ts | datetime | None | Timestamp cursor end override from --end-cursor-ts |
start_cursor_int | int | None | Integer cursor start override from --start-cursor-int |
end_cursor_int | int | None | Integer cursor end override from --end-cursor-int |
adapter | BaseAdapter | The database adapter instance |
connection | object | The active database connection |
logger | Logger | Python logger scoped to the loader |
Methods
| Method | Description |
|---|---|
execute_sql(sql) | Execute a SQL statement against the connection |
query(sql) | Execute a SQL query and return the cursor |
log(message) | Log a message to the execution lifecycle output |
qualify_name(name) | Return a fully-qualified relation name in the destination database/schema |
skip(reason, mode=...) | Skip this loader. mode is "soft" (default, skip only this loader) or "hard" (also block dependents) |
result(payload=, metadata=, materialized=) | Return a structured result for a self-managed loader |
result_of(node_fn) | Read the latest persisted result of an upstream node (current or previous run) |
results_of(node_fn, limit=N) | Read the last N successful results of an upstream node, newest first |
loader(loader_fn) | Return a LoaderRelationRef for an upstream loader dependency |
source(source_name) | Return a LoaderRelationRef for a project source by YAML name |
LoaderRelationRef
Returned byctx.loader() and ctx.source(). Provides access to an upstream relation:
| Property / Method | Description |
|---|---|
destination | Fully-qualified relation name |
current_cursor_value | Current MAX(cursor_column) from the relation |
max(column) | Return the MAX of any column from the relation |
Loader dependencies
Loaders can depend on other loaders usingdepends_on. Dependencies are executed first, and their destination relations are available via ctx.loader():
--concurrency is set.
Intermediate loaders (those referenced only via depends_on, with no managed source of the same name) are given synthetic source entries and write to __loader__<name> tables by default. Only the terminal loader - the one whose name matches a managed source - populates that source; intermediate loaders feed it. Use the destination parameter on the decorator to override the intermediate relation:
Decorator parameters
The@loader decorator accepts optional parameters that can also be set in the source YAML. When both are specified, the YAML takes precedence.
| Parameter | Description |
|---|---|
depends_on | List of loader functions this loader depends on |
destination | Override the destination relation name (can include schema or database) |
write_strategy | table, append, delete_insert, or merge |
cursor_column | Column used for incremental cursor tracking |
unique_key | Column(s) used as the merge key (string or list of strings) |
columns | Column specifications with name, type, nullable, and description |
contract | enforced or none |
Auto-load during builds
By default,sqb build automatically loads managed sources before building dependent models. This is controlled by the auto_load_sources setting:
--reload is passed, ctx.is_reload is True in the loader function. This lets loaders implement different behavior for full reloads versus normal incremental loads.
Source deferral
When using multiple targets, loaders write data into the active target. But models may need to read source data from a different target (e.g. reading production data while developing in dev). Thedefer_sources_to field controls this:
dev target read managed source data from prod schema, even though sqb load writes to dev. This prevents accidentally reading empty or partial source tables during development.
If a target uses managed sources but does not declare defer_sources_to, SQLBuild raises an error rather than guessing.
Schema evolution
When a loader returns rows with columns not present in the existing target table, SQLBuild detects the schema change and adds the new columns automatically. Type mismatches between the staging table and the existing target raise an error.Project structure
.py files under loaders/ recursively (excluding __init__.py and files starting with _). Each file is scanned for functions decorated with @loader.
Config reference
Source YAML fields for managed sources
| Field | Description |
|---|---|
managed | Set to true to bind the source to the @loader function of the same name |
write_strategy | table, append, delete_insert, or merge (requires managed: true) |
cursor_column | Column for incremental cursor tracking (required for delete_insert and merge) |
unique_key | Merge key column(s) (required for merge) |
columns | Column declarations with types |
contract | enforced or none |
Validation rules
appendcannot haveunique_keymergerequiresunique_keytablecannot havecursor_columnorunique_keydelete_insertrequirescursor_columnand cannot haveunique_keycursor_columnrequires one ofappend,delete_insert, ormerge

