Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.sqlbuild.com/llms.txt

Use this file to discover all available pages before exploring further.

ingestr is an open-source CLI tool that copies data from any source to any destination using a single command. SQLBuild integrates with ingestr as a declarative source loader - you configure the ingestion directly in your source YAML, and SQLBuild handles execution as part of the build lifecycle.

Install

pip install 'sqlbuild[ingestr]'
# or
uv pip install 'sqlbuild[ingestr]'
This installs ingestr alongside SQLBuild. The ingestr CLI must be available on PATH.

How it works

  1. Declare an ingestr block on a source in sources/*.yml
  2. SQLBuild generates a synthetic loader that calls ingestr ingest as a subprocess
  3. ingestr reads from the configured source and writes directly to the SQLBuild adapter’s database
  4. The destination URI is built automatically from your SQLBuild connection config - no manual destination setup
No Python loader code is needed. The YAML declaration is the entire configuration.

Example: PostgreSQL to DuckDB

Replicate a table from PostgreSQL into your local DuckDB project:
# sources/raw.yml
sources:
  - name: raw_orders
    table: orders
    ingestr:
      source_uri: "postgresql://user:pass@host:5432/mydb"
      source_table: "public.orders"
That’s it. Run sqb load or sqb build and ingestr copies the orders table into your project.

Example: Stripe with incremental merge

Load Stripe charges with incremental merge on a primary key:
sources:
  - name: raw_stripe_charges
    table: charges
    ingestr:
      source_uri: "stripe://${stripe_api_key}"
      source_table: "charges"
      strategy: merge
      primary_key: id
      incremental_key: created
On subsequent runs, ingestr merges new and updated records based on the id column, using created to detect changes.

Configuration reference

The ingestr block on a source supports the following fields:
FieldRequiredDescription
source_uriYesingestr source connection URI (e.g. postgresql://..., stripe://..., shopify://...)
source_tableYesSource table or resource name to ingest
strategyNoIngestion strategy: replace, append, merge, delete+insert, or truncate+insert
incremental_keyNoColumn used for incremental change detection
primary_keyNoPrimary key column(s) for merge strategy (string or list)
columnsNoComma-separated column list to select from the source
extra_argsNoAdditional CLI arguments passed to ingestr ingest (list of strings)

Strategy mapping

StrategyBehavior
replaceDrop and recreate the destination table (default when no strategy is set)
appendInsert all rows without deduplication
mergeUpsert based on primary_key, using incremental_key for change detection
delete+insertDelete matching rows by incremental_key range, then insert replacements
truncate+insertTruncate destination table, then insert all rows

Template variables

All string fields in the ingestr block support SQLBuild project variable substitution and context templates:
sources:
  - name: raw_orders
    ingestr:
      source_uri: "postgresql://${pg_user}:${pg_password}@${pg_host}:5432/${pg_database}"
      source_table: "public.orders"
Variables are resolved from the project’s merged variable config (project + environment + local). Set sensitive values in sqlbuild_local.toml (gitignored):
[vars]
pg_user = "readonly"
pg_password = "secret"
pg_host = "prod-db.example.com"
pg_database = "analytics"
stripe_api_key = "sk_live_..."

Destination URI

SQLBuild automatically builds the ingestr destination URI from your adapter connection config. All supported adapters work without manual destination configuration:
AdapterDestination URI format
DuckDBduckdb:///path/to/db.duckdb
MotherDuckmotherduck://database
PostgreSQLpostgresql://user:pass@host:port/db
Snowflakesnowflake://user:pass@account/db/schema
BigQuerybigquery://project
Databricksdatabricks://token:...@host
SQL Servermssql://user:pass@host:port/db

Reload

When --reload is passed, ingestr runs with --full-refresh, forcing a complete reload regardless of the configured strategy:
sqb load --reload
sqb build --reload

Build integration

ingestr sources are managed sources - they participate in the same lifecycle as Python loaders:
# ingestr runs automatically before dependent models
sqb build

# run ingestr sources standalone
sqb load

# skip loading
sqb build --no-load
See Loaders for details on auto-load behavior, source deferral, and the --load / --no-load / --reload flags.

Supported sources

ingestr supports 50+ sources including databases, SaaS APIs, and file systems. See the ingestr documentation for the full list of supported sources and their URI formats.