Skip to content

Overview

Data warehouse pipeline that ETLs SheepCRM MongoDB data into MotherDuck (cloud DuckDB) for analytical queries.

Multi-Tenant Architecture

Each tenant (flock) gets its own MotherDuck database (warehouse_{bucket}). Tables use uid as primary key. There is no bucket column — tenant isolation is structural. Queries run against a single tenant database (switched via USE) so no bucket filtering is needed in SQL.

Authentication

Admin endpoints (API key only)

/schema, /migrate, /query, /query-all — require X-API-Key header.

API v1 endpoints (auth required)

/api/v1/* endpoints require one of:

  • Bearer token: Authorization: Bearer <oauth_access_token> validated against auth.sheepcrm.com
  • API key: X-API-Key: <key> stored in SSM

All API v1 endpoints are scoped to a bucket and validate the caller has access.

Quick Start

  1. Apply schema: POST /schema
  2. Run incremental sync: POST /migrate with {"flocks": ["example"]}
  3. Run a full reload: POST /migrate with {"flocks": ["example"], "mode": "full"}
  4. Query data: POST /query with {"sql": "SELECT * FROM person LIMIT 10", "database": "warehouse_example"}
  5. Authenticated API: GET /api/v1/example/tables with X-API-Key header

Response Format

Admin endpoints (/schema, /migrate, /query) return operation-specific JSON.

API v1 endpoints use a structured envelope:

{"data": ..., "meta": ...}

API v1 errors use:

{"error": {"code": "...", "message": "..."}}

OAuth Bearer token validated against auth.sheepcrm.com/userinfo/

Security scheme type: http

API key stored in SSM at /sheepcrm/{stage}/warehouse/admin-api-key

Security scheme type: apiKey

Header parameter name: X-API-Key