Pipe Webhook Events Into Snowflake, BigQuery, and Databricks
Route webhook events from Stripe, GitHub, Shopify, or any provider straight into your data warehouse. Hookbase batches events into S3, R2, GCS, or Azure Blob — ready to load into Snowflake, BigQuery, and Databricks, or query in place with Athena and Redshift Spectrum.
Your data team keeps asking for the events
Payments, signups, orders, subscription changes — the events your analysts want most arrive as webhooks. But a webhook is an HTTP POST, not a row in a table. So the data team files a ticket, and someone spins up yet another ingestion service: a server to catch the POST, verify the signature, batch the records, and write them somewhere the warehouse can read.
That service is pure plumbing. Here is how to delete it.
Object storage is the universal loading dock
Hookbase warehouse destinations batch your webhook events and write them as files to cloud object storage — Amazon S3, Cloudflare R2, Google Cloud Storage, or Azure Blob Storage. That matters because every major warehouse already knows how to read a bucket. Object storage is the one integration point they all share, which makes it the simplest path into any of them.
| Object storage | Loads natively into | |---|---| | Amazon S3 | Snowflake, Databricks, Athena, Redshift Spectrum | | Google Cloud Storage | BigQuery, Snowflake | | Azure Blob Storage | Snowflake, Synapse, Databricks | | Cloudflare R2 | Anything that speaks the S3 API (zero egress fees) |
Hookbase handles the extract and load to the bucket. Your warehouse handles the transform and query. No ingestion service in the middle.
What lands in your bucket
Events are batched — up to 100 events or every 30 seconds, whichever comes first — and written as newline-delimited JSON. Each line is one event:
{"event_id":"evt_abc123","received_at":"2026-02-21T14:30:00Z","payload":{"type":"payment_intent.succeeded","data":{"amount":2500}}}
So the natural landing schema is three columns: event_id, received_at, and a semi-structured payload. Files are written under date-, hour-, or source-partitioned prefixes, so your warehouse can prune by time range and only scan what it needs.
Prefer flat, typed columns over a nested blob? Turn on field mapping and Hookbase projects payload fields into top-level columns with explicit types — handy when you point Athena or Redshift Spectrum at the files.
From bucket to warehouse
The load step is a few lines of SQL in each engine. In Snowflake, point a stage at the bucket and COPY INTO a table with a VARIANT column:
COPY INTO webhook_events
FROM (
SELECT $1:event_id::string, $1:received_at::timestamp_ntz, $1:payload
FROM @hookbase_stage
)
PATTERN = '.*[.]jsonl';
Wrap it in a Snowpipe and new files load automatically as they land. The other engines are just as short:
- BigQuery —
LOAD DATAinto a table with a nativeJSONcolumn, or define an external table to query the GCS prefix in place. - Databricks —
COPY INTOa Delta table, or use Auto Loader to stream new files incrementally. - Athena / Redshift Spectrum — create an external table over the S3 prefix and query the files where they sit, no load step at all.
The full, copy-paste SQL for all four is in the Data Warehouses guide.
Only warehouse what you actually want
Because this runs through the normal Hookbase pipeline, your filters and transforms apply before anything hits the bucket. That means you can:
- Drop noisy or irrelevant events so you are not paying to store and scan them
- Reshape the payload to match your table schema
- Flatten, rename, or strip fields — including PII — before they ever land at rest
Your analytics tables stay clean and your storage bill stays small.
Treat event_id as the idempotency key
Webhook providers retry, and batches can occasionally overlap, so deduplicate on event_id when you build downstream tables:
SELECT *
FROM webhook_events
QUALIFY ROW_NUMBER() OVER (PARTITION BY event_id ORDER BY received_at DESC) = 1;
Getting started
Warehouse destinations are available on Pro and Business plans:
- Go to Destinations → Add Destination
- Choose the Data Warehouse category
- Pick your storage (S3, R2, GCS, or Azure Blob) and enter credentials
- Point a route from your source to the new destination
Your first batch of events lands in the bucket within seconds, ready to load.
Wherever your data lives
Warehouse destinations join HTTP endpoints and direct queue delivery — SQS, EventBridge, Pub/Sub, and more. Between HTTP, queues, and object storage, Hookbase delivers your webhooks wherever your stack expects them: your API, your message bus, or your data warehouse.