Smart Monitoring — Catch Provider Changes and Traffic Anomalies Before They Break You
Anomaly detection learns each source's normal traffic pattern and alerts when volume spikes or drops vs. that baseline. Schema drift auto-infers a JSON schema from observed payloads and tells you the moment a provider adds, removes, or retypes a field. Both ship today on Pro and Business.
The two failure modes that hurt webhook integrations the most are also the two least well covered by traditional alerting:
- Traffic anomalies that don't look like errors. Stripe is suddenly sending you 12× more events than usual. Your delivery rate is fine. Your latency is fine. But something upstream broke — a retry loop, a misconfigured test mode, a customer migration — and you'll find out when your DB fills up or your costs spike.
- Schema drift you didn't ask for. A provider quietly adds a new field. Or worse, retypes
amountfrom integer to floating-point. Your JSONata transform throws on the next event. Your downstream service starts logging cryptic errors. By the time you trace it, hours have passed.
Today we're shipping Smart Monitoring — two new alert rule types that close both gaps. Both available on Pro and Business plans.
1. Anomaly Detection — adaptive volume alerting
Hookbase already had volume_spike and volume_drop alert rules, but they used absolute thresholds. You had to know the right number. "Alert if more than 1000 events in 10 minutes" is fine until your traffic 5× over six months and you forget to update the threshold.
The new anomaly_volume rule type uses a statistical baseline. Hookbase rolls up per-source per-minute event counts every minute, computes the rolling mean and standard deviation over your chosen window (15 min to 6 hours), and fires when the most recent minute's volume is more than N standard deviations away from that baseline.
📈 Volume spike on "Stripe Live" — 1,247 events in the last minute
(baseline mean 312, σ 48, z-score 19.5)
📉 Volume drop on "Shopify Orders" — 0 events in the last minute
(baseline mean 18, σ 4)
Configurable knobs:
- Direction — spike, drop, or both
- Z-score threshold — default 3 (3σ), tune lower for more sensitivity
- Baseline window — 15 / 60 / 180 / 360 minutes
- Minimum baseline samples — defaults to 30 minutes of history before the rule can fire (cold-start defense)
The math is Welford's online algorithm — single-pass, numerically stable, and computed against pre-aggregated minute buckets so the cron path is cheap regardless of your event volume.
What about black-friday-style traffic shifts?
Z-score over a 60-minute window doesn't know about hour-of-week patterns. If your traffic genuinely 5× during business hours, the rule will eventually re-baseline (every minute, the rolling window slides forward). For known traffic events you don't want to be paged on, set the baseline window wider (3–6 hours) — it makes the rule less reactive to short spikes that are part of normal weekly cadence.
2. Schema Drift Detection — know when payloads change shape
Every Pro+ source now has an auto-inferred JSON schema maintained by Hookbase. Once an hour, we sample up to 100 of your most recent payloads, infer their JSON schema (types, formats, required fields, nested structure), and compare against the prior baseline. If the diff is non-empty — fields added, fields removed, or types changed — we record a drift event.
🧬 Schema drift on "Stripe Live"
2 new fields: data.invoice_id, data.metadata.region
1 type change: data.amount integer → number
Sampled 100 payloads · compared against inferred baseline
The drift history lives in the new Schema tab on every source detail page. You can:
- See the current inferred schema as pretty-printed JSON
- Refresh the baseline manually after intentional changes
- Browse drift events with full diffs
- Toggle drift detection on or off per source
Auto-inferred or declared baseline
Two ways drift detection finds its baseline:
- Auto-inferred (default) — Hookbase learns each source's payload shape from the first 10+ samples it sees, then watches for changes against that baseline. Zero setup.
- Declared schema (override) — If a route has a
schema_idattached, drift compares observed payloads against your declared schema instead. The drift event recordsbaseline_source = "declared"so you know which path triggered it.
Defense against slow drift
Here's a subtle attack vector we wanted to defend: if Hookbase blindly merged every observation into the baseline, a malicious upstream could slowly add new fields over weeks until they're indistinguishable from "normal" — and remove a sensitive field along the way without anyone noticing. So we apply an asymmetric policy:
- Additive drift (only new fields): auto-merged into the baseline. The source learned a new field; future events with it won't fire.
- Removed fields or type changes: drift event recorded, baseline left alone. You'll keep getting alerts until you explicitly refresh the baseline. This forces a human to acknowledge that "yes, this regression is intentional."
How to set it up
Anomaly Detection
- Settings → Alert Rules → Add Rule
- Trigger Type: Anomaly Detection (Pro+)
- Pick a source, direction, Z-threshold (3 is sensible), and baseline window
- Pick your notification channels (Slack, Teams, PagerDuty, Discord, email, webhook)
The rule starts firing once the source has 30 minutes of baseline history. Cold start is silent.
Schema Drift
Schema drift detection is automatic for every Pro+ source — there's no setup required for inference. To get alerts when drift is detected:
- Settings → Alert Rules → Add Rule
- Trigger Type: Schema Drift (Pro+)
- Pick a source and which categories you care about (added fields, removed fields, type changes)
- Pick your notification channels
You can also browse drift history directly on each source's Schema tab without configuring an alert rule.
Why both, why now
Webhook integrations are loud when they fail and quiet when they degrade. The two failure modes Smart Monitoring covers — silent traffic shifts and silent schema changes — are both the kind of thing where the cost of detection is hours of debugging that would otherwise be 30 seconds of "oh, Stripe added a field." That's enough to pay for itself.
We've been running Smart Monitoring against our own production webhooks for the last few weeks. It's already caught one provider (no names) silently changing a field type from string to number mid-week. Without drift detection, we'd have shipped a transform regression to a customer.