Funnel Ingestion API
One API for metrics, traces, and logs. OpenTelemetry-compatible (OTLP/HTTP/JSON today; OTLP/gRPC and OTLP/Protobuf scaffolded), with a simplified native JSON schema for hand-rolled clients.
http://localhost:4000
Bearer st_…
application/json
Quickstart #
Send your first metric. Replace
st_YOUR_API_KEY
with a key from your project's API Keys page.
curl -X POST http://localhost:4000/v1/metrics \
-H "authorization: Bearer st_YOUR_API_KEY" \
-H "content-type: application/json" \
-d '{"metrics":[{"attributes":{"method":"GET","route":"/users","service":"api"},"kind":"gauge","name":"http.server.duration_ms","time":"2026-05-11T17:00:00Z","value":182.3},{"attributes":{"service":"api"},"kind":"counter","name":"http.server.requests","time":"2026-05-11T17:00:00Z","value":1.0}]}'
Successful response:
{
"accepted": 2
}
Authentication #
Every request requires a project API key as a Bearer token:
Authorization: Bearer st_AbCdEf1234...
Content-Type: application/json
- · Keys are scoped to a single project. All three pillars (metrics/traces/logs) share the same key.
- · Keys are bcrypt-hashed at rest; the plaintext is shown exactly once on creation.
-
·
RUM beacons
may use
?token=…in the URL instead — required fornavigator.sendBeacon. - · Revoke a key from the API Keys page; future requests return 401.
Metrics ingestion #
/v1/metrics
Numeric time-series. Counters, gauges, histograms, and summaries.
Native JSON shape
Recommended for hand-rolled clients:
{
"metrics": [
{
"attributes": {
"method": "GET",
"route": "/users",
"service": "api"
},
"kind": "gauge",
"name": "http.server.duration_ms",
"time": "2026-05-11T17:00:00Z",
"value": 182.3
},
{
"attributes": {
"service": "api"
},
"kind": "counter",
"name": "http.server.requests",
"time": "2026-05-11T17:00:00Z",
"value": 1.0
}
]
}
OTLP/JSON shape
Matches OTLP ExportMetricsServiceRequest:
{
"resourceMetrics": [
{
"resource": {
"attributes": [
{
"key": "service.name",
"value": {
"stringValue": "api"
}
}
]
},
"scopeMetrics": [
{
"metrics": [
{
"histogram": {
"dataPoints": [
{
"attributes": [
{
"key": "http.route",
"value": {
"stringValue": "/users"
}
}
],
"bucketCounts": [
0,
2,
5,
4,
1,
0,
0
],
"count": 12,
"explicitBounds": [
10,
50,
100,
250,
500,
1000
],
"sum": 1840.5,
"timeUnixNano": "1715451600000000000"
}
]
},
"name": "http.server.duration",
"unit": "ms"
}
],
"scope": {
"name": "my-app",
"version": "1.0.0"
}
}
]
}
]
}
name_count,
name_sum, and name_bucket (one row per
explicit bound, with the bound in bucket_le). Query them with
agg=p95 via the Metrics Explorer.
Traces ingestion #
/v1/traces
Distributed spans. Reconstructed into trace trees by trace_id/parent_span_id.
Native JSON shape
{
"spans": [
{
"attributes": {
"http.method": "GET",
"http.route": "/users"
},
"duration_ms": 250.0,
"end_time": "2026-05-11T17:00:00.250Z",
"kind": "server",
"operation_name": "GET /users",
"parent_span_id": null,
"service_name": "api",
"span_id": "0123456789abcdef",
"start_time": "2026-05-11T17:00:00.000Z",
"status": "ok",
"trace_id": "a1f3c4d5b6e7890123456789abcdef00"
},
{
"duration_ms": 160.0,
"end_time": "2026-05-11T17:00:00.180Z",
"kind": "client",
"operation_name": "SELECT users",
"parent_span_id": "0123456789abcdef",
"service_name": "db",
"span_id": "fedcba9876543210",
"start_time": "2026-05-11T17:00:00.020Z",
"status": "ok",
"trace_id": "a1f3c4d5b6e7890123456789abcdef00"
}
]
}
OTLP/JSON shape
{
"resourceSpans": [
{
"resource": {
"attributes": [
{
"key": "service.name",
"value": {
"stringValue": "api"
}
}
]
},
"scopeSpans": [
{
"spans": [
{
"attributes": [
{
"key": "http.method",
"value": {
"stringValue": "GET"
}
}
],
"endTimeUnixNano": "1715451600250000000",
"kind": 2,
"name": "GET /users",
"spanId": "0123456789abcdef",
"startTimeUnixNano": "1715451600000000000",
"status": {
"code": 1
},
"traceId": "a1f3c4d5b6e7890123456789abcdef00"
}
]
}
]
}
]
}
code: 1→okcode: 2→errorcode: 0or missing →ok
Use the W3C traceparent
header to link spans across services. The Funnel agent reads
it automatically.
Logs ingestion #
/v1/logs
Structured log records. Full-text indexed for fast search.
Native JSON shape
{
"logs": [
{
"attributes": {
"env": "prod"
},
"message": "handled GET /users in 182ms",
"service_name": "api",
"severity": "info",
"time": "2026-05-11T17:00:00Z"
},
{
"attributes": {
"customer_id": "cus_abc"
},
"message": "charge failed: card_declined",
"service_name": "billing",
"severity": "error",
"time": "2026-05-11T17:00:01Z",
"trace_id": "a1f3c4d5b6e7890123456789abcdef00"
}
]
}
OTLP/JSON shape
{
"resourceLogs": [
{
"resource": {
"attributes": [
{
"key": "service.name",
"value": {
"stringValue": "api"
}
}
]
},
"scopeLogs": [
{
"logRecords": [
{
"attributes": [
{
"key": "http.method",
"value": {
"stringValue": "GET"
}
}
],
"body": {
"stringValue": "handled GET /users in 182ms"
},
"severityNumber": 9,
"severityText": "INFO",
"timeUnixNano": "1715451600000000000"
}
]
}
]
}
]
}
debug,
9–12
→ info,
13–16
→ warn,
17–20
→ error,
21–24
→ fatal. Text severities (case-insensitive) work too.
RUM beacons #
/v1/rum/events
Production-grade browser RUM: structured UA, geo enrichment, release tagging, source-map symbolication, origin allowlist, server-side sample floor, SDS redaction, GDPR purge.
Drop the Funnel RUM SDK into your site and beacons start flowing automatically. The SDK captures the full Core Web Vitals suite (LCP, FCP, CLS, INP, TTFB), uncaught errors with stack traces, SPA route changes, and any custom events you emit.
<script
src="https://YOUR_FUNNEL/sdk/funnel-rum.js"
data-funnel-endpoint="https://YOUR_FUNNEL"
data-funnel-key="st_public_xxxxxxxxxxx"
data-funnel-release="abc1234"
data-funnel-sample="1.0"
async></script>
Production-grade properties
allowed_origins array, the request's Origin header must match one of the entries (exact host or *.example.com wildcard). A scraped public key cannot be used from an attacker-controlled origin.
RumOriginAuth delegates to the shared 60 s auth cache. Bcrypt runs at most once per key per minute regardless of beacon rate.
String.to_atom/1. Non-string keys reject with HTTP 400. Regression-tested with a 60-key attack payload that must add < 10 atoms.
Retry-After; daily cap returns 429 with the used/limit in the message.
Funnel.Sds.redact_rum/2 against url, error_message, and every attribute string leaf before insert. Matches are rewritten to [REDACTED:<rule>]; raw secrets never reach Postgres.
[:funnel, :rum, :ingest, :stop] with measurements {count, bytes, errors, duration_us} and a disposition status (:ok / :rate_limited / :quota_exceeded / :invalid_* / :overloaded).
Funnel.Rum.purge_user/3 permanently deletes RUM events for a set of session_ids in a single transaction with an audit-log row — used to fulfil GDPR Article 17 / DSAR requests.
Funnel.Rum.UserAgent) extracts browser_name, browser_version, os_name, os_version, device_type, and is_bot into dedicated indexed columns. Filter on "Safari 17 on iPhone" without UA-string regex at query time.
country / region / city extracted from cf-ipcountry, cloudfront-viewer-country, etc. on ingest. Raw IPs are never persisted.
.map files via POST /v1/sourcemaps or the bundled mix funnel.upload_sourcemaps CLI. Pure-Elixir VLQ parser rewrites app.min.js:1:48211 → src/PaymentForm.tsx:189:24 on the read path. Parsed maps cached in :persistent_term.
release column (typically a git SHA) for "errors that started after deploy X" queries — and for matching against uploaded source maps.
max_sample_rate (0.0..1.0) downsamples server-side. A misbehaving client that bumped its sampleRate to 1.0 still gets capped — defence in depth above the quota system.
trace_id in their attributes are upserted into session_trace_links so the Sessions UI can surface "RUM errors observed in this session" without an extra query.
Source maps
Run once per release as part of CI, after the build but before the deploy:
mix funnel.upload_sourcemaps \
--endpoint https://YOUR_FUNNEL \
--key st_xxxxxxxxxx \
--release $(git rev-parse HEAD) \
dist/
Walks the directory for *.map files, POSTs each to
/v1/sourcemaps. Idempotent (same release re-uploads
replace). Exits non-zero on any upload failure so CI fails
loudly. Source maps are stored gzipped per
(project, release, file)
and never served — Funnel only uses them to rewrite stack
frames on the read path.
Hard limits
| Limit | Cap | On violation |
|---|---|---|
| events per batch | 500 | HTTP 413 |
| total payload (decompressed) | 5 MB | HTTP 413 |
| string fields (URL, UA, stack, …) | 4 KB | silently clamped |
| attribute keys per event | 64 | HTTP 400 |
| attribute nesting depth | 8 | HTTP 400 |
| attribute value size | 2 KB | silently clamped |
session_id charset | [A-Za-z0-9_:.-]{1,128} | HTTP 400 |
burst (tier developer) | 500 events / sec | HTTP 429 with Retry-After |
| daily quota | plan-tier dependent | HTTP 429 |
| pipeline buffer (high-water mark) | 10 000 events | HTTP 503 with Retry-After |
Beacon payload
{
"events": [
{
"fcp_ms": 410,
"kind": "pageview",
"time": 1715451600000,
"ttfb_ms": 88,
"url": "https://example.com/pricing"
},
{
"cls": 0.04,
"inp_ms": 95,
"kind": "vitals",
"lcp_ms": 1240,
"time": 1715451600500
},
{
"error_message": "TypeError: cannot read 'x' of undefined",
"error_stack": "at PaymentForm…",
"kind": "error",
"time": 1715451601000
}
],
"session_id": "s_abc123xyz",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) AppleWebKit/..."
}
CORS & Origin allowlist
The endpoint sets Access-Control-Allow-Origin: *
and accepts the API key via the ?api_key=
query parameter (legacy ?token= is still accepted)
— required for navigator.sendBeacon, which can't
attach custom headers.
On its own, open CORS plus a public key is not safe — anyone
who scrapes the key from your site can flood your ingest. The
per-key allowed_origins column closes that gap.
When set, the browser's Origin header must match
one of the entries:
# Restrict a public RUM key to your production domains:
Funnel.Accounts.update_api_key(key, %{
"allowed_origins" => [
"https://app.example.com",
"https://*.staging.example.com"
]
})
Wildcards follow cookie-domain semantics:
*.example.com
matches any subdomain but NOT the bare apex. Schema and port
must match exactly — there's no implicit upgrade.
Right to erasure (GDPR Article 17)
RUM events are identified by session_id, not user
identifier. To fulfil a DSAR, gather the user's session IDs
from session_recordings (or wherever your app
tracks the user → session mapping) and call:
{:ok, %{events: n}} =
Funnel.Rum.purge_user(project, ["s_abc123", "s_def456"], actor: "admin:42")
Runs in a single transaction; a security_audit_log
row is written before COMMIT. There is no undo.
Session Replay #
/v1/sessions
rrweb-style session capture + replay. Drop-in browser SDK, atomic counters, SDS-aware redaction, cross-pillar trace linking, GDPR-grade purge.
Funnel records browser sessions as rrweb-style event streams,
replays them visually with the bundled rrweb-player
hook, and ties each session back to your traces / logs / errors
via a shared trace_id. The ingest endpoint is
CORS-enabled
so the SDK can post directly from the browser. The endpoint at
/v1/sessions is canonical;
/v1/rum/sessions is kept as an alias for legacy
clients.
Quick start — the Funnel Replay SDK
Drop one script tag on your page. The SDK lazy-loads
rrweb-record from jsDelivr the first time a session
is sampled, batches events, masks inputs by default, and
flushes via navigator.sendBeacon on
pagehide.
<script
src="https://YOUR_FUNNEL/sdk/funnel-replay.js"
data-funnel-endpoint="https://YOUR_FUNNEL"
data-funnel-key="st_public_xxxxxxxxxxx"
data-funnel-sample="1.0"
data-funnel-mask-inputs="true"
data-funnel-block-selectors=".funnel-private,.no-record"
data-funnel-consent="required"
async></script>
Programmatic init
FunnelReplay.init({
endpoint: "https://YOUR_FUNNEL",
key: "st_public_xxxxxxxxxxx",
sampleRate: 1.0,
maskAllInputs: true,
blockSelectors: [".funnel-private"],
maskSelectors: [".funnel-pii"],
user: { id: "u-42", email: "[email protected]" },
consentRequired: true, // hold until optIn()
})
// Public API
FunnelReplay.optIn() // start recording
FunnelReplay.optOut() // stop + persistent flag
FunnelReplay.identify({ id: "u-42", plan: "team" }) // attach user
FunnelReplay.addEvent("checkout_clicked", { ... }) // custom marker
FunnelReplay.linkTrace(traceId) // pin a backend trace
<input>,<textarea>,<select>values maskeddata-funnel-maskattribute → text maskeddata-funnel-blockattribute → removed from capture- CSS allow / block selector lists at init
consentRequired: trueblocks untiloptIn()optOut()flushes a sentinel and sets a per-key localStorage flag- Future page loads honour the opt-out without asking
- Buffer flushes every 5 s or at 100 events
visibilitychange/pagehideusesendBeacon- Bounded buffer: oldest drops past 1 000 frames
- Sampling sticky per
sessionStorage-bound session
Demo page: /sdk/demo.html (served by your Funnel instance).
Source: priv/static/sdk/funnel-replay.js.
Request body
{
"events": [
{
"data": {
"height": 900,
"href": "https://example.com/checkout",
"width": 1440
},
"kind": "meta",
"time": 1715793721000
},
{
"data": {
"initialOffset": {
"left": 0,
"top": 0
},
"node": {
"childNodes": [
"…rrweb tree…"
],
"type": 0
}
},
"kind": "full_snapshot",
"time": 1715793721100
},
{
"data": {
"adds": [
"…"
],
"source": 2,
"type": "mutation"
},
"kind": "incremental_snapshot",
"time": 1715793722400
},
{
"data": {
"id": 87,
"source": 5,
"text": "•••• •••• •••• 4242"
},
"kind": "input",
"time": 1715793723900
},
{
"data": {
"message": "TypeError: Cannot read 'amount' of undefined",
"stack": "at PaymentForm.submit (/static/js/app.js:42:11)",
"trace_id": "abc123def4567890abc123def4567890"
},
"kind": "error",
"time": 1715793725100
}
],
"session_id": "s_2pK3xA9bF7d",
"start_url": "https://example.com/checkout",
"started_at": "2026-05-15T17:42:01.000Z",
"trace_id": "abc123def4567890abc123def4567890",
"user_agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) AppleWebKit/...",
"user_identifier": "u_482"
}
Event kinds
The first seven match rrweb's numeric type enum (0..6)
and are decoded for the player. The remaining four are
non-rrweb but surface as marks on the player's timeline:
dom_content_loaded/load— lifecycle pings (rrweb 0 / 1)full_snapshot— initial DOM tree (rrweb 2). Required for replay.incremental_snapshot— mutation records (rrweb 3). Bulk of the stream.meta— viewport + href (rrweb 4). Required for replay.custom— SDK custom events, slow-XHR marks (rrweb 5)plugin— rrweb plugin payload (rrweb 6)error— uncaught error / unhandled rejection. Bumpserror_count; red mark on the scrubber.navigation— pushState / hashchange / popstate. Blue mark.console— console.error / console.warn. Orange / yellow mark.input/mouse— masked input + pointer events.
Unknown kinds are coerced to custom rather than rejected, so forward-compat with new rrweb subtypes is automatic.
Hard limits
| Limit | Cap | On violation |
|---|---|---|
| events per batch | 500 | HTTP 413 |
| total payload (decompressed) | 5 MB | HTTP 413 |
per-event data size | 512 KB | HTTP 413 |
| jsonb nesting depth | 16 | HTTP 400 |
| header strings (UA, URL, user) | 250 bytes | silently clamped |
session_id charset | [A-Za-z0-9_:.-]{1,128} | HTTP 400 |
burst (tier developer) | 500 events / sec | HTTP 429 with Retry-After |
| daily quota | plan-tier dependent | HTTP 429 |
Cross-pillar correlation (sessions ↔ traces ↔ logs)
Send a trace_id (W3C trace-context format — 16 or 32 hex chars) at the top of the batch
or inside any event's data. Funnel persists each
(session_id, trace_id)
tuple into session_trace_links with an idempotent
upsert. Both directions are then indexed:
- Session → traces: the recording drawer surfaces each linked trace as a clickable link to the Traces explorer.
- Trace → sessions: the trace-detail page renders a "Session replays" panel listing every recording that touched it.
Malformed trace IDs (anything outside [a-fA-F0-9]{16,32}) are silently dropped — they don't fail the ingest.
The SDK exposes FunnelReplay.linkTrace(traceId) for
per-action correlation (e.g. emit a span on the server, return its
traceparent to the client, call
linkTrace).
SDS — server-side secret redaction
Every string leaf in event data is scanned against
enabled RUM-scoped
Sensitive Data Scanner rules before
persistence. Matches with action = "redact" are
rewritten in place to [REDACTED:<rule>]; a
sds_findings
row is recorded with source_kind = "sessions" so
operators have a paper trail.
The response payload includes a redacted
count alongside accepted so client telemetry can
flag pages where SDS keeps firing.
Example: curl
curl -X POST http://localhost:4000/v1/sessions \
-H 'authorization: Bearer st_YOUR_API_KEY' \
-H 'content-type: application/json' \
-d '{"events":[{"data":{"height":900,"href":"https://example.com/checkout","width":1440},"kind":"meta","time":1715793721000},{"data":{"initialOffset":{"left":0,"top":0},"node":{"childNodes":["…rrweb tree…"],"type":0}},"kind":"full_snapshot","time":1715793721100},{"data":{"adds":["…"],"source":2,"type":"mutation"},"kind":"incremental_snapshot","time":1715793722400},{"data":{"id":87,"source":5,"text":"•••• •••• •••• 4242"},"kind":"input","time":1715793723900},{"data":{"message":"TypeError: Cannot read 'amount' of undefined","stack":"at PaymentForm.submit (/static/js/app.js:42:11)","trace_id":"abc123def4567890abc123def4567890"},"kind":"error","time":1715793725100}],"session_id":"s_2pK3xA9bF7d","start_url":"https://example.com/checkout","started_at":"2026-05-15T17:42:01.000Z","trace_id":"abc123def4567890abc123def4567890","user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 14_5) AppleWebKit/...","user_identifier":"u_482"}'
# Response:
# { "accepted": 5, "redacted": 1 }
Right to erasure (GDPR Article 17)
Funnel.Sessions.purge_user/3
permanently deletes every recording, event, and trace-link row
for a given user_identifier within a project in a
single transaction. A security_audit_log
entry with the actor and per-table counts is written before
the transaction commits, so the deletion itself is auditable
even though the underlying rows are gone.
# From an admin LiveView or operator IEx session:
{:ok, %{recordings: r, events: e, links: l}} =
Funnel.Sessions.purge_user(project, "[email protected]", actor: "admin:42")
There is no undo — once the COMMIT lands, the data is gone. Authenticate the DSAR (data-subject-access request) out of band before invoking this.
Production-grade properties
String.to_atom/1. Adversarial 100-unique-key payloads add < 5 atoms to the table (regression-tested).
event_count / error_count upsert via single SQL ON CONFLICT DO UPDATE. Concurrent batches for the same session both succeed without lost updates.
[:funnel, :sessions, :ingest, :stop] with measurements {count, errors, bytes, duration_us} and a disposition status (:ok / :rate_limited / :quota_exceeded / :invalid_session_id / :persist_failed).
session_events is monthly RANGE-partitioned; the daily Oban cron creates next month's partition. Oban.Peers.Postgres ensures cron fires exactly once per cluster.
Funnel.CacheBus; every node drops its local ETS row within milliseconds rather than waiting up to 60 s for TTL.
blockSelectors / maskSelectors /
data-funnel-block attributes — masking at the source is always cheaper and safer than redacting after the fact.
session_recordings (one row per session_id, header data + counters),
partitioned session_events (one row per rrweb event, (time, id) composite PK, monthly RANGE partitions with BRIN index on time),
and session_trace_links (set-typed
(project, session, trace)
for cross-pillar correlation). The visual player is the
ReplayPlayer
LiveView hook in assets/js/hooks/replay_player.js — loads
rrweb-player
from jsDelivr on demand and renders inside a sandboxed iframe.
Edge Workers #
User-deployed JavaScript handlers served at
/w/<org_slug>/<project_slug>/<worker_slug>.
No API key on the public URL — these are your published endpoints.
Two runtimes are available:
-
Node JS — real JavaScript in
a per-invocation V8 sandbox (
vm.runInContext) with no globals beyondrequest,env,console, and (when an outbound allow-list is set)funnel. Spawned with--max-old-space-size=<mb>and--disallow-code-generation-from-strings. Wall-clock kill enforced from the BEAM side via Port close. -
Safe templates — a tiny
non-Turing DSL (
status:,header:,body:) with{{request.query.name}}placeholders. Cannot infinite-loop. Useful for static / redirect / health-check endpoints when JS isn't justified.
Hello world
// Available in scope: request, env, console, funnel.
// Return a Response-shaped object or a JSON-able value.
if (request.method === "GET") {
return {
status: 200,
headers: { "content-type": "application/json" },
body: JSON.stringify({
hello: request.query.name || "world",
path: request.path
})
};
}
return { status: 405, body: "method not allowed" };
Request shape
| Field | Description |
|---|---|
request.method | HTTP verb (GET, POST, …). |
request.path | Full request path including the /w/… prefix. |
request.url | Full URL including scheme, host, query string. |
request.headers | Header map (frozen). cookie and authorization are stripped before reaching your code. traceparent passes through. |
request.query | Parsed query string as { key: value }. |
request.body | Parsed body for JSON/form, raw string otherwise, null for GET/HEAD/OPTIONS. Bodies over 1MB are rejected with 413 before reaching your code. |
env | Your worker's env vars (decrypted at request time, frozen for the lifetime of one invocation). |
console.log/info/warn/error/debug | Output goes into the invocation's stored logs column. First 200 lines per call. |
funnel.fetch(url, opts) | Allow-list-gated outbound HTTP. Throws if no outbound_allow_list is configured. See Outbound below. |
Response shape
Return a Response-like object:
{ status, headers, body }. Or return a string
(becomes plain-text 200), or any JSON-serializable value (becomes
JSON 200). Throwing yields a 500 with the error message captured.
Auth
Workers are public by default. Toggle auth_required
and rotate a token from the dashboard's Settings →
Authentication tab. Callers must then send
Authorization: Bearer <token>; the plaintext is
shown to you exactly once on rotation and never persisted in
cleartext (only a SHA-256 hash).
Env vars (secrets at rest)
Set per-worker secrets from Settings → Environment
variables. Each value is encrypted with AES-256-GCM under
a key derived from the application secret via HKDF before being
written to the DB. The dashboard only ever shows a masked
fingerprint; the worker sees the decrypted value at request time.
Names must match [A-Z_][A-Z0-9_]*; max 64 keys,
8KB per value.
CORS
Set cors_origins from Settings → CORS
allow-list. Without an entry, browser callers are blocked
(preflight returns 403). Matching origin → response carries
Access-Control-Allow-Origin + Vary: Origin;
OPTIONS preflight returns 204 with the
Access-Control-Allow-Methods/Headers/Max-Age echo.
Wildcard * mirrors the request origin so the worker
stays credential-compatible.
Staging environment
Every worker has two version slots: prod
(active_version_id) and staging
(staging_version_id). The dashboard's Deploy to
staging button writes a new version to the staging slot
without touching prod; Promote → prod swaps it in.
Staging is reachable via a hidden URL segment:
# Prod
curl https://YOUR-FUNNEL/w/acme/api/hello
# Staging — same worker, separate version
curl https://YOUR-FUNNEL/w/acme/api/hello/_staging
The _staging segment is consumed by the dispatcher;
your code sees the remaining path as request.path.
Invocations carry an env attribute
("prod"/"staging") for filtering.
Outbound HTTP via funnel.fetch
User code has no fetch / http /
net. The single egress primitive is
funnel.fetch(url, opts), which round-trips through an
Elixir-side proxy that enforces three defenses on every call:
-
Allow-list. The URL's
host[:port] must appear in
outbound_allow_list, or the list must include*. Empty list = throws immediately. - SSRF guard. After DNS resolution, refuses private (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), loopback (127.0.0.0/8), link-local (169.254/16 — blocks the cloud metadata service), and multicast addresses.
-
Resource caps. 1MB
response cap, 5s default timeout (override via
timeoutMs, clamped 50–30,000ms),Host/Connection/Transfer-Encodingetc. headers blocked.
// Configure outbound_allow_list = ["api.stripe.com"] in the dashboard.
const resp = await funnel.fetch("https://api.stripe.com/v1/charges", {
method: "POST",
headers: {
authorization: `Bearer ${env.STRIPE_KEY}`,
"content-type": "application/x-www-form-urlencoded"
},
body: "amount=1000¤cy=usd",
timeoutMs: 3000
});
return {
status: resp.status,
headers: { "content-type": "application/json" },
body: resp.body
};
Security model
-
No ambient capabilities.
fetch,require,process,setTimeout, the filesystem, the network — all undefined.typeof requirereturns the string"undefined". The only egress primitive is the allow-list-gatedfunnel.fetch. -
Per-project warm pool.
Sandboxes are reused within the same project (skipping ~30–60ms
Node start-up per request) but never across projects.
Each invocation creates a fresh
vm.createContext, so two back-to-back requests can't share V8 state. Pool caps: 32 in-flight global, 8 per project, 4 warm per project. Idle sandboxes self-terminate after 30s. - Replace-on-failure. Timeout, OOM, or crash kills the sandbox; the next acquire spawns fresh. The Pool traps exits so abnormal sandbox deaths never crash the host.
-
Wall-clock kill.
timeout_msis enforced on the BEAM side by closing the Port — V8's ownvmtimeout is a second layer. -
Memory cap.
--max-old-space-size=<memory_mb>. OOM kills the Node process cleanly; the invocation is recorded ascrashed. -
Quotas + per-project concurrency.
daily_invocation_capper worker, accounted via an ETS-backed atomic counter (seeded once per day from SQL), plus a per-project sandbox cap. Over-cap returns 429. -
Isolation driver.
The Node spawn goes through a pluggable
Funnel.Workers.Runtime.Driver. The default driver is in-process. On Linux you can opt-in to theHardeneddriver (FUNNEL_WORKERS_DRIVER=Elixir.Funnel.Workers.Runtime.Driver.Hardened), which wraps Node infirejail(caps drop, seccomp,--net=none) orunshare(user/PID/net/mount namespaces) when available.
Runtime.Driver that spawns Node inside
Firecracker microVMs or gVisor. The behaviour is just
spawn_command/1; adding a driver doesn't touch the
Sandbox, Pool, or controller.
HTTP status codes
| Status | Meaning |
|---|---|
200 | Whatever your code returns. |
204 | CORS preflight (OPTIONS with matching origin). |
401 | auth_required and the Bearer token is missing/wrong. |
403 | CORS preflight from a non-allow-listed origin. |
404 | Org / project / worker slug doesn't exist, or staging route hit a worker with no staging deploy. |
410 | Worker is disabled via the kill switch. |
413 | Request body exceeds the 1MB cap. |
429 | daily_invocation_cap reached or per-project concurrency cap reached. Carries Retry-After. |
500 | Runtime error in user code (exception, syntax error). |
503 | Pool exhausted across all projects, or Node runtime not available. |
504 | Wall-clock timeout. |
Observability
Each invocation writes a row to worker_invocations
(partitioned by month, BRIN-indexed on time) and emits two metric
points into the regular pipeline:
workers.invocations— counter, tagsworker, status, warmth.workers.latency_ms— gauge, same tags.
The invocation row also stores a JSONB attributes
column. Today it carries:
trace_id+parent_span_id— extracted from W3Ctraceparent, when the caller sent one. Lets the invocation row join to upstream traces.warmth—"cold"(fresh sandbox spawn) or"warm"(reused from pool). Useful for tuning the warm-pool sizing.env—"prod"or"staging".
Invocation writes are batched + retried by
Funnel.Workers.InvocationWriter (5 attempts with
exponential backoff, then a structured log line per row so an
external collector can recover the data). The Pool drains
gracefully on app shutdown (FUNNEL_WORKERS_DRAIN_MS,
default 5s).
Safe templates #
A non-Turing alternative to the JS runtime. Pick this when you only need to return a static response, a JSON envelope built from request fields, a redirect, or an env-driven config — and you want a strong guarantee that the worker cannot spin a CPU, eat memory, or call out anywhere.
Syntax
Source is line-oriented. Empty lines and lines starting with
#
are ignored. Three directives are recognised:
| Directive | Effect |
|---|---|
status: 200 |
HTTP status code. Defaults to 200. Last status: line wins. |
header: name: value |
Response header. Name is lowercased. Multiple header: lines accumulate; same-name lines overwrite. |
body: … |
Sets the response body. Last body: line wins. Placeholders interpolate. |
# comment |
Ignored. |
| other | Appended to the body as a literal line (after interpolation). Useful for multi-line bodies — see HTML example. |
Placeholders
Anything inside
{{…}}
on any line is replaced before the line is written. Unknown
placeholders render as the empty string (never throw).
| Placeholder | Resolves to |
|---|---|
{{request.method}} | HTTP verb — GET, POST, etc. |
{{request.path}} | Request path including the /w/… prefix. |
{{request.url}} | Full URL. |
{{request.headers.NAME}} | Specific request header (case-insensitive name). |
{{request.query.NAME}} | Query-string value. |
{{env.NAME}} | Env var from the worker's env_vars map (decrypted at request time). |
{{now}} | Current UTC time, ISO 8601. |
{{raw NAME}} | Bypass auto-escaping for trusted interpolations — see below. |
Auto-escaping
Interpolated values are escaped by default based
on the response content-type you set with a
header: directive — this is the only thing standing
between you and a reflected XSS when echoing query strings into a
page.
| Content-type | Escape applied |
|---|---|
application/json (and application/*+json) | JSON string-escape (quotes, backslashes, newlines, control characters). Output is safe to drop inside a JSON string literal. |
text/html, application/xhtml+xml, anything containing xml | HTML attribute-safe escape (&, <, >, quotes). |
| Everything else (incl. no header) | No escape (raw interpolation). |
Header values are not escaped — CRLF (\r /
\n) is stripped instead, so a malicious
{{request.query.x}} can't smuggle a
Set-Cookie via response-splitting.
To opt out of escaping for a specific value (you've already
validated it, you're emitting pre-rendered HTML, etc.), prefix
the placeholder name with raw:
header: content-type: text/html
# {{request.query.name}} — escaped: <b>Bob</b> renders as <b>Bob</b>
# {{raw request.query.name}} — raw: <b>Bob</b> renders as <b>Bob</b>
body: <h1>Hi {{request.query.name}}</h1>
Examples
JSON response that echoes a query param
status: 200
header: content-type: application/json
body: {"hello":"{{request.query.name}}","served_at":"{{now}}"}
Redirect
status: 302
header: location: https://example.com/{{request.query.to}}
header: cache-control: no-store
Health check with current time
# Minimal health endpoint — perfect for uptime monitors.
status: 200
header: content-type: text/plain
body: ok {{now}}
Public app config (env-driven)
# Set env_vars on the worker: api_base, feature_flag
status: 200
header: content-type: application/json
header: cache-control: public, max-age=60
body: {"api":"{{env.api_base}}","new_ui":"{{env.feature_flag}}"}
Multi-line HTML body
Lines that don't start with a directive become body lines.
status: 200
header: content-type: text/html
body: <!DOCTYPE html>
<html>
<head><title>Hi {{request.query.name}}</title></head>
<body><h1>Hello, {{request.query.name}}</h1>
<p>Served at {{now}}</p></body>
</html>
Limits & behaviour
- No loops, no branching. Templates are pure substitution. To pick a value based on the request, do it on the client (or upstream).
-
Single body field.
Either set
body:on a single line, or build a multi-line body by writing literal lines beneath it. The output response has exactly one body. -
No nested placeholders.
{{…}}cannot reference each other. Each interpolation is a single lookup. -
Headers and body still respect platform caps:
the
workers_publicrate limit, daily invocation cap, and HTTP body size limits all apply just like the Node runtime. -
Every call still records an invocation row
and emits
workers.invocations&workers.latency_msmetrics — so safe-template workers show up next to JS workers in the dashboard and Metrics Explorer.
Host / agent heartbeat #
/v1/hosts/heartbeat
Infrastructure auto-discovery. Upsert host status every 30–60s. Required scope: ingest:hosts.
Single endpoint for hosts, containers, lambdas, and GPU nodes. The row
is upserted on (project_id, hostname). Send a heartbeat
every 30–60 seconds; the dashboard marks the host
stale
after two missed intervals.
Request body
{
"arch": "arm64",
"cloud_provider": "aws",
"cpu_pct": 32.1,
"disk_pct": 41.2,
"hostname": "web-prod-03",
"instance_type": "c7g.xlarge",
"kind": "host",
"memory_pct": 64.8,
"os": "linux",
"region": "us-east-1",
"status": "healthy",
"tags": {
"env": "prod",
"role": "web"
}
}
kind
is free-form. Cloud attribution fields (cloud_provider,
region, instance_type) are optional but
power cost-attribution joins against
/v1/cost/records.
Open an incident #
/v1/incidents
Forward incidents from PagerDuty, Statuspage, or any external system. Required scope: ingest:incidents.
Use this when an external alerting system (PagerDuty, OpsGenie, custom
CI) needs to open an incident in Funnel's incident-management view.
For Funnel's own alert rules with
auto_create_incident=true, incidents are opened
internally — no HTTP call needed.
Request body
{
"description": "Forwarded from PagerDuty incident PD-9182.",
"severity": "critical",
"tags": [
"checkout",
"pagerduty"
],
"title": "Checkout API: elevated 5xx rate",
"triggered_at": "2026-05-15T17:42:00Z"
}
The created incident's source
is set to manual
unless an
alert_rule_id
is provided, in which case it inherits the rule's kind
(alert, anomaly, synthetic).
Cloud cost rollups #
/v1/cost/records
Bulk daily-grain cost line items. Auto-attributed to catalog services. Required scope: ingest:costs.
One row per (day, cloud_provider, resource_kind, service)
tuple. Funnel's daily CostAttributorJob joins these against
the Software Catalog on service.name and stamps
tier / owner_team
into the row's tags jsonb — so cost dashboards can group
by team without a manual mapping table.
Request body
{
"records": [
{
"amount_cents": 4287,
"cloud_provider": "aws",
"currency": "USD",
"day": "2026-05-15",
"environment": "prod",
"region": "us-east-1",
"resource_kind": "compute",
"service": "checkout-api",
"tags": {
"cost-category": "ec2"
},
"team": "payments"
},
{
"amount_cents": 1832,
"cloud_provider": "aws",
"day": "2026-05-15",
"environment": "prod",
"region": "us-east-1",
"resource_kind": "database",
"service": "checkout-api",
"team": "payments"
}
]
}
amount_cents
is always an integer in the smallest unit of currency.
$12.34 → 1234. This avoids floating-point rounding when
aggregating across millions of rows.
Security findings #
/v1/findings
Unified SIEM / SCA / secret / vuln findings, with automatic dedupe. Required scope: ingest:findings.
Single endpoint for all four security pillars. Funnel routes by
source:
source |
Typical sender |
|---|---|
siem | Cloud SIEM — CloudTrail, GuardDuty, Defender |
workload | eBPF / container runtime sensors |
secret_scanner | gitleaks, trufflehog, GitHub Secret Protection |
vuln_scanner | trivy, snyk, Dependabot |
Request body
{
"findings": [
{
"external_id": "CVE-2026-12345",
"kind": "vuln.cve",
"metadata": {
"package": "openssl",
"version_affected": "<3.2.1"
},
"repository": "smor/funnel",
"severity": "high",
"source": "vuln_scanner",
"title": "CVE-2026-12345: openssl heap overflow"
},
{
"external_id": "gitleaks/aws-key-abc123",
"file_path": "config/runtime.exs",
"kind": "leaked_secret",
"line": 42,
"repository": "smor/funnel",
"severity": "critical",
"source": "secret_scanner",
"title": "AWS access key in commit"
}
]
}
Dedupe semantics
Findings carrying an external_id
upsert on
(project_id, source, external_id). On a repeat:
last_seen_atbumps to the current timeseen_countincrements by 1-
severityescalates if higher than the stored value (never de-escalates)
Findings without an external_id
always insert. The response counts inserted vs deduped rows:
{ accepted: 3, deduped: 5, errors: 0 }.
Deploy markers #
/v1/deployments
Record a service deploy. Rendered as vertical lines on metric charts. Required scope: ingest:deployments.
POST from CI when a service ships. The marker becomes a vertical dashed line on every chart that overlaps its timestamp — so you can correlate latency / error spikes with deploys at a glance.
Request body
{
"commit_sha": "9a8b7c6d5e4f3a2b1c0d",
"environment": "prod",
"link_url": "https://github.com/acme/checkout/actions/runs/123456",
"metadata": {
"pipeline": "github-actions",
"run_id": "123456"
},
"previous_version": "v2.3.7",
"service": "checkout-api",
"source": "ci",
"time": "2026-05-15T17:42:01Z",
"version": "v2.4.0"
}
deployment.version
resource attribute, Funnel writes a marker with
source: "detected"
automatically. Manual posts get source: "ci"
(or "manual") for provenance.
OTLP / HTTP #
The HTTP transport accepts both
OTLP encodings: application/json
and
application/x-protobuf. Either works with the same
three endpoints — no separate URL, no separate port. The decoder
dispatches on Content-Type.
SDK configuration
Point any OpenTelemetry SDK at Funnel via the standard env vars:
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4000
export OTEL_EXPORTER_OTLP_HEADERS="authorization=Bearer st_YOUR_API_KEY"
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf # or http/json
export OTEL_EXPORTER_OTLP_COMPRESSION=gzip # optional but recommended
SDKs append /v1/metrics, /v1/traces,
/v1/logs
automatically.
Spec compliance
| Feature | Status | Notes |
|---|---|---|
application/json | ✓ | All three signals |
application/x-protobuf | ✓ | Pure-Elixir wire-format decoder, zero deps, depth + length capped |
| OTLP/gRPC on :4317 | ✓ | Real gRPC server, opt-in via FUNNEL_GRPC_ENABLED=1 |
Content-Encoding: gzip | ✓ | Transparent for both encodings; capped at 50 MB inflated to prevent zip-bomb DoS |
| PartialSuccess responses | ✓ | Returns rejectedDataPoints / rejectedSpans / rejectedLogRecords |
| Status codes | ✓ | 200 / 400 / 401 / 403 / 413 / 415 / 429 / 503 |
| Retry-After | ✓ | Sent on 429 (burst), 429 (daily cap), 503 (shedding) |
| Max body size | 5 MB | Pre-decompression; configurable via :funnel, :otlp_max_body_bytes |
| Instrumentation Scope | ✓ | Surfaced as otel.library.name/version attributes |
| Span events & links | ✓ | Normalized into JSON arrays |
| Trace + span ID validation | ✓ | Rejects non-hex or zero-only IDs; reflects in PartialSuccess |
| Histogram exponential buckets | partial | Field decoded; aggregation TODO |
| Exemplars | ✓ | Captured per data point |
| Sharded pipeline | ✓ | One GenServer per pillar; a slow metrics flush can't stall traces |
| Graceful shutdown | ✓ | 30-second drain on SIGTERM; buffered items flush to Postgres before exit |
| Auth cache | 60s ETS | bcrypt verify runs at most once per minute per key (~10× lower request latency) |
Backpressure
Each pillar has its own GenServer with a bounded buffer.
When a buffer crosses the 10k high-water mark, the server returns
503 Service Unavailable
with a Retry-After
header. Recovery uses hysteresis: shedding clears once the buffer
drops below 5k. Buffer state is exposed via
Funnel.Ingestion.Pipeline.stats/0.
# Server is shedding — example response
HTTP/1.1 503 Service Unavailable
Retry-After: 12
Content-Type: application/json
{"error":"service overloaded","retry_after_ms":12000}
Verified throughput
Sustained 30,900 events / sec
for 30 seconds with full SDS regex redaction, Software Catalog
auto-tagging, partition-aware inserts, and PartialSuccess
validation — zero drops, zero 5xx. Single-node laptop, single
Postgres instance. Run your own benchmark with the included
mix funnel.load task:
# 500 req/s × 100 events/req = 50k events/sec target
mix funnel.load \
--rps 500 --duration 30 --batch 100 --senders 16 \
--endpoint http://localhost:4000/v1/metrics \
--token st_xxx --gzip
OTLP / gRPC #
Real gRPC server on the OTLP-standard port 4317.
Implements the three canonical services:
opentelemetry.proto.collector.metrics.v1.MetricsService/Exportopentelemetry.proto.collector.trace.v1.TraceService/Exportopentelemetry.proto.collector.logs.v1.LogsService/Export
Enable it
Boot-gated so dev installs don't pay the listener overhead. Set
FUNNEL_GRPC_ENABLED=1
before starting:
export FUNNEL_GRPC_ENABLED=1
export FUNNEL_GRPC_PORT=4317 # optional, this is the default
mix phx.server
SDK configuration
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_HEADERS="authorization=Bearer st_YOUR_API_KEY"
gRPC status mapping
| HTTP equivalent | gRPC status | Reason |
|---|---|---|
| 200 | OK | Accepted |
| 401 | UNAUTHENTICATED | Missing or invalid authorization metadata |
| 429 | RESOURCE_EXHAUSTED | Burst rate limit or daily quota |
| 503 | UNAVAILABLE | Pipeline shedding load |
| 500 | INTERNAL | Server-side decode failure |
Funnel.Ingestion.OtlpProto),
the same Pipeline (Funnel.Ingestion.Pipeline.try_ingest/3),
and the same quota machinery as the HTTP transport. The only
differences are framing (HTTP/2 trailers) and the protobuf
message structs — Funnel.Otlp.Proto.{Metrics,Trace,Logs}Service
treat nested OTLP messages as opaque bytes and dispatch to the
wire-format decoder, so no protoc tooling or codegen is required.
OTLP / HTTP + Protobuf #
POST your ExportMetricsServiceRequest /
ExportTraceServiceRequest /
ExportLogsServiceRequest
protobuf bytes to the regular endpoint — no separate URL, no
separate scope:
curl -X POST http://localhost:4000/v1/metrics \
-H "authorization: Bearer st_YOUR_API_KEY" \
-H "content-type: application/x-protobuf" \
--data-binary @export-metrics-request.pb
Decoded by a pure-Elixir protobuf wire-format parser
(Funnel.Ingestion.OtlpProto). No
protoc, no codegen, no extra deps — the decoder
targets the OTLP messages directly. Hardened with
@max_depth=32 recursion limit and
@max_field_bytes=16MB length-prefix cap to prevent
OOM via hostile payloads. Combined with gzip, this is the fastest
transport Funnel offers.
Quotas & tiers #
Every project belongs to a plan tier with two layers of limits: a daily budget (bytes + events; resets at 00:00 UTC) and a burst rate (token-bucket events/sec; protects against runaway senders consuming the daily budget in a few seconds).
Tier defaults
| Tier | Ingest/day | Events/day | Burst rate | Burst cap | Retention | Workers |
|---|---|---|---|---|---|---|
| Developer | 100 MB | 100k | 50 /s | 500 | 7d | 5 |
| Team | 5 GB | 5M | 1,000 /s | 10,000 | 30d | 50 |
| Business | 50 GB | 50M | 10,000 /s | 100,000 | 90d | 500 |
| Enterprise | 1 TB | 1B | 100k /s | 1M | 365d | 5,000 |
Check order on every request
-
Burst check — token-bucket scoped to the project.
Sub-microsecond ETS lookup. If insufficient tokens for the batch:
429withRetry-After; daily budget is not charged. -
Daily quota check — bytes + events vs the tier cap.
If over:
429withRetry-After: <seconds-to-midnight-UTC>and JSON body describing what was exceeded. -
Pipeline acceptance — buffer headroom check.
If shedding:
503withRetry-After.
Response shapes
Burst rate-limited (429)
HTTP/1.1 429 Too Many Requests
Retry-After: 1
Content-Type: application/json
{"error":"burst rate limit","retry_after_ms":1000}
Daily quota exceeded (429)
HTTP/1.1 429 Too Many Requests
Retry-After: 14400
Content-Type: application/json
{
"error":"quota exceeded",
"kind":"bytes",
"used":104857600,
"limit":104857600,
"resets_at_utc":"2026-05-18T00:00:00Z"
}
Per-project overrides
Operator dial for paying customers that need more headroom than the
tier default. Set any of these projects.*_override
columns to NULL to inherit the tier value:
ingest_bytes_per_day_override— override the daily byte capingest_events_per_day_override— override the daily event capretention_days_override— extend retention without upgrading the tiermax_workers_override— extra Edge Worker headroom
Live dashboard
The Quotas & limits page (sidebar → Security group) shows live usage, rejection counts, and pipeline buffer headroom. Refreshes every 3 seconds.
Alerts #
Alert rules
are evaluated by an Oban cron worker once a minute. When a rule
breaches it transitions to firing,
writes an alert_events
row, and enqueues one webhook delivery per configured destination.
When the rule recovers it transitions back to
ok
and resolves any open events.
Rule kinds
Each rule has a
kind
discriminator and a free-form
config
jsonb. The same shape is accepted by the
dashboard form and the underlying schema.
Aggregates a metric over a window, compares against a threshold. Use for "p95 latency > 500ms" or "error counter > 10 / 5min".
| key | type | notes |
|---|---|---|
| metric | string | metric name |
| agg | enum | avg · sum · max · min · p50 · p95 · p99 · rate · count |
| operator | enum | > · >= · < · <= · == |
| threshold | number | |
| window_seconds | int | default 300 |
| filters | object | attribute key/value to scope the query |
{
"config": {
"agg": "p95",
"filters": {
"service": "api"
},
"metric": "http.server.duration_ms",
"operator": ">",
"threshold": 500,
"window_seconds": 300
},
"kind": "metric_threshold",
"name": "API p95 latency",
"severity": "warning"
}
Counts spans where status != 'ok'
divided by total spans for the named service in the window.
Threshold is a fraction (e.g. 0.05 = 5%).
{
"config": {
"service": "api",
"threshold": 0.05,
"window_seconds": 300
},
"kind": "error_rate",
"name": "API error rate",
"severity": "critical"
}
Fires when at least threshold
log entries match the
pattern
(full-text search) at severity
severity_min
or higher within the window.
{
"config": {
"pattern": "timeout connecting to upstream",
"severity_min": "error",
"threshold": 10,
"window_seconds": 300
},
"kind": "log_pattern",
"name": "Repeated DB timeout",
"severity": "critical"
}
Funnel continuously updates a per-metric baseline
(EWMA
mean + variance, α = 0.1). When the latest aggregate is
|x − μ| > k · σ, the rule fires. k
defaults to 3 — roughly a 1-in-370 false-positive rate under
Gaussian assumptions.
Baselines need ≥10 samples to become live, so anomaly alerts don't fire on cold start.
{
"config": {
"k": 3,
"metric": "http.server.duration_ms",
"window_seconds": 300
},
"kind": "anomaly",
"name": "Login latency anomaly",
"severity": "warning"
}
Fires when the synthetic check identified by
check_id
has failed at least
consecutive_failures
times in a row (default 3). Latency / body / status
assertions are configured on the check itself.
{
"config": {
"check_id": 42,
"consecutive_failures": 3
},
"kind": "synthetic_failure",
"name": "Public homepage down",
"severity": "critical"
}
State machine
Every minute the evaluator transitions a rule between three states.
on firing: insert alert_events row · enqueue WebhookDeliveryWorker per destination
on resolve: stamp resolved_at on open events
Synthetics #
Synthetic checks are scheduled HTTP probes that run from your
Funnel deployment. Each successful run writes a
synthetic.latency_ms
metric and a synthetic.success
counter through the normal ingestion pipeline, plus a log entry —
so synthetic data shows up in the Metrics Explorer, Logs, and
Alerts pages just like any other telemetry.
Check configuration
| field | type | notes |
|---|---|---|
| name | string | human-readable, becomes the check attribute |
| method | enum | GET · POST · PUT · DELETE · HEAD |
| url | string | https:// or http:// |
| headers | object | optional request headers |
| body | string | raw request body |
| interval_seconds | int | how often to run, min 30 |
| timeout_ms | int | request & connect timeout |
| expected_status | int | success if response code == this |
| assert_body_regex | string | optional — body must match |
| max_latency_ms | int | optional — fail if exceeded |
| enabled | bool | pause without deleting |
{
"assert_body_regex": "\"status\":\\s*\"ok\"",
"enabled": true,
"expected_status": 200,
"headers": {
"x-api-key": "..."
},
"interval_seconds": 60,
"max_latency_ms": 1500,
"method": "GET",
"name": "Homepage health",
"timeout_ms": 10000,
"url": "https://www.example.com/health"
}
Pass/fail logic
A check is considered successful when all the following hold:
-
·
HTTP status equals
expected_status -
·
Response received before
timeout_mselapsed -
·
max_latency_ms(if set) is not exceeded -
·
assert_body_regex(if set) matches the response body
What gets emitted per run
synthetic.latency_ms
— gauge metric, attributes
check
and
success=true|false
synthetic.success
— counter, value 1 on success / 0 on failure
log entry
— service
funnel.synthetics, severity
info / error
check row updated
— last_run_at, last_status, last_latency_ms,
consecutive_failures
Pairing with alerts
The cleanest way to alert on a synthetic check is the
synthetic_failure
rule kind — it tracks
consecutive_failures
on the check row directly, avoiding flap. For ratio-based alerting
across many checks you can also evaluate the emitted
synthetic.success
counter via a
metric_threshold
rule with
agg = "avg".
Sensitive Data Scanner (SDS) #
The Sensitive Data Scanner inspects every piece of telemetry you ingest
— logs, traces, RUM events, session replay strings, and metric
attributes — and either alerts, redacts,
hashes, or drops matches before they
land in storage. SDS sits inline in the ingestion pipeline shard, so
there is no batch job to wait for: a credit card in a log line
becomes [REDACTED:credit_card] in the same flush that
would have persisted it.
Three additional production-grade safeguards run on every scan:
-
ReDoS protection — user-supplied regexes
are validated by a static heuristic + a 100 ms stress test at config time, then
run with a per-call 50 ms wall-clock ceiling. The classic
(a+)+$class of pathological patterns is rejected at the form. - Per-project CPU budget — token-bucket limit on scan-microseconds per second. A noisy project can't starve scanning for other projects on the same shard.
-
Cluster-wide kill switch — operators
can pause SDS for a project instantly via the
sds_enabledcolumn or the LiveView toggle.Phoenix.PubSubbroadcasts the flip; every node updates within milliseconds.
All operations on SDS — rule CRUD, kill-switch flips, bulk category
toggles — are mirrored to
/v1/findings
as sds.rule.* events so the audit trail is unified with the
rest of the security pipeline.
Pattern catalogue (170 built-ins) #
The full curated catalogue, grouped by category. Each row shows
the rule's stable builtin_id (used when seeding /
referencing rules), its default action, and severity. Patterns
tagged noisy seed
disabled by default — operators opt in deliberately.
Live-loaded from Funnel.Sds.Library.all/0 at request
time — what you see here is exactly what runs in production,
never a marketing copy.
credentials 136 patterns ›
| builtin_id | Name | Action | Severity | Notes |
|---|---|---|---|---|
| adyen_api_key | Adyen API key | redact | critical | |
| airtable_api_key | Airtable API key / PAT | redact | critical | |
| alchemy_api_key | Alchemy API key | redact | high | |
| anthropic_api_key | Anthropic API key | redact | critical | |
| api_key_in_url | API key in URL query parameter | redact | high | |
| asana_pat | Asana personal access token | redact | critical | |
| atlassian_api_token | Atlassian (Jira/Confluence) API token | redact | critical | |
| auth0_api_token | Auth0 management API token | redact | critical | noisy |
| aws_access_key | AWS access key ID | redact | critical | |
| aws_mws_auth_token | AWS MWS auth token | redact | critical | |
| aws_secret_key | AWS secret access key | redact | critical | ✓ validated |
| aws_session_token | AWS session token | redact | critical | |
| azure_sas_token | Azure SAS token | redact | high | noisy |
| azure_storage_key | Azure storage account key (88-char base64) | redact | critical | |
| azure_subscription_id | Azure subscription / tenant GUID | alert | medium | |
| basic_auth_header | HTTP Authorization Basic header | redact | high | |
| basic_auth_in_url | HTTP basic-auth credentials in URL | redact | high | |
| bearer_auth_header | HTTP Authorization Bearer header | redact | high | |
| bitbucket_oauth_secret | Bitbucket OAuth client secret | redact | critical | |
| bitcoin_wif_key | Bitcoin private key (WIF format) | redact | critical | noisy |
| bugsnag_api_key | Bugsnag API key | redact | high | |
| buildkite_agent_token | Buildkite agent token | redact | critical | |
| calendly_token | Calendly personal access token | alert | medium | noisy |
| circleci_token | CircleCI personal API token | redact | critical | |
| clickup_api_token | ClickUp personal API token | redact | critical | |
| cloudflare_api_token | Cloudflare API token | redact | critical | |
| cloudflare_global_api_key | Cloudflare global API key | redact | critical | |
| codecov_upload_token | Codecov upload token | redact | high | |
| cohere_api_key | Cohere API key | redact | critical | |
| crates_io_token | crates.io API token | redact | critical | noisy |
| databricks_pat | Databricks personal access token | redact | critical | |
| datadog_api_key | Datadog API key | redact | critical | |
| datadog_app_key | Datadog application key | redact | critical | |
| deepl_api_key | DeepL API key | redact | critical | |
| discord_bot_token | Discord bot token | redact | critical | |
| discord_webhook_url | Discord webhook URL | redact | high | |
| do_api_token | DigitalOcean personal-access token | redact | critical | |
| docker_registry_auth | Docker registry auth config | redact | high | |
| dockerhub_pat | Docker Hub personal access token | redact | critical | |
| doppler_token | Doppler token | redact | critical | |
| elevenlabs_api_key | ElevenLabs API key | redact | high | |
| email_password | Email + password combo (leak format) | redact | critical | |
| ethereum_private_key | Ethereum private key (32 bytes hex) | redact | critical | |
| etherscan_api_key | Etherscan API key | redact | high | |
| fastly_api_token | Fastly API token | redact | critical | |
| firebase_fcm_server_key | Firebase Cloud Messaging legacy server key | redact | critical | |
| fly_io_token | Fly.io access token | redact | critical | |
| gcp_api_key | GCP API key | redact | critical | |
| gcp_oauth_client_id | GCP OAuth 2.0 client ID | alert | medium | |
| gcp_service_account | GCP service-account key (JSON private_key field) | redact | critical | |
| generic_secret | Generic high-entropy secret | alert | high | ✓ validated |
| github_app_token | GitHub app installation token (ghs_…) | redact | critical | |
| github_pat | GitHub personal access token | redact | critical | |
| github_refresh_token | GitHub refresh token (ghr_…) | redact | critical | |
| gitlab_pat | GitLab personal access token | redact | critical | |
| google_api_key | Google API key | redact | critical | |
| groq_api_key | Groq API key | redact | critical | |
| heroku_api_key | Heroku API key | redact | critical | |
| honeycomb_api_key | Honeycomb API key | redact | critical | |
| huggingface_token | HuggingFace access token | redact | critical | |
| infura_project_id | Infura project ID | redact | high | |
| jfrog_api_key | JFrog Artifactory API key | redact | critical | |
| jwt | JSON Web Token | redact | high | ✓ validated |
| k8s_service_account_token | Kubernetes service-account JWT (kube-apiserver) | redact | critical | |
| launchdarkly_access_token | LaunchDarkly access token | redact | critical | |
| launchdarkly_mobile_key | LaunchDarkly mobile key | redact | critical | |
| launchdarkly_sdk_key | LaunchDarkly SDK key | redact | critical | |
| linear_api_key | Linear API key | redact | critical | |
| linode_pat | Linode personal access token | redact | critical | |
| loggly_customer_token | Loggly customer token | redact | high | |
| mailchimp_api_key | Mailchimp API key | redact | critical | |
| mailgun_api_key | Mailgun API key | redact | critical | |
| mailjet_api_key | Mailjet API key | redact | high | |
| mistral_api_key | Mistral API key | redact | critical | |
| mongodb_uri_with_password | MongoDB connection URI with password | redact | critical | |
| mssql_connection_string | Microsoft SQL Server connection string | redact | critical | |
| mysql_uri_with_password | MySQL connection URI with password | redact | critical | |
| netlify_token | Netlify personal access token | redact | critical | |
| new_relic_insert_key | New Relic insights insert key | redact | critical | |
| new_relic_license_key | New Relic license key | redact | critical | |
| new_relic_user_key | New Relic user API key | redact | critical | |
| ngrok_auth_token | ngrok auth token | redact | high | |
| notion_integration_token | Notion integration token | redact | critical | |
| notion_ntn_token | Notion ntn token | redact | critical | |
| npm_token | NPM access token | redact | critical | |
| okta_api_token | Okta API token | redact | critical | noisy |
| onepassword_service_account_token | 1Password service-account token | redact | critical | |
| onesignal_api_key | OneSignal REST API key | redact | high | |
| openai_api_key | OpenAI API key | redact | critical | |
| opsgenie_api_key | Opsgenie API key | redact | high | |
| pagerduty_integration_key | PagerDuty integration key | redact | high | |
| paypal_braintree_token | PayPal / Braintree access token | redact | critical | |
| pinecone_api_key | Pinecone API key | redact | critical | |
| plaid_client_id | Plaid client ID + secret | redact | critical | |
| postgres_uri_with_password | PostgreSQL connection URI with password | redact | critical | |
| postman_api_key | Postman API key | redact | critical | |
| postmark_token | Postmark server API token | redact | high | |
| private_key_pem | Private key (PEM block) | redact | critical | |
| pusher_secret | Pusher channel secret | redact | high | |
| pypi_token | PyPI API token | redact | critical | |
| razorpay_key_id | Razorpay key ID | alert | medium | |
| redis_uri_with_password | Redis connection URI with password | redact | critical | |
| render_api_key | Render API key | redact | critical | |
| replicate_api_token | Replicate API token | redact | critical | |
| rollbar_token | Rollbar access token | redact | high | |
| rubygems_api_key | RubyGems API key | redact | critical | |
| segment_write_key | Segment write key | redact | high | |
| sendgrid_api_key | SendGrid API key | redact | critical | |
| sentry_auth_token | Sentry auth token | redact | critical | |
| sentry_dsn | Sentry DSN | redact | high | |
| slack_legacy_token | Slack legacy API token | redact | critical | |
| slack_token | Slack token | redact | critical | |
| slack_webhook_url | Slack incoming webhook URL | redact | critical | |
| snowflake_password | Snowflake account password | redact | critical | |
| sonarqube_token | SonarQube auth token | redact | critical | |
| sparkpost_api_key | SparkPost API key | redact | high | |
| splunk_hec_token | Splunk HEC token | redact | critical | |
| square_access_token | Square access token | redact | critical | |
| square_oauth_secret | Square OAuth secret | redact | critical | |
| ssh_authorized_key | SSH public key (authorized_keys line) | alert | medium | |
| stripe_publishable | Stripe publishable key | alert | medium | |
| stripe_restricted_key | Stripe restricted key | redact | critical | |
| stripe_secret | Stripe secret key | redact | critical | |
| stripe_webhook_secret | Stripe webhook signing secret | redact | critical | |
| supabase_service_role_key | Supabase service-role key (anon/service JWT) | redact | critical | noisy |
| tailscale_auth_key | Tailscale auth key | redact | critical | |
| teams_webhook_url | Microsoft Teams webhook URL | redact | high | |
| telegram_bot_token | Telegram bot token | redact | critical | |
| travis_ci_token | Travis CI token | redact | high | |
| twilio_account_sid | Twilio account SID (AC…) | alert | medium | |
| twilio_api_key | Twilio API key (SK…) | redact | critical | |
| twilio_auth_token | Twilio auth token | redact | critical | |
| vault_batch_token | HashiCorp Vault batch token | redact | critical | |
| vault_service_token | HashiCorp Vault service token | redact | critical | |
| vultr_api_key | Vultr API key | alert | medium | noisy |
| zoom_jwt | Zoom JWT API key | redact | high |
financial 6 patterns ›
| builtin_id | Name | Action | Severity | Notes |
|---|---|---|---|---|
| aba_routing | US ABA routing number | alert | high | ✓ validated noisy |
| bic_swift_code | BIC / SWIFT code | alert | medium | |
| bvn_nigeria | Nigeria Bank Verification Number | alert | high | noisy |
| credit_card | Credit Card Number | redact | critical | ✓ validated |
| iban | IBAN bank account | alert | high | ✓ validated |
| isin_code | ISIN security identifier | alert | low | noisy |
pii 22 patterns ›
| builtin_id | Name | Action | Severity | Notes |
|---|---|---|---|---|
| australia_tfn | Australia Tax File Number | alert | high | noisy |
| date_of_birth | Date of birth (ISO/EU format) | alert | low | noisy |
| dea_number | US DEA registration number | alert | high | noisy |
| ein_us | US Employer Identification Number (EIN) | redact | high | |
| Email address | alert | medium | ||
| ethereum_address | Ethereum address | alert | low | noisy |
| icd10_code | ICD-10 medical diagnosis code | alert | high | noisy |
| india_aadhaar | India Aadhaar number | redact | critical | noisy |
| india_pan | India PAN (Permanent Account Number) | redact | high | |
| ipv6_address | IPv6 address (full form) | alert | low | |
| itin_us | US Individual Taxpayer Identification Number (ITIN) | redact | critical | |
| nigerian_phone | Nigerian phone number | alert | medium | |
| nin_nigeria | Nigeria National Identification Number | alert | high | noisy |
| npi_us | US National Provider Identifier (NPI) | alert | medium | |
| sin_canada | Canadian Social Insurance Number | alert | high | noisy |
| south_africa_id | South Africa national ID | alert | high | noisy |
| ssn_us | US Social Security Number | redact | critical | |
| uk_nin | UK National Insurance Number | redact | critical | |
| uk_phone | UK mobile number | alert | medium | |
| us_drivers_license_state | US driver's license (state-prefixed) | alert | high | |
| us_phone | US phone number | alert | medium | |
| vehicle_vin | Vehicle identification number (VIN) | alert | medium | noisy |
infra 6 patterns ›
| builtin_id | Name | Action | Severity | Notes |
|---|---|---|---|---|
| aws_imds_url | AWS IMDS endpoint reference | alert | medium | |
| firebase_database_url | Firebase Realtime Database URL | alert | low | |
| ipv4_link_local | IPv4 link-local (169.254.0.0/16) | alert | low | |
| ipv4_loopback | IPv4 loopback address (127.0.0.0/8) | alert | info | noisy |
| ipv4_private | Internal IPv4 (RFC1918) | alert | low | |
| mac_address | MAC address | alert | low | noisy |
Plus two NER kinds
(person_name, address) wired through
the pluggable adapter for
free-form PII the regex layer doesn't see.
SDS rules & kinds #
A rule has a kind that decides how the pattern is
executed and an action that decides what happens
on match.
Rule kinds
| kind | pattern field holds… | Use when |
|---|---|---|
builtin |
empty (rule lookups by builtin_id) |
You want one of the 170 curated patterns — AWS keys, Stripe secrets, OpenAI/Anthropic, MongoDB URIs, ITIN/Aadhaar, etc. |
regex |
your custom PCRE regex | Internal-format secrets that aren't in the catalogue (account IDs, internal tokens). |
ner |
person_name · address · any |
Free-form PII that regex misses — names, postal addresses. |
Actions
| action | Effect on the matched substring |
|---|---|
| alert | Recorded as a finding; text is left untouched. |
| redact | Replaced with [REDACTED:rule_id] before persistence. |
| hash | Replaced with [HASH:rule_id:<12-hex>]; the SHA-256 prefix lets you count distinct secrets without storing them. |
| drop | The entire event is removed from the batch. Finding still recorded. |
Validators (built-in only)
A built-in pattern can attach a checksum validator. Matches that pass the regex but fail the validator are silently filtered out — they don't fire a finding AND they aren't redacted. Eliminates the false-positive class where a random 16-digit number looks like a credit card.
- Luhn — credit cards
- ISO 7064 mod-97 — IBAN
- ABA mod-10 — US routing numbers
- Shannon entropy ≥ 3.0 bits/char — generic secrets, AWS secret keys
- JWT shape — three base64-url segments whose first two decode to JSON
Scopes
Every rule has four scope booleans —
scope_logs, scope_traces,
scope_metrics, scope_rum — controlling which
pillars the scanner applies the rule to. At least one must be true. Per-rule
scope toggles are also exposed on the SDS LiveView so you can flip them
without leaving the dashboard.
Example: create a custom regex rule
# Create a custom regex rule that redacts internal account IDs.
# Pattern is validated by PatternSafety on save — evil regexes
# (`(a+)+\$`, ambiguous alternation under quantifier, …) are
# rejected before reaching the hot path.
curl -X POST $FUNNEL/api/graphql \\
-H "Authorization: Bearer st_YOUR_KEY" \\
-H "Content-Type: application/json" \\
-d '{
"query": "mutation { ... }"
}'
# In Elixir code (e.g. via mix task or seed file):
Funnel.Sds.create_rule(project, %{
"name" => "Internal account ID",
"kind" => "regex",
"pattern" => "acct-[a-z0-9]{12}",
"action" => "redact",
"severity" => "high",
"scope_logs" => true,
"scope_traces"=> true,
"scope_rum" => true
}, actor: "[email protected]")
Sample masking
Findings store a matched_sample that's aggressively
masked for critical/high severity rules
(***6f (40 chars)) and partially revealed for lower
severity (bil...com). The plaintext never leaves the
shard — only the mask survives into the database.
NER — free-form PII detection #
Regex catches structured secrets — credit cards, JWTs, API keys. It misses free-form PII: someone's name in a support log, a customer address in a checkout error. The NER (Named Entity Recognition) layer adds a second detector for those cases.
The default detector is a pure-Elixir heuristic engine combining Census gazetteers, structural anchors (street suffix + state code + ZIP), and confidence scoring. No model weights to download, no inference latency. Findings only fire above a configurable threshold (default 0.5).
Confidence model — person names
| Signal | Contribution |
|---|---|
| Both tokens hit gazetteers | +0.40 |
| One token hits a gazetteer | +0.10 |
| Title prefix (Mr/Dr/Prof) | +0.30 |
| Mixed-case context (not shouty) | +0.15 |
| "from/to/by <Name>" frame | +0.05 |
| URL / email context | -0.20 |
Confidence model — addresses
| Signal | Contribution |
|---|---|
| USPS street suffix (St, Ave, Blvd, …) | +0.45 |
| US state code within 30 chars | +0.30 |
| 5-digit ZIP within 50 chars | +0.20 |
| Comma-separated city before state | +0.10 |
| Code-block context (= { nearby) | -0.15 |
Worked example
Input: Approved by Dr. Sarah Johnson at 1234 Main Street, Springfield, IL 62701
-
person_name
Sarah Johnsonconf=0.85 -
address
1234 Main Street, Springfield, IL 62701conf=1.05
both_gazetteers · title_prefixstreet_suffix · state_code · zip_code · city_token
Creating a NER rule
Funnel.Sds.create_rule(project, %{
"name" => "Customer PII (names + addresses)",
"kind" => "ner",
# entity filter: "person_name" | "address" | "any"
"pattern" => "any",
"action" => "redact",
"severity" => "high",
"scope_logs" => true,
"scope_rum" => true
}, actor: "[email protected]")
# Hot-path result on: "Approved by Dr. Sarah Johnson"
# → "Approved by Dr. [REDACTED:rule_42:person_name]"
The pattern field on a NER rule carries the entity kind:
person_name, address, or any to
match both. Action and severity work the same as on regex rules.
Pluggable ML backend
NER detection goes through
Funnel.Sds.NerAdapter — a behaviour
with one callback,
detect/2. The default adapter is
the heuristic engine. For projects that need true ML recall (free-form
addresses, non-US names, multi-language PII), implement the behaviour
against a Bumblebee NER model and set:
defmodule MyApp.BumblebeeNer do
@behaviour Funnel.Sds.NerAdapter
@impl true
def detect(text, _opts) do
# Run Bumblebee.Text.token_classification against a BERT NER
# checkpoint and return entities in the adapter's shape.
results = Nx.Serving.batched_run(BertNer, text)
Enum.map(results.entities, fn e ->
%{
kind: map_label(e.label), # :person_name | :address | :other
text: e.phrase,
offset: e.start,
length: String.length(e.phrase),
confidence: e.score,
reasons: [:ml]
}
end)
end
defp map_label("PER"), do: :person_name
defp map_label("LOC"), do: :address
defp map_label(_), do: :other
end
# config/runtime.exs
config :funnel, :sds_ner_adapter, MyApp.BumblebeeNer
Funnel's hot path stays unchanged — every other component (rule storage, finding writer, sample masking, webhook fan-out) works identically with either adapter.
SDS findings & webhooks #
Every match is recorded as one row in sds_findings. The
row carries snapshots of the rule's name, pattern, and action
at match time — so deleting a rule later doesn't orphan history. A
stable 16-hex-char fingerprint identifies the matched value
(de-duplicated normalisation; rule + match), letting you count distinct
secrets without storing them.
Finding row shape
| column | notes |
|---|---|
| id | UUID. Part of the composite PK with time. |
| time | timestamptz. Range-partition key. |
| rule_id | FK to sds_rules.id. Nullable after soft-delete. |
| builtin_id | e.g. credit_card, aws_access_key; empty for custom rules. |
| rule_name | Snapshot — survives rule deletion. |
| pattern_snapshot | Snapshot of the regex source, or ner:<kind> for NER rules. |
| action_snapshot | Rule action at match time (alert/redact/hash/drop). |
| severity | critical/high/medium/low/info. |
| source_kind | logs/spans/rum/metrics/sessions. |
| source_id | The originating event id when one exists. |
| service_name | Service that produced the event, if known. |
| field_name | Field path inside the source event (e.g. message, attributes.user.email). |
| matched_sample | Masked sample, severity-tuned. |
| fingerprint | 16 hex chars. Same value → same fingerprint. |
| hits_count | N matches collapsed into one row. |
| attributes | jsonb. NER findings include ner_kind, ner_confidence, ner_reasons. |
Webhook fan-out
Critical and high-severity findings are pushed to every
webhook_destinations row configured on the project
(the same table that alerts use — one URL, two event types). The
delivery worker retries 5× with exponential backoff and HMAC-signs
the body when the destination has a secret.
{
"type": "sds.finding.created",
"delivered_at": "2026-05-17T13:43:40.417Z",
"finding": {
"id": "abc123de-...",
"time": "2026-05-17T13:43:40Z",
"project_id": 42,
"rule_id": 7,
"builtin_id": "credit_card",
"rule_name": "Built-in: Credit Card Number",
"severity": "critical",
"source_kind": "logs",
"source_id": "msg-001",
"action_taken": "redact",
"matched_sample": "***11 (19 chars)",
"service_name": "checkout",
"field_name": "message",
"hits_count": 1,
"fingerprint": "129a20e901e6e335"
}
}
The x-funnel-event-type header is set to
sds.finding.created so receivers can route SDS events
separately from alert.fired.
Retention
Funnel.Sds.RetentionJob runs at 04:15 UTC daily. For each
project it (a) drops any whole monthly partition whose range falls
fully below the max-retention across all projects, then (b) batch-
DELETEs older rows in the boundary partition using LIMIT 50_000
chunks to avoid long-running locks. Per-tier retention applies — see
Quotas & tiers.
API Gateway #
A unified read API for everything you've ingested. Same Bearer token as the write paths, same rate-limit primitive (token bucket per project), but exposes the underlying query layer over plain JSON HTTP so you can build custom dashboards, ML pipelines, incident-response runbooks, or exports.
Endpoints
/api/v1/metrics/query
Bucketed metric aggregation. Query params:
name, from, to,
agg (avg/sum/max/min/p50/p95/p99/rate/count),
bucket seconds, group_by CSV,
filters as k=v,k=v.
/api/v1/metrics/names
List distinct metric names seen recently. lookback_hours defaults to 24.
/api/v1/traces
Recent traces. service, operation,
only_errors, since_seconds,
limit (max 500).
/api/v1/traces/:trace_id
All spans for one trace, in start-time order.
/api/v1/logs/search
Full-text log search. q, severity_min,
service, trace_id,
since_seconds, limit (max 1000).
/api/v1/services
Service map snapshot — nodes + cross-service edges with call/avg/error_rate.
Example: rolling p95 latency
# p95 of http.server.duration_ms over the last 5m, in 10s buckets,
# grouped by service:
curl -s "$URL/api/v1/metrics/query?\
name=http.server.duration_ms&\
from=2026-05-13T17:00:00Z&\
to=2026-05-13T17:05:00Z&\
agg=p95&\
bucket=10&\
group_by=service" \
-H "authorization: Bearer st_YOUR_API_KEY"
Rate-limit headers
Every response includes:
-
X-RateLimit-Limit— bucket capacity -
X-RateLimit-Policy— capacity;w=window (in seconds) -
Retry-After— only on 429, seconds to wait
AI Gateway #
Project-scoped LLM gateway. Single endpoint sits in front of
Anthropic, OpenAI, and a built-in echo
provider (for dev/tests). Every call is rate-limited, cost-tracked,
recorded in ai_requests, and — uniquely to Funnel
— emitted back into the project's own metrics
as ai.request.latency_ms,
ai.request.cost_cents, and token counters. So your
AI usage shows up in the Metrics Explorer next to your app
telemetry.
Endpoint
/v1/ai/chat
Authenticated with the project Bearer key. Body is OpenAI-style
(messages, model, max_tokens).
Override
provider
to route to a specific backend; otherwise the project default applies.
Request
curl -s -X POST $URL/v1/ai/chat \
-H "authorization: Bearer st_YOUR_API_KEY" \
-H "content-type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a senior SRE."},
{"role": "user", "content": "Summarize the last hour of error logs for service=api."}
],
"max_tokens": 512
}'
Response shape
{
"content": "Over the last hour there were 47 errors across 'api' …",
"cost_cents": 0.299,
"latency_ms": 1240,
"model": "claude-3-5-sonnet-latest",
"provider": "anthropic",
"usage": {
"completion_tokens": 174,
"prompt_tokens": 132,
"total_tokens": 306
}
}
Error responses
| Status | Body error code | Meaning |
|---|---|---|
| 402 | budget_exhausted | Monthly budget hit; raise it in AI Settings. |
| 429 | rate_limited | Token bucket empty; honor Retry-After. |
| 503 | ai_disabled | AI gateway disabled for this project. |
| 400 | unknown_provider | Provider name is not anthropic, openai, or echo. |
| 502 | provider_error | Upstream LLM provider returned an error (network / 4xx / 5xx). |
Provider matrix
Pricing is approximate, used for budget tracking only — always verify against the provider's current rates.
| Provider | Models | Input ¢/1k tok | Output ¢/1k tok |
|---|---|---|---|
| anthropic | claude-3-5-sonnet-latest | 0.30 | 1.50 |
| anthropic | claude-3-5-haiku-latest | 0.08 | 0.40 |
| anthropic | claude-3-opus-latest | 1.50 | 7.50 |
| openai | gpt-4o | 0.50 | 1.50 |
| openai | gpt-4o-mini | 0.015 | 0.06 |
| echo | echo-1 | free | free |
Load Balancer & Health #
Funnel's load-balancing strategy is straightforward: run multiple
BEAM nodes connected via libcluster, fronted by any
standard L4/L7 load balancer (nginx, Caddy, AWS ALB, fly.io edge).
Phoenix.PubSub
is cluster-aware out of the box, so dashboards on one node see
telemetry ingested on another within milliseconds.
For workloads that benefit from cache locality (the ETS rate-limit
buckets, in-memory anomaly baselines), the
Funnel.LoadBalancer
module exposes a deterministic consistent-hash router that maps a
project_id to a "home node":
# Decide whether to handle this project locally or RPC to its home node:
case Funnel.LoadBalancer.where_should_handle(project_id) do
:local ->
handle_locally(project_id, payload)
{:remote, node} ->
:rpc.call(node, MyModule, :handle, [project_id, payload])
end
Probe endpoints
Three endpoints, intentionally unauthenticated, designed for load balancers and orchestration systems.
/health
Liveness. Always 200 if the VM is up. Cheap — does no I/O.
/ready
Readiness. 200 only if DB + PubSub + ingestion pipeline are all
healthy; 503 otherwise. Use this for your LB's "remove from rotation"
check.
/status
Detailed JSON snapshot: cluster membership, DB latency, PubSub
reachability, uptime, process count, app version. Suitable for
ops dashboards.
Suggested LB config (nginx)
upstream funnel_app {
server funnel-1:4000 max_fails=2 fail_timeout=15s;
server funnel-2:4000 max_fails=2 fail_timeout=15s;
keepalive 32;
}
server {
listen 443 ssl http2;
# Health checks the upstreams
location = /lb-check { proxy_pass http://funnel_app/ready; }
location / {
proxy_pass http://funnel_app;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
}
# WebSockets (LiveView)
location /live {
proxy_pass http://funnel_app;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
}
}
Kubernetes probes
livenessProbe:
httpGet: { path: /health, port: 4000 }
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet: { path: /ready, port: 4000 }
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
Webhook payloads (alerts → external) #
When an alert rule fires, Funnel POSTs this body to every enabled
webhook destination for the project. Signed with HMAC-SHA256 via
X-Funnel-Signature: sha256=…
(when a webhook secret is configured).
{
"fired_at": "2026-05-11T17:03:12.483Z",
"message": "error_rate for api = 0.084 (threshold 0.05)",
"project_id": 1,
"rule": {
"id": 42,
"kind": "error_rate",
"name": "API error rate"
},
"severity": "critical",
"type": "alert.fired",
"value": 0.084
}
Verifying the signature (Node.js)
const crypto = require("crypto");
function verify(rawBody, signatureHeader, secret) {
const expected =
"sha256=" +
crypto.createHmac("sha256", secret).update(rawBody).digest("hex");
return crypto.timingSafeEqual(
Buffer.from(signatureHeader),
Buffer.from(expected)
);
}
SDKs & client samples #
curl -X POST http://localhost:4000/v1/traces \
-H "authorization: Bearer st_YOUR_API_KEY" \
-H "content-type: application/json" \
-d '{"spans":[{"attributes":{"http.method":"GET","http.route":"/users"},"duration_ms":250.0,"end_time":"2026-05-11T17:00:00.250Z","kind":"server","operation_name":"GET /users","parent_span_id":null,"service_name":"api","span_id":"0123456789abcdef","start_time":"2026-05-11T17:00:00.000Z","status":"ok","trace_id":"a1f3c4d5b6e7890123456789abcdef00"},{"duration_ms":160.0,"end_time":"2026-05-11T17:00:00.180Z","kind":"client","operation_name":"SELECT users","parent_span_id":"0123456789abcdef","service_name":"db","span_id":"fedcba9876543210","start_time":"2026-05-11T17:00:00.020Z","status":"ok","trace_id":"a1f3c4d5b6e7890123456789abcdef00"}]}'
body =
Jason.encode!(%{
"metrics" => [
%{
"name" => "http.server.duration_ms",
"kind" => "gauge",
"value" => 182.3,
"attributes" => %{"service" => "api"}
}
]
})
Finch.build(
:post,
"http://localhost:4000/v1/metrics",
[
{"authorization", "Bearer " <> System.fetch_env!("funnel_API_KEY")},
{"content-type", "application/json"}
],
body
)
|> Finch.request(MyApp.Finch)
await fetch("http://localhost:4000/v1/logs", {
method: "POST",
headers: {
"authorization": `Bearer ${process.env.funnel_API_KEY}`,
"content-type": "application/json"
},
body: JSON.stringify({
logs: [
{
time: new Date().toISOString(),
severity: "info",
service_name: "web",
message: "user signed in",
attributes: { user_id: 42 }
}
]
})
});
import os, requests
requests.post(
"http://localhost:4000/v1/traces",
headers={
"authorization": f"Bearer {os.environ['funnel_API_KEY']}",
"content-type": "application/json",
},
json={
"spans": [
{
"start_time": "2026-05-11T17:00:00Z",
"end_time": "2026-05-11T17:00:00.250Z",
"trace_id": "a1f3c4d5b6e7890123456789abcdef00",
"span_id": "0123456789abcdef",
"service_name": "api",
"operation_name": "GET /users",
"duration_ms": 250.0,
"status": "ok",
}
]
},
timeout=5,
)
Connectors (33 ready-to-copy recipes) #
Funnel speaks OpenTelemetry, so anything that can emit OTLP works out of
the box. Below is a copy-and-paste recipe for every one of the 33
curated connectors. In every example, replace
YOUR-FUNNEL
with your host and
st_YOUR_KEY
with an API key created in
API keys.
https://YOUR-FUNNEL
Authorization: Bearer st_YOUR_KEY
OTLP/HTTP · OTLP/gRPC · native JSON
Infrastructure
6 connectors
K
Kubernetes
OTel Collector DaemonSet via Helm.
›
# values.yaml for the open-telemetry/opentelemetry-collector chart
mode: daemonset
config:
receivers:
otlp: { protocols: { http: {}, grpc: {} } }
kubeletstats: { collection_interval: 30s, auth_type: serviceAccount }
filelog: { include: [/var/log/pods/*/*/*.log] }
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers:
authorization: "Bearer st_YOUR_KEY"
service:
pipelines:
metrics: { receivers: [otlp, kubeletstats], exporters: [otlphttp] }
traces: { receivers: [otlp], exporters: [otlphttp] }
logs: { receivers: [otlp, filelog], exporters: [otlphttp] }
# Install:
# helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
# helm install otel open-telemetry/opentelemetry-collector -f values.yaml
D
Docker
Run the OTel Collector container, scrape Docker stats.
›
# otel.yaml
receivers:
docker_stats: { collection_interval: 30s }
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [docker_stats], exporters: [otlphttp] }
# Run:
docker run -d --name otel-collector \
-v $PWD/otel.yaml:/etc/otelcol/config.yaml \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
otel/opentelemetry-collector:latest
A
AWS CloudWatch
AWS Distro for OpenTelemetry (ADOT).
›
# adot-config.yaml
receivers:
awscloudwatchmetrics:
region: us-east-1
metrics:
namespaces: [AWS/EC2, AWS/RDS, AWS/Lambda]
collection_interval: 60s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [awscloudwatchmetrics], exporters: [otlphttp] }
# Run via ECS task definition or:
# aws-otel-collector --config adot-config.yaml
A
Azure Monitor
Pull metrics from Azure subscription.
›
receivers:
azuremonitor:
subscription_id: ${AZURE_SUB}
tenant_id: ${AZURE_TENANT}
client_id: ${AZURE_CLIENT}
client_secret: ${AZURE_SECRET}
collection_interval: 60s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [azuremonitor], exporters: [otlphttp] }
G
GCP Cloud Operations
Pull from Cloud Monitoring API.
›
receivers:
googlecloudmonitoring:
project_id: my-gcp-project
collection_interval: 60s
metrics_list:
- metric_name: compute.googleapis.com/instance/cpu/utilization
- metric_name: cloudsql.googleapis.com/database/cpu/utilization
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [googlecloudmonitoring], exporters: [otlphttp] }
L
Linux host
Lightweight heartbeat shell agent via cron.
›
# /etc/cron.d/funnel-heartbeat — runs every minute
* * * * * root /usr/local/bin/funnel-heartbeat.sh
# /usr/local/bin/funnel-heartbeat.sh
#!/usr/bin/env bash
set -eu
FUNNEL=https://YOUR-FUNNEL
KEY=st_YOUR_KEY
load=$(awk '{print $1}' /proc/loadavg)
mem=$(free | awk '/Mem:/ {printf "%.1f", $3/$2*100}')
curl -sS -X POST "$FUNNEL/v1/hosts/heartbeat" \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d "{\"hostname\":\"$(hostname)\",\"load\":$load,\"mem_used_pct\":$mem}"
APM
7 connectors
E
Elixir
FunnelAgent attaches :telemetry handlers.
›
# mix.exs
defp deps, do: [
{:funnel_agent, "~> 0.1"}
]
# lib/my_app/application.ex
def start(_type, _args) do
FunnelAgent.start(
endpoint: "https://YOUR-FUNNEL/v1",
api_key: System.fetch_env!("FUNNEL_KEY"),
service_name: "my-app"
)
children = [...]
Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
end
# Auto-instruments :phoenix, :ecto, :oban, :bandit.
N
Node.js
@opentelemetry/sdk-node with auto-instrumentations.
›
// npm i @opentelemetry/sdk-node \
// @opentelemetry/auto-instrumentations-node \
// @opentelemetry/exporter-trace-otlp-http
// instrumentation.js
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
new NodeSDK({
serviceName: 'my-app',
traceExporter: new OTLPTraceExporter({
url: 'https://YOUR-FUNNEL/v1/traces',
headers: { authorization: 'Bearer st_YOUR_KEY' }
}),
instrumentations: [getNodeAutoInstrumentations()]
}).start();
// Run: node -r ./instrumentation.js app.js
P
Python
opentelemetry-instrument zero-code agent.
›
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install
OTEL_SERVICE_NAME=my-app \
OTEL_EXPORTER_OTLP_ENDPOINT=https://YOUR-FUNNEL \
OTEL_EXPORTER_OTLP_HEADERS="authorization=Bearer st_YOUR_KEY" \
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \
opentelemetry-instrument python app.py
G
Go
go.opentelemetry.io/otel + OTLP/HTTP exporter.
›
// go get go.opentelemetry.io/otel \
// go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp \
// go.opentelemetry.io/otel/sdk/trace
import (
"context"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)
func initTracer(ctx context.Context) (*sdktrace.TracerProvider, error) {
exp, err := otlptracehttp.New(ctx,
otlptracehttp.WithEndpoint("YOUR-FUNNEL"), // no scheme
otlptracehttp.WithURLPath("/v1/traces"),
otlptracehttp.WithHeaders(map[string]string{
"authorization": "Bearer st_YOUR_KEY",
}),
)
if err != nil { return nil, err }
tp := sdktrace.NewTracerProvider(sdktrace.WithBatcher(exp))
otel.SetTracerProvider(tp)
return tp, nil
}
J
Java / JVM
Drop-in -javaagent, no code changes.
›
# Download opentelemetry-javaagent.jar from
# https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases
java -javaagent:opentelemetry-javaagent.jar \
-Dotel.service.name=my-app \
-Dotel.exporter.otlp.endpoint=https://YOUR-FUNNEL \
-Dotel.exporter.otlp.headers="authorization=Bearer st_YOUR_KEY" \
-Dotel.exporter.otlp.protocol=http/protobuf \
-jar app.jar
R
Ruby
opentelemetry-ruby SDK + auto-instrument.
›
# Gemfile
gem 'opentelemetry-sdk'
gem 'opentelemetry-exporter-otlp'
gem 'opentelemetry-instrumentation-all'
# config/initializers/otel.rb (Rails) or boot.rb
require 'opentelemetry/sdk'
require 'opentelemetry/exporter/otlp'
require 'opentelemetry/instrumentation/all'
ENV['OTEL_SERVICE_NAME'] ||= 'my-app'
ENV['OTEL_EXPORTER_OTLP_ENDPOINT'] ||= 'https://YOUR-FUNNEL'
ENV['OTEL_EXPORTER_OTLP_HEADERS'] ||= 'authorization=Bearer st_YOUR_KEY'
OpenTelemetry::SDK.configure { |c| c.use_all }
R
Rust
opentelemetry-otlp + reqwest client.
›
# Cargo.toml
[dependencies]
opentelemetry = "0.21"
opentelemetry_sdk = { version = "0.21", features = ["rt-tokio"] }
opentelemetry-otlp = { version = "0.14", features = ["http-proto", "reqwest-client"] }
// src/main.rs
use opentelemetry_otlp::WithExportConfig;
use std::collections::HashMap;
fn init_tracer() -> Result<(), Box<dyn std::error::Error>> {
let mut headers = HashMap::new();
headers.insert("authorization".into(), "Bearer st_YOUR_KEY".into());
opentelemetry_otlp::new_pipeline()
.tracing()
.with_exporter(
opentelemetry_otlp::new_exporter()
.http()
.with_endpoint("https://YOUR-FUNNEL/v1/traces")
.with_headers(headers),
)
.install_batch(opentelemetry_sdk::runtime::Tokio)?;
Ok(())
}
CI/CD
4 connectors
G
GitHub Actions
Per-job duration metric via curl step.
›
# .github/workflows/build.yml
jobs:
build:
runs-on: ubuntu-latest
steps:
- id: timer
run: echo "started=$(date +%s)" >> $GITHUB_OUTPUT
- run: ./build.sh
- name: Report to Funnel
if: always()
env:
FUNNEL_KEY: ${{ secrets.FUNNEL_KEY }}
run: |
elapsed=$(( $(date +%s) - ${{ steps.timer.outputs.started }} ))
curl -sS -X POST https://YOUR-FUNNEL/v1/metrics \
-H "Authorization: Bearer $FUNNEL_KEY" \
-H "Content-Type: application/json" \
-d @- <<JSON
{ "metrics": [{
"name": "ci.github.job.duration_s",
"kind": "gauge",
"value": $elapsed,
"attributes": {
"workflow": "${{ github.workflow }}",
"job": "${{ github.job }}",
"status": "${{ job.status }}",
"repo": "${{ github.repository }}"
}
}]}
JSON
G
GitLab CI
after_script reports CI_JOB_DURATION.
›
# .gitlab-ci.yml
.report_to_funnel: &report_to_funnel
after_script:
- |
curl -sS -X POST https://YOUR-FUNNEL/v1/metrics \
-H "Authorization: Bearer $FUNNEL_KEY" \
-H "Content-Type: application/json" \
-d "{
\"metrics\": [{
\"name\": \"ci.gitlab.job.duration_s\",
\"kind\": \"gauge\",
\"value\": $CI_JOB_DURATION,
\"attributes\": {
\"job\": \"$CI_JOB_NAME\",
\"status\": \"$CI_JOB_STATUS\",
\"ref\": \"$CI_COMMIT_REF_NAME\"
}
}]
}"
build:
<<: *report_to_funnel
script: ./build.sh
J
Jenkins
OpenTelemetry plugin — UI configuration.
›
1. Manage Jenkins → Plugins → install "OpenTelemetry".
2. Manage Jenkins → Configuration → OpenTelemetry.
3. Endpoint: https://YOUR-FUNNEL
4. Headers: authorization=Bearer st_YOUR_KEY
5. Protocol: OTLP HTTP/protobuf (or gRPC if FUNNEL_GRPC_ENABLED=1, port 4317)
6. Save.
Every build now exports a root span "ci.pipeline.run" with
child spans per stage/step, plus build duration and outcome metrics.
A
Argo CD
Sync hook posts to /v1/logs.
›
# Add a notification webhook to argocd-notifications-cm ConfigMap:
data:
service.webhook.funnel: |
url: https://YOUR-FUNNEL/v1/logs
headers:
- name: Authorization
value: "Bearer st_YOUR_KEY"
- name: Content-Type
value: application/json
template.app-sync-status: |
webhook:
funnel:
method: POST
body: |
{ "logs": [{
"severity": "info",
"service_name": "argocd",
"message": "{{.app.metadata.name}} sync {{.app.status.sync.status}}",
"attributes": {
"app": "{{.app.metadata.name}}",
"revision": "{{.app.status.sync.revision}}"
}
}]}
trigger.on-sync-status-change: |
- when: app.status.sync.status in ['Synced', 'OutOfSync']
send: [app-sync-status]
Messaging
2 connectors
K
Kafka
JMX → OTel Collector jmx receiver.
›
# Make sure Kafka has JMX exposed (KAFKA_JMX_PORT=9999).
receivers:
jmx:
jar_path: /opt/opentelemetry-jmx-metrics.jar
endpoint: kafka:9999
target_system: kafka
collection_interval: 30s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [jmx], exporters: [otlphttp] }
# Exposes: kafka.message.count, kafka.partition.lag, kafka.consumer.lag, ...
R
RabbitMQ
rabbitmq receiver scrapes Management API.
›
# Enable plugin: rabbitmq-plugins enable rabbitmq_management
receivers:
rabbitmq:
endpoint: http://rabbitmq:15672
username: ${RABBIT_USER}
password: ${RABBIT_PASS}
collection_interval: 30s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [rabbitmq], exporters: [otlphttp] }
# Exposes: rabbitmq.consumer.count, rabbitmq.message.published, queue.size, ...
Database
4 connectors
R
Redis
redis receiver runs INFO every 30s.
›
receivers:
redis:
endpoint: redis:6379
password: ${REDIS_PASS} # omit if disabled
collection_interval: 30s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [redis], exporters: [otlphttp] }
P
PostgreSQL
postgresql receiver scrapes pg_stat_* views.
›
# Grant the read-only monitoring user pg_monitor first:
-- CREATE USER monitoring WITH PASSWORD 'xxx';
-- GRANT pg_monitor TO monitoring;
receivers:
postgresql:
endpoint: postgres:5432
transport: tcp
username: monitoring
password: ${PG_PASS}
databases: [mydb]
collection_interval: 30s
tls: { insecure: true }
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [postgresql], exporters: [otlphttp] }
M
MySQL
mysql receiver scrapes performance_schema.
›
# Grant the monitoring user:
-- CREATE USER 'monitoring'@'%' IDENTIFIED BY 'xxx';
-- GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'monitoring'@'%';
receivers:
mysql:
endpoint: mysql:3306
username: monitoring
password: ${MYSQL_PASS}
collection_interval: 30s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [mysql], exporters: [otlphttp] }
M
MongoDB
mongodb receiver — serverStatus + dbStats.
›
# Create a user with clusterMonitor role:
# use admin
# db.createUser({ user: "monitoring", pwd: "xxx",
# roles: [{ role: "clusterMonitor", db: "admin" }] })
receivers:
mongodb:
hosts:
- endpoint: mongo:27017
username: monitoring
password: ${MONGO_PASS}
collection_interval: 30s
exporters:
otlphttp:
endpoint: https://YOUR-FUNNEL/v1
headers: { authorization: "Bearer st_YOUR_KEY" }
service:
pipelines:
metrics: { receivers: [mongodb], exporters: [otlphttp] }
Security
4 connectors
G
GitHub Secret Scanning
Forward secret alerts to /v1/findings.
›
# .github/workflows/secret-alert-forward.yml
on:
secret_scanning_alert: { types: [created, resolved] }
jobs:
forward:
runs-on: ubuntu-latest
steps:
- run: |
curl -sS -X POST https://YOUR-FUNNEL/v1/findings \
-H "Authorization: Bearer ${{ secrets.FUNNEL_KEY }}" \
-H "Content-Type: application/json" \
-d @- <<JSON
{
"source": "secret_scanner",
"severity": "high",
"title": "${{ github.event.alert.secret_type_display_name }}",
"target": "${{ github.repository }}",
"detail": "${{ github.event.alert.html_url }}"
}
JSON
T
Trivy
Scan output piped to /v1/findings.
›
# Scan, then post each vulnerability:
trivy fs --format json --output trivy.json .
jq -c '.Results[]?.Vulnerabilities[]? | {
source: "vuln_scanner",
severity: (.Severity | ascii_downcase),
title: (.VulnerabilityID + " in " + .PkgName),
target: .PkgName,
detail: (.Title // .Description // "")
}' trivy.json | while read -r body; do
curl -sS -X POST https://YOUR-FUNNEL/v1/findings \
-H "Authorization: Bearer $FUNNEL_KEY" \
-H "Content-Type: application/json" \
-d "$body"
done
F
Falco
http_output sends every event as JSON.
›
# falco.yaml
json_output: true
json_include_output_property: true
http_output:
enabled: true
url: "https://YOUR-FUNNEL/v1/findings"
user_agent: "falco"
insecure: false
headers:
- "Authorization: Bearer st_YOUR_KEY"
- "Content-Type: application/json"
# Funnel accepts Falco's native envelope and maps:
# priority -> severity
# rule -> title
# output -> detail
# output_fields.container.name -> target
A
AWS CloudTrail
Lambda subscribes to CloudTrail S3 PUTs.
›
# Lambda triggered by S3 PUT on the CloudTrail log bucket.
# Forwards every event with errorCode set as a finding.
import boto3, gzip, json, os, urllib3
http = urllib3.PoolManager()
def lambda_handler(event, _):
s3 = boto3.client('s3')
for rec in event['Records']:
obj = s3.get_object(
Bucket=rec['s3']['bucket']['name'],
Key=rec['s3']['object']['key'])
body = json.loads(gzip.decompress(obj['Body'].read()))
for e in body.get('Records', []):
if not e.get('errorCode'):
continue
http.request('POST',
os.environ['FUNNEL'] + '/v1/findings',
headers={
'Authorization': 'Bearer ' + os.environ['FUNNEL_KEY'],
'Content-Type': 'application/json',
},
body=json.dumps({
'source': 'cloudtrail',
'severity': 'medium',
'title': e['errorCode'],
'target': e.get('userIdentity', {}).get('arn', 'unknown'),
'detail': e.get('errorMessage', ''),
}))
Cost
2 connectors
A
AWS Cost Explorer
Daily Lambda → /v1/cost/records.
›
# Run on EventBridge cron(0 6 * * *)
# IAM: ce:GetCostAndUsage
import boto3, requests, os, datetime as dt
yesterday = (dt.date.today() - dt.timedelta(days=1)).isoformat()
today = dt.date.today().isoformat()
ce = boto3.client('ce')
resp = ce.get_cost_and_usage(
TimePeriod={'Start': yesterday, 'End': today},
Granularity='DAILY',
Metrics=['UnblendedCost'],
GroupBy=[{'Type': 'DIMENSION', 'Key': 'SERVICE'}])
records = [{
'date': yesterday,
'service': g['Keys'][0],
'cost_usd': float(g['Metrics']['UnblendedCost']['Amount']),
'provider': 'aws'
} for g in resp['ResultsByTime'][0]['Groups']]
requests.post(
os.environ['FUNNEL'] + '/v1/cost/records',
headers={'Authorization': 'Bearer ' + os.environ['FUNNEL_KEY']},
json={'records': records})
G
GCP Billing
BigQuery billing export → /v1/cost/records.
›
# Prereq: enable Billing Export to BigQuery.
# Run on Cloud Scheduler daily.
from google.cloud import bigquery
import requests, os
bq = bigquery.Client()
query = '''
SELECT service.description AS service,
SUM(cost) AS cost_usd,
DATE(usage_start_time) AS day
FROM `proj.billing.gcp_billing_export_v1_XXXXXX_*`
WHERE DATE(usage_start_time) = CURRENT_DATE() - 1
GROUP BY service, day
'''
records = [{
'date': str(r['day']),
'service': r['service'],
'cost_usd': float(r['cost_usd']),
'provider': 'gcp'
} for r in bq.query(query).result()]
requests.post(
os.environ['FUNNEL'] + '/v1/cost/records',
headers={'Authorization': 'Bearer ' + os.environ['FUNNEL_KEY']},
json={'records': records})
Notifications
4 connectors
S
Slack
Incoming webhook URL — Funnel formats automatically.
›
1. In Slack: Apps → Incoming Webhooks → Add to Workspace.
Pick a channel, copy the URL:
https://hooks.slack.com/services/T.../B.../...
2. In Funnel: Alerts → New rule → Destination → Webhook.
Paste the URL. Save.
Funnel POSTs a Slack-shaped payload:
{ "text": "Alert: error_rate > 0.05 on service=api",
"attachments": [{ "color": "danger", "fields": [...] }] }
No transformation required.
P
PagerDuty
Events API v2 — paste integration key.
›
1. PagerDuty: Service → Integrations → + Events API v2.
Copy the Integration Key (32 hex chars).
2. Funnel: Alerts → New rule → Destination → PagerDuty.
Paste the Integration Key. Save.
Funnel POSTs to https://events.pagerduty.com/v2/enqueue:
{ "routing_key": "<INTEGRATION_KEY>",
"event_action": "trigger" | "resolve",
"payload": {
"summary": "<alert name> fired",
"severity": "critical" | "warning" | "info",
"source": "funnel"
}
}
Resolution events fire automatically when the rule un-breaches.
M
Microsoft Teams
Incoming Webhook connector URL.
›
1. In Teams: Channel → ⋯ → Connectors → Incoming Webhook.
Name it "Funnel alerts". Copy the webhook URL.
2. In Funnel: Alerts → New rule → Destination → Webhook.
Paste the URL. Funnel detects the
outlook.office.com host and posts a
MessageCard-formatted body automatically.
No transformation required.
@
Email (Swoosh)
Use the project Mailer; no extra setup per alert.
›
# In Funnel UI:
# Alerts → New rule → Destination → Email.
# Enter one or more recipient addresses.
# SMTP is configured once, project-wide, in config/runtime.exs:
config :funnel, Funnel.Mailer,
adapter: Swoosh.Adapters.SMTP,
relay: System.fetch_env!("SMTP_HOST"),
port: 587,
username: System.fetch_env!("SMTP_USER"),
password: System.fetch_env!("SMTP_PASS"),
tls: :always,
auth: :always,
retries: 2
# Works out of the box with Mailtrap, SES, Postmark, SendGrid.
If it speaks OTLP — it works. Set
OTEL_EXPORTER_OTLP_ENDPOINT=https://YOUR-FUNNEL
and
OTEL_EXPORTER_OTLP_HEADERS=authorization=Bearer st_YOUR_KEY.
For anything custom, hand-roll JSON to
POST /v1/{metrics,traces,logs} — see
SDKs & samples.
GraphQL — /api/graphql #
Funnel exposes a project-scoped GraphQL API for everything SDS-related: findings (with cursor pagination + multi-filter), rule statistics, per-pillar breakdowns, writer health, CPU budget, and live subscriptions for newly-created findings. Same Bearer-token auth as the REST API; rate-limited at 8 req/sec per project.
POST /api/graphql
GET /api/graphiql
Backed by Absinthe 1.7 with a real introspectable schema,
enums (uppercase names: CRITICAL, LOGS, etc.),
custom DateTime + JSON scalars, and cursor-paginated
connections. Subscription delivery uses Phoenix.PubSub over the existing
/live socket — no extra server to run.
Authentication
The same Authorization: Bearer st_… header that gates the
REST API gates GraphQL. The API key resolves to a project; the
FunnelWeb.GraphQL.ContextPlug puts that project on the
Absinthe context and every resolver reads it from there. The client
cannot pass a projectId argument to query a different
project — the API key is the only project selector.
Worked example
# Paginated finding list with multi-filter.
curl -X POST $FUNNEL/api/graphql \\
-H "Authorization: Bearer st_YOUR_KEY" \\
-H "Content-Type: application/json" \\
-d '{
"query": "query Findings(\$first: Int!, \$sev: Severity) {
findings(first: \$first, filter: { severity: \$sev }) {
nodes {
id time severity ruleName matchedSample
sourceKind serviceName hitsCount fingerprint
}
pageInfo { hasNextPage endCursor }
totalCount
}
}",
"variables": { "first": 25, "sev": "CRITICAL" }
}'
Errors come back in the standard
{ data: …, errors: [{ message }] }
shape. The most common error is
"unauthenticated" when the API key resolves to no project,
and field-level validation errors when an enum value is misspelled
(use CRITICAL, not critical).
GraphQL queries #
| Query | Returns | Use for |
|---|---|---|
| findings | FindingConnection | Paginated finding list with severity / source-kind / service / rule / fingerprint filters. |
| finding | Finding | Single finding by UUID. |
| topRules | [RuleStat] | Noisiest rules in the window — diagnose over-broad patterns. |
| bySourceKind | [SourceKindStat] | Distribution across logs / spans / rum / metrics / sessions. |
| topServices | [ServiceStat] | Distribution across emitting services. |
| budget | Budget | Current CPU token-bucket state for this project. |
| writerStats | WriterStats | Sharded FindingWriter health (per-shard breakdown). |
| health | String | Liveness ping. Always "ok". |
Single dashboard query
GraphQL's point is composing many queries in one HTTP round-trip. The SDS dashboard fetches everything it needs with one query:
# One round-trip for a custom SDS dashboard.
{
summary: findings(first: 1) { totalCount }
byKind: bySourceKind(hours: 24) { sourceKind count }
services: topServices(hours: 24, limit: 5) { service count }
rules: topRules(hours: 24, limit: 5) {
ruleName count severity
}
budget { tokens deniedCount }
writerStats {
written buffered dropped
perShard { shard written batches }
}
}
Cursor pagination
Connection-style pagination using opaque cursors. Decode-on-server: the
cursor is a base64-url-encoded (time, uuid) tuple. Pass the
previous page's pageInfo.endCursor back as
after to fetch the next page.
{
topRules(hours: 24, limit: 10) {
ruleName
builtinId
severity
actionTaken
count
uniqueFingerprints
firstSeen
lastSeen
}
}
For windows up to 30 days the resolver returns a
totalCount; for longer windows the count is
null to avoid a slow COUNT(*) against the
partitioned table.
GraphQL subscriptions #
GraphQL subscription { findingCreated } delivers every
newly-persisted finding to subscribed clients in real time. Each shard
flush publishes to Phoenix.PubSub topic
sds:findings:<project_id> (plus a per-severity
variant), and Absinthe's subscription router fans the message out to
every matching client.
project_id, so a client of project A
cannot receive project B's findings even if they could craft the
socket message — Absinthe rejects the subscription at config
time without a matching project in context.
JavaScript example
# Connect via the Phoenix Socket — channel topic
# "__absinthe__:control" handles GraphQL subscription frames.
import { Socket } from "phoenix";
import * as AbsintheSocket from "@absinthe/socket";
import { createAbsintheSocketLink } from "@absinthe/socket-apollo-link";
const phx = new Socket("/socket", {
params: { token: "st_YOUR_KEY" }
});
const absintheSocket = AbsintheSocket.create(phx);
AbsintheSocket.observe(absintheSocket, {
operation: \`subscription {
findingCreated(severity: CRITICAL) {
ruleName matchedSample serviceName sourceKind
}
}\`,
variables: {}
}, {
onResult: (data) => console.log("finding:", data)
});
Subscriptions can filter by severity at subscription-time:
findingCreated(severity: CRITICAL). Filtering happens at
the topic level, so a CRITICAL-only subscriber doesn't receive
INFO findings over the wire.
GraphQL schema reference #
Condensed SDL view of the schema. Introspection is enabled, so tools like Insomnia, Apollo DevTools, and GraphiQL can autocomplete fields and validate queries before sending them.
type Query {
findings(filter: FindingFilter, first: Int, after: String): FindingConnection!
finding(id: ID!): Finding
topRules(hours: Int = 24, limit: Int = 10): [RuleStat!]!
bySourceKind(hours: Int = 24): [SourceKindStat!]!
topServices(hours: Int = 24, limit: Int = 10): [ServiceStat!]!
budget: Budget!
writerStats: WriterStats!
health: String!
}
type Subscription {
findingCreated(severity: Severity): Finding!
}
input FindingFilter {
severity: Severity
sourceKind: SourceKind
service: String
ruleId: Int
builtinId: String
fingerprint: String
since: DateTime
until: DateTime
}
enum Severity { CRITICAL HIGH MEDIUM LOW INFO }
enum SourceKind { LOGS SPANS RUM METRICS SESSIONS }
enum ActionKind { ALERT REDACT HASH DROP }
scalar DateTime
scalar JSON
type Finding {
id: ID!
time: DateTime!
ruleId: Int
builtinId: String
ruleName: String
patternSnapshot: String
severity: Severity!
sourceKind: SourceKind!
sourceId: String
actionTaken: ActionKind!
matchedSample: String
serviceName: String
fieldName: String
hitsCount: Int!
fingerprint: String
attributes: JSON
}
type FindingConnection {
edges: [FindingEdge!]!
nodes: [Finding!]!
pageInfo: PageInfo!
totalCount: Int
}
Type details
topRules.ruleId: IntbuiltinId: StringruleName: Stringseverity: SeverityactionTaken: ActionKindcount: Int!uniqueFingerprints: Int!— distinct secrets matchedfirstSeen / lastSeen: DateTime
budget.projectId: Int!tokens: Int!— microseconds remaining in token-bucketdeniedCount: Int!— times scanning was throttled since boot
writerStats.buffered / written / dropped / batches: Int!— aggregated totalsperShard: [WriterShardStats!]!— per-shard breakdown (4 shards by default)
hasNextPage: Boolean!endCursor: String— opaque, pass back asafter
Operational caveats
-
Enum input form:
uppercase names.
severity: CRITICALnot"critical". -
Date input:
ISO-8601 with a Z timezone:
since: "2026-05-17T00:00:00Z". -
Max per-page:
first: 200; larger values are silently clamped. -
GraphiQL:
only mounted in
dev/test— production never exposes the introspection-heavy playground at the edge. Use a desktop client.
Error responses #
| Status | Meaning | When |
|---|---|---|
| 200 | OK | Accepted into the buffer. |
| 204 | No Content | RUM beacon accepted. |
| 401 | Unauthorized | Missing/invalid/revoked key. |
| 413 | Payload Too Large | Body exceeds 5 MB. |
| 415 | Unsupported Media | Non-JSON body to JSON endpoint. |
| 422 | Unprocessable | Malformed body — partial decode possible. |
| 500 | Server Error | Pipeline failure — safe to retry. |