# Yorker Documentation (Full) Complete documentation for Yorker synthetic monitoring platform. For a curated summary, see https://yorkermonitoring.com/llms.txt # Getting Started ## Quick Start > Create your first monitor and start monitoring in under 5 minutes. # Quick Start This guide walks you through creating your first synthetic monitor with Yorker. You can use the Web UI for a visual approach or the CLI for infrastructure-as-code workflows. ## Sign up To get started, create an account at [yorkermonitoring.com](https://yorkermonitoring.com). After signing in, you land on the dashboard where you can create monitors, view results, and configure alerts. ## Pick a monitor type Yorker supports three monitor types: - **HTTP** — sends an HTTP request and asserts on the response (status, body, headers, timing, SSL, OpenAPI conformance). - **Browser** — runs a real Chromium session, either against a URL with automatic Core Web Vitals capture, or scripted with Playwright. - **MCP** — exercises a Model Context Protocol server over Streamable HTTP, validates the advertised tools, and (optionally) calls them with expected output. You can create monitors through the Web UI, the CLI, or via natural language. ## Create your first monitor (Web UI) 1. Open the dashboard and click **Create Monitor**. 2. Select **HTTP**, **Browser**, or **MCP**. 3. Enter the target URL (or MCP endpoint). 4. Choose one or more check locations. 5. Set the check frequency (default: every 5 minutes). 6. Click **Create**. The monitor starts running immediately. Results appear on the dashboard within one check interval. For a richer walkthrough — assertions, auth, browser scripts, MCP tool calls — see [Create a Monitor](/docs/guides/create-monitor). ## Create your first monitor (CLI) To manage monitors as code, use the Yorker CLI. ### Install the CLI ```bash npm install -g @yorker/cli ``` ### Set your API key Generate an API key from **Settings > API Keys** in the dashboard, then export it: ```bash export YORKER_API_KEY=sk_... ``` ### Scaffold a config file ```bash yorker init ``` The interactive wizard creates a `yorker.config.yaml` with your project name, first monitor URL, type, and frequency. You can also pass flags to skip the prompts: ```bash yorker init --name my-app --url https://example.com --type http --frequency 5m ``` ### Deploy ```bash yorker deploy ``` The CLI validates your config, computes a diff against the current remote state, and applies changes. You will see a Terraform-style plan showing what will be created, updated, or deleted before any changes are applied. ### View results Check the status of your monitors: ```bash yorker status ``` Or stream results in real time: ```bash yorker results tail "Homepage" ``` Open the dashboard to see check results, response times, and status history. Each check result includes timing breakdowns (DNS, TLS, TTFB) and, for browser monitors, a filmstrip screenshot view. ## Next steps - [Create a Monitor](/docs/guides/create-monitor) — HTTP, browser, and MCP monitors, assertions, auth, labels. - [Set Up Alerts](/docs/guides/set-up-alerts) — consecutive failures, multi-location correlation, SSL alerts, burn-rate alerts. - [Define SLOs](/docs/guides/define-slos) — availability and performance objectives with error budgets and burn rate alerts. - [Deploy with CLI](/docs/guides/monitoring-as-code) — manage your full monitoring config as code with `yorker deploy`. - [Install Dashboards](/docs/guides/install-dashboards) — one-click provisioning of pre-built HyperDX / ClickStack dashboards. --- # Guides ## Create a Monitor > How to create HTTP, browser, and MCP monitors using the Web UI, CLI, or natural language. # Create a Monitor Yorker supports three monitor types: - **HTTP** — sends an HTTP request and asserts on the response (status code, body, headers, timing, SSL expiry, OpenAPI conformance). - **Browser** — runs a real Chromium session, either against a URL (automatic Core Web Vitals + screenshots) or scripted with Playwright TypeScript. - **MCP** — exercises a Model Context Protocol server over Streamable HTTP, validates the advertised tool list, and (optionally) calls tools with expected output. There are three ways to create monitors: the Web UI, the CLI (monitoring as code), and natural-language generation. ## Web UI To create a monitor through the dashboard: 1. Navigate to the dashboard and click **Create Monitor**. 2. Select the monitor type (**HTTP**, **Browser**, or **MCP**). 3. For HTTP monitors, fill in the URL, method, headers, authentication, and assertions. 4. For browser monitors, choose **URL mode** (just a URL) or **Scripted mode** and write a Playwright TypeScript script in the built-in editor. 5. For MCP monitors, enter the endpoint and optionally list expected tools and test calls. 6. Choose check locations and frequency. 7. Click **Create**. ## CLI / Monitoring as Code To manage monitors as code, define them in `yorker.config.yaml` and deploy with `yorker deploy`. ### HTTP monitor ```yaml project: my-app monitors: - name: API Health type: http url: https://api.example.com/health method: GET frequency: 1m locations: - loc_us_east - loc_eu_central timeoutMs: 10000 followRedirects: true headers: Accept: application/json labels: - env:production - service:api assertions: - type: status_code operator: equals value: 200 - type: response_time max: 2000 - type: body_contains value: '"status":"ok"' - type: body_json_path path: $.version operator: exists - type: header_value header: content-type operator: contains value: application/json - type: ssl_expiry daysBeforeExpiry: 14 ``` ### HTTP authentication Three auth types are supported. Add an `auth` block to any HTTP monitor: ```yaml # Basic auth auth: type: basic username: "{{secrets.API_USER}}" password: "{{secrets.API_PASS}}" # Bearer token auth: type: bearer token: "{{secrets.AUTH_TOKEN}}" # API key header (defaults to X-API-Key) auth: type: api-key header: X-API-Key value: "{{secrets.API_KEY}}" ``` ### Assertion types | Type | Fields | Description | |------|--------|-------------| | `status_code` | `operator` (equals, not_equals, less_than, greater_than), `value` | Assert on HTTP status code. Operator defaults to `equals`. | | `response_time` | `max` | Fail if response takes longer than `max` milliseconds. | | `body_contains` | `value` | Fail if response body does not contain the string. | | `body_matches` | `pattern` | Fail if response body does not match the regex pattern. | | `body_json_path` | `path`, `operator` (equals, not_equals, contains, exists), `value` | Assert on a JSONPath expression. Operator defaults to `equals`. | | `header_value` | `header`, `operator` (equals, contains, exists), `value` | Assert on a response header. Operator defaults to `equals`. | | `ssl_expiry` | `daysBeforeExpiry` | Fail if SSL certificate expires within the given number of days. Defaults to 14. | | `openapi_conformance` | `specId`, `operationPath` (optional), `validateHeaders` (optional) | Validate the response against an OpenAPI spec registered in Yorker. See [Assertions](/docs/reference/assertions#openapi_conformance). | See the [Assertions reference](/docs/reference/assertions) for full details. ### Browser monitor — URL mode The simplest browser monitor: Yorker navigates to the URL, captures Core Web Vitals, takes screenshots, and runs assertions. No script to write. URL-mode browser monitors are currently created via the Web UI or the REST API. `yorker deploy` is scripted-only for browser monitors — create URL-mode monitors from the dashboard, or via `POST /api/checks` with `browserConfig.browserMode: "url"`. See the [REST API reference](/docs/reference/api#create-check). ### Browser monitor — scripted mode Scripted browser monitors run full Playwright TypeScript scripts. Reference the script file from your config: ```yaml monitors: - name: Checkout Flow type: browser script: ./monitors/checkout.ts frequency: 5m locations: - loc_us_east viewport: width: 1280 height: 720 screenshotMode: every_step videoEnabled: false ``` The script file (`./monitors/checkout.ts`) is a **Playwright script body**, not a full test file. The Yorker runner wraps your script in an async function and injects `page` and `context` (both standard Playwright objects) for you to use. That means: - **Do not** write `import { test } from "@playwright/test"` or any other `import`/`require` statements — the script has no module scope. - **Do not** wrap the code in `test(...)` or `describe(...)` — Yorker doesn't run the Playwright test runner. - **Do** write the body of your check directly, referencing `page` (a `Page`) and `context` (a `BrowserContext`) as if they were already in scope. - **Do** use `// @step: Name` comments to mark steps. Yorker captures a screenshot at each step (when `screenshotMode: every_step`) and surfaces step timing in the filmstrip view. ```typescript // @step: Go to shop await page.goto("https://shop.example.com"); // @step: Add to cart await page.click("text=Add to Cart"); // @step: Checkout await page.click("text=Checkout"); await page.waitForSelector(".order-confirmation"); ``` #### Browser configuration options (scripted mode, YAML) | Field | Default | Description | |-------|---------|-------------| | `script` | *(required)* | Path to the Playwright TypeScript script file, relative to the config file. | | `viewport` | `{ width: 1280, height: 720 }` | Browser viewport dimensions. | | `device` | *(none)* | Playwright device name for emulation (e.g., `"iPhone 14"`). | | `screenshotMode` | `every_step` | When to capture screenshots: `every_step`, `failure_only`, or `disabled`. | | `videoEnabled` | `false` | Whether to record video of the browser session. | | `timeoutMs` | `30000` | Maximum script execution time (5000-120000 ms). | ### MCP monitor MCP monitors check the health of a Model Context Protocol server over Streamable HTTP. On each run, Yorker: 1. Connects to the `endpoint` and calls `tools/list`. 2. Verifies every tool in `expectedTools` is present (if configured). 3. Runs each `testCalls` entry: invokes the tool and checks the result contains `expectedOutputContains` (if provided). 4. Optionally detects schema drift — tools that appeared, disappeared, or changed signatures since the last successful run. ```yaml monitors: - name: Docs MCP Server type: mcp endpoint: https://mcp.example.com/sse frequency: 5m timeoutMs: 30000 locations: - loc_us_east auth: type: bearer token: "{{secrets.MCP_TOKEN}}" expectedTools: - search_docs - fetch_page testCalls: - toolName: search_docs arguments: query: "pricing" expectedOutputContains: "Plans" detectSchemaDrift: true ``` #### MCP configuration options | Field | Default | Description | |-------|---------|-------------| | `endpoint` | *(required)* | Streamable HTTP endpoint URL of the MCP server. | | `timeoutMs` | `30000` | Request timeout (5000-120000 ms). | | `auth` | *(none)* | Same auth block shape as HTTP monitors (`basic`, `bearer`, `api-key`). | | `expectedTools` | *(none)* | Array of tool names that MUST be present. Missing tools fail the check. | | `testCalls` | *(none)* | Array of tool invocations to run. Each entry has `toolName`, optional `arguments` (plain object), and optional `expectedOutputContains`. | | `detectSchemaDrift` | `true` | Emit schema-drift events when the tool list or tool signatures change. | > Note: MCP monitors cannot be executed locally with `yorker test` — deploy them and watch results via `yorker results tail` or the dashboard. ### Labels Attach labels to any monitor. Labels serve two purposes: filtering and grouping in the dashboard, and emission as OTel resource attributes so you can slice telemetry by label in your observability backend. ```yaml monitors: - name: Payments API type: http url: https://api.example.com/payments labels: - env:production # key:value label - service:payments - critical # boolean label (becomes yorker.label.critical="true") ``` Labels follow the convention `[a-zA-Z0-9][a-zA-Z0-9_.:-]*`, max 128 characters. Plain labels (no colon) emit as `yorker.label.="true"`. Key-value labels emit as `yorker.label.=""`. Omitting the `labels` field leaves labels unmanaged by config — the CLI preserves whatever labels exist on the remote. Setting `labels: []` explicitly clears all labels. ### Defaults and groups To avoid repeating configuration across monitors, use `defaults` and `groups`. **Defaults** apply to all monitors unless overridden: ```yaml defaults: frequency: 5m locations: - loc_us_east - loc_eu_central http: timeoutMs: 15000 followRedirects: true assertions: - type: status_code value: 200 browser: viewport: width: 1280 height: 720 screenshotMode: every_step ``` **Groups** apply shared settings to a subset of monitors: ```yaml groups: - name: API Endpoints frequency: 1m locations: - loc_us_east - loc_us_west - loc_eu_central monitors: - name: Users API type: http url: https://api.example.com/users - name: Orders API type: http url: https://api.example.com/orders ``` The cascade order is: **defaults -> group -> monitor**. Each level overrides the previous. Per-monitor assertions **replace** defaults entirely (they do not merge). To clear inherited assertions, set `assertions: []` on the monitor. ### Frequency format Frequencies use a shorthand: `30s` (seconds), `5m` (minutes), `1h` (hours). Valid range: 10 seconds to 24 hours. ## CLI imperative commands For quick one-off monitors, use the `monitors` commands instead of a config file: ### Create a monitor ```bash yorker monitors create --name "API Health" --type http --url https://api.example.com/health --frequency 1m ``` ### List monitors ```bash yorker monitors list yorker monitors list --type http --status enabled ``` ### View monitor details ```bash yorker monitors get "API Health" ``` ### Edit, pause, and resume ```bash yorker monitors edit "API Health" --frequency 30s --add-location loc_eu_central yorker monitors pause "API Health" yorker monitors resume "API Health" ``` ### Delete a monitor ```bash yorker monitors delete "Old Endpoint" --yes ``` See the [CLI reference](/docs/reference/cli) for the full list of monitor commands and flags. ## Natural language To create a monitor using natural language, use either the Web UI or the API. **Web UI:** Click **Describe in plain English** on the create monitor page and type a description like "Monitor our checkout flow every 2 minutes from US and EU, alert if it takes longer than 3 seconds." Yorker generates a Playwright script you can edit before saving. **API:** Send a POST request to `/api/checks/generate` with a description. The endpoint returns a generated Playwright script you can review, optionally refine, and save as a browser monitor: ```bash curl -X POST https://app.yorkermonitoring.com/api/checks/generate \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "description": "Navigate to https://shop.example.com, add the first product to cart, and verify the cart shows it.", "targetUrl": "https://shop.example.com" }' ``` You can also pass `previousScript` and `refinement` to iteratively improve an existing script. See the [REST API reference](/docs/reference/api#generate-playwright-script). ### Generate HTTP checks from an OpenAPI spec `/api/checks/generate` also accepts a `spec` field. Pass an existing spec ID, an OpenAPI URL Yorker should fetch, or the name of a spec already on your team — Yorker creates one HTTP check per operation, skips operations that already have a check, and returns the full list. This is the API equivalent of the spec import flow: ```bash curl -X POST https://app.yorkermonitoring.com/api/checks/generate \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "spec": { "source": "url", "specUrl": "https://api.example.com/openapi.json" }, "locations": ["us-east", "eu-west"], "frequencySeconds": 300 }' ``` The `spec` field is a discriminated union: pass `{"source": "id", "specId": "spec_..."}` to use an existing spec, `{"source": "url", "specUrl": "https://..."}` to fetch and (if needed) import a new one, or `{"source": "name", "specName": "..."}` to look up by name. If the spec yields more than 50 *new* operations the first call returns `409` with `requiresConfirmation: true` — re-submit with `"confirm": true` to proceed. See the [Generate Checks From OpenAPI Spec](/docs/reference/api#generate-checks-from-openapi-spec) reference for the full request/response shape and error codes. --- ## Set Up Alerts > How to configure alert rules — consecutive failures, multi-location correlation, SSL alerts, and SLO burn-rate alerts. # Set Up Alerts Alerts notify you when monitors detect problems. Each alert rule combines one or more **conditions** (what triggers the alert) with one or more **channels** (where the notification goes). ## Define alert channels To send notifications, first define your channels in the `alertChannels` block at the top of `yorker.config.yaml`. Each channel has a name (the key) and a type-specific configuration. ```yaml alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" on-call-email: type: email addresses: - oncall@example.com - sre-team@example.com pagerduty: type: webhook url: "{{secrets.PAGERDUTY_WEBHOOK_URL}}" method: POST headers: Authorization: "Token token={{secrets.PD_TOKEN}}" ``` ### Channel types | Type | Required fields | Description | |------|----------------|-------------| | `slack` | `webhookUrl` | Posts to a Slack incoming webhook. | | `email` | `addresses` (array, at least one) | Sends email to the listed addresses. | | `webhook` | `url` | Sends an HTTP request. `method` defaults to `POST`. Optional `headers` for auth. | ## Reference channels in alerts To attach a channel to an alert, reference it with the `@channel-name` syntax: ```yaml monitors: - name: API Health type: http url: https://api.example.com/health alerts: - conditions: - type: consecutive_failures count: 3 channels: - "@ops-slack" - "@on-call-email" ``` ## Alert conditions Each alert must have at least one condition. Multiple conditions on the same alert are combined with AND logic — all conditions must be met for the alert to trigger. ### consecutive_failures Triggers after a monitor fails a specified number of times in a row. ```yaml - type: consecutive_failures count: 3 # default: 2, min: 1 ``` ### response_time_threshold Triggers when response time exceeds a threshold. ```yaml - type: response_time_threshold maxMs: 5000 # milliseconds ``` ### multi_location_failure Triggers when a monitor fails from multiple locations within a time window. This reduces false positives from localized network issues. ```yaml - type: multi_location_failure minLocations: 2 # default: 2, min: 2 windowSeconds: 300 # default: 300 (5 minutes) ``` ### ssl_expiry Triggers when an SSL certificate is approaching expiration. ```yaml - type: ssl_expiry daysBeforeExpiry: 14 # default: 14, min: 1 severity: warning # optional: critical | warning | info ``` ### ssl_certificate_changed Triggers when the leaf certificate's fingerprint changes between runs — useful for catching unexpected cert rotations and possible man-in-the-middle conditions. ```yaml - type: ssl_certificate_changed severity: critical ``` ### ssl_self_signed Triggers when the endpoint presents a self-signed (or otherwise untrusted-root) certificate. ```yaml - type: ssl_self_signed severity: critical ``` ### ssl_protocol_deprecated Triggers when the TLS handshake negotiates a protocol older than `minProtocol`. ```yaml - type: ssl_protocol_deprecated minProtocol: TLSv1.2 # default: TLSv1.2 (allowed: TLSv1.2, TLSv1.3) severity: warning ``` ### burn_rate Triggers when an SLO's error budget is burning faster than a threshold across a short window AND a long window (the Google SRE multi-window burn-rate alerting pattern). Requires an existing SLO — reference it by ID. ```yaml - type: burn_rate sloId: slo_abc123 burnRateThreshold: 14.4 # burn rate multiple (e.g. 14.4 = budget exhausted in ~2 days at a 30d SLO) longWindowMinutes: 60 # minimum 60 shortWindowMinutes: 5 # minimum 5, MUST be less than longWindowMinutes ``` Burn-rate alerts are automatically wired up when you set `burnRateAlerts: true` on an SLO (the default). Use a manual `burn_rate` condition only if you need custom threshold/window combinations beyond the built-in ones. See [Define SLOs](/docs/guides/define-slos) for the simpler path. ### baseline_anomaly Triggers when a performance metric drifts away from its learned baseline for several consecutive runs. Baselines are stored per `(check, location, hour-of-day, day-of-week)` bucket so a monitor that's slower on Monday mornings doesn't trip the alert every Monday. ```yaml - type: baseline_anomaly metric: response_time # required sigmaThreshold: 3 # default: 3 (min: 2, max: 10) consecutiveCount: 3 # default: 3 (min: 2, max: 20, integer) direction: above # default: above (allowed: above | below | both) severity: warning # default: warning ``` **Supported metrics.** HTTP: `response_time`, `dns_lookup`, `tls_handshake`, `ttfb`, `content_transfer`. Browser: `lcp`, `fcp`, `cls`. **How the chain works.** On each result ingestion the engine reads the last N runs for this check+location, regardless of status. The alert fires only if all N are successful AND each deviates by more than `sigmaThreshold`·σ from its own time-bucketed baseline in the configured direction. Any non-success run inside the window breaks the chain, so this alert stays scoped to drift-style regressions rather than outages. Failures are not skipped over to reach earlier successes: the window simply slides forward until it again contains N successes. **Pick a reasonable threshold.** 3σ / 3 consecutive is a conservative starting point: under the normal assumption (and assuming run-to-run independence), the per-run false-positive rate at 3σ is ≈1-in-740 for one-sided checks (`direction: above` or `below`, the default) and ≈1-in-370 for two-sided (`direction: both`). Across 3 consecutive runs that compounds to ≈1-in-400-million one-sided or ≈1-in-50-million two-sided. In practice runs sharing a time bucket carry correlated noise (network conditions, regional perturbations), so treat the compounded figure as a theoretical ceiling. Tightening to 4σ / 5 consecutive buys near-zero false positives; loosening to 2σ / 2 consecutive is effectively a point-anomaly detector. **Direction.** `above` catches slowdowns (the common case for response-time metrics). `below` catches suspiciously-fast responses, which often indicate the runner short-circuiting past the real work (stale cache hits, 304 storms, redirect chains being skipped). `both` is useful for CLS-style vitals where either side is a UX regression. ### Severity All SSL-related conditions (including `ssl_expiry`), `mcp_schema_drift`, and `baseline_anomaly` accept an optional `severity` field with value `critical`, `warning`, or `info`. Severity is stored on the resulting alert instance and surfaces in the alerts dashboard: use it to distinguish "nice to know" rotations from genuine outages. `mcp_schema_drift` and `baseline_anomaly` default to `warning` (set by the shared schema); SSL conditions have no schema default and fall back to `critical` via the evaluator. ## Cascading alerts Alerts follow the same cascade as other monitor settings: **defaults -> group -> monitor**. Define alerts at any level: ```yaml defaults: alerts: - conditions: - type: consecutive_failures count: 2 channels: - "@ops-slack" groups: - name: Critical APIs alerts: - conditions: - type: consecutive_failures count: 1 channels: - "@ops-slack" - "@pagerduty" monitors: - name: Payments API type: http url: https://api.example.com/payments ``` When a monitor defines its own `alerts`, those **replace** the inherited alerts entirely. To clear inherited alerts, set `alerts: []` on the monitor. ## Multi-tier alerting To escalate alerts based on severity, define multiple alert rules with different conditions and channels: ```yaml monitors: - name: Checkout Flow type: browser script: ./monitors/checkout.ts alerts: # Tier 1: Slack for initial failures - name: checkout-warning conditions: - type: consecutive_failures count: 2 channels: - "@ops-slack" # Tier 2: PagerDuty for persistent multi-location failures - name: checkout-critical conditions: - type: consecutive_failures count: 5 - type: multi_location_failure minLocations: 3 channels: - "@pagerduty" - "@on-call-email" # SSL expiry: early warning - name: checkout-ssl conditions: - type: ssl_expiry daysBeforeExpiry: 30 severity: warning channels: - "@ops-slack" # SSL rotation detection - name: checkout-ssl-rotation conditions: - type: ssl_certificate_changed severity: info channels: - "@ops-slack" ``` ## OTel trace linking When an alert fires, Yorker includes the OpenTelemetry trace ID in the notification payload. If your application propagates the W3C `traceparent` header, you can jump directly from an alert to the distributed trace in your observability backend (e.g., HyperDX, Jaeger, Grafana Tempo) to identify root cause. ## Web UI To create alerts through the dashboard: 1. Navigate to a monitor's detail page. 2. Click **Add Alert Rule**. 3. Select one or more conditions and configure thresholds. 4. Choose notification channels (create them in **Settings > Notification Channels** if needed). 5. Click **Save**. Alert rules created in the Web UI and the CLI are the same underlying resource. The CLI's `yorker deploy` command will detect and diff against rules created through the UI, and abort on drift unless you pass `--force` or `--accept-remote`. You can also view all alerts across monitors from the **Alerts** page in the dashboard. ## CLI alert management In addition to defining alerts in `yorker.config.yaml`, you can manage alert instances directly from the command line. ### List active alerts ```bash yorker alerts list ``` Include resolved and recovered alerts with `--all`, or filter by monitor: ```bash yorker alerts list --monitor "Homepage" --all ``` ### Acknowledge and resolve ```bash yorker alerts ack ainst_abc123 yorker alerts resolve ainst_abc123 ``` ### View alert history ```bash yorker alerts history --since 7d ``` ### Create alert rules imperatively ```bash yorker alerts rules create \ --monitor "Homepage" \ --condition "consecutive_failures >= 3" \ --channel nch_abc123 \ --name "homepage-down" ``` Baseline-deviation rules use `baseline_anomaly:` (defaults to 3σ, 3 consecutive, above) or the explicit `baseline_anomaly:@σ:[:above|below|both]` form: ```bash yorker alerts rules create \ --monitor "Checkout API" \ --condition "baseline_anomaly:response_time" \ --channel nch_abc123 \ --severity warning yorker alerts rules create \ --monitor "Marketing site" \ --condition "baseline_anomaly:lcp@4σ:5:above" \ --channel nch_pagerduty \ --severity critical ``` See the [CLI reference](/docs/reference/cli) for the full list of alert commands and condition formats. --- ## Define SLOs > How to set Service Level Objectives — availability and performance targets with error budgets and burn rate alerts. # Define SLOs Service Level Objectives (SLOs) let you define a target for a monitor and track how much error budget remains. When a monitor's success rate drops below the target, Yorker calculates how fast you are consuming error budget and can alert before the budget runs out. Yorker supports two SLI types: - **Availability SLIs** — the ratio of successful check runs to total runs. Use this for uptime objectives. - **Performance SLIs** — the ratio of runs where response time is at or below a threshold. Use this for latency objectives (e.g. "p95 < 500 ms"). ## Why SLOs matter Raw uptime percentages hide how much room you have for incidents. An SLO of 99.9% over 30 days gives you an error budget of roughly 43 minutes of downtime. Burn rate alerts tell you when you are spending that budget faster than expected, so you can respond before it is exhausted — rather than reacting after a hard threshold is crossed. ## Define an availability SLO Add an `slos` block to your `yorker.config.yaml`. Each SLO references a monitor by name: ```yaml project: my-app alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" monitors: - name: API Health type: http url: https://api.example.com/health frequency: 1m slos: - name: API Availability monitor: API Health target: "99.9%" window: 30d burnRateAlerts: true channels: - "@ops-slack" ``` ### SLO fields | Field | Required | Description | |-------|----------|-------------| | `name` | Yes | A unique name for the SLO. | | `monitor` | Yes | The name of the monitor this SLO tracks. Must match a monitor defined in the same config. | | `target` | Yes | Target. Accepts `"99.9%"` (string with percent sign) or `99.9` (number treated as percentage). | | `window` | Yes | Evaluation window. Must be `7d`, `14d`, or `30d`. | | `burnRateAlerts` | No | Enable burn rate alerting. Defaults to `true`. | | `channels` | No | Notification channels for burn rate alerts. Uses `@channel-name` references to `alertChannels`. | | `enabled` | No | Whether the SLO is active. Defaults to `true`. | ### Target parsing The `target` field accepts two formats: - **String with percent:** `"99.9%"` — parsed as 99.9%, stored as 9990 basis points. - **Number:** `99.9` — treated as a percentage, stored as 9990 basis points. Both formats produce the same result. Basis points range from 1 to 9999 (0.01% to 99.99%). ### Window The evaluation window determines how far back Yorker looks when calculating availability and error budget. Only three values are accepted: `7d`, `14d`, or `30d` (or the equivalent integers `7`, `14`, `30`). ## Performance SLOs Performance SLOs track the fraction of runs whose response time meets a latency threshold. They are available through the REST API and dashboard today — not through `yorker.config.yaml`. Create one via the API: ```bash curl -X POST https://app.yorkermonitoring.com/api/slos \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "sloType": "check", "checkId": "chk_abc123", "name": "API Latency", "targetBasisPoints": 9500, "windowDays": 30, "sliType": "performance", "perfThresholdMs": 500, "burnRateAlertsEnabled": true }' ``` | Field | Description | |-------|-------------| | `sliType: "performance"` | Use a latency-based SLI. Default is `availability`. | | `perfThresholdMs` | Required for performance SLIs. Runs at or below this duration are considered compliant. | Performance SLOs share everything else with availability SLOs — burn rate alerts, error budgets, and evaluation windows. ## Third-party SLOs Yorker can also track SLOs against metrics emitted by services you do not own — external APIs, vendors, dependencies. When the runner observes an outbound call to a third-party host, Yorker extracts metrics and can roll them up into an SLO. Third-party SLOs are discovered and configured through the dashboard (**SLOs > New > Third-party**). See the [REST API reference](/docs/reference/api#slos) for the underlying `sloType: "third_party"` fields. ## Burn rate alerts When `burnRateAlerts` is enabled (the default), Yorker continuously calculates how fast error budget is being consumed. The system tracks burn rates over three windows: - **1-hour burn rate** — detects sudden spikes in failures. - **6-hour burn rate** — catches sustained degradation. - **24-hour burn rate** — identifies slow-burn issues. If the burn rate indicates the error budget will be exhausted before the window ends, an alert fires through the configured channels. Advanced users can also configure custom `burn_rate` alert conditions directly — see [Set Up Alerts](/docs/guides/set-up-alerts#burn_rate). ## View SLO status To view SLO status in the dashboard, navigate to the **SLOs** section. Each SLO shows: - Current availability or performance ratio - Error budget total, consumed, and remaining - Burn rates at 1h, 6h, and 24h windows - Whether the SLO is currently met To check SLO status via the API: ```bash curl https://app.yorkermonitoring.com/api/slos/slo_abc123/status \ -H "Authorization: Bearer $YORKER_API_KEY" ``` ## Complete example ```yaml project: my-app alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" pagerduty: type: webhook url: "{{secrets.PAGERDUTY_WEBHOOK_URL}}" monitors: - name: Homepage type: http url: https://example.com frequency: 1m - name: Checkout Flow type: browser script: ./monitors/checkout.ts frequency: 5m slos: - name: Homepage Availability monitor: Homepage target: "99.95%" window: 30d burnRateAlerts: true channels: - "@ops-slack" - name: Checkout Availability monitor: Checkout Flow target: "99.9%" window: 30d burnRateAlerts: true channels: - "@ops-slack" - "@pagerduty" ``` --- ## Notification Channels > How to create and manage notification channels for Slack, email, and webhook alerts. # Notification Channels Notification channels define where alerts are sent. Yorker supports three channel types: **Slack**, **email**, and **webhook**. Channels are shared across your team and can be referenced by any alert rule or SLO. ## Channel Types ### Slack Posts alert notifications to a Slack channel via an incoming webhook. ```yaml alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" ``` To set up a Slack webhook, create an [Incoming Webhook](https://api.slack.com/messaging/webhooks) in your Slack workspace and use the generated URL. ### Email Sends alert notifications to one or more email addresses. ```yaml alertChannels: on-call-email: type: email addresses: - oncall@example.com - sre-team@example.com ``` At least one address is required. ### Webhook Sends an HTTP request to any URL. Use this for PagerDuty, Opsgenie, custom integrations, or any service that accepts webhooks. ```yaml alertChannels: pagerduty: type: webhook url: "{{secrets.PAGERDUTY_WEBHOOK_URL}}" method: POST headers: Authorization: "Token token={{secrets.PD_TOKEN}}" ``` | Field | Default | Description | |---|---|---| | `url` | *(required)* | Webhook endpoint URL. | | `method` | `POST` | HTTP method. | | `headers` | *(none)* | Optional headers for authentication. | ## Create channels in the Web UI 1. Navigate to **Settings > Notification Channels**. 2. Click **Create Channel**. 3. Select the channel type (Slack, Email, or Webhook). 4. Fill in the required fields. 5. Click **Save**. Channels created in the UI and via the CLI are the same underlying resource. The CLI's `yorker deploy` detects and diffs against channels created through the UI. ## Define channels in YAML Add an `alertChannels` block at the top level of `yorker.config.yaml`: ```yaml alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" on-call-email: type: email addresses: - oncall@example.com pagerduty: type: webhook url: "{{secrets.PAGERDUTY_WEBHOOK_URL}}" method: POST headers: Authorization: "Token token={{secrets.PD_TOKEN}}" ``` ### Reference channels in alerts Use the `@channel-name` syntax to attach a channel to an alert rule: ```yaml monitors: - name: API Health type: http url: https://api.example.com/health alerts: - conditions: - type: consecutive_failures count: 3 channels: - "@ops-slack" - "@on-call-email" ``` ### Reference channels in SLOs SLOs can also reference channels for burn rate alerts: ```yaml slos: - name: API Availability monitor: API Health target: "99.9%" window: 30d channels: - "@ops-slack" ``` ## Deploy ordering When you run `yorker deploy`, channels are created/updated **before** alerts and SLOs (phase 0), and deleted **after** all references are removed (phase Z). This ensures channels always exist when alerts or SLOs reference them. ## Customize incident notification templates Slack, email, and webhook channels support per-event template overrides that let you replace Yorker's default payload with your own Handlebars-rendered markup. The in-app editor at **Settings > Notification Channels > Templates** gives you: - A Handlebars editor with channel-appropriate syntax highlighting (JSON for Slack/webhook, HTML for email) - A live preview rendered against six canonical incident fixtures (single HTTP failure, multi-location burst, browser check, MCP, and more) - A library of starters and curated examples you can apply with one click, plus a diff view against the currently saved template - **Send test**: dispatches the current saved template to the real channel with the selected fixture (60-second cooldown per channel) See the [Incidents concepts page](/docs/concepts/incidents#user-editable-templates) for the render context and fallback semantics. Per-channel examples and helper reference live on each integration page: [Slack](/docs/integrations/slack#template-overrides), [Email](/docs/integrations/email#template-overrides), [Webhook](/docs/integrations/webhook#template-overrides). PagerDuty and ServiceNow template overrides are authored through the REST API only; UI editor support for those channel types is planned for a future release. ## Manage channels via API Channels can also be managed through the REST API. See the [REST API reference](/docs/reference/api) for endpoints. --- ## Deploy with CLI > How to deploy monitors, alerts, and SLOs from yorker.config.yaml using the Yorker CLI. # Deploy with CLI The Yorker CLI lets you define monitors, alerts, notification channels, and SLOs in a `yorker.config.yaml` file and deploy them with a single command. Changes are computed as a diff against the current remote state and applied in the correct order. ## Install ```bash npm install -g @yorker/cli ``` ## Authenticate Generate an API key from **Settings > API Keys** in the dashboard. Export it as an environment variable: ```bash export YORKER_API_KEY=sk_... ``` The CLI also accepts `YORKER_API_URL` to point at a different control plane (defaults to `https://app.yorkermonitoring.com`). ## Scaffold a config To create a starter `yorker.config.yaml`, run: ```bash yorker init ``` The interactive wizard walks you through project name, first monitor URL, type, and frequency. Pass flags to skip prompts: ```bash yorker init --name my-app --url https://example.com --type http --frequency 5m ``` ## Config file structure The `yorker.config.yaml` file has these top-level sections: ```yaml project: my-app # Required. Project identifier. alertChannels: # Notification channel definitions. ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" defaults: # Default settings for all monitors. frequency: 5m locations: - loc_us_east - loc_eu_central groups: # Groups of monitors with shared settings. - name: API Endpoints frequency: 1m monitors: - name: Users API type: http url: https://api.example.com/users monitors: # Top-level monitor definitions. - name: Homepage type: http url: https://example.com slos: # Service Level Objectives. - name: Homepage Availability monitor: Homepage target: "99.9%" window: 30d maintenanceWindows: # Scheduled silences / pauses. - name: Weekly DB maintenance checks: all mode: pause # pause | continue startsAt: "2026-04-12T02:00:00Z" endsAt: "2026-04-12T03:00:00Z" recurring: true recurrenceRule: "FREQ=WEEKLY;BYDAY=SU" ``` See the [Configuration reference](/docs/reference/configuration#maintenancewindows) for the full `maintenanceWindows` schema. ## Commands ### yorker validate Validates the config file without deploying. Checks YAML syntax, Zod schema validation, script file existence, secret interpolation, and cross-references (e.g., SLOs referencing valid monitors, alerts referencing valid channels). ```bash yorker validate ``` Exit code 0 means the config is valid. Non-zero means errors were found — they are printed to stderr. ### yorker diff Shows what would change between your local config and the remote state without applying anything: ```bash yorker diff ``` Output shows each resource as CREATE, UPDATE, DELETE, or UNCHANGED, with field-level diffs for updates: ``` Yorker deploy plan for "my-app" Checks: + CREATE http "API Health" (60s, 2 locations) ~ UPDATE http "Homepage" ~ configJson.timeoutMs 30000 -> 15000 = UNCHANGED browser "Checkout Flow" Summary: 1 to create, 1 to update, 1 unchanged (dry run — no changes applied) ``` ### yorker deploy Applies changes to the remote state: ```bash yorker deploy ``` Resources are applied in dependency order: channels first, then checks (with label sync), then alerts, SLOs, and maintenance windows. See the [CLI reference](/docs/reference/cli#deploy-phases) for the full phase table. Remote resources that exist but are not in your config are reported but **not deleted** unless you pass `--prune`. ### yorker deploy --prune Deletes remote resources that are not present in the config file: ```bash yorker deploy --prune ``` This is useful for keeping the remote state in exact sync with the config. Prune deletions happen in the correct dependency order as part of the normal deploy phases — SLOs and alerts are removed early (phases A and B, before their parent checks in phase C), maintenance windows at phase I, and channels at phase Z. See the [CLI reference for `yorker deploy`](/docs/reference/cli#deploy-phases) for the full phase table. ### yorker deploy --force / --accept-remote If someone edits a YAML-managed resource via the web UI, the next `yorker deploy` detects the change and aborts with a drift report. Two flags control how to resolve: ```bash yorker deploy --force # local config wins — overwrite remote yorker deploy --accept-remote # skip drifted resources — keep remote as-is ``` See [Drift detection](#drift-detection) below for details. ### yorker status Displays the current state of all monitors: ```bash yorker status ``` ### yorker results tail Live-stream check results as they arrive: ```bash yorker results tail "Homepage" --interval 30s ``` ### yorker test Runs HTTP monitors locally against their configured URLs. Useful for validating URLs and auth before deploying: ```bash yorker test ``` Browser monitors are listed but not executed locally (use Playwright directly for local browser tests). ## Drift detection The CLI tracks the state of each deployed resource in `.yorker/.deploy-state.json` (gitignored, per-machine). After every successful deploy, it saves a config hash and the remote `updatedAt` timestamp for each resource. On the next deploy, it compares: - **Local changed?** — current config hash differs from the stored hash. - **Remote changed?** — remote `updatedAt` is newer than the stored timestamp, and the resource has `managedBy: "yaml"`. This produces four possible outcomes: | | Remote unchanged | Remote changed | |---|---|---| | **Local unchanged** | Skip | **Drift** (remote-only edit) | | **Local changed** | Normal update | **Conflict** (both sides changed) | When drift or conflicts are detected, the deploy aborts with a report showing which resources were affected. You have three options: 1. **Review and choose** — inspect the remote changes in the dashboard, then update your config to match or intentionally overwrite. 2. **`--force`** — local config wins, remote changes are overwritten. 3. **`--accept-remote`** — drifted resources are skipped, keeping their remote state. ### First deploy On the first deploy (no state file exists), drift detection is skipped entirely — the CLI creates the state file after a successful apply. ### After `yorker pull` Every successful `yorker pull` rewrites `.yorker/.deploy-state.json` with a fresh snapshot of remote state, so the next deploy treats everything as a clean baseline with no drift. ### `.yorker/.deploy-state.json` This file is per-machine state and should not be committed. Add it to your `.gitignore`: ```text .yorker/.deploy-state.json ``` --- ## Secret interpolation To keep secrets out of your config file, use placeholder syntax. The CLI resolves these at deploy time from environment variables. ### `{{secrets.NAME}}` Looks up `YORKER_SECRET_NAME` first, then falls back to `NAME`: ```yaml auth: type: bearer token: "{{secrets.AUTH_TOKEN}}" ``` Set with `export YORKER_SECRET_AUTH_TOKEN=...` or `export AUTH_TOKEN=...`. ### `{{env.NAME}}` Looks up the environment variable directly: ```yaml monitors: - name: Staging API type: http url: "{{env.STAGING_URL}}/health" ``` ### `${NAME}` Legacy shorthand, equivalent to `{{env.NAME}}`. Not applied inside browser script files to avoid conflicts with JavaScript template literals. The CLI fails with a clear error if any placeholder is unresolved after interpolation. ## CI/CD integration Validate on push, preview changes on PRs with `yorker diff`, and deploy on merge. See the full [CI/CD Integration guide](/docs/guides/ci-cd) for complete GitHub Actions and GitLab CI workflows. ```yaml # GitHub Actions — quick start - run: npm install -g @yorker/cli - run: yorker validate - run: yorker diff - run: yorker deploy --force # CI owns the config if: github.ref == 'refs/heads/main' ``` Use `--force` in CI pipelines where the config file is the source of truth. If you want CI to preserve manual edits made via the dashboard, use `--accept-remote` instead. Set `YORKER_API_KEY` and any `YORKER_SECRET_*` variables at the job or workflow level so all steps — including `validate` — can resolve `{{secrets.*}}` placeholders. ## Complete example ```yaml project: my-app alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" on-call-email: type: email addresses: - oncall@example.com pagerduty: type: webhook url: "{{secrets.PAGERDUTY_WEBHOOK_URL}}" method: POST headers: Authorization: "Token token={{secrets.PD_TOKEN}}" defaults: frequency: 5m locations: - loc_us_east - loc_eu_central http: timeoutMs: 15000 followRedirects: true assertions: - type: status_code value: 200 browser: viewport: width: 1280 height: 720 screenshotMode: every_step alerts: - conditions: - type: consecutive_failures count: 2 channels: - "@ops-slack" groups: - name: Critical APIs frequency: 1m locations: - loc_us_east - loc_us_west - loc_eu_central alerts: - conditions: - type: consecutive_failures count: 1 - type: multi_location_failure minLocations: 2 channels: - "@ops-slack" - "@pagerduty" monitors: - name: Payments API type: http url: https://api.example.com/payments assertions: - type: status_code value: 200 - type: response_time max: 1000 - name: Auth API type: http url: https://api.example.com/auth/health monitors: - name: Homepage type: http url: https://example.com - name: Checkout Flow type: browser script: ./monitors/checkout.ts frequency: 5m alerts: - name: checkout-warning conditions: - type: consecutive_failures count: 2 channels: - "@ops-slack" - name: checkout-critical conditions: - type: consecutive_failures count: 5 channels: - "@pagerduty" - "@on-call-email" slos: - name: Homepage Availability monitor: Homepage target: "99.95%" window: 30d channels: - "@ops-slack" - name: Checkout Availability monitor: Checkout Flow target: "99.9%" window: 30d channels: - "@ops-slack" - "@pagerduty" ``` --- ## CI/CD Integration > How to validate, diff, and deploy monitors automatically with GitHub Actions or GitLab CI. # CI/CD Integration Wire `yorker validate`, `yorker diff`, and `yorker deploy` into your CI pipeline to get config validation on every push, change previews on pull requests, and automatic deploys on merge. ## Prerequisites 1. **API key** -- generate one from **Settings > API Keys** in the dashboard. 2. **Store as a secret** -- add it as `YORKER_API_KEY` in your CI provider's secret store. 3. **Config committed** -- your `yorker.config.yaml` and any `monitors/` script files must be in version control. --- ## GitHub Actions Create `.github/workflows/yorker.yml` in your repository: ```yaml name: Yorker Monitoring as Code on: push: paths: - "yorker.config.yaml" - "monitors/**" pull_request: paths: - "yorker.config.yaml" - "monitors/**" env: YORKER_API_KEY: ${{ secrets.YORKER_API_KEY }} # Add YORKER_SECRET_* vars here if your config uses {{secrets.*}} interpolation jobs: validate: name: Validate config runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: 20 - run: npm install -g @yorker/cli - run: yorker validate diff: name: Preview changes if: github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository needs: validate runs-on: ubuntu-latest permissions: contents: read pull-requests: write issues: write steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: 20 - run: npm install -g @yorker/cli - name: Run diff id: diff run: | set +e OUTPUT=$(yorker diff --json 2>/dev/null) EXIT_CODE=$? set -e EOF_MARKER=$(dd if=/dev/urandom bs=15 count=1 2>/dev/null | base64) echo "json<<$EOF_MARKER" >> "$GITHUB_OUTPUT" echo "$OUTPUT" >> "$GITHUB_OUTPUT" echo "$EOF_MARKER" >> "$GITHUB_OUTPUT" echo "exit_code=$EXIT_CODE" >> "$GITHUB_OUTPUT" - name: Comment on PR if: always() uses: actions/github-script@v7 env: DIFF_JSON: ${{ steps.diff.outputs.json }} with: script: | const esc = (s) => s.replace(/[|\\`*_{}[\]<>()#+\-!~@\n\r]/g, (ch) => ch === '\n' || ch === '\r' ? ' ' : ch === '@' ? '@' : `\\${ch}`); const raw = process.env.DIFF_JSON ?? ''; let body; try { const result = JSON.parse(raw); if (!result.ok) { body = `### Yorker Diff\n\n:x: Error: ${result.error?.message ?? 'Unknown error'}`; } else { const changes = result.data?.changes ?? []; const actionable = changes.filter(c => c.type !== 'unchanged'); if (actionable.length === 0) { body = '### Yorker Diff\n\n:white_check_mark: No changes. Remote state matches local config.'; } else { const symbols = { create: '+', update: '~', delete: '-' }; const rows = actionable .map(c => `| ${symbols[c.type] ?? '?'} ${c.type} | ${esc(c.kind)} | ${esc(c.name)} |`) .join('\n'); body = `### Yorker Diff\n\n| Action | Type | Name |\n|---|---|---|\n${rows}\n\n${actionable.length} change(s) will be applied on merge.`; } } } catch { body = `### Yorker Diff\n\n:warning: Could not parse diff output.\n\n
Raw output\n\n\`\`\`\n${raw}\n\`\`\`\n
`; } const comments = await github.paginate(github.rest.issues.listComments, { owner: context.repo.owner, repo: context.repo.repo, issue_number: context.issue.number, }); const existing = comments.find(c => c.user?.type === 'Bot' && c.body?.startsWith('### Yorker Diff') ); if (existing) { await github.rest.issues.updateComment({ owner: context.repo.owner, repo: context.repo.repo, comment_id: existing.id, body, }); } else { await github.rest.issues.createComment({ owner: context.repo.owner, repo: context.repo.repo, issue_number: context.issue.number, body, }); } - name: Fail on diff errors if: steps.diff.outputs.exit_code != '0' run: exit ${{ steps.diff.outputs.exit_code }} deploy: name: Deploy monitors if: github.event_name == 'push' && github.ref == 'refs/heads/main' needs: validate runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: 20 - run: npm install -g @yorker/cli - run: yorker deploy ``` ### How it works | Trigger | Job | What it does | |---|---|---| | Push or PR touching config files | **validate** | Validates YAML syntax and schema. Blocks the pipeline on errors. | | Same-repo pull request | **diff** | Runs `yorker diff --json`, parses the output, and posts a summary comment on the PR. Updates the same comment on subsequent pushes. Skipped for fork PRs. | | Push to main touching config files | **deploy** | Applies changes to the remote state. Runs after validation passes. | **Note:** The diff job's `if:` condition skips fork PRs, where `GITHUB_TOKEN` is read-only and repository secrets are not exposed. The validate job still runs on fork PRs, but will fail if your config uses `{{secrets.*}}` placeholders (since the corresponding environment variables won't be set). If you accept fork contributions, either avoid secret placeholders in validation-critical fields or add the same `full_name == github.repository` guard to the validate job. ### Secrets The workflow uses workflow-level `env:` so all jobs (including `validate`) can resolve `{{secrets.*}}` and `{{env.*}}` placeholders. Add secrets referenced in your config as additional environment variables: ```yaml env: YORKER_API_KEY: ${{ secrets.YORKER_API_KEY }} YORKER_SECRET_SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }} YORKER_SECRET_AUTH_TOKEN: ${{ secrets.AUTH_TOKEN }} ``` --- ## GitLab CI Create `.gitlab-ci.yml` in your repository: ```yaml stages: - validate - diff - deploy .yorker: image: node:20-slim before_script: - npm install -g @yorker/cli validate: extends: .yorker stage: validate rules: - changes: - yorker.config.yaml - monitors/** script: - yorker validate diff: extends: .yorker stage: diff rules: - if: $CI_PIPELINE_SOURCE == "merge_request_event" changes: - yorker.config.yaml - monitors/** script: - yorker diff variables: YORKER_API_KEY: $YORKER_API_KEY deploy: extends: .yorker stage: deploy rules: - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH changes: - yorker.config.yaml - monitors/** script: - yorker deploy variables: YORKER_API_KEY: $YORKER_API_KEY ``` Add `YORKER_API_KEY` to your project's **Settings > CI/CD > Variables** as a masked variable. Only mark it as protected if you restrict it to protected branches — otherwise MR pipelines on unprotected branches won't have access. --- ## JSON output format Most CLI commands support `--json` for machine-readable output and share a consistent envelope. The interactive `yorker dashboard` command does not emit this envelope. The other exception is `yorker results tail --json`, which emits one JSON object per result (newline-delimited) instead of a single envelope — this allows streaming consumption. ### Success ```json { "ok": true, "data": { ... } } ``` ### Error ```json { "ok": false, "error": { "code": "general_error", "message": "3 config error(s): ..." } } ``` ### Exit codes These are the codes most relevant to CI pipelines (`validate`, `diff`, `deploy`). Other commands may use additional codes (e.g., `yorker status` exits `10` when monitors are unhealthy). | Code | Meaning | |---|---| | `0` | Success | | `1` | General error (validation failure, API error, missing config) | | `2` | Authentication failure | | `3` | Plan/quota limit exceeded | | `4` | Partial failure (some deploy operations succeeded, others failed) | ### Key command outputs **`yorker validate --json`** ```json { "ok": true, "data": { "valid": true, "monitors": 5, "slos": 2 } } ``` **`yorker diff --json`** ```json { "ok": true, "data": { "changes": [ { "type": "create", "kind": "check", "name": "API Health", "fieldChanges": [] }, { "type": "update", "kind": "check", "name": "Homepage", "fieldChanges": [ { "path": "configJson.timeoutMs", "oldValue": 30000, "newValue": 15000 } ] }, { "type": "unchanged", "kind": "check", "name": "Orders API", "fieldChanges": [] } ] } } ``` Each change has a `type` (`create`, `update`, `delete`, `unchanged`), a `kind` (`check`, `alert`, `slo`, `channel`), and a `fieldChanges` array (empty when there are no field-level differences). Actual CLI output also includes metadata fields such as `remoteId`, `local`, and `remote` which are omitted here for brevity. **`yorker deploy --json`** Same as `diff`, plus an `applied` field with operation counts: ```json { "ok": true, "data": { "changes": [ ... ], "applied": { "created": 1, "updated": 1, "deleted": 0, "errors": [] } } } ``` If `applied.errors` is non-empty, the exit code is `4` (partial failure). --- ## Tips ### Pin the CLI version The CLI install takes 2-3 seconds. To lock a specific version: ```bash npm install -g @yorker/cli@0.4.0 ``` ### Deploy with pruning To keep remote state exactly in sync (deleting monitors removed from config): ```bash yorker deploy --prune ``` Only use this if your config is the single source of truth. Monitors created through the web UI will be deleted. ### Gate deploys on diff To require an explicit approval step before deploying, separate the diff and deploy jobs and add a manual gate: ```yaml # GitLab CI deploy: stage: deploy when: manual script: - yorker deploy ``` ### Multiple environments Use environment variables to deploy different configs to different environments: ```yaml # GitHub Actions deploy-staging: env: YORKER_API_KEY: ${{ secrets.YORKER_API_KEY_STAGING }} steps: - run: yorker deploy deploy-production: env: YORKER_API_KEY: ${{ secrets.YORKER_API_KEY_PRODUCTION }} needs: [deploy-staging] steps: - run: yorker deploy ``` Each API key is scoped to a team, so the same config deploys to different teams. --- ## Private Locations > Deploy Yorker runners on your own infrastructure to monitor internal services, staging environments, and VPNs. # Private Locations A private location runs the Yorker check runner on your own infrastructure instead of Yorker's hosted fleet. Use it to monitor internal services, staging environments, or anything behind a VPN -- from inside your own network. The same monitors, alerting, and dashboards work identically; the only difference is where the agent executes. Private location runs are billed at 50% of hosted runs because you provide the compute. ## Prerequisites - A Yorker plan that includes private locations (free tier: 0; paid: up to 2; enterprise: unlimited) - Docker installed on the target machine - Outbound HTTPS access from the agent machine to `app.yorkermonitoring.com:443` -- no inbound connections are required --- ## Step 1: Create a Location To create a location via the CLI, provide a unique slug (`--name`) and a human-readable label (`--display-name`): ```bash yorker locations create --name hq-data-center --display-name "HQ Data Center" ``` The CLI also auto-generates a runner key. Use the key and location ID from the output when continuing with Step 3 below. To create a location via the dashboard: go to **Settings > Locations** and click **Add Location**. Both paths return a location ID in the form `loc_...`. Save it -- you will need it in the next step. --- ## Step 2: Generate a Runner Key Runner keys authenticate the agent to the control plane. Each key is shown only once. To generate a key via the CLI: ```bash yorker locations keys create ``` To generate a key via the dashboard: expand the location row, enter a key name, and click **Create Key**. Copy the key immediately -- it cannot be retrieved after the page is closed. --- ## Step 3: Deploy the Agent The repository includes a reference Docker Compose configuration at `docker/docker-compose.private-agent.yml` and an environment template at `docker/private-agent.env.example`. The template pre-fills `RUNNER_MODE`, `CONTROL_PLANE_URL`, and `LOCATION_TYPE` with their required values. You only need to supply the four values specific to your deployment: ```bash # Copy the reference environment file cp docker/private-agent.env.example docker/.env ``` Open `docker/.env` and fill in: ```bash RUNNER_API_KEY=rk_... # The runner key from Step 2 LOCATION_ID=loc_... # The location ID from Step 1 TEAM_ID=team_... # Included in the `yorker locations create` CLI output LOCATION_NAME=HQ Data Center # Human-readable label for logs and telemetry ``` Then start the agent: ```bash docker compose -f docker/docker-compose.private-agent.yml up -d ``` ### Choosing an image Two runner Dockerfiles are available: | Dockerfile | Includes | Use when | |---|---|---| | `Dockerfile.runner-heavy` (~1 GB) | Playwright + Chromium | You need browser checks | | `Dockerfile.runner-light` (~200 MB) | HTTP/API checks only | You only run HTTP/ping monitors | The default Compose file builds the heavy image. To switch to the lighter image, comment out the `yorker-agent` service in the Compose file and uncomment `yorker-agent-light`. --- ## Step 4: Verify Health After the agent starts, it begins polling the control plane. To verify it is connected: - **Dashboard:** the location's status badge turns green (Active) in **Settings > Locations**. - **CLI:** run `yorker locations list` and check the status column. Health states are derived from the time since the last successful poll: | Status | Meaning | |---|---| | Active | Last poll within the past 5 minutes | | Degraded | Last poll 5--10 minutes ago | | Offline | Last poll more than 10 minutes ago, or never connected | If the agent does not appear Active within a few minutes, see [Troubleshooting](#troubleshooting) below. --- ## Step 5: Assign the Location to a Monitor Monitors only execute in the locations they are assigned to. A monitor can be assigned to any mix of hosted and private locations. To assign via the dashboard: edit a monitor and open the location picker. Your private location appears under the **Private** group with its current health badge. To assign via `yorker.config.yaml`, add the location ID to the monitor's `locations` array: ```yaml monitors: - name: Internal API type: http url: https://api.internal.example.com/health locations: - loc_us_east # hosted location - loc_abcdef123456 # your private location ``` Deploy the config to apply: ```bash yorker deploy ``` --- ## Network Requirements The agent only needs outbound connectivity -- no inbound connections are required and no firewall rules need to be opened toward Yorker. | Requirement | Detail | |---|---| | Outbound HTTPS | `app.yorkermonitoring.com:443` -- for polling checks and submitting results | | Internal network access | The agent must be able to reach the services it monitors (e.g., internal hostnames, VPN resources) | There is no VPN tunnel from the agent to Yorker's infrastructure. All communication is initiated by the agent over standard HTTPS. --- ## Troubleshooting | Symptom | Likely cause | Fix | |---|---|---| | Location stays Offline | Wrong runner key or location ID | Verify `RUNNER_API_KEY` and `LOCATION_ID` in `.env` match what was generated in Steps 1 and 2 | | Location stays Offline | Firewall blocking outbound HTTPS | Allow outbound connections from the agent machine to `app.yorkermonitoring.com:443` | | Location is Active but no checks run | Wrong `TEAM_ID` | The runner polls with team scoping -- verify `TEAM_ID` matches your team in Settings | | Checks not executing | Agent is connected but no checks assigned | Assign the private location to at least one monitor (Step 5) | | Image build fails | Missing Docker or build context | Run `docker build` from the repository root with `-f docker/Dockerfile.runner-heavy` | | Browser checks fail | Using the light image | Switch to `runner-heavy`, which includes Playwright and Chromium | --- ## Install Dashboards > One-click provisioning of pre-built Yorker synthetic monitoring dashboards into self-hosted HyperDX or ClickStack Cloud. # Install Dashboards Yorker ships 4 pre-built dashboards that provision directly into your observability platform via its Dashboard API. No manual chart creation required. Both self-hosted HyperDX and ClickStack Cloud (ClickHouse-managed HyperDX) are supported. ## Dashboards included | Dashboard | What it shows | |---|---| | **[Yorker] Synthetic Monitoring Overview** | Active monitors, uptime %, response time, success/failure trend, recent alert activity | | **[Yorker] Monitor Deep Dive** | Per-monitor drill-down: percentiles, Web Vitals, HTTP timing breakdown, recent failures with screenshot links | | **[Yorker] Status Page** | System status, uptime by monitor, daily trend, incident timeline | | **[Yorker] Trace Correlation** | Trace counts, duration trends, status distribution, recent traces with screenshot links | ## Install via the web UI 1. Go to **Settings > Telemetry (OTLP)** 2. Under **Dashboard Provisioning**, select your deployment type (**Self-hosted** or **ClickStack Cloud**) 3. For ClickStack Cloud: enter your Organization ID, Service ID, and Cloud API key credentials 4. Click **Save Changes**, then click **Install Dashboards** Self-hosted mode uses the same HyperDX API key already configured for telemetry. ClickStack Cloud uses ClickHouse Cloud API credentials (found in your ClickHouse Cloud console under API Keys). ## Install via the CLI ### Self-hosted HyperDX ```bash yorker dashboards install --hyperdx-api-key ``` ### ClickStack Cloud ```bash yorker dashboards install --mode clickstack_cloud \ --clickstack-org-id \ --clickstack-service-id \ --clickstack-api-key-id \ --clickstack-api-secret ``` All ClickStack options support environment variables (`CLICKSTACK_ORG_ID`, `CLICKSTACK_SERVICE_ID`, `CLICKSTACK_API_KEY_ID`, `CLICKSTACK_API_SECRET`). ### Options | Flag | Description | |---|---| | `--mode ` | `self_hosted` (default) or `clickstack_cloud` (env: `HYPERDX_MODE`) | | `--hyperdx-api-key ` | HyperDX API key for self-hosted mode (env: `HYPERDX_API_KEY`) | | `--hyperdx-url ` | Override HyperDX API URL for self-hosted (default: `https://www.hyperdx.io`) | | `--clickstack-org-id ` | ClickHouse Cloud organization ID (env: `CLICKSTACK_ORG_ID`) | | `--clickstack-service-id ` | ClickHouse Cloud service ID (env: `CLICKSTACK_SERVICE_ID`) | | `--clickstack-api-key-id ` | ClickHouse Cloud API key ID (env: `CLICKSTACK_API_KEY_ID`) | | `--clickstack-api-secret ` | ClickHouse Cloud API key secret (env: `CLICKSTACK_API_SECRET`) | | `--packs ` | Comma-separated subset: `overview`, `deep-dive`, `status-page`, `trace-correlation` | | `--force` | Skip duplicate check (may create additional copies) | | `--json` | Output results as JSON | ### Examples ```bash # Self-hosted: install all dashboards yorker dashboards install --hyperdx-api-key hdx_abc123 # Self-hosted: install only the overview dashboard yorker dashboards install --hyperdx-api-key hdx_abc123 --packs overview # ClickStack Cloud: install all dashboards via env vars export CLICKSTACK_ORG_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx export CLICKSTACK_SERVICE_ID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx export CLICKSTACK_API_KEY_ID=your-key-id export CLICKSTACK_API_SECRET=your-key-secret yorker dashboards install --mode clickstack_cloud ``` ## Idempotency Dashboards are named with a `[Yorker]` prefix. The install command checks for existing dashboards by name and skips duplicates automatically. Use `--force` to skip this check (note: this may create additional copies rather than replacing existing ones). ## Screenshots in dashboards The Deep Dive, Status Page, and Trace Correlation dashboards include `synthetics.screenshot.url` in their search panels. When a browser check captures screenshots, the URL appears as a clickable link directly in the dashboard. --- # Concepts ## Architecture > How Yorker's three-tier architecture works — Control Plane, Orchestrator, and Runners. # Architecture Yorker uses a three-tier architecture that separates check management from check execution. This design gives you per-execution isolation for browser checks, lets runners operate in 14 global regions, and lets one control plane coordinate alerting, SLOs, and insights across every monitor. Different check types and location types take different telemetry paths to your OTel backend — the [Telemetry flow](#telemetry-flow) section below lays out exactly which data travels which route. ## Three-tier model ### Control Plane The control plane is a Next.js application deployed on Vercel. It handles: - **Check definitions** -- creating, updating, and deleting monitors via the Web UI, CLI, or API. - **User accounts and teams** -- authentication (Clerk), team membership, API key management. - **Results storage** -- check metadata (pass/fail, response times, Web Vitals) stored in Postgres via Neon. - **Artifact storage** -- screenshots and debug artifacts stored in Cloudflare R2 (S3-compatible). - **API** -- RESTful endpoints for all operations, consumed by the dashboard, CLI, and runners. The control plane never executes checks itself. It defines *what* to monitor, stores check results in Postgres, evaluates alert rules and SLO burn, runs anomaly detection, and generates monitor and team insights. It also hands off OTel events to the orchestrator's outbox for asynchronous delivery — see [Telemetry flow](#telemetry-flow) for the full breakdown of which emissions come from the runner versus the control plane. ### Orchestrator The orchestrator is an always-on service running on Fly.io. It: 1. Polls Postgres on a schedule to find checks that are due for execution. 2. Dispatches Fly Machines in the correct region for each check. 3. Manages the machine pool lifecycle -- creating, reusing, and destroying machines. The orchestrator is the bridge between "this check should run every 5 minutes from London" and "spin up a machine in `lhr` right now." ### Runners Runners execute checks. They run on Fly.io across 14 global regions, close to the infrastructure being monitored. Every runner, regardless of mode, does the same three things for each check: 1. Executes the check (HTTP request, Playwright browser session, or MCP tool exchange). 2. Uploads screenshots directly to Cloudflare R2 (browser checks only). 3. Submits the full check result — timing breakdown, assertions, Web Vitals, certificates, network data — to the Yorker control plane via `POST /api/runner/results` for storage and evaluation. Some runners *additionally* emit OTLP metrics/traces directly from the runner to your collector — but not all of them do, and which do depends on check type and deployment mode. That split is explained in [Telemetry flow](#telemetry-flow) below. ## Three execution tiers Different check types have different isolation and resource requirements. Yorker uses three execution tiers: ### Tier 1 -- Ephemeral Heavy (Browser checks) Each browser check runs in its own Fly Machine with Playwright and Chromium (~1GB image). The machine is created for the check and destroyed after it completes. This gives you: - **Full isolation** -- no shared browser state between checks or tenants. - **Clean environment** -- no cookies, cache, or extensions carrying over. - **Predictable performance** -- no resource contention from other checks. ### Tier 3 — Per-Tenant Persistent (HTTP and MCP checks) HTTP and MCP checks run in lightweight Node.js containers (~200MB image). Each customer gets one persistent container per region. The container stays alive and executes checks as they come due. This gives you: - **Low latency** — no cold-start overhead for each check. - **Efficiency** — HTTP and MCP checks are fast and lightweight, so sharing a container is safe. - **Tenant isolation** — each customer has their own container, so one customer's checks cannot affect another's. ### Tier 2 — Ephemeral Light (Coming soon) Multi-step API tests will run in ephemeral lightweight containers — isolated like browser checks but without the Chromium overhead. ## Telemetry flow Yorker's emission model has two constants and one split: - **Constant 1:** every check run submits its result (status, timing, assertions, Web Vitals, certificates, screenshots, console logs, etc.) to the Yorker control plane via `POST /api/runner/results`. This is an internal submission protocol, not OTLP. It is how alerting, SLOs, insights, the dashboard, and the CLI all work. It happens whether or not you have configured an OTLP endpoint. - **Constant 2:** once you have configured a team OTLP endpoint under **Settings > Telemetry (OTLP)**, the control plane enqueues OTLP log and span events in an outbox whenever it has something worth telling your collector about — a completed or failed check, an alert state change, an SLO burn, a certificate rotation, a new insight, a deployment marker, a maintenance-window edit. The orchestrator (a separate always-on Fly service) drains that outbox every ~10 s, applies SSRF guards, and POSTs the OTLP payload to your collector. You can see every event type in `apps/web/src/lib/otel-events.ts` and the shipper in `apps/orchestrator/src/outbox-drain.ts`. **If no team OTLP endpoint is configured, the control plane skips the enqueue entirely** — the result still lands in Postgres and still drives alerting and the dashboard, but no OTel events are produced. - **The split:** **browser checks** *also* emit OTLP metrics and traces **directly from the runner** to your collector (when a team OTLP endpoint is configured; the orchestrator threads it into each browser execution payload). HTTP and MCP checks running on Yorker-hosted locations do **not** emit OTLP from the runner — the OTel signal for those runs is the `synthetics.check.completed` / `synthetics.check.failed` log event the control plane enqueues on your behalf (it carries the same dimensions: response time, status, assertions, timing breakdown). Private-location runners can opt in to runner-direct OTLP for HTTP and MCP checks by setting `OTLP_ENDPOINT`/`OTLP_API_KEY` as environment variables on the runner container at deploy time. The table below is the short version. **All "outbox → collector" rows assume you have configured a team OTLP endpoint** — without one, the outbox columns are skipped entirely. | Check type / location | Per-check metrics + traces | `check.completed` log event | Alerts / SLO / insight / cert / deployment events | |---|---|---|---| | **Browser**, any location | Runner → collector (direct) | Control plane → outbox → collector | Control plane → outbox → collector | | **HTTP / MCP**, Yorker-hosted | *(not emitted from runner today)* | Control plane → outbox → collector | Control plane → outbox → collector | | **HTTP / MCP**, private location | Runner → collector (direct) — only if you set `OTLP_ENDPOINT` on the agent container at startup | Control plane → outbox → collector | Control plane → outbox → collector | A few things to note: - **Every path targets the same `otlpEndpoint`.** Whether a signal is runner-direct or outbox-delivered, it lands in the same collector you configured under **Settings > Telemetry (OTLP)**. - **Every emission is OTLP HTTP JSON.** There is no proprietary ingestion format to learn. Runner-direct emission skips entirely if `otlpEndpoint` is unset (best-effort). The outbox path is retried with exponential backoff by the orchestrator. - **A runner crash skips both emission paths** for the affected run — the control plane only learns about the attempt if and when the runner submits a result. An orphaned attempt shows up as a gap in your result history, not an OTLP event. ### What this means for compliance and data flow - **Check results transit the Yorker control plane.** They land in Yorker's Postgres and power the dashboard, alerting, SLOs, and insights. What the control plane sees per run: timing breakdown, assertion pass/fail, HTTP status codes, truncated response bodies (for assertion re-evaluation and display), TLS certificate metadata, and — for browser checks — Web Vitals, network request metadata, console logs, screenshot *references* (the actual image bytes live in R2, uploaded directly by the runner), and step results. Retention depends on plan tier. - **For browser checks, per-check metrics and traces also reach your collector without traversing Yorker.** If you need low-latency, runner-direct OTel for browsers on hosted locations today, you already have it. - **For HTTP and MCP checks on Yorker-hosted locations, any OTel signals reaching your collector flow through the Yorker control plane** (and the orchestrator outbox), once you have configured an OTLP endpoint. This is a real architectural trade-off: it is how Yorker can enrich the `check.completed` event with things the runner doesn't know — SLO state, anomaly scores, alert context. - **If you need full runner-direct OTLP for HTTP and MCP checks**, run a private location and set `OTLP_ENDPOINT`/`OTLP_API_KEY` on the runner container at startup. The agent reads those env vars once and emits OTLP for every check it runs. ### Private locations Private locations let you run Yorker's runner inside your own network. The runner calls your internal services from inside your VPC (so internal hosts stay private), uploads screenshots directly to R2 (or a local fallback if R2 isn't configured), and POSTs results to the Yorker control plane over outbound HTTPS. That last path is mandatory — alerting, SLOs, and the dashboard all depend on it. You can opt your private runners into runner-direct OTLP for HTTP and MCP checks by setting `OTLP_ENDPOINT` and `OTLP_API_KEY` as environment variables on the runner container when you start it. Do that with an internal collector and no per-check telemetry ever leaves your network — only the result submission to the control plane egresses. Browser checks on private locations also emit runner-direct OTLP the same way hosted browser checks do. Derived events (alerts, SLO, cert, insight, check.completed) continue to flow via the control plane outbox → orchestrator → your collector. ## Screenshot pipeline For browser checks, screenshots flow through a dedicated pipeline: 1. **Capture** -- Playwright captures screenshots during script execution (every step, on failure only, or disabled). 2. **Upload** -- The runner uploads screenshots to Cloudflare R2 with a path scoped to the team, check, and run. 3. **Serve** -- The control plane API serves screenshots with team ownership validation. Only members of the team that owns the check can access its screenshots. Screenshots are stored as R2 artifacts with retention based on your plan tier. ## Data tiering Check result data is split into tiers based on how often it is accessed and how long it needs to be retained: | Tier | What it stores | Retention | Storage | |------|---------------|-----------|---------| | **Tier A** (checkResults) | Core metrics: pass/fail, response time, status code, Web Vitals | Full plan retention | Postgres | | **Tier B** (checkResultDetails) | Debug data: network request waterfalls, console logs, DOM snapshots | Shorter retention | Postgres (JSONB) | | **R2 artifacts** | Screenshots, full network headers | Based on plan tier | Cloudflare R2 | Tier A data is always stored and drives the dashboard, alerting, and SLO calculations. Tier B data is for debugging failed checks and is retained for a shorter window to manage storage costs. --- ## OpenTelemetry > How Yorker emits standard OTLP signals and correlates with your existing observability stack. # OpenTelemetry Yorker is OTel-native. Every check emits standard OpenTelemetry signals -- metrics and traces -- using the OTLP HTTP JSON protocol. There is no proprietary telemetry format. If your backend speaks OTel, it works with Yorker. ## Why OTel-native matters Most synthetic monitoring tools store results in a proprietary system. When something breaks, you switch between your monitoring dashboard and your observability platform, manually correlating timestamps and URLs. Yorker eliminates that context switch. Your synthetic check results land in the same backend as your application traces, logs, and metrics. A failing health check and the 500 error it triggered share the same trace ID. ## Metrics emitted Runner-direct OTLP emission (today: browser checks on any location, plus HTTP/MCP checks on private locations that have `OTLP_ENDPOINT` set) produces these metrics as OTLP gauge data points: | Metric | Type | Description | |--------|------|-------------| | `synthetics.http.response_time` | Gauge (ms) | Total response time from request start to last byte received. | | `synthetics.check.success` | Gauge (0 or 1) | Whether the check passed all assertions. | | `synthetics.dns.lookup_duration` | Gauge (ms) | Time spent resolving DNS. | | `synthetics.tls.handshake_duration` | Gauge (ms) | Time spent on TLS handshake. | | `synthetics.browser.lcp` | Gauge (ms) | Largest Contentful Paint (browser checks only). | | `synthetics.browser.fcp` | Gauge (ms) | First Contentful Paint (browser checks only). | | `synthetics.browser.cls` | Gauge (score) | Cumulative Layout Shift (browser checks only). | These metrics follow OpenTelemetry semantic conventions for synthetic monitoring where they exist, and use the `synthetics.*` namespace for domain-specific signals. Once you have configured an OTLP endpoint under **Settings > Telemetry (OTLP)**, the control plane outbox path produces matching log events (`synthetics.check.completed`, `synthetics.check.failed`) for every check regardless of type or location. Those log event bodies carry the same response time, status, and timing breakdown as the runner-emitted metrics, so HTTP and MCP checks on Yorker-hosted locations still land observable data in your collector — you just query log events instead of gauges for those. Until you configure an endpoint, no outbox events are enqueued at all. ## Resource attributes Every metric, trace, and log event includes resource attributes that identify the check, location, and run: | Attribute | Example | Description | |-----------|---------|-------------| | `synthetics.check.id` | `chk_abc123` | Unique check identifier. | | `synthetics.check.name` | `Homepage` | Human-readable check name. | | `synthetics.check.type` | `http`, `browser`, or `mcp` | Check type. | | `synthetics.location.id` | `loc_us_east` | Location identifier. | | `synthetics.location.name` | `US East (Ashburn)` | Human-readable location name. | | `synthetics.location.type` | `hosted` or `private` | Whether the location is Yorker-hosted or a private location. | | `synthetics.run.id` | `run_xyz789` | Unique identifier for this specific execution. | | `url.full` | `https://example.com` | The URL being monitored. | | `service.name` | `synthetics` | Service name used by both runner-direct emissions and control-plane outbox events. | These attributes let you filter, group, and alert on synthetic check data in your observability backend the same way you would with any other OTel-instrumented service. ### Labels as resource attributes Any labels attached to a check are emitted as additional resource attributes on every metric and trace. This lets you slice telemetry by your own dimensions — environment, service, team, criticality — without having to map check IDs back to metadata in your observability backend. | Label format | Resource attribute | |---|---| | `env:production` | `yorker.label.env="production"` | | `service:payments` | `yorker.label.service="payments"` | | `critical` *(no colon)* | `yorker.label.critical="true"` | See [Create a Monitor → Labels](/docs/guides/create-monitor#labels) for how to attach labels. ## Trace correlation Yorker injects a W3C `traceparent` header into outbound requests during check execution. This is how it works: 1. The runner generates a trace ID for the check execution. 2. The `traceparent` header is added to the HTTP request (or injected into the browser's network requests for browser checks). 3. Your backend application picks up the trace context via its own OTel instrumentation. 4. The synthetic check span and your backend request span share the same distributed trace. The result: when a check fails, you can click from the Yorker alert directly to the distributed trace in your observability backend. You see the synthetic request, the backend handler, the database query, and the error -- in one view. ## Backend compatibility Yorker works with any OTel-compatible backend. All Yorker telemetry is emitted as OTLP HTTP JSON — the most widely supported OTel transport. Tested backends include: - **ClickStack** (ClickHouse + HyperDX) - **Grafana Cloud** (Tempo + Mimir) - **Datadog** - **Honeycomb** - **New Relic** - **Jaeger** - **Any OTLP-compatible collector** (OpenTelemetry Collector, Alloy, Vector) ## Emission model Yorker has **two OTel emitters**, and they both target the same `otlpEndpoint` you configure on your team: - **Runners** emit OTLP metrics, traces, and logs directly to your collector — but only for browser checks today. Hosted HTTP and MCP runners do not emit OTLP from the runner process; private-location operators can enable runner-direct emission by setting `OTLP_ENDPOINT`/`OTLP_API_KEY` on their runner container when they start it. - **The orchestrator** drains an emission outbox that the control plane writes to, and ships every OTel log/span event Yorker generates to your collector — including `synthetics.check.completed`, `synthetics.check.failed`, `synthetics.step.completed`, alert state changes, SLO burn warnings, TLS certificate events, monitor/team insights, deployment markers, and maintenance-window events. The control plane enqueues; the orchestrator polls the outbox every ~10 seconds, runs SSRF guards, and POSTs. The metrics catalogued above (`synthetics.http.response_time`, `synthetics.check.success`, and friends) are currently produced by the runner-direct path, which means you will see them for browser checks. For HTTP and MCP checks on hosted locations, the same information reaches your collector via the `synthetics.check.completed` / `check.failed` log events — they carry `responseTimeMs`, `status`, assertion results, timing breakdown, and the same resource attributes as the metrics, so dashboards and queries can key off either signal. See the [Telemetry flow section in Architecture](/docs/concepts/architecture#telemetry-flow) for the full table of which check type and location combinations take which path. ## Setup To configure OTel emission for your team: 1. Go to **Settings > Telemetry (OTLP)** in the Yorker dashboard. 2. Enter your OTLP endpoint URL (e.g., `https://otel-collector.example.com:4318`). 3. Add any required authentication headers (API key, bearer token). 4. Click **Test Connection** — Yorker's control plane dispatches a test payload and reports success or failure. 5. Save. From this point, the control plane starts enqueueing events for the orchestrator to ship (for every check type and every location), and browser-check runners start including the endpoint in each execution payload for runner-direct metric/trace emission. Team-level OTLP credentials are stored on the team, not per-check. --- ## Alert Correlation > How multi-location correlation and OTel trace linking reduce noise and speed up root cause analysis. # Alert Correlation Synthetic monitors generate a lot of signals. Not every failure is a real outage -- network glitches, regional ISP issues, and transient errors produce false positives. Yorker uses multi-location correlation and consecutive failure thresholds to separate real incidents from noise, and OTel trace linking to get you from alert to root cause in one click. ## The noise problem A single-location failure usually means nothing. A DNS resolver in Frankfurt hiccups for 200ms. A CDN edge node in Sydney drops a connection. If you alert on every individual failure, you get paged for problems your users never notice. The question is not "did one check fail?" but "is the service actually down?" ## Multi-location correlation The `multi_location_failure` condition answers that question. It requires N of M monitoring locations to report failure within a time window before triggering an alert. For example, if your check runs from 6 locations and you configure `minLocations: 3`, the alert only fires when at least 3 locations fail in the same window. A single location flaking does not page you. ```yaml alerts: - name: Homepage Down conditions: - type: multi_location_failure minLocations: 3 channels: - "@pagerduty-oncall" ``` This eliminates geographic noise. If only Tokyo fails but Ashburn, London, Frankfurt, Singapore, and Sydney are all passing, the problem is regional -- not an outage. ## Consecutive failure thresholds The `consecutive_failures` condition handles a different class of noise: transient blips. A single timeout or 503 that resolves on the next check interval is not worth alerting on. ```yaml alerts: - name: API Degraded conditions: - type: consecutive_failures count: 5 channels: - "@ops-slack" ``` This alert only fires after 5 checks in a row fail. A one-off timeout is silently recorded in the check history but does not trigger a notification. ## Multi-tier alerting Combine both conditions to build alert tiers that match your incident response workflow: ```yaml alerts: # Critical: multiple locations confirm the outage - name: Service Outage conditions: - type: multi_location_failure minLocations: 3 channels: - "@pagerduty-oncall" # Warning: persistent failures from any location - name: Service Degraded conditions: - type: consecutive_failures count: 5 channels: - "@ops-slack" # Info: SSL certificate expiring soon - name: SSL Expiry Warning conditions: - type: ssl_expiry daysBeforeExpiry: 14 channels: - "@on-call-email" ``` Critical alerts go to PagerDuty because multiple locations confirm the service is down. Warning alerts go to Slack because the issue is persistent but might be localized. Info alerts go to email for non-urgent action items. ## OTel trace linking When a check fails, the trace ID from that execution links directly to the distributed trace in your observability backend. The flow looks like this: 1. Runner executes the check and injects a `traceparent` header. 2. Your backend processes the request and records the trace. 3. The check fails (assertion failure, timeout, 5xx response). 4. Yorker creates an alert with the trace ID attached. 5. You click the trace link in the alert notification. 6. Your observability backend shows the full distributed trace: the synthetic request, your API handler, the database query that timed out, the error. This collapses the "what broke?" investigation from minutes of log searching to a single click. The synthetic check and the backend error are part of the same trace. The alerts dashboard shows all active, acknowledged, and recovered alerts across your monitors. ## Alert lifecycle Alerts follow a state machine: | State | Meaning | |-------|---------| | **ACTIVE** | The alert condition is met. Notifications have been sent. | | **ACKNOWLEDGED** | A team member has acknowledged the alert. No repeat notifications. | | **RESOLVED** | A team member manually resolved the alert. | | **RECOVERED** | The check started passing again. The alert auto-resolves. | When a check that triggered an ACTIVE alert starts passing again, the alert transitions to RECOVERED and a recovery notification is sent to the same channels. This closes the loop without manual intervention. Acknowledged alerts suppress repeat notifications but remain visible in the dashboard until the underlying issue is resolved or the check recovers. --- ## Incidents > How Yorker groups correlated alerts into incidents, tracks their lifecycle, and dispatches opinionated, investigator-grade notifications. # Incidents A Yorker incident is a correlated group of alerts treated as one investigable unit. Each incident has a fingerprint, a severity, a lifecycle, and a notification policy. Incidents reduce noise by collapsing many alerts into one ticket and by emitting structured, investigator-grade payloads to your channels. ## Why incidents exist A single alert answers "is this check failing right now?" It does not answer the question an on-call engineer actually needs: **what is the blast radius, and is it related to something else that's breaking?** Synthetic monitors often fire in bursts. An upstream DNS provider hiccups and ten HTTP checks page at once. A CDN edge degrades and browser checks across three regions turn red. Without correlation, you get ten pages for one problem. Yorker groups those alerts into an incident, computes a scoped hypothesis from the observations (HTTP status codes, locations, shared failing domains, symptom timing), and dispatches **one** ticket per channel per incident — not one per alert. ## The incident lifecycle Every incident moves through a small set of states. Each state transition is recorded as a first-class event and dispatched to subscribed channels. | State | Entered by | | --------------- | ------------------------------------------------------------------ | | `open` | Correlated alerts above the score threshold | | `acknowledged` | A user clicks "Acknowledge" in the dashboard or API | | `auto_resolved` | All member alerts recovered and the 15-minute cool-down elapsed | | `closed` | A user closes the incident explicitly | | `reopened` | A user reopens a previously closed/resolved incident | The transient states `reopened` → `open` are preserved in the event log so downstream consumers can replay the exact sequence. ## Event types Every lifecycle transition emits one of these events. Every event carries the full observations + hypothesis snapshot so a consumer replaying one event has complete context without querying back. - `opened` — new incident created - `alert_attached` — an additional alert joined an active incident - `severity_changed` — severity escalated or de-escalated - `acknowledged` — a user took ownership - `note_added` — a user added a freeform note - `auto_resolved` — all members recovered and cool-down elapsed - `closed` — a user closed it - `reopened` — a user reopened a previously resolved incident Each event is persisted to `incident_events`, emitted as an OTel log record (if an OTLP endpoint is configured for the team), and dispatched to every channel subscribed to incidents for the team. ## Default notification routing Different channel types have different sensible defaults. Yorker opts into the minimum-noise routing that matches each channel's audience: | Channel | Receives by default | | ------------ | ------------------------------------------------------------------------------------ | | Slack | Every lifecycle event (timeline-style thread) | | Email | `opened`, `auto_resolved`, `closed` only (inboxes should not be a running timeline) | | Webhook | Every lifecycle event | | PagerDuty | `opened`, `acknowledged`, `auto_resolved`, `closed`, `reopened`, `note_added` | | ServiceNow | `opened`, `severity_changed`, `acknowledged`, `auto_resolved`, `closed`, `note_added` | PagerDuty skips `severity_changed` because the Events API v2 has no matching action. ServiceNow skips `reopened` because Yorker's reopen semantics don't map cleanly to ServiceNow's reopen concept — a Yorker "reopen" after a recurrence creates a new external ticket rather than mutating the old one. See the [Slack](/docs/integrations/slack), [PagerDuty](/docs/integrations/pagerduty), [ServiceNow](/docs/integrations/servicenow), [Email](/docs/integrations/email), and [Webhook](/docs/integrations/webhook) integration pages for the exact payload shapes. ## Scoped hypothesis Every outbound incident payload carries a `hypothesis` block that tells the reader what Yorker thinks is going on — scoped to what an external synthetic sensor can prove: ```json { "hypothesis": { "summary": "Stripe API is returning 503/504; checkout is blocked.", "confidence": 0.75, "ruledIn": ["shared_failing_domain=api.stripe.com"], "ruledOut": [ "DNS resolution: NXDOMAIN not observed", "TLS: handshake completes" ], "correlationDimensionsMatched": ["shared_failing_domain", "error_pattern"], "scope": "external_symptoms_only" } } ``` `scope: external_symptoms_only` is the honesty baseline. Yorker can prove the external symptom — users cannot reach checkout — and can rule out classes of causes it directly measured (DNS, TLS, shared failing domains). It cannot see your backend logs, so it never claims the backend is the culprit. ## Dedupe + rate limiting - **30s dedupe window** — a retry firing the same event to the same channel within 30 seconds is recorded as `skipped_dedupe` in `incident_notification_dispatches`, not sent again. - **1-per-minute note rate limit** — per (channel, incident), a second `note_added` within 60 seconds of a prior send attempt (successful **or** failed) is recorded as `skipped_rate_limit`. Failed attempts count because each one still hit the upstream endpoint — a flaky webhook returning 5xx must not leak a retry burst past the cap. Prevents an operator running a backfill script from spamming hundreds of notes. Both checks fail **open** on database errors — losing a notification is worse than double-sending one. ## User-editable templates Every channel's default payload can be overridden with a Handlebars template attached to the notification channel. The rendering context matches `serializeIncidentEventForExport` plus a few helpers (`severityEmoji`, `eventEmoji`, `join`, `ifHasSource`, `jsonBody`). A render error or JSON-parse failure on the override **falls back to the default** and logs — a bad template never fails dispatch. ### In the web UI For Slack, email, and webhook channels, **Settings > Notification Channels > Templates** opens a full editor with per-event tabs, a live preview rendered against canonical fixtures, a library of starter and example templates, a diff view comparing the draft against the last saved version, and a **Send test** button that dispatches the current saved template to the real channel. The editor is the recommended authoring path for these three channel types. PagerDuty and ServiceNow overrides are currently API-only. ### Via the API Template overrides are sent via the notification-channel API: ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "incidentTemplate": { "channelType": "slack", "overrides": { "opened": { "blocks": "{\"blocks\":[{\"type\":\"section\",\"text\":{\"type\":\"mrkdwn\",\"text\":\"{{severityEmoji incident.severity}} {{incident.title}}\"}}]}" } } } }' ``` To disable a channel from receiving incident events (fall back to legacy per-alert dispatch), set `incidentSubscribed: false` on the channel. ## Audit trail Every dispatch writes one row to `incident_notification_dispatches` with status `sent`, `skipped_dedupe`, `skipped_rate_limit`, `skipped_not_routed`, or `failed`, plus any channel-specific response payload (PagerDuty `dedup_key`, ServiceNow `sys_id`). This is the source of truth for "did we actually notify?" — the UI will expose it in a later iteration. --- ## Locations > Available monitoring locations — 14 hosted regions on Fly.io and private locations. # Locations Yorker runs checks from 14 hosted locations distributed across six continents. You can also run checks from private locations inside your own infrastructure. ## Hosted locations Every hosted location runs on Fly.io infrastructure in the corresponding region. Checks execute close to your users, giving you realistic latency measurements from each geography. | Location ID | Display Name | Fly Region | |---|---|---| | `loc_us_east` | US East (Ashburn) | `iad` | | `loc_us_south` | US South (Dallas) | `dfw` | | `loc_us_west` | US West (Los Angeles) | `lax` | | `loc_na_north` | Canada (Toronto) | `yyz` | | `loc_sa_east` | South America (São Paulo) | `gru` | | `loc_eu_west` | Europe West (London) | `lhr` | | `loc_eu_west_2` | Europe West (Paris) | `cdg` | | `loc_eu_central` | Europe Central (Frankfurt) | `fra` | | `loc_eu_north` | Europe North (Stockholm) | `arn` | | `loc_ap_southeast` | Asia Pacific (Singapore) | `sin` | | `loc_ap_northeast` | Asia Pacific (Tokyo) | `nrt` | | `loc_ap_south` | Asia Pacific (Mumbai) | `bom` | | `loc_ap_oceania` | Oceania (Sydney) | `syd` | | `loc_af_south` | Africa (Johannesburg) | `jnb` | ## Using locations in config Specify locations as an array of location IDs in your monitor configuration: ```yaml monitors: - name: Homepage type: http url: https://example.com locations: - loc_us_east - loc_eu_central - loc_ap_northeast ``` When a check runs, it executes independently from each configured location. A check with 3 locations and a 5-minute frequency produces 3 results every 5 minutes — one per location. You can also set default locations for all monitors: ```yaml defaults: locations: - loc_us_east - loc_eu_central monitors: - name: Homepage type: http url: https://example.com # Inherits loc_us_east and loc_eu_central from defaults - name: APAC Health type: http url: https://example.com locations: - loc_ap_southeast - loc_ap_northeast # Overrides defaults with APAC locations ``` ## Plan limits The number of locations you can use depends on your plan: | Plan | Hosted Locations | Private Locations | |------|-----------------|-------------------| | **Free** | 1 | 0 | | **Yorker** ($29.99/mo) | All 14 | Up to 2 | | **Enterprise** | All 14 | Unlimited | Paid and enterprise plans have access to all 14 hosted locations. The free tier is limited to 1 location per check. See [Pricing](https://yorkermonitoring.com/pricing) for current plan details. ## Private locations Private locations let you run Yorker's runner inside your own infrastructure — behind a firewall, in a VPC, or on-premises. This is useful for monitoring internal services that are not publicly accessible. ### How they work 1. You deploy the Yorker runner container in your infrastructure (Docker, Kubernetes, or a VM), typically using the Docker snippet that `yorker locations create` prints for you. 2. The runner polls the Yorker control plane over outbound HTTPS for check assignments. No inbound ports needed. 3. The runner executes checks against your internal services — HTTP requests, browser sessions, and MCP tool calls all originate from inside your network, so internal hosts stay private. 4. The runner submits each check result (timing, assertions, Web Vitals, certificates, and screenshot references for browser checks) to the Yorker control plane via `POST /api/runner/results`. This drives alerts, SLOs, insights, and the dashboard — it is mandatory. 5. For browser checks the runner also emits OTLP metrics and traces directly to your collector. For HTTP and MCP checks, runner-direct OTLP emission is opt-in: set `OTLP_ENDPOINT` and `OTLP_API_KEY` as environment variables on the runner container at startup and the persistent-http agent will emit OTLP for every check it runs. 6. The control plane evaluates the result (alerts, SLO burn, baselines, insights). If a team OTLP endpoint is configured, it also enqueues OTel events — `synthetics.check.completed`, alert state changes, SLO warnings, certificate events, insights — in an outbox that the orchestrator polls and ships to your collector asynchronously. If no OTLP endpoint is configured, the enqueue is skipped entirely (results still power the dashboard and alerting, there are just no outbound OTel events). ### Key properties - **Outbound-only** — The runner initiates all connections. No firewall rules, no VPN tunnels, no inbound ports. - **Internal targets stay private** — The runner calls your internal services from inside your network, so you never have to expose a staging API or intranet dashboard on the public internet to monitor it. - **You can keep per-check OTLP inside your network** — Point `OTLP_ENDPOINT` on the runner container at an internal collector and the per-check metrics/traces/logs emitted by your private agent never leave your infrastructure. Check results still flow to the Yorker control plane over outbound HTTPS; derived events (alerts, SLO, insights, `check.completed`) still flow through the orchestrator outbox on the way to your collector. - **Same runner image** — Private locations use the same runner container as hosted locations. No separate software to maintain. - **Discounted runs** — Private browser runs are billed at 50% of the hosted rate. ### Setting one up See the [Private Locations guide](/docs/guides/private-locations) for a complete step-by-step walkthrough (create the location, generate a runner key, deploy the agent, verify health, and assign the location to monitors). You can drive the same flow from the CLI: ```bash # Create a private location (also auto-provisions an initial runner key and prints a Docker snippet) yorker locations create --name hq-data-center --display-name "HQ Data Center" # List private locations only (default) yorker locations list # Include hosted locations in the list too yorker locations list --all # Create another runner key for an existing location (secret is shown once) yorker locations keys create loc_abcdef123456 --name "eu-west-1" # List runner keys on a location (prefixes only) yorker locations keys list loc_abcdef123456 # Revoke a key by its ID yorker locations keys revoke loc_abcdef123456 rkey_abc123 # Delete a private location (also revokes all its runner keys) yorker locations delete loc_abcdef123456 ``` Once the runner is online, reference the private location ID in any monitor: ```yaml monitors: - name: Internal API type: http url: https://internal-api.staging.local/health locations: - loc_abcdef123456 - loc_us_east # Can mix private and hosted locations ``` --- # Integrations ## Slack > Send incident notifications to Slack via Block Kit — one timeline-style thread per incident, every lifecycle event. # Slack Yorker posts incident notifications to Slack via an [Incoming Webhook](https://api.slack.com/messaging/webhooks). Slack is the **timeline** channel — by default it receives every lifecycle event, so your channel becomes a running record of the incident. For the underlying model (lifecycle states, event types, scoped hypothesis), see [Incidents](/docs/concepts/incidents). ## Set up 1. In Slack, create an Incoming Webhook and copy the URL. 2. In Yorker, go to **Settings > Notification Channels**, click **Create Channel**, pick **Slack**, and paste the webhook URL. 3. The channel is subscribed to incidents by default. Wire it to any alert rule and it will participate in incident dispatch. Or via the API: ```bash curl -X POST https://yorkermonitoring.com/api/notification-channels \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "ops-channel", "channel": { "type": "slack", "webhookUrl": "https://hooks.slack.com/services/T.../B.../..." } }' ``` ## What gets posted Slack receives **every** incident event by default: - `opened` - `alert_attached` - `severity_changed` - `acknowledged` - `auto_resolved` - `closed` - `reopened` - `note_added` The `opened` event uses a rich Block Kit layout — header, severity, affected checks, locations, symptom window, shared failing domains, hypothesis, ruled-out list, and a **View in Yorker** button. Subsequent events are single-section status lines so the thread reads like a timeline. Example `opened` payload: ```json { "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": "🔴 Incident opened — Checkout API outage" } }, { "type": "section", "text": { "type": "mrkdwn", "text": "*Severity*: `CRITICAL` · *Incident*: " } }, { "type": "section", "fields": [ { "type": "mrkdwn", "text": "*Affected checks*\nCheckout API" }, { "type": "mrkdwn", "text": "*Locations*\nloc_us_east_1, loc_eu_west_1" }, { "type": "mrkdwn", "text": "*Symptom window*\n2026-04-15T09:58:00Z → ongoing" }, { "type": "mrkdwn", "text": "*Shared failing domains*\napi.stripe.com" } ]}, { "type": "section", "text": { "type": "mrkdwn", "text": "*Hypothesis*\nStripe API is returning 503/504; checkout is blocked." } }, { "type": "section", "text": { "type": "mrkdwn", "text": "*Ruled out*\n• DNS resolution: NXDOMAIN not observed\n• TLS: handshake completes" } }, { "type": "context", "elements": [{ "type": "mrkdwn", "text": "Scope: `external_symptoms_only` — Yorker measures external symptoms only" }] }, { "type": "actions", "elements": [{ "type": "button", "text": { "type": "plain_text", "text": "View in Yorker" }, "url": "https://yorkermonitoring.com/dashboard/incidents/inc_abc" }] } ] } ``` ## Template overrides Every event's default payload can be replaced with a Handlebars-rendered Block Kit JSON string. The template renders against the full incident event context. ### Edit in the web UI The fastest way to author overrides is the per-channel template editor. Open **Settings > Notification Channels**, click **Templates** next to the Slack channel, and you get: - A Handlebars editor with JSON syntax highlighting - A live **preview** pane that renders your template against one of six canonical fixtures (single HTTP failure, multi-location burst, browser check, MCP, etc.) - A per-event drop-down so you can author one template per lifecycle event - A **library** sidebar with curated starter blocks and end-to-end examples (runbook-style, terse-ack, etc.) that apply with a click - **Compare with saved** diff view before you save - **Send test**: dispatches the current saved template to the real Slack webhook using the selected fixture (60-second cooldown per channel) - **Reset to default** per field: clears the override and falls back to the shipping payload Saved overrides apply immediately to the next incident dispatch for that channel. ### Edit via the API ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "incidentTemplate": { "channelType": "slack", "overrides": { "opened": { "blocks": "{\"blocks\":[{\"type\":\"section\",\"text\":{\"type\":\"mrkdwn\",\"text\":\"{{severityEmoji incident.severity}} *{{incident.title}}*\\n{{payload.hypothesis.summary}}\"}}]}" } } } }' ``` The `blocks` body must render to a JSON object with a `blocks: [...]` array. A render error or a parse error falls back to the default payload and logs a warning. A bad template never fails dispatch. Each body is capped at 32 KiB. Use `"default"` as the event key to define a single override that applies to every event type that doesn't have its own entry. ### Available helpers - `{{severityEmoji incident.severity}}` → `🔴 / 🟡 / 🔵` - `{{eventEmoji eventType}}` → `🚨 / ➕ / 🔺 / 👤 / ✅ / ☑️ / 🔁 / 📝` - `{{upperCase str}}`, `{{titleCase str}}` - `{{join array ", "}}` - `{{#ifHasSource "synthetic_http"}}…{{/ifHasSource}}` — `synthetic_http`, `synthetic_browser`, or `synthetic_mcp` - `{{jsonBody payload}}` — splat a value as raw JSON (already `JSON.stringify`d). In JSON-producing channels (Slack, webhook, PagerDuty, ServiceNow) escaping is disabled, so `{{jsonBody x}}` and `{{{jsonBody x}}}` are equivalent. In email HTML templates the double-stash form is HTML-escaped by default — use triple-stash only as an explicit opt-out. ### Render context The full context mirrors `serializeIncidentEventForExport`: - `eventId`, `eventType`, `incidentId`, `teamId`, `occurredAt` - `actor` — `{ type: "user" | "system", id }` - `payload` — the full event payload (observations, hypothesis, event-specific fields) - `incident` — `{ incidentId, title, severity, state, openedAt, triageUrl }` ## Disabling incident routing To fall back to the legacy per-alert Slack dispatch, set `incidentSubscribed: false` on the channel: ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -d '{ "incidentSubscribed": false }' ``` --- ## PagerDuty > Trigger PagerDuty incidents via Events API v2 with dedup_key = Yorker incident ID, observations and scoped hypothesis in custom_details. # PagerDuty Yorker drives PagerDuty through the [Events API v2](https://developer.pagerduty.com/docs/events-api-v2-overview). One Yorker incident maps to one PagerDuty alert; the `dedup_key` is the Yorker `incident_id`. Subsequent lifecycle events update the same PagerDuty alert rather than creating new ones. For the underlying model (lifecycle states, event types, scoped hypothesis), see [Incidents](/docs/concepts/incidents). ## Set up 1. In PagerDuty, add an **Events API v2** integration to the service that should receive Yorker alerts. Copy the **Integration Key** (also called the routing key). 2. Note your PagerDuty region: `us` for `events.pagerduty.com`, `eu` for `events.eu.pagerduty.com`. 3. In Yorker, create a notification channel of type `pagerduty` with the routing key and region: ```bash curl -X POST https://yorkermonitoring.com/api/notification-channels \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "pd-oncall", "channel": { "type": "pagerduty", "routingKey": "R0UTINGKEY1234567890ABCDEF", "serviceRegion": "us" } }' ``` | Field | Required | Default | Description | | ---------------- | -------- | ------- | ----------------------------------------- | | `routingKey` | yes | — | Events API v2 integration key | | `serviceRegion` | no | `us` | `us` or `eu` | ## Event mapping | Yorker event | PD `event_action` | Notes | | ----------------- | ----------------- | -------------------------------------------------- | | `opened` | `trigger` | First alert created | | `acknowledged` | `acknowledge` | Forwards the Yorker ack to PD | | `auto_resolved` | `resolve` | After cool-down elapses | | `closed` | `resolve` | User-initiated close | | `reopened` | `trigger` | Recurrence re-triggers the alert | | `note_added` | `trigger` | PD treats duplicate dedup_key triggers as updates | | `severity_changed`| *(skipped)* | Events API v2 has no matching action | | `alert_attached` | *(skipped)* | Internal-only | `dedup_key` is always the Yorker `incident_id`. `severity` maps `critical → critical`, `warning → warning`, `info → info`. Example `opened` payload: ```json { "routing_key": "R0UTINGKEY…", "event_action": "trigger", "dedup_key": "inc_abc", "payload": { "summary": "[Yorker] Checkout API outage", "source": "yorker", "severity": "critical", "class": "synthetic-monitoring", "custom_details": { "incident_id": "inc_abc", "event_type": "opened", "event_id": "ievt_001", "sources": ["synthetic_http"], "affected_checks": ["Checkout API"], "locations_affected": ["loc_us_east_1", "loc_eu_west_1"], "shared_failing_domains": ["api.stripe.com"], "hypothesis": { "summary": "Stripe API is returning 503/504; checkout is blocked.", "confidence": 0.75, "ruled_in": ["shared_failing_domain=api.stripe.com"], "ruled_out": ["DNS resolution: NXDOMAIN not observed", "TLS: handshake completes"], "scope": "external_symptoms_only", "correlation_dimensions_matched": ["shared_failing_domain", "error_pattern"] }, "actor": { "type": "system", "id": null } } }, "links": [ { "href": "https://yorkermonitoring.com/dashboard/incidents/inc_abc", "text": "View in Yorker" } ] } ``` When the incident is a recurrence of prior closed incidents, Yorker adds up to 5 prior incident links to the `links` array so the on-call engineer can pivot to history without leaving PD. ## Template overrides Only the `custom_details` block is user-editable. The envelope (routing key, event action, dedup key, severity, source, summary) is fixed so dedupe and severity semantics stay consistent across your tenants. > **UI editor note:** the in-app per-channel template editor (at **Settings > Notification Channels > Templates**) currently supports Slack, email, and webhook channels only. PagerDuty `customDetails` overrides are authored through the API below; a UI editor for PagerDuty is planned for a future release. ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "incidentTemplate": { "channelType": "pagerduty", "overrides": { "opened": { "customDetails": "{\"runbook\":\"https://wiki.acme.com/runbooks/checkout\",\"yorker_link\":\"{{incident.triageUrl}}\",\"severity\":\"{{incident.severity}}\",\"hypothesis\":\"{{payload.hypothesis.summary}}\"}" } } } }' ``` The rendered string must parse as a JSON object. On render failure or invalid JSON the default custom_details block is used instead; dispatch does not fail. Helper and render context are the same as the [Slack integration](/docs/integrations/slack). ## Ack propagation Acknowledging in Yorker sends `event_action: "acknowledge"` to PagerDuty. Acknowledging in PagerDuty does **not** propagate back to Yorker in this release — bidirectional sync is tracked as a future enhancement. ## Disabling The PagerDuty channel type is **incident-pipeline-only** — it has no legacy per-alert dispatch path. `incidentSubscribed` is locked to `true` for PagerDuty channels; the API rejects create/update requests that set it to `false`. To stop routing incidents to a PagerDuty channel, delete it or remove it from the alert rule's channel list. --- ## ServiceNow > Create and update ServiceNow incident records via inbound REST, keyed by correlation_id = Yorker incident ID. # ServiceNow Yorker integrates with ServiceNow via inbound REST against the standard `incident` table. The first `opened` event creates a new incident record; subsequent lifecycle events update that record via `correlation_id` lookup — no `sys_id` round-trip required. For the underlying model (lifecycle states, event types, scoped hypothesis), see [Incidents](/docs/concepts/incidents). ## Set up Yorker needs an inbound REST credential with permission to create and update `incident` records. 1. In ServiceNow, create a service account (or re-use an existing integration user) with the **Incident** table permissions needed to `POST` and `PATCH`. 2. Note your instance URL (e.g. `https://acme.service-now.com`) and optionally the assignment group new incidents should route to. 3. Create the channel in Yorker: ```bash curl -X POST https://yorkermonitoring.com/api/notification-channels \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "snow-prod", "channel": { "type": "servicenow", "instanceUrl": "https://acme.service-now.com", "username": "yorker_integration", "password": "…", "assignmentGroup": "Network Operations" } }' ``` | Field | Required | Description | | ------------------ | -------- | --------------------------------------------------------------------- | | `instanceUrl` | yes | Base URL of the ServiceNow instance | | `username` | yes | Integration user | | `password` | yes | Password or OAuth token (stored at rest; treated as secret) | | `assignmentGroup` | no | Group name — maps to `assignment_group` on new incidents | Authentication is HTTP Basic over TLS. ## Event mapping | Yorker event | Action | Notes | | ----------------- | ------------------- | -------------------------------------------------------------- | | `opened` | `POST /incident` | Creates record with `correlation_id = incident_id` | | `severity_changed`| `PATCH` record | Updates `impact`, `urgency`, adds `work_notes` | | `acknowledged` | `PATCH` record | Sets `state = 2` (Work in Progress), adds ack note | | `note_added` | `PATCH` record | Appends to `work_notes` | | `auto_resolved` | `PATCH` record | Sets `state = 6` (Resolved), fills `close_code` + `close_notes` | | `closed` | `PATCH` record | Sets `state = 6` (Resolved), uses Yorker `closeReason` | | `reopened` | *(skipped)* | Yorker reopens create a new external ticket, not mutate the old| | `alert_attached` | *(skipped)* | Internal-only | Updates target `PATCH /api/now/table/incident?sysparm_query=correlation_id=` — no `sys_id` is stored. ### Field mapping (on `opened`) | ServiceNow field | Source | | ---------------------- | --------------------------------------------------------------- | | `short_description` | `payload.hypothesis.summary` (stripped of CR/LF, capped at 160) | | `description` | Hypothesis summary + observations summary + triage URL | | `work_notes` | `scope: external_symptoms_only` + ruled-out list | | `impact`, `urgency` | `critical → 1`, `warning → 2`, `info → 3` | | `correlation_id` | Yorker `incident_id` | | `correlation_display` | `"Yorker Incident"` | | `assignment_group` | Channel config (if set) | ### Severity mapping | Yorker severity | ServiceNow `impact` | ServiceNow `urgency` | | --------------- | ------------------- | -------------------- | | `critical` | 1 (High) | 1 (High) | | `warning` | 2 (Medium) | 2 (Medium) | | `info` | 3 (Low) | 3 (Low) | ## Why reopened isn't routed ServiceNow's "Reopen" action reverts a closed incident to an active state on the **same** record. Yorker's recurrence model treats a new burst of correlated alerts after auto-resolution as a **new incident** with a pointer back to the prior one (`recurrenceOf`). Those two semantics don't map cleanly, so Yorker creates a new ServiceNow record on recurrence (via the next `opened` event) rather than mutating the closed one. Use the **View in Yorker** link on each record to pivot between related incidents. ## Template overrides Three fields are user-editable: `short_description`, `description`, and `work_notes`. The envelope (`state`, `impact`, `urgency`, `correlation_id`, `assignment_group`) stays under Yorker's control so the lookup pattern and severity mapping remain consistent. > **UI editor note:** the in-app per-channel template editor (at **Settings > Notification Channels > Templates**) currently supports Slack, email, and webhook channels only. ServiceNow overrides are authored through the API below; a UI editor for ServiceNow is planned for a future release. ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "incidentTemplate": { "channelType": "servicenow", "overrides": { "opened": { "shortDescription": "[{{upperCase incident.severity}}] {{incident.title}}", "description": "Yorker incident {{incident.incidentId}}\n\nHypothesis: {{payload.hypothesis.summary}}\n\nTriage: {{incident.triageUrl}}", "workNotes": "Scope: {{payload.hypothesis.scope}}\nRuled out:\n{{#each payload.hypothesis.ruledOut}} - {{this}}\n{{/each}}" } } } }' ``` Template render failures fall back to the default string and log a warning. Dispatch never fails on a bad template. Helpers and render context are the same as the [Slack integration](/docs/integrations/slack). `short_description` is always stripped of CR/LF (header injection guard) and capped at 160 characters after render. ## Disabling The ServiceNow channel type is **incident-pipeline-only** — it has no legacy per-alert dispatch path. `incidentSubscribed` is locked to `true` for ServiceNow channels; the API rejects create/update requests that set it to `false`. To stop routing incidents to a ServiceNow channel, delete it or remove it from the alert rule's channel list. --- ## Email > Email notifications for incident lifecycle milestones (opened, auto-resolved, closed) with multipart HTML + plain-text bodies. # Email Yorker sends incident notifications by email via AWS SES. Email is the **low-noise** channel: inboxes shouldn't be a running timeline, so only the lifecycle milestones that matter for async followers are routed by default. For the underlying model (lifecycle states, event types, scoped hypothesis), see [Incidents](/docs/concepts/incidents). ## Set up 1. Go to **Settings > Notification Channels**, create an **Email** channel, and add one or more recipient addresses. 2. Or via the API: ```bash curl -X POST https://yorkermonitoring.com/api/notification-channels \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "on-call-email", "channel": { "type": "email", "addresses": ["oncall@example.com", "sre-team@example.com"] } }' ``` | Field | Required | Description | | ----------- | -------- | ------------------------------------------------- | | `addresses` | yes | At least one valid email address | ## What gets sent Email dispatches on **three** events by default: - `opened`: full context mail with observations, hypothesis, ruled-out list, and an **Acknowledge in Yorker** button - `auto_resolved`: resolution mail - `closed`: close-reason mail `acknowledged`, `severity_changed`, `note_added`, `alert_attached`, and `reopened` are **not** routed to email by default. They'd turn the inbox into a timeline; use Slack or webhooks for timeline-style updates. Every mail is multipart: an HTML body for rich clients and a plain-text alternative for others. The HTML body is styled dark (`#000` background, `#111827` card) and includes: - Event type badge (color-coded: red for opened, teal for resolved/closed) - Incident ID, severity, affected checks, locations, shared failing domains - Hypothesis summary - Ruled-out list - **Acknowledge in Yorker** call-to-action Subject line format: ``` Yorker Incident — CRITICAL — Checkout API outage Yorker Incident Auto-Resolved — Checkout API outage Yorker Incident Closed — Checkout API outage ``` CR/LF characters are stripped from every subject (header-injection guard) and the line is capped at 200 characters after render. ## Template overrides All three parts (subject, HTML body, plain-text body) can be overridden per event type. ### Edit in the web UI Open **Settings > Notification Channels** and click **Templates** next to the email channel. The editor has three tabs (`subject` / `html` / `text`), so you can author each part independently and see a live rendered preview: - The HTML tab previews in a sandboxed iframe with no same-origin access, so user-supplied markup can't escape into the settings UI - The subject tab shows a live rendered preview. The 200-character cap is enforced after render at dispatch time; the editor's character counter reflects template source length, not rendered length, so keep rendered subjects concise - The **fixture selector** picks which canonical incident context to render against (single HTTP failure, multi-location burst, browser check, MCP) - The **library** sidebar includes a branded incident-digest HTML example you can drop in as a starting point - **Send test** delivers to the channel's addresses via SES using the active fixture (60-second cooldown per channel) ### Edit via the API ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "incidentTemplate": { "channelType": "email", "overrides": { "opened": { "subject": "[{{upperCase incident.severity}}] {{incident.title}}", "html": "

{{incident.title}}

{{payload.hypothesis.summary}}

View in Yorker

", "text": "{{incident.title}}\n\n{{payload.hypothesis.summary}}\n\n{{incident.triageUrl}}" } } } }' ``` A render failure on any of the three parts falls back to the default version of **that** part, keeping the other overridden parts. Dispatch does not fail on a bad template. The HTML body part renders with Handlebars' default `{{...}}` **HTML escaping enabled**, so user-supplied strings in `incident.title`, `payload.hypothesis.summary`, etc. are safely escaped. The subject and text parts render with escaping disabled (plain text). Helper output is escaped too: `{{jsonBody x}}` is HTML-escaped by default. Use `{{{jsonBody x}}}` (triple-stash) only when you consciously need raw JSON interpolation into the HTML body; the double-stash form is always safe. Helpers and render context are the same as the [Slack integration](/docs/integrations/slack). ## SES requirements Yorker uses your platform's shared SES sender. Self-hosters must provision SES credentials via the `AWS_SES_REGION` / `AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `SES_FROM_ADDRESS` environment variables. Dispatch fails with a logged error if SES is not configured. ## Disabling Set `incidentSubscribed: false` on the channel to fall back to the legacy per-alert email dispatch (one mail per alert, no aggregation). --- ## Webhook > Send every incident lifecycle event to your own HTTP endpoint as JSON, schema-versioned for forward compatibility. # Webhook The webhook integration posts a JSON body for every incident lifecycle event to your own HTTP endpoint. Use this for custom integrations, Opsgenie, Zapier, workflow engines, or anywhere Yorker doesn't ship a purpose-built adapter. For the underlying model (lifecycle states, event types, scoped hypothesis), see [Incidents](/docs/concepts/incidents). ## Set up ```bash curl -X POST https://yorkermonitoring.com/api/notification-channels \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "incident-sink", "channel": { "type": "webhook", "url": "https://hooks.example.com/yorker-incidents", "method": "POST", "headers": { "Authorization": "Bearer ${INCOMING_TOKEN}" } } }' ``` | Field | Required | Default | Description | | ---------- | -------- | ------- | -------------------------------------------- | | `url` | yes | n/a | Destination endpoint | | `method` | no | `POST` | `POST` or `PUT` | | `headers` | no | n/a | Extra headers (e.g., auth). A `Content-Type` header (any casing) is rejected at create/update time. | Yorker always sends `Content-Type: application/json`. A user-supplied `Content-Type` header would break the documented body-parser contract and is refused by the channel schema. ## What gets sent The webhook channel receives **every** incident event by default: - `opened` - `alert_attached` - `severity_changed` - `acknowledged` - `auto_resolved` - `closed` - `reopened` - `note_added` ### Default payload ```json { "schema_version": 1, "event": { "eventId": "ievt_001", "incidentId": "inc_abc", "teamId": "team_123", "eventType": "opened", "actor": { "type": "system", "id": null }, "payload": { "eventType": "opened", "observations": { "sources": ["synthetic_http"], "syntheticHttp": { "affectedChecks": [{ "checkId": "chk_api", "checkName": "Checkout API" }], "symptomWindow": { "startedAt": "2026-04-15T09:58:00.000Z" }, "errorSignature": { "httpStatusCodes": [503, 504], "errorCategories": ["upstream_error"], "locationsAffected": ["loc_us_east_1", "loc_eu_west_1"], "sampleMessages": ["Bad Gateway", "Gateway Timeout"] }, "sharedFailingDomains": ["api.stripe.com"] } }, "hypothesis": { "summary": "Stripe API is returning 503/504; checkout is blocked.", "confidence": 0.75, "ruledIn": ["shared_failing_domain=api.stripe.com"], "ruledOut": ["DNS resolution: NXDOMAIN not observed", "TLS: handshake completes"], "correlationDimensionsMatched": ["shared_failing_domain", "error_pattern"], "scope": "external_symptoms_only" }, "title": "Checkout API outage", "severity": "critical", "fingerprintHash": "…", "memberAlertInstanceIds": ["ainst_1", "ainst_2"], "recurrenceOf": [] }, "occurredAt": "2026-04-15T10:00:00.000Z" }, "incident": { "incidentId": "inc_abc", "title": "Checkout API outage", "severity": "critical", "state": "open", "openedAt": "2026-04-15T10:00:00.000Z", "triageUrl": "https://yorkermonitoring.com/dashboard/incidents/inc_abc" } } ``` ### schema_version Every default payload carries `schema_version: 1`. Gate your consumer on this field and Yorker will not silently break your integration when the default shape evolves: breaking changes bump the version; additive changes don't. ## Observations shape Each source in `observations.sources[]` (snake_case: `synthetic_http`, `synthetic_browser`, `synthetic_mcp`) has a matching camelCase block (`syntheticHttp`, `syntheticBrowser`, `syntheticMcp`) on the same object. A multi-source incident carries every relevant block. Example consumer: ```ts const obs = event.payload.observations; if (obs.sources.includes("synthetic_http")) { // obs.syntheticHttp is present const statusCodes = obs.syntheticHttp.errorSignature.httpStatusCodes; } ``` ## Template overrides Render your own JSON body with Handlebars. The rendered string must parse as valid JSON; on failure, the default payload is sent instead. ### Edit in the web UI Open **Settings > Notification Channels** and click **Templates** next to the webhook channel. The editor has JSON syntax highlighting, a live preview that renders your body against one of six canonical fixtures, and a **library** sidebar with two starter bodies (**Default: flat envelope** and **Nested: incident + event objects**) plus curated examples (PagerDuty Events API v2-shaped, OTel log record shape) you can apply with a click. **Send test** posts the current saved template to your webhook URL with the selected fixture's context (60-second cooldown per channel). The audit row for the test run is written to `incident_notification_dispatches` with `detail_json.isTest = true` so a real incident replay can be distinguished from a smoke-test. ### Edit via the API ```bash curl -X PUT https://yorkermonitoring.com/api/notification-channels/nch_abc \ -H "Authorization: Bearer $YORKER_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "incidentTemplate": { "channelType": "webhook", "overrides": { "opened": { "body": "{\"type\":\"incident.opened\",\"id\":\"{{incident.incidentId}}\",\"severity\":\"{{incident.severity}}\",\"hypothesis\":\"{{payload.hypothesis.summary}}\",\"triage\":\"{{incident.triageUrl}}\"}" } } } }' ``` For payloads that want to splat in arbitrary nested structure without mustache-ing every key, use `{{jsonBody payload}}`. The render context has the following top-level keys (same as the [Slack integration](/docs/integrations/slack)). The event envelope fields mirror `serializeIncidentEventForExport`: `eventId`, `eventType`, `incidentId`, `teamId`, `actor`, `occurredAt`, `payload`. In addition, a materialized `incident` snapshot (`title`, `severity`, `state`, `openedAt`, `triageUrl`) is exposed for direct use in templates. There is no top-level `event` key; use the individual fields or helper as shown below. ```json { "body": "{\"type\":\"{{eventType}}\",\"id\":\"{{eventId}}\",\"occurredAt\":\"{{occurredAt}}\",\"actor\":{{{jsonBody actor}}},\"payload\":{{{jsonBody payload}}},\"incident\":{{{jsonBody incident}}} }" } ``` Notes: - JSON-producing templates (webhook, Slack, PagerDuty, ServiceNow) compile with Handlebars HTML escaping disabled, so for these channels `{{foo}}` and `{{{foo}}}` produce identical output. Triple-stash is shown here by convention: it makes the intent (raw interpolation into JSON) obvious to readers. Email HTML templates compile with escaping on, where the two forms are NOT equivalent: `{{jsonBody payload}}` gets HTML-escaped by default (safe) and `{{{jsonBody payload}}}` is an explicit opt-out of escaping that the template author must choose deliberately. - Handlebars' tokenizer fails on a mustache close that runs directly into a JSON `}`. Both `{{{foo}}}}` (triple-close + literal) and `{{foo}}}` (double-close + literal) raise a parse error. Add a space before the JSON close brace (`{{{foo}}} }`) to disambiguate. The rendered JSON is otherwise unchanged; if your consumer verifies a canonical-JSON HMAC over the body, re-serialize (e.g., `JSON.stringify(JSON.parse(body))`) before hashing so whitespace differences don't break the signature. A render error or invalid-JSON result falls back to the default payload and logs a warning. Dispatch never fails on a bad template. Helpers and render context are the same as the [Slack integration](/docs/integrations/slack). ## Delivery and retry - **Timeout:** Yorker expects a response within the platform HTTP timeout. Slow endpoints risk being recorded as `failed`. - **Retry:** Yorker does not retry failed webhook deliveries on the same event. Use the [audit trail](/docs/concepts/incidents#audit-trail) (`incident_notification_dispatches`) to replay deliveries from your own backfill tooling. - **Dedupe:** Within a 30-second window, a duplicate event to the same channel is recorded as `skipped_dedupe` and not re-sent. This protects against runner retry bursts. ## Disabling Set `incidentSubscribed: false` to fall back to the legacy per-alert webhook dispatch. --- # Reference ## Configuration (YAML) > Complete reference for yorker.config.yaml — every field, type, default, and constraint. # Configuration (YAML) Yorker uses a declarative YAML file to define monitors, alerts, SLOs, and notification channels as code. The CLI validates every field through the same Zod schemas that the API uses, so defaults and constraints are applied identically. To get started, create a `yorker.config.yaml` in your project root. --- ## Root Fields | Field | Type | Required | Description | |---|---|---|---| | `project` | `string` | Yes | Project name. Displayed in deploy plan output. | | `alertChannels` | `object` | No | Named notification channel definitions (key-value map). | | `defaults` | `object` | No | Default values inherited by all monitors. | | `groups` | `array` | No | Groups of monitors that share frequency, locations, and alerts. | | `monitors` | `array` | No | Top-level monitor definitions (outside any group). | | `slos` | `array` | No | Service Level Objective definitions. | | `maintenanceWindows` | `array` | No | Scheduled silences / pauses. See [`maintenanceWindows`](#maintenancewindows). | ```yaml project: "my-app" alertChannels: # ... defaults: # ... groups: # ... monitors: # ... slos: # ... ``` --- ## `alertChannels` To configure notification channels, define them as named entries under `alertChannels`. Alerts and SLOs reference channels by name using `@channel-name` syntax. Each channel must have a `type` field. The remaining fields depend on the type. ### Slack | Field | Type | Required | Description | |---|---|---|---| | `type` | `"slack"` | Yes | Channel type. | | `webhookUrl` | `string` (URL) | Yes | Slack incoming webhook URL. | ```yaml alertChannels: ops-slack: type: slack webhookUrl: "https://hooks.slack.com/services/T00/B00/xxxx" ``` ### Email | Field | Type | Required | Description | |---|---|---|---| | `type` | `"email"` | Yes | Channel type. | | `addresses` | `string[]` | Yes | At least one valid email address. | ```yaml alertChannels: team-email: type: email addresses: - oncall@example.com - devops@example.com ``` ### Webhook | Field | Type | Required | Default | Description | |---|---|---|---|---| | `type` | `"webhook"` | Yes | -- | Channel type. | | `url` | `string` (URL) | Yes | -- | Webhook endpoint URL. | | `method` | `"POST"` \| `"PUT"` | No | `"POST"` | HTTP method. | | `headers` | `object` | No | -- | Custom headers (key-value string pairs). | ```yaml alertChannels: pagerduty: type: webhook url: "https://events.pagerduty.com/v2/enqueue" method: POST headers: Content-Type: "application/json" ``` --- ## `defaults` To set values that apply to all monitors unless overridden, use the `defaults` block. Monitors and groups can override any default. | Field | Type | Default | Description | |---|---|---|---| | `frequency` | `string` | `"5m"` (300s) | Check interval. Format: `Ns`, `Nm`, or `Nh`. Range: 10s-86400s. | | `locations` | `string[]` | `["loc_us_east", "loc_eu_central"]` | Location IDs to run from. Must be non-empty. | | `http` | `object` | -- | Default HTTP check configuration. | | `browser` | `object` | -- | Default browser check configuration. | | `alerts` | `array` | -- | Default alert rules applied to all monitors. | ### Frequency Format The frequency value uses a duration string with a numeric value and a unit suffix. | Unit | Suffix | Example | Seconds | |---|---|---|---| | Seconds | `s` | `30s` | 30 | | Minutes | `m` | `5m` | 300 | | Hours | `h` | `1h` | 3600 | The resolved value in seconds must be between **10** and **86400** (24 hours). ```yaml defaults: frequency: "5m" locations: - loc_us_east - loc_eu_central - loc_ap_northeast ``` ### `defaults.http` To configure default values for all HTTP monitors, use the `defaults.http` block. | Field | Type | Default | Description | |---|---|---|---| | `timeoutMs` | `number` | `30000` | Request timeout in milliseconds. | | `followRedirects` | `boolean` | `true` | Whether to follow HTTP redirects. | | `maxRedirects` | `number` | `5` | Maximum number of redirects to follow. | | `assertions` | `array` | `[]` | Default assertion rules. See [Assertions](/docs/reference/assertions). | ```yaml defaults: http: timeoutMs: 15000 followRedirects: true maxRedirects: 3 assertions: - type: status_code value: 200 - type: response_time max: 5000 ``` ### `defaults.browser` To configure default values for all browser monitors, use the `defaults.browser` block. | Field | Type | Default | Constraints | Description | |---|---|---|---|---| | `timeoutMs` | `number` | `30000` | 5000-120000 | Script execution timeout in milliseconds. | | `viewport` | `object` | `{ width: 1280, height: 720 }` | -- | Browser viewport dimensions. | | `viewport.width` | `number` | `1280` | -- | Viewport width in pixels. | | `viewport.height` | `number` | `720` | -- | Viewport height in pixels. | | `screenshotMode` | `string` | `"every_step"` | `every_step` \| `failure_only` \| `disabled` | When to capture screenshots. | | `videoEnabled` | `boolean` | `false` | -- | Whether to record video. | | `device` | `string` | -- | -- | Playwright device name for emulation (e.g., `"iPhone 14"`). | ```yaml defaults: browser: timeoutMs: 60000 viewport: width: 1920 height: 1080 screenshotMode: every_step videoEnabled: false ``` ### `defaults.alerts` To set alert rules that apply to all monitors by default, define them under `defaults.alerts`. Each alert requires at least one condition and at least one channel reference. ```yaml defaults: alerts: - name: "default-failure-alert" conditions: - type: consecutive_failures count: 3 channels: - "@ops-slack" ``` See [Groups and Monitors](#groups) for how alert inheritance works. --- ## `groups` To organize monitors that share configuration, use groups. Groups can override `defaults` for frequency, locations, and alerts. | Field | Type | Required | Description | |---|---|---|---| | `name` | `string` | Yes | Group name (for display in deploy plan). | | `frequency` | `string` | No | Overrides `defaults.frequency` for all monitors in this group. | | `locations` | `string[]` | No | Overrides `defaults.locations` for all monitors in this group. | | `alerts` | `array` | No | Overrides `defaults.alerts` for all monitors in this group. | | `monitors` | `array` | Yes | Monitor definitions within this group. | ```yaml groups: - name: "US API endpoints" frequency: "1m" locations: - loc_us_east - loc_us_west alerts: - name: "api-down" conditions: - type: consecutive_failures count: 2 channels: - "@ops-slack" - "@team-email" monitors: - name: "Users API" type: http url: "https://api.example.com/v1/users" - name: "Orders API" type: http url: "https://api.example.com/v1/orders" ``` --- ## `monitors` ### HTTP Monitors To define an HTTP monitor, set `type: http` and provide a `url`. | Field | Type | Required | Default | Constraints | Description | |---|---|---|---|---|---| | `name` | `string` | Yes | — | 1-255 characters | Unique monitor name. | | `type` | `"http"` | Yes | — | — | Monitor type. | | `url` | `string` | Yes | — | Valid URL | Target URL to check. | | `method` | `string` | No | `"GET"` | `GET` \| `POST` \| `PUT` \| `DELETE` \| `PATCH` \| `HEAD` | HTTP method. | | `headers` | `object` | No | — | — | Custom request headers (key-value string pairs). | | `body` | `string` | No | — | — | Request body (ignored for GET and HEAD). | | `auth` | `object` | No | — | — | Authentication configuration. See [Auth](#auth). | | `followRedirects` | `boolean` | No | `true` | — | Whether to follow redirects. Overrides `defaults.http`. | | `maxRedirects` | `number` | No | `5` | — | Maximum redirects. Overrides `defaults.http`. | | `timeoutMs` | `number` | No | `30000` | — | Request timeout in ms. Overrides `defaults.http`. | | `assertions` | `array` | No | `[]` | — | Assertion rules. **Replaces** defaults (not merged). See [Assertions](/docs/reference/assertions). | | `frequency` | `string` | No | From defaults/group | 10s-86400s | Check interval. | | `locations` | `string[]` | No | From defaults/group | Non-empty | Location IDs. | | `alerts` | `array` | No | From defaults/group | — | Alert rules. | | `labels` | `string[]` | No | — | See [Labels](#labels) | Labels attached to this check. Emitted as OTel resource attributes. | | `enabled` | `boolean` | No | `true` | — | Whether this monitor is active. | ```yaml monitors: - name: "Homepage" type: http url: "https://www.example.com" method: GET timeoutMs: 10000 assertions: - type: status_code value: 200 - type: response_time max: 3000 - type: body_contains value: "Welcome" - name: "Create Order API" type: http url: "https://api.example.com/v1/orders" method: POST headers: Content-Type: "application/json" body: '{"item": "test", "quantity": 1}' auth: type: bearer token: "{{secrets.API_TOKEN}}" assertions: - type: status_code value: 201 ``` ### Browser Monitors Browser monitors defined in `yorker.config.yaml` are always scripted — point `script` at a Playwright TypeScript file and Yorker runs it on each check. URL-mode browser monitors (which navigate a single URL without a script) are currently created through the Web UI or the REST API only; they cannot be deployed via YAML yet. | Field | Type | Required | Default | Constraints | Description | |---|---|---|---|---|---| | `name` | `string` | Yes | — | 1-255 characters | Unique monitor name. | | `type` | `"browser"` | Yes | — | — | Monitor type. | | `script` | `string` | Yes | — | — | Path to Playwright script file (relative to config file). | | `steps` | `array` | No | — | — | Optional named steps. Each entry has `name` (must match a `// @step: Name` marker in the script), optional `timeoutMs`, and optional `assertions`. | | `viewport` | `object` | No | `{ width: 1280, height: 720 }` | — | Browser viewport dimensions. Overrides `defaults.browser`. | | `device` | `string` | No | — | — | Playwright device name for emulation. Overrides `defaults.browser`. | | `screenshotMode` | `string` | No | `"every_step"` | `every_step` \| `failure_only` \| `disabled` | Screenshot capture mode. Overrides `defaults.browser`. | | `videoEnabled` | `boolean` | No | `false` | — | Whether to record video. Overrides `defaults.browser`. | | `timeoutMs` | `number` | No | `30000` | 5000-120000 | Script timeout in ms. Overrides `defaults.browser`. | | `frequency` | `string` | No | From defaults/group | 10s-86400s | Check interval. | | `locations` | `string[]` | No | From defaults/group | Non-empty | Location IDs. | | `alerts` | `array` | No | From defaults/group | — | Alert rules. | | `labels` | `string[]` | No | — | See [Labels](#labels) | Labels attached to this check. | | `enabled` | `boolean` | No | `true` | — | Whether this monitor is active. | ```yaml monitors: - name: "Login Flow" type: browser script: "./monitors/login.ts" viewport: width: 1920 height: 1080 screenshotMode: every_step timeoutMs: 60000 frequency: "10m" locations: - loc_us_east - loc_eu_west ``` ### MCP Monitors To define an MCP monitor (for checking Model Context Protocol servers over Streamable HTTP), set `type: mcp` and provide an `endpoint`. | Field | Type | Required | Default | Constraints | Description | |---|---|---|---|---|---| | `name` | `string` | Yes | — | 1-255 characters | Unique monitor name. | | `type` | `"mcp"` | Yes | — | — | Monitor type. | | `endpoint` | `string` | Yes | — | Valid URL | Streamable HTTP endpoint of the MCP server. | | `timeoutMs` | `number` | No | `30000` | 5000-120000 | Request timeout in ms. | | `auth` | `object` | No | — | — | Same shape as HTTP [`auth`](#auth). | | `expectedTools` | `string[]` | No | — | — | Tool names that must be present. Missing tools fail the check. | | `testCalls` | `array` | No | — | — | Tool invocations to exercise. See below. | | `detectSchemaDrift` | `boolean` | No | `true` | — | Emit events when the tool list or tool signatures change. | | `frequency` | `string` | No | From defaults/group | 10s-86400s | Check interval. | | `locations` | `string[]` | No | From defaults/group | Non-empty | Location IDs. | | `alerts` | `array` | No | From defaults/group | — | Alert rules. | | `labels` | `string[]` | No | — | See [Labels](#labels) | Labels attached to this check. | | `enabled` | `boolean` | No | `true` | — | Whether this monitor is active. | `testCalls` entry fields: | Field | Type | Required | Description | |---|---|---|---| | `toolName` | `string` | Yes | Name of the tool to invoke. | | `arguments` | `object` | No | Plain JSON key-value map of arguments passed to the tool. | | `expectedOutputContains` | `string` | No | Substring that must appear in the tool's result. | ```yaml monitors: - name: "Docs MCP" type: mcp endpoint: "https://mcp.example.com/sse" frequency: "5m" auth: type: bearer token: "{{secrets.MCP_TOKEN}}" expectedTools: - search_docs - fetch_page testCalls: - toolName: search_docs arguments: query: "pricing" expectedOutputContains: "Plans" detectSchemaDrift: true locations: - loc_us_east ``` --- ## `auth` To authenticate HTTP requests, add an `auth` block to an HTTP monitor. Three authentication types are supported. ### Basic Auth | Field | Type | Required | Description | |---|---|---|---| | `type` | `"basic"` | Yes | Auth type. | | `username` | `string` | Yes | Username. | | `password` | `string` | Yes | Password. | ```yaml auth: type: basic username: "{{secrets.BASIC_USER}}" password: "{{secrets.BASIC_PASS}}" ``` ### Bearer Token | Field | Type | Required | Description | |---|---|---|---| | `type` | `"bearer"` | Yes | Auth type. | | `token` | `string` | Yes | Bearer token value. | ```yaml auth: type: bearer token: "{{secrets.API_TOKEN}}" ``` ### API Key | Field | Type | Required | Default | Description | |---|---|---|---|---| | `type` | `"api-key"` | Yes | -- | Auth type. | | `header` | `string` | No | `"X-API-Key"` | Header name to send the key in. | | `value` | `string` | Yes | -- | API key value. | ```yaml auth: type: api-key header: "X-Custom-Key" value: "{{secrets.CUSTOM_API_KEY}}" ``` --- ## `slos` To define Service Level Objectives, add entries to the `slos` array. Each SLO is linked to a monitor by name. | Field | Type | Required | Default | Constraints | Description | |---|---|---|---|---|---| | `name` | `string` | Yes | -- | 1-255 characters | Unique SLO name. | | `monitor` | `string` | Yes | -- | Must match a monitor `name` | Reference to the monitor this SLO tracks. | | `target` | `string` \| `number` | Yes | -- | Resolves to 1-9999 basis points | Availability target. `"99.9%"` or `99.9`. | | `window` | `string` \| `number` | Yes | -- | `"7d"` \| `"14d"` \| `"30d"` or `7` \| `14` \| `30` | Rolling window. | | `burnRateAlerts` | `boolean` | No | `true` | -- | Enable burn rate alerting. | | `channels` | `array` | No | `[]` | -- | Channel references for burn rate alerts (e.g., `["@ops-slack"]`). | | `enabled` | `boolean` | No | `true` | -- | Whether this SLO is active. | ### Target Format The target can be specified as a percentage string or a number: - `"99.9%"` -- parsed as 9990 basis points - `99.9` -- treated as a percentage, parsed as 9990 basis points - Valid range: 1-9999 basis points (0.01% to 99.99%) ### Window Format The window can be a duration string or a number: - `"7d"`, `"14d"`, `"30d"` -- string format - `7`, `14`, `30` -- numeric format (days) - Only these three values are allowed. ```yaml slos: - name: "Homepage Availability" monitor: "Homepage" target: "99.9%" window: "30d" burnRateAlerts: true channels: - "@ops-slack" - name: "API Uptime" monitor: "Users API" target: 99.95 window: 7d channels: - "@ops-slack" - "@team-email" ``` --- ## `maintenanceWindows` To silence alerts during scheduled work, add entries to `maintenanceWindows`. A window can pause checks entirely or let them continue running while suppressing notifications. > **Single-file format only.** `maintenanceWindows` is supported when you deploy from a single `yorker.config.yaml`. The directory format (`yorker init --format directory`) does not currently support a per-resource maintenance-window YAML file. If you need maintenance windows in code, stay on the single-file format. Also note that `yorker pull` does not export maintenance windows and overwrites the YAML file; see the [CLI reference for `yorker pull`](/docs/reference/cli#yorker-pull) for the round-trip caveat. | Field | Type | Required | Default | Description | |---|---|---|---|---| | `name` | `string` | Yes | — | Unique window name. | | `mode` | `string` | No | `pause` | `pause` (stop running checks) or `continue` (run but silence alerts). | | `checks` | `"all"` \| `string[]` | Yes | — | `"all"` or a list of monitor names covered by the window. | | `startsAt` | `string` | Yes | — | ISO-8601 start timestamp. | | `endsAt` | `string` | Yes | — | ISO-8601 end timestamp. Must be after `startsAt`. | | `recurring` | `boolean` | No | `false` | Enable recurrence. | | `recurrenceRule` | `string` | No | — | RRULE string (e.g., `FREQ=WEEKLY;BYDAY=SU`). Required when `recurring: true`. | ```yaml maintenanceWindows: - name: "Weekly DB maintenance" mode: pause checks: all startsAt: "2026-04-12T02:00:00Z" endsAt: "2026-04-12T03:00:00Z" recurring: true recurrenceRule: "FREQ=WEEKLY;BYDAY=SU" - name: "Black Friday cut-over" mode: continue checks: - "Homepage" - "Checkout Flow" startsAt: "2026-11-27T05:00:00Z" endsAt: "2026-11-27T06:00:00Z" ``` --- ## `alerts` (alert rule definitions) Alert rules live on monitors, in `groups`, or in `defaults`. Each rule has a list of `conditions` (ANDed together) and `channels` (channel references). | Field | Type | Required | Default | Description | |---|---|---|---|---| | `name` | `string` | No | — | Optional rule name. | | `enabled` | `boolean` | No | `true` | Whether the rule is active. | | `conditions` | `array` | Yes | — | At least one condition. | | `channels` | `string[]` | Yes | — | Channel references using `@channel-name` syntax. | ### Condition types | Type | Fields | Description | |---|---|---| | `consecutive_failures` | `count` (default `2`, min `1`) | Trigger after N failures in a row. | | `response_time_threshold` | `maxMs` (required) | Trigger when response time exceeds threshold. | | `multi_location_failure` | `minLocations` (default `2`, min `2`), `windowSeconds` (default `300`) | Trigger when failures correlate across multiple locations. | | `ssl_expiry` | `daysBeforeExpiry` (default `14`, min `1`), `severity` (optional) | Trigger when SSL cert approaches expiration. | | `ssl_certificate_changed` | `severity` (optional) | Trigger when the leaf certificate fingerprint changes between runs. | | `ssl_self_signed` | `severity` (optional) | Trigger when a self-signed or untrusted certificate is detected. | | `ssl_protocol_deprecated` | `minProtocol` (default `TLSv1.2`; allowed `TLSv1.2`, `TLSv1.3`), `severity` (optional) | Trigger when the handshake negotiates a protocol older than `minProtocol`. | | `burn_rate` | `sloId`, `burnRateThreshold`, `longWindowMinutes` (min `60`), `shortWindowMinutes` (min `5`, must be less than long) | SLO burn-rate alert. Most users let SLOs generate burn-rate alerts automatically via `burnRateAlerts: true`. | | `baseline_anomaly` | `metric` (required; `response_time`, `dns_lookup`, `tls_handshake`, `ttfb`, `content_transfer`, `lcp`, `fcp`, or `cls`), `sigmaThreshold` (default `3`, 2–10), `consecutiveCount` (default `3`, integer 2–20), `direction` (default `above`; allowed `above`, `below`, `both`), `severity` (default `warning`) | Trigger when the last `consecutiveCount` runs are all successes and each deviates by more than `sigmaThreshold`·σ from its own (hour × day-of-week × location) baseline in the configured direction. Any non-success run inside the window breaks the chain. | All SSL conditions (including `ssl_expiry`), `mcp_schema_drift`, and `baseline_anomaly` support an optional `severity` field: `critical`, `warning`, or `info`. `mcp_schema_drift` and `baseline_anomaly` default to `warning` from the schema; SSL conditions fall back to `critical` via the evaluator when unset. See [Set Up Alerts](/docs/guides/set-up-alerts) for worked examples. --- ## Labels Labels attach metadata to checks. They serve two purposes: 1. **Filtering and grouping** in the dashboard. 2. **OTel resource attributes** — Yorker emits every label as a `yorker.label.*` resource attribute on metrics and traces, so you can slice telemetry by label in your observability backend. Labels follow this format: `[a-zA-Z0-9][a-zA-Z0-9_.:-]*`, max 128 characters. | Label form | OTel attribute | |---|---| | `env:production` | `yorker.label.env="production"` | | `service:payments` | `yorker.label.service="payments"` | | `critical` *(no colon)* | `yorker.label.critical="true"` | ```yaml monitors: - name: "Payments API" type: http url: "https://api.example.com/payments" labels: - env:production - service:payments - critical ``` | Behavior | Meaning | |---|---| | `labels` omitted | Labels are **unmanaged** by config — the CLI preserves whatever labels exist on the remote. | | `labels: []` | Explicitly clears all labels on the check. | | `labels: [...]` | Sets the full list of labels on the check. | --- ## Secret Interpolation To inject secrets and environment variables into your config, use placeholder syntax. Placeholders are resolved at deploy time from environment variables. ### `{{secrets.NAME}}` Reads `YORKER_SECRET_NAME` first, then falls back to `NAME` from the environment. ```yaml auth: type: bearer token: "{{secrets.API_TOKEN}}" # Resolves: $YORKER_SECRET_API_TOKEN ?? $API_TOKEN ``` ### `{{env.NAME}}` Reads the environment variable `NAME` directly. ```yaml url: "{{env.BASE_URL}}/health" # Resolves: $BASE_URL ``` ### `${NAME}` (Legacy) Reads the environment variable `NAME` directly. This is the legacy syntax, supported only in YAML config values. It is **not** applied inside browser script files because `${...}` conflicts with JavaScript template literals. ```yaml url: "${BASE_URL}/health" # Resolves: $BASE_URL ``` ### Script Interpolation Browser script files (`.ts` files referenced by `script:`) support `{{secrets.NAME}}` and `{{env.NAME}}` interpolation. The legacy `${NAME}` syntax is intentionally excluded from scripts to avoid conflicts with JavaScript template literals. ### Unresolved Placeholders If a placeholder cannot be resolved, a warning is printed and the raw placeholder text is preserved. After interpolation, the CLI checks for any remaining unresolved placeholders and fails with an error listing each one and its location in the config. --- ## Inheritance Rules Configuration values cascade through three levels: **defaults**, **group**, and **monitor**. The most specific value wins. ``` monitor > group > defaults ``` | Setting | Behavior | |---|---| | `frequency` | Monitor overrides group, group overrides defaults. Falls back to 300s (5m). | | `locations` | Monitor overrides group, group overrides defaults. Falls back to `["loc_us_east", "loc_eu_central"]`. | | `alerts` | Monitor overrides group, group overrides defaults. **Explicit presence (even an empty array) is an intentional override.** An empty `alerts: []` on a monitor disables all alerts for that monitor. | | `assertions` | Monitor-level assertions **replace** defaults entirely. They are **not merged** with `defaults.http.assertions`. This matches Terraform/Checkly semantics. | | `labels` | Monitor-only (no cascade). Omitting leaves labels unmanaged. `labels: []` clears all labels. | | HTTP config (`timeoutMs`, `followRedirects`, `maxRedirects`) | Monitor overrides defaults. | | Browser config (`timeoutMs`, `viewport`, `device`, `screenshotMode`, `videoEnabled`) | Monitor overrides defaults. | --- ## Available Locations To see all available locations and their IDs, call `GET /api/locations`. | Location ID | Display Name | Fly Region | |---|---|---| | `loc_us_east` | US East (Ashburn) | `iad` | | `loc_us_south` | US South (Dallas) | `dfw` | | `loc_us_west` | US West (Los Angeles) | `lax` | | `loc_na_north` | Canada (Toronto) | `yyz` | | `loc_sa_east` | South America (São Paulo) | `gru` | | `loc_eu_west` | Europe West (London) | `lhr` | | `loc_eu_west_2` | Europe West (Paris) | `cdg` | | `loc_eu_central` | Europe Central (Frankfurt) | `fra` | | `loc_eu_north` | Europe North (Stockholm) | `arn` | | `loc_ap_southeast` | Asia Pacific (Singapore) | `sin` | | `loc_ap_northeast` | Asia Pacific (Tokyo) | `nrt` | | `loc_ap_south` | Asia Pacific (Mumbai) | `bom` | | `loc_ap_oceania` | Oceania (Sydney) | `syd` | | `loc_af_south` | Africa (Johannesburg) | `jnb` | --- ## Full Example ```yaml project: "acme-production" alertChannels: ops-slack: type: slack webhookUrl: "{{secrets.SLACK_WEBHOOK_URL}}" oncall-email: type: email addresses: - oncall@acme.com pagerduty: type: webhook url: "https://events.pagerduty.com/v2/enqueue" method: POST headers: Content-Type: "application/json" defaults: frequency: "5m" locations: - loc_us_east - loc_eu_central http: timeoutMs: 15000 followRedirects: true assertions: - type: status_code value: 200 browser: timeoutMs: 60000 screenshotMode: every_step alerts: - name: "default-alert" conditions: - type: consecutive_failures count: 3 channels: - "@ops-slack" groups: - name: "Critical APIs" frequency: "1m" locations: - loc_us_east - loc_us_west - loc_eu_west alerts: - name: "critical-api-alert" conditions: - type: consecutive_failures count: 2 - type: multi_location_failure minLocations: 2 windowSeconds: 300 channels: - "@ops-slack" - "@pagerduty" monitors: - name: "Auth API" type: http url: "https://api.acme.com/v1/auth/health" - name: "Payments API" type: http url: "https://api.acme.com/v1/payments/health" assertions: - type: status_code value: 200 - type: response_time max: 1000 monitors: - name: "Homepage" type: http url: "https://www.acme.com" - name: "Checkout Flow" type: browser script: "./monitors/checkout.ts" frequency: "10m" locations: - loc_us_east alerts: - name: "checkout-alert" conditions: - type: consecutive_failures count: 1 channels: - "@ops-slack" - "@oncall-email" slos: - name: "Auth API SLO" monitor: "Auth API" target: "99.95%" window: "30d" channels: - "@ops-slack" - name: "Homepage SLO" monitor: "Homepage" target: "99.9%" window: "7d" burnRateAlerts: true channels: - "@ops-slack" ``` --- ## CLI > Complete reference for the Yorker CLI — all commands, flags, and environment variables. # CLI The Yorker CLI manages monitors, alerts, SLOs, notification channels, and private locations. Use it for infrastructure-as-code deployments from `yorker.config.yaml`, or manage resources imperatively with CRUD commands. --- ## Installation ```bash npm install -g @yorker/cli ``` Or locally in a project: ```bash npm install --save-dev @yorker/cli ``` --- ## Authentication Generate an API key from **Settings > API Keys** in the dashboard, then set it: ```bash export YORKER_API_KEY=sk_your_key_here ``` --- ## Global Flags These flags are available on every command: | Flag | Description | |---|---| | `--json` | Output in structured JSON format. | | `--quiet` | Suppress non-essential output. | | `--no-color` | Disable colored output. | | `--verbose` | Enable verbose/debug output. | | `--api-url ` | Control plane URL (default: `https://app.yorkermonitoring.com`). Also settable via `YORKER_API_URL`. | | `--api-key ` | API key. Also settable via `YORKER_API_KEY`. | | `--help`, `-h` | Show help. | | `--version`, `-V` | Show version. | --- ## Environment Variables | Variable | Required | Default | Description | |---|---|---|---| | `YORKER_API_KEY` | Yes | — | API key for authentication. Keys start with `sk_`. | | `YORKER_API_URL` | No | `https://app.yorkermonitoring.com` | Control plane base URL. Override for self-hosted or development. | --- ## Exit Codes | Code | Meaning | |---|---| | `0` | Success. | | `1` | General error (validation failure, API error, missing config). | | `2` | Authentication failure (invalid or missing API key). | | `3` | Plan/quota limit exceeded. | | `4` | Partial failure (some operations succeeded, others failed). | | `5` | Drift detected — remote resources were modified outside the CLI. | | `10` | `yorker status` reports one or more unhealthy monitors. | --- ## Commands ### `yorker init` Scaffold a new monitoring config in the current directory. Interactive prompts guide you through project name, first monitor URL, type, and frequency. In non-interactive contexts (CI, `--json`, or a non-TTY shell) you must pass `--url` (and optionally the other fields) as flags or the command errors out. ```bash yorker init [--name ] [--url ] [--type ] [--frequency ] [--format file|directory] [--force] ``` | Flag | Description | |---|---| | `--name ` | Project name (skips prompt). | | `--url ` | First monitor URL (skips prompt). Required in non-interactive contexts. | | `--type ` | Monitor type: `http`, `browser`, or `mcp` (skips prompt). Default: `http`. | | `--frequency ` | Check frequency, e.g. `30s`, `5m`, `1h` (skips prompt). Default: `5m`. | | `--format ` | Output layout: `file` (a single `yorker.config.yaml`, the default) or `directory` (a `.yorker/` directory with one YAML file per resource). | | `--force` | Overwrite existing config without asking. | The generated config uses the two default locations `loc_us_east` and `loc_eu_central` and the frequency you choose. HTTP monitors get a starter `status_code: 200` assertion; browser monitors get a `./monitors/.ts` script stub placeholder; MCP monitors just get the endpoint URL. --- ### `yorker deploy` Push your local `yorker.config.yaml` to the control plane. Computes a diff against remote state, displays a Terraform-style plan, and applies changes. ```bash yorker deploy [--dry-run] [--prune] [--force] [--accept-remote] [--wait] ``` | Flag | Description | |---|---| | `--dry-run` | Show the deploy plan without applying changes. | | `--prune` | Delete remote resources not defined in the local config. | | `--force` | Overwrite remote changes (ignore drift). Local config wins. | | `--accept-remote` | Skip drifted resources, keeping their remote changes. | | `--wait` | After a successful create, poll for the first check result from each new monitor (3-minute overall timeout). Exits with code `4` (`PARTIAL_FAILURE`) if any new check's first result is a failure, or if waiting for a first result times out. Useful in CI to catch broken monitors before merging. | > `--force` and `--accept-remote` are mutually exclusive — pass one or the other, not both. #### Deploy pipeline 1. **Load** — reads `yorker.config.yaml` from the current directory. 2. **Interpolate** — resolves `{{secrets.NAME}}`, `{{env.NAME}}`, and `${NAME}` placeholders from environment variables. 3. **Validate** — parses every resource through shared Zod schemas. Fails with all errors collected. 4. **Resolve** — applies cascading defaults (defaults → group → monitor), reads browser script files, builds API payloads. 5. **Fetch remote state** — gets all checks, alerts, SLOs, channels, and maintenance windows from the API. 6. **Detect drift** — compares remote `updatedAt` timestamps against the stored deploy state. See [Drift detection](#drift-detection) below. 7. **Compute plan** — field-level diffs between local and remote state. 8. **Display plan** — Terraform-style plan output. 9. **Apply** — if not `--dry-run`, applies changes in ordered phases. 10. **Save state** — writes `.yorker/.deploy-state.json` with config hashes and remote timestamps. #### Deploy phases Changes are applied in dependency order: | Phase | Action | Why | |---|---|---| | 0 | Create/update notification channels | Channels must exist before alerts or SLOs reference them. | | A | Delete SLOs | Must be deleted before their parent checks. | | B | Delete alerts | Must be deleted before their parent checks. | | C | Delete checks | Safe after dependent resources removed. | | D+E | Create and update checks | New checks get server-assigned IDs needed by alerts and SLOs. | | E2 | Sync check labels | Labels are a separate API, applied after checks exist so new check IDs are known. | | F | Create/update alerts | Alerts reference check IDs and channel IDs. | | G | Create/update SLOs | SLOs reference check IDs and channel IDs. | | H | Create/update maintenance windows | Maintenance windows reference check IDs, so they run after checks exist. | | I | Delete maintenance windows (only with `--prune`) | Stale maintenance windows are only removed when you opt in via `--prune`. | | Z | Delete channels | Runs last so alerts/SLOs referring to channels are gone first. Delete only runs when `--prune` is set, or when a channel's `type` changed in YAML (type changes require delete-and-recreate). | #### Plan output ``` Yorker deploy plan for "my-project" Channels: + CREATE "ops-slack" = UNCHANGED "team-email" Checks: + CREATE http "Users API" (60s, 3 locations) ~ UPDATE http "Homepage" ~ configJson.timeoutMs 30000 -> 15000 - DELETE http "Old Endpoint" = UNCHANGED http "Orders API" Alerts: + CREATE "api-down" (Users API) SLOs: ~ UPDATE "Homepage SLO" ~ targetBasisPoints 9990 -> 9995 Summary: 2 to create, 2 to update, 1 to delete, 2 unchanged ``` | Symbol | Meaning | |---|---| | `+` CREATE | Resource will be created. | | `~` UPDATE | Resource will be updated. Field-level diffs shown below. | | `-` DELETE | Resource will be deleted (requires `--prune` or type change). | | `=` UNCHANGED | No action needed. | #### Drift detection The CLI tracks the state of each resource after every deploy in `.yorker/.deploy-state.json`. On the next deploy, it compares remote `updatedAt` timestamps against the stored state to detect resources modified outside the CLI (e.g., via the web UI). | Local changed? | Remote changed? | Result | |---|---|---| | No | No | Unchanged — skip. | | Yes | No | Normal update — proceed. | | No | Yes | **Drift** — remote was edited outside the CLI. | | Yes | Yes | **Conflict** — both local config and remote were changed. | If drift or conflicts are detected, the deploy aborts with a report: ``` Warning: 1 drifted, 1 conflicting resource(s) modified outside config Checks: ! DRIFT "api-health" (remote changed since last deploy) Alerts: ! CONFLICT "api-health:latency" (both local and remote changed) Use --force to overwrite remote changes, or --accept-remote to keep them. ``` Resolution options: | Flag | Behavior | |---|---| | *(default)* | Abort with drift/conflict report. | | `--force` | Local config wins — overwrite remote changes. | | `--accept-remote` | Skip drifted/conflicting resources — keep remote as-is. | The deploy state file `.yorker/.deploy-state.json` is per-machine state. Add it to your `.gitignore` to avoid committing it. Every successful `yorker pull` rewrites this file with a fresh snapshot of remote state, so the next deploy treats everything as a clean baseline. --- ### `yorker diff` Show changes between local config and remote state without applying them. ```bash yorker diff ``` Displays the same Terraform-style plan as `yorker deploy --dry-run`. Designed for CI pipelines and quick checks — see [CI/CD Integration](/docs/guides/ci-cd) for full workflow examples. --- ### `yorker pull` Export remote monitors, alerts, SLOs, and notification channels to a local config. The reverse of `yorker deploy`. ```bash yorker pull [--output ] [--format file|directory] [--monitor ] [--force] ``` | Flag | Description | |---|---| | `-o, --output ` | Output path. Default: `yorker.config.yaml` in file format, or `.yorker/` in directory format. | | `--format ` | Output layout: `file` (single `yorker.config.yaml`, default) or `directory` (a `.yorker/` tree with one YAML file per resource). | | `--monitor ` | Pull a single monitor by name or ID. | | `--force` | Overwrite existing config file and scripts. | Fetches all monitors, alerts, SLOs, and notification channels and writes a config file that `yorker deploy` can consume. Browser monitor scripts are extracted to `./monitors/.ts`, where `` is the monitor name lowercased with non-alphanumeric characters replaced by hyphens (e.g., `Login Flow` becomes `login-flow.ts`). > **`yorker pull` does not export maintenance windows and overwrites your entire config file.** If you manage maintenance windows through YAML, running `pull` will erase them from your local file (remote maintenance windows are not touched, but they will not reappear in the re-pulled YAML). If you need to round-trip safely, either avoid `pull` on projects that use `maintenanceWindows`, or re-add them to the YAML after each pull. A subsequent `yorker deploy` without `--prune` leaves remote maintenance windows alone; `yorker deploy --prune` will delete any remote maintenance windows that are not in the (now empty) local YAML. --- ### `yorker import` > **Coming soon.** The `import` command is a scaffold today — running it reports the selected source, the source and output paths, and a link to the changelog where the release will land, then exits with code `1`. The importers themselves ship in a follow-up release. Import existing monitors from another synthetic monitoring tool into a Yorker config. The resulting YAML is ready to deploy with `yorker deploy`, with inline `TODO` comments on anything that doesn't translate cleanly. ```bash yorker import --from [--output ] [path] ``` | Flag / Argument | Description | |---|---| | `--from ` | **Required.** Source tool to import from. See supported sources below. | | `--output ` | Directory to write the generated Yorker YAML into. Default: `.` (current directory). | | `[path]` | Path to the source repo or config file (e.g. `./src/__checks__` for Checkly, `./tests` for a Playwright project). | **Supported sources** (planned): | Source | `--from` value | |---|---| | Checkly | `checkly` | | Datadog Synthetics | `datadog-synthetics` | | Better Stack | `better-stack` | | Pingdom | `pingdom` | | Grafana Synthetic Monitoring | `grafana-synthetic-monitoring` | | Playwright | `playwright` | Example (once available): ```bash # Import a Checkly project yorker import --from checkly ./src/__checks__ # Import a Playwright test suite as browser monitors yorker import --from playwright ./tests --output ./yorker/ ``` Watch the [changelog](/changelog) for release dates. Until the importers land, the [comparison pages](/compare) document the migration story for each source. --- ### `yorker validate` Validate your config without deploying. Runs the full load, interpolate, and validate pipeline without contacting the API. ```bash yorker validate ``` Validation includes YAML syntax, Zod schema validation, secret interpolation, browser script file existence, frequency range checks, name uniqueness, SLO/alert cross-references, channel resolution, and maintenance-window time ranges. ``` Config is valid: 5 monitor(s), 2 SLO(s) ``` On failure, all errors are collected: ``` 3 config error(s): monitors[Homepage].httpConfig.url: Invalid url monitors[Login Flow]: cannot read script file "./monitors/missing.ts" slos[API SLO]: references monitor "Nonexistent" which does not exist ``` --- ### `yorker status` Show a status overview of all monitors. ```bash yorker status [--watch] [--interval ] ``` | Flag | Description | |---|---| | `--watch` | Continuously refresh. In an interactive TTY the screen is cleared and redrawn in place. In `--json` mode each cycle emits a single newline-delimited JSON object. When stdout is piped (non-TTY, non-JSON), each snapshot is appended with a `--- Status @ ---` header. Press `Ctrl+C` to exit. | | `--interval ` | Refresh interval for `--watch` (default: `30s`, range: `5s` to `1h`). Ignored without `--watch`. | Displays each monitor with its enabled/paused state, type, frequency, health, and recent-result sparkline. Exits with code `10` when any monitor is unhealthy. --- ### `yorker dashboard` Open an interactive full-screen status dashboard in your terminal. Polls the API on a configurable interval and renders monitor health, recent results, and sparklines. ```bash yorker dashboard [--interval ] ``` | Flag | Description | |---|---| | `--interval ` | Refresh interval (default: `30s`, range: `5s` to `1h`). | **Keybindings** (list view): | Key | Action | |---|---| | `↑` / `k` | Move selection up | | `↓` / `j` | Move selection down | | `Enter` | Open detail view for the selected monitor | | `/` | Filter monitors by name | | `r` | Refresh now (outside the polling interval) | | `q` / `Ctrl+C` | Quit | In the detail view, press `Esc` to return to the list, `q` to quit, or `r` to refresh. This is the interactive TUI (singular `dashboard`). For installing HyperDX / ClickStack dashboard packs, see `yorker dashboards install` below. --- ### `yorker dashboards install` Install pre-built Yorker dashboards into HyperDX (self-hosted) or ClickStack Cloud. See [Install Dashboards](/docs/guides/install-dashboards) for full setup. ```bash # Self-hosted HyperDX yorker dashboards install --hyperdx-api-key # ClickStack Cloud yorker dashboards install --mode clickstack_cloud \ --clickstack-org-id \ --clickstack-service-id \ --clickstack-api-key-id \ --clickstack-api-secret ``` | Flag | Description | |---|---| | `--mode ` | `self_hosted` (default) or `clickstack_cloud` (env: `HYPERDX_MODE`) | | `--hyperdx-api-key ` | HyperDX API key for self-hosted mode (env: `HYPERDX_API_KEY`) | | `--hyperdx-url ` | Override HyperDX API URL for self-hosted (default: `https://www.hyperdx.io`) | | `--clickstack-org-id ` | ClickHouse Cloud organization ID (env: `CLICKSTACK_ORG_ID`) | | `--clickstack-service-id ` | ClickHouse Cloud service ID (env: `CLICKSTACK_SERVICE_ID`) | | `--clickstack-api-key-id ` | ClickHouse Cloud API key ID (env: `CLICKSTACK_API_KEY_ID`) | | `--clickstack-api-secret ` | ClickHouse Cloud API key secret (env: `CLICKSTACK_API_SECRET`) | | `--packs ` | Comma-separated subset: `overview`, `deep-dive`, `status-page`, `trace-correlation` | | `--force` | Skip duplicate check (may create additional copies) | --- ### `yorker log` View config change history. Shows what changed, when, and how. ```bash yorker log [--type ] [--name ] [--source ] [--limit ] [--offset ] ``` | Flag | Description | |---|---| | `--type ` | Filter by resource type (`check`, `alert`, `slo`, `channel`). | | `--name ` | Filter by resource name (case-insensitive substring match). | | `--source ` | Filter by change source (`ui`, `yaml`, `api`). | | `--limit ` | Number of changes to show (default: 20, max: 200). | | `--offset ` | Skip first N changes for pagination (cannot be used together with `--name`). | ``` Config changes (showing 5): Time Action Type Name Source 2m ago create check api-health yaml 5m ago update alert api-health:p95 ui 1h ago delete channel old-slack yaml 2h ago update slo availability api 3h ago create check login-flow yaml ``` Use `--json` for machine-readable output in CI pipelines. --- ### `yorker test` Run HTTP monitors locally against their configured URLs. Uses the same config pipeline as deploy, so all defaults, auth, and secrets are applied identically. ```bash yorker test ``` ``` Running monitors locally... HTTP Homepage ... 200 (142ms) HTTP Users API ... 200 (89ms) Browser Login Flow (4 steps) 1. Navigate 2. Log in 3. Dashboard loads 4. Check account Browser monitors require remote execution — use `yorker deploy` then check results MCP Docs Server MCP monitors require remote execution — use `yorker deploy` then check results ``` Only HTTP monitors are exercised locally. Browser and MCP monitors are listed (browser monitors also display extracted step markers) but skipped — deploy them and view results remotely. --- ### `yorker completions [shell]` Generate shell completion scripts. Auto-detects your shell if the argument is omitted. ```bash yorker completions # auto-detect yorker completions bash yorker completions zsh yorker completions fish ``` Pipe the output to the appropriate file for your shell. For example: ```bash yorker completions zsh > ~/.zfunc/_yorker ``` --- ## Monitor Commands ### `yorker monitors list` List all monitors. The output includes each monitor's name, type, frequency, location count, and any labels attached to the check. ```bash yorker monitors list [--type ] [--status ] ``` | Flag | Description | |---|---| | `--type ` | Filter by type: `http`, `browser`, or `mcp`. | | `--status ` | Filter by status: `enabled` or `paused`. | In `--json` mode, each monitor entry includes a `labels` array (omitted when label lookup failed, e.g. due to permissions or a transient API error). --- ### `yorker monitors get ` Show detailed information about a monitor, including recent results. ```bash yorker monitors get "Homepage" yorker monitors get chk_abc123 ``` Displays monitor details (ID, type, status, frequency, locations, URL or endpoint) and a table of recent results with response times and status codes. --- ### `yorker monitors create` Create a new HTTP monitor imperatively. Browser and MCP monitors must be created via `yorker deploy` from YAML. ```bash yorker monitors create --name "API Health" --type http --url https://api.example.com/health [--method GET] [--frequency 5m] [--locations loc_us_east,loc_eu_central] ``` | Flag | Required | Description | |---|---|---| | `--name ` | Yes | Monitor name. | | `--type ` | Yes | Must be `http`. | | `--url ` | Yes | URL to monitor. | | `--method ` | No | HTTP method (default: `GET`). | | `--frequency ` | No | Check frequency (default: `5m`). | | `--locations ` | No | Comma-separated location IDs (default: all active hosted locations). | --- ### `yorker monitors edit ` Edit an existing monitor. ```bash yorker monitors edit "Homepage" --frequency 1m --add-location loc_eu_central ``` | Flag | Description | |---|---| | `--name ` | New monitor name. | | `--frequency ` | New frequency. | | `--add-location ` | Add a location. | | `--remove-location ` | Remove a location. | | `--yes` | Skip confirmation prompt. | --- ### `yorker monitors delete ` Delete a monitor and all its results and alert rules. ```bash yorker monitors delete "Old Endpoint" [--yes] ``` Requires confirmation unless `--yes` is passed. --- ### `yorker monitors pause ` Pause a running monitor. ```bash yorker monitors pause "Homepage" ``` --- ### `yorker monitors resume ` Resume a paused monitor. ```bash yorker monitors resume "Homepage" ``` --- ### `yorker monitors analyze ` Run a deep insights analysis on a monitor. Sends recent results and context to the Yorker insights engine and returns a classified summary (type, severity, confidence, summary text). ```bash yorker monitors analyze "Homepage" yorker monitors analyze chk_abc123 yorker monitors analyze chk_abc123 --json ``` Output in human mode: ``` Type degradation Severity warning Confidence 0.87 Summary Response times increased 3x over the last hour across US-East, correlated with an upstream dependency slowdown. Model claude-sonnet-4-20250514 Generated Apr 9, 2026, 3:23:04 PM UTC Deep analysis usage: 2/5 used for this resource. ``` In `--json` mode, emits the full validated response envelope (`insight`, `stale`, `deepAnalysisCount`, `deepAnalysisLimit`) under the standard `{ ok: true, data: ... }` envelope. Deep analysis is rate-limited per monitor (5-minute cooldown between runs, 5 lifetime calls per monitor) and per team (50 per month). When rate-limited, the command exits with code `3` and prints a message explaining how long to wait or when the monthly quota resets. --- ## Results Commands ### `yorker results list ` List check results for a monitor. ```bash yorker results list "Homepage" [--limit 20] [--status success] [--since 24h] ``` | Flag | Description | |---|---| | `--limit ` | Number of results (default: `20`, max: `200`). | | `--offset ` | Skip first N results for pagination. Cannot be combined with `--status` or `--since`. | | `--status ` | Filter: `success`, `failure`, `error`, `timeout`. Applied client-side. | | `--since ` | Only show results from this period (e.g. `1h`, `24h`, `7d`). Applied client-side. | > `--offset` is server-side pagination; `--status` and `--since` are client-side filters. Mixing them would produce inconsistent page sizes, so the CLI rejects the combination. Use `--limit` + re-run with a later `--since` if you need to page through filtered results. --- ### `yorker results get ` Show full details for a specific check result. ```bash yorker results get "Homepage" res_abc123 ``` Includes timing breakdown (DNS, TLS, TTFB, content transfer), assertions (pass/fail), screenshots, Web Vitals (browser monitors), tool-call results (MCP monitors), network requests, and console errors. --- ### `yorker results tail ` Live-stream new results as they arrive. Long-running command that polls at a configurable interval. ```bash yorker results tail "Homepage" [--interval 30s] ``` | Flag | Description | |---|---| | `--interval ` | Poll interval (default: `30s`, range: `5s` to `1h`). | ``` Tailing Homepage (every 30s, Ctrl+C to stop) Apr 9, 2026, 12:34:56 PM UTC ✓ 200 US East Coast 142ms Apr 9, 2026, 12:35:26 PM UTC ✓ 200 EU Central 89ms Apr 9, 2026, 12:36:56 PM UTC ✗ timeout US East Coast 30000ms connection timeout ``` Timestamps are formatted as localized US English strings using the team's configured timezone (defaults to UTC). In `--json` mode, outputs one JSON object per result (newline-delimited). --- ## Alert Commands ### `yorker alerts list` List alert instances. ```bash yorker alerts list [--all] [--monitor "Homepage"] [--state active,acknowledged] ``` | Flag | Description | |---|---| | `--all` | Include resolved and recovered alerts (default: active and acknowledged only). | | `--monitor ` | Filter by monitor. | | `--state ` | Comma-separated states: `active`, `acknowledged`, `recovered`, `resolved`. | The command returns at most 100 alert instances per invocation (the most recent by start time). For larger historical windows, use `yorker alerts history` or the REST API with pagination. --- ### `yorker alerts ack ` Acknowledge an active alert. ```bash yorker alerts ack ainst_abc123 ``` --- ### `yorker alerts resolve ` Manually resolve an alert. ```bash yorker alerts resolve ainst_abc123 ``` --- ### `yorker alerts analyze ` Run a deep insights analysis on an alert instance. Sends context about the alert (monitor, recent results, correlations) to the Yorker insights engine and returns a classified summary with a type, severity, confidence score, and summary text. ```bash yorker alerts analyze ainst_abc123 yorker alerts analyze ainst_abc123 --json ``` Output in human mode matches `yorker monitors analyze` — labeled fields for `Type`, `Severity`, `Confidence`, `Summary`, `Model`, and `Generated`, followed by a usage footer. In `--json` mode, emits the full validated response envelope (`insight`, `stale`, `deepAnalysisCount`, `deepAnalysisLimit`) under the standard `{ ok: true, data: ... }` envelope. Deep analysis is rate-limited per alert instance (5-minute cooldown, 5 lifetime calls per instance) and per team (50 per month). When rate-limited, the command exits with code `3` and prints a message explaining how long to wait or when the monthly quota resets. --- ### `yorker alerts history` Show alert history. ```bash yorker alerts history [--monitor "Homepage"] [--since 7d] [--limit 20] ``` | Flag | Description | |---|---| | `--monitor ` | Filter by monitor. | | `--since ` | Time window (e.g. `30m`, `24h`, `7d`). | | `--limit ` | Number of alerts (default: `20`, max: `200`). | --- ### `yorker alerts rules list` List alert rules across monitors. ```bash yorker alerts rules list [--monitor "Homepage"] ``` --- ### `yorker alerts rules create` Create a new alert rule with a single condition. For multi-condition rules, use `yorker.config.yaml` and `yorker deploy`. ```bash yorker alerts rules create --monitor "Homepage" --condition "consecutive_failures >= 3" --channel nch_abc123 [--name "homepage-down"] [--severity critical] ``` | Flag | Required | Description | |---|---|---| | `--monitor ` | Yes | Monitor to attach the rule to. | | `--condition ` | Yes | Alert condition (see formats below). | | `--channel ` | Yes | Notification channel ID (`nch_...`). | | `--name ` | No | Optional rule name. | | `--severity ` | No | Severity for SSL/certificate conditions: `critical`, `warning`, or `info`. Ignored for non-certificate conditions. | #### Condition formats | Syntax | Type | |---|---| | `consecutive_failures >= 3` | Trigger after N consecutive failures. | | `response_time > 5000` | Trigger if response time exceeds N ms. | | `multi_location >= 2` | Trigger if N+ locations fail simultaneously. | | `ssl_expiry < 14` | Trigger if SSL cert expires within N days. | | `ssl_certificate_changed` | Trigger when the leaf certificate fingerprint changes between runs. | | `ssl_self_signed` | Trigger when a self-signed or untrusted-root certificate is detected. | | `ssl_protocol < TLSv1.3` | Trigger when the TLS handshake negotiates a protocol older than the minimum. Allowed minimums: `TLSv1.2`, `TLSv1.3`. | --- ## Location Commands The `yorker locations` command group manages private locations and runner keys. ### `yorker locations list` List your private locations. By default only private locations are shown; pass `--all` to include the 14 hosted locations as well. ```bash yorker locations list # private locations only yorker locations list --all # private + hosted ``` | Flag | Description | |---|---| | `--all` | Include hosted locations in the output. | Shows each location's ID, type (`hosted` / `private`), display name, region, health status, and (for private locations) last heartbeat. --- ### `yorker locations create` Create a private location and auto-provision an initial runner key. ```bash yorker locations create --name --display-name "" [--region ] ``` | Flag | Required | Description | |---|---|---| | `--name ` | Yes | Unique lowercase slug (e.g. `staging-eu`). | | `--display-name ` | Yes | Human-readable display name. | | `--region ` | No | Region identifier (default: `private`). | The runner key secret is printed **once** — save it immediately. The command also prints a ready-to-run Docker snippet for starting the runner. --- ### `yorker locations delete ` Delete a private location. Fails if any monitors still reference it. Also revokes every runner key associated with the location. The argument is the location ID (`loc_...`). ```bash yorker locations delete loc_staging_eu [--yes] ``` --- ### `yorker locations keys list ` List runner keys for a private location. Only the key ID, name, prefix, and last-used time are returned — the secret is never exposed again after creation. ```bash yorker locations keys list loc_staging_eu ``` --- ### `yorker locations keys create ` Provision a new runner key for a private location. The secret is printed **once**, followed by a ready-to-run Docker snippet. ```bash yorker locations keys create loc_staging_eu [--name "