Monitoring as Code

Monitoring as Code

Monitoring as Code lets you define your LoadFocus monitoring setup as version-controlled files and apply it from the command line or CI — the same way you manage infrastructure with Terraform or Pulumi. You describe the monitors, groups, alerts, maintenance windows, dashboards and status pages you want; the @loadfocus/monitoring CLI computes the difference against what is live and reconciles it (create, update, delete).

It is declarative and idempotent: running deploy twice changes nothing the second time. Your files are the source of truth, so changes go through pull requests and your monitoring history lives in git.

Everything runs inside your account and your active team, with your plan limits enforced by the LoadFocus backend exactly as in the dashboard. The CLI only does what you could do yourself in the UI.

How it works

You keep a folder of small YAML (or JavaScript) files — one resource per file — plus a loadfocus.config.yaml that points at them. The CLI sends those definitions to LoadFocus, which maps them to live resources, diffs them, and returns a plan. You review the plan, then apply it.

  • Author — describe resources as files (YAML or JS constructs).
  • Plandeploy --dry-run shows exactly what will be created, updated, adopted or deleted.
  • Applydeploy reconciles your account to match the files.
  • Reconcile identity — every resource carries a stable logicalId you choose. That is how the CLI tracks a resource across renames, so changing a check's display name never recreates it.

Install

The CLI is a Node package (Node 18+). Run it on demand with npx:

npx @loadfocus/monitoring --help

…or install it globally to get the loadfocus-monitoring command:

npm install -g @loadfocus/monitoring
loadfocus-monitoring --help

Authenticate

The CLI authenticates with a LoadFocus API key and a team id. Find your API key in the dashboard under your account/API settings, and your team id on the teams page.

Sign in once and the credentials are saved to ~/.loadfocus/config.json:

loadfocus-monitoring login
loadfocus-monitoring whoami # confirm who you are and which team you're targeting

For CI, prefer environment variables (they override the saved config and never touch disk):

export LOADFOCUS_API_KEY="apikey_xxxxxxxx"
export TEAM_ID="team_xxxxxxxx"
# optional: export API_URL="https://apimonitor.loadfocus.com"

Create a project

Scaffold a config file and a sample monitor in your repository:

loadfocus-monitoring init

This writes loadfocus.config.yaml:

project: my-project # a namespace for this set of resources
checkMatch:
- "monitors/**/*.{check,group,alertRule,maintenanceWindow,dashboard,statusPage,alertChannel,variable}.{yaml,yml,js}"
defaults:
schedule: "300" # applied to checks that omit a schedule
locations: [us-east-1]
  • project scopes everything the CLI manages. Resources in a project are reconciled together; anything in the project that is no longer in your files is deleted on deploy. Use separate projects to manage independent sets of monitors.
  • checkMatch is the glob(s) for your authoring files.
  • defaults fill in schedule, locations and alertChannels for checks that omit them.

The workflow

loadfocus-monitoring validate # compile locally + server-side dry-run; great as a PR gate
loadfocus-monitoring deploy --dry-run # show the plan (created / updated / adopted / deleted)
loadfocus-monitoring deploy # apply it
loadfocus-monitoring list # inventory of what's deployed in the project
loadfocus-monitoring list --status # …with each check's latest up/down/degraded status
loadfocus-monitoring get <logicalId> # show one deployed resource
loadfocus-monitoring trigger <logicalId> # run a check now
loadfocus-monitoring destroy # delete everything managed in the project

deploy is safe by default: it shows the plan and, when run interactively, asks before deleting anything. In CI (non-interactive), it refuses to delete without --yes and exits with a clear code instead of hanging on a prompt. Add --json to read/result commands for machine-readable output.

Adopt existing monitors

Already have monitors in the dashboard? Pull them into files instead of recreating them:

loadfocus-monitoring import --project my-project --out monitors

This writes one file per resource and a loadfocus.config.yaml. Review, commit, then run deploy --dry-run — matching resources are adopted in place (brought under management) rather than duplicated.

Resources

Every resource is one file with a kind, a logicalId (your stable identifier), and the fields for that kind. References between resources use logicalIds (or names for alert channels) — the server resolves them, and deploy order is handled for you.

Checks

One Monitor kind covers every check type via type: api, browser, multistep, tcp, heartbeat.

kind: check
type: api
logicalId: home
name: Home API
schedule: "300" # seconds between runs
locations: [us-east-1, eu-west-1]
request:
url: "https://example.com/health"
method: GET
assertions:
- { type: statusCode, comparison: equals, value: 200 }
- { type: responseTime, comparison: lessThan, value: 1000 }
  • api — HTTP request with assertions on status, body, headers, response time, SSL expiry.
  • browser — a Playwright user-flow script with screenshots and per-step timings (paid).
  • multistep — an ordered sequence of API requests passing data between steps.
  • tcp — a port/reachability check from multiple regions.
  • heartbeat — a dead-man's switch: an external job pings a URL on a schedule, and LoadFocus alerts if a ping is missed.

Groups

Share locations, alert channels, frequency and activation across many checks. A check joins a group with group: <logicalId>.

kind: group
logicalId: web
name: Web services
locations: [us-east-1, eu-west-1]

Alert rules

Alert when a check's metric crosses a threshold.

kind: alertRule
logicalId: home-latency
name: Home API latency
check: home # reference a check by logicalId
metric: responseTime # responseTime | statusCode | duration
condition: above
conditionValue: 1500 # milliseconds

Alert channels

Manage notification channels as code and reference them by name from a check, group or alert rule. Supported types: email, slack, microsoftteams, webhook, discord, pagerduty, opsgenie. Secret fields (webhookUrl, routingKey, apiKey) take a {{secrets.NAME}} reference — the value is stored with env set-secret and resolved when an alert is sent, never committed to your files.

kind: alertChannel
logicalId: oncall # the name checks / groups / alert rules reference
type: pagerduty
routingKey: "{{secrets.PAGERDUTY_KEY}}"

Maintenance windows

Suppress alerts during planned work. Times are UTC. startsAt / endsAt accept an ISO-8601 string (e.g. "2026-07-01T00:00:00Z") or unix milliseconds.

kind: maintenanceWindow
logicalId: weekly-deploy
name: Weekly deploy window
enabled: true
startsAt: "2026-07-01T00:00:00Z" # ISO-8601 or unix ms
endsAt: "2026-07-01T02:00:00Z"
repeat: weekly # none | daily | weekly | monthly
weekdays: [2] # 0=Sun6=Sat
targets:
allChecks: false
checkIds: [home] # by logicalId

Dashboards

A shared view of selected checks, optionally public via a slug.

kind: dashboard
logicalId: status-overview
name: Status overview
visibility: private # private | public
checks: [home] # by logicalId
window: 24h # 24h | 7d | 30d

Status pages

A public status page at <slug>.loadfoc.us, optionally on your own custom domain.

kind: statusPage
logicalId: public-status
title: Acme Status
slug: acme # -> acme.loadfoc.us (globally unique)
enabled: true
customDomain: status.acme.com # optional, paid; point a CNAME at cname.loadfoc.us
groups:
- { id: core, name: Core Services, order: 0 }
components:
- id: api
name: API
groupId: core
monitors: [home] # checks shown on this component, by logicalId
branding:
brandColor: "#5353a4"
colorTheme: dark

A custom domain goes live once you create the CNAME and the certificate is issued — deploy declares it; verification happens out of band.

Variables

Non-secret values (base URLs, IDs) that checks reference at run time as {{vars.NAME}}. The logicalId is the variable key. (For secrets, use env set-secret — never put them in files.)

kind: variable
logicalId: BASE_URL
value: "https://api.example.com"

Authoring in JavaScript or TypeScript

If you prefer code over YAML, build the same definitions programmatically and export them — the constructs produce identical resources:

const { Monitor, Group, AlertRule, Maintenance, Dashboard, StatusPage, AlertChannel, Variable } = require('@loadfocus/monitoring');
new Monitor({
type: 'api', logicalId: 'home', name: 'Home API', schedule: '300',
locations: ['us-east-1'],
request: { url: 'https://example.com/health', method: 'GET' },
assertions: [{ type: 'statusCode', comparison: 'equals', value: 200 }],
});
new Group({ logicalId: 'web', name: 'Web services', locations: ['us-east-1'] });

Point checkMatch at your .js files and the CLI loads them like any other resource.

Secrets and variables

Reference values from your checks without committing them. Secrets (tokens, passwords) are managed imperatively only and referenced as {{secrets.NAME}} in check fields and alert-channel secret fields. Variables (non-secret) can be declared as files (kind: variable, above) or set imperatively, and are referenced as {{vars.NAME}}.

loadfocus-monitoring env set-secret API_TOKEN "s3cr3t"
loadfocus-monitoring env set-variable BASE_URL "https://example.com"
loadfocus-monitoring env ls # list secret + variable keys (values never shown)

Run it in CI

A typical pipeline validates on every pull request and deploys on merge to the main branch.

# .github/workflows/monitoring.yml
name: monitoring
on:
pull_request:
push:
branches: [main]
jobs:
monitoring:
runs-on: ubuntu-latest
env:
LOADFOCUS_API_KEY: ${{ secrets.LOADFOCUS_API_KEY }}
TEAM_ID: ${{ secrets.LOADFOCUS_TEAM_ID }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 20 }
- run: npx @loadfocus/monitoring validate
- if: github.ref == 'refs/heads/main'
run: npx @loadfocus/monitoring deploy --yes

Things worth knowing

  • logicalId is the identity. Keep it stable. You can rename a check's name or title freely; changing its logicalId is treated as deleting one resource and creating another.
  • Deletes are scoped to the project. deploy only removes resources in the current project that are no longer in your files — never anything in another project or created outside Monitoring as Code (until you adopt it).
  • Status-page slugs are global. slug becomes a subdomain, so it must be unique across all LoadFocus customers.
  • Paid features fail loudly. A free team that declares a paid-only field (a status-page custom domain, removing the "Powered by" badge) gets a clear error on deploy rather than a silent partial result.
  • Plan limits apply. Creating resources via the CLI is subject to the same plan quotas as the dashboard.