What this is
An outage you find out about from a customer is an outage you found out about too late. This workflow checks a single website or API on a schedule and posts to Slack the moment its status changes, so the alert lands in the channel your team already watches instead of in someone's inbox an hour later.
Because it hits the real endpoint from outside, it catches failures an internal health check sitting on the same box would miss: DNS, TLS, CDN, and routing problems all show up. It also tracks three things, not just up or down: a hard failure, a slow-but-alive response, and a recovery back to healthy.
How it runs
The flow is linear with one branch at the end. It reads the endpoint and thresholds from a config step, reads the endpoint's previous status, makes a single HTTP request, and classifies the result as up, down, or degraded. A request that errors or returns the wrong status is treated as down; a request that succeeds but exceeds the latency threshold is degraded.
It then compares the current result to the previous run and decides whether anything is worth saying. A Switch routes the outcome: if the status is a new outage, an ongoing outage, a slow response, or a recovery, it builds a message and posts to Slack. If the endpoint is healthy and unchanged, the run exits quietly with no message.
The steps
- Config (config): Holds the endpoint to monitor and its thresholds:
name,url,method,expectedStatus, andmaxLatencyMs. Because these live in the workflow, it needs no input and can run standalone on a schedule. - Persisted State (load_state): Reads the endpoint's status from the previous run so the workflow has something to compare against.
- HTTP (http_check): A single GET against the URL. Marked optional, so a timeout or connection error is captured as a result rather than failing the whole run.
- JavaScript (evaluate): Classifies the result as UP, DOWN, or DEGRADED, compares it to the previous run, and decides the alert kind: new outage, still down, slow, or recovered.
- Persisted State (save_state): Stores this run's status so the next run can detect a recovery or an ongoing outage.
- Switch (alert_switch): Sends anything noteworthy to the alert branch; healthy-and-unchanged results exit quietly.
- Slack (slack_alert): Posts a formatted alert to your channel, with an icon for the state and the URL and reason.
- Exit (exit_quiet): Ends the run with no message when nothing has changed.
Design notes
The HTTP step is optional on purpose. A failed request is the signal this workflow exists to catch, not an error to abort on. Marking it optional lets the run continue and read the failure instead of stopping at it.
The two Persisted State steps are what make the workflow quiet and smart rather than noisy. Without a memory of the previous run, it could only ever say "currently down" and would repeat that every minute for the same incident. By remembering the last status, it distinguishes a new outage from an ongoing one, suppresses repeat-spam, and can announce when the endpoint recovers. Healthy, unchanged runs say nothing at all.
Setup
- Connect Slack and set the connection and target channel on the
slack_alertstep. - Edit the
configstep with the endpoint to watch:name,url,method,expectedStatus, andmaxLatencyMs(setmaxLatencyMsto 0 to skip the slow check, orexpectedStatusto something other than 200 for an auth-gated endpoint). - Attach a schedule so it runs at your chosen interval, for example every minute. No input is needed at run time; everything it checks lives in the config step.
- Optionally adjust the alert text in the
build_alertstep to match your team's style.
When to use it
- You have a public site or API whose downtime matters and want to hear about it fast.
- Your team lives in Slack and would rather get an alert there than watch a dashboard.
- You want recovery and repeat-outage awareness, not just a raw "it's down right now" ping.
- You are monitoring one endpoint for now and want something simple to stand up and extend later.