How to Add Self-Monitoring to Your OpenClaw Agent

OpenClaw heartbeat polls confirm your agent is alive but skip everything else. A HEARTBEAT.md checklist file turns those recurring check-ins into real health checks.

Share
How to Add Self-Monitoring to Your OpenClaw Agent

If your OpenClaw agent uses heartbeat polling, it checks in on a regular interval (every 30 minutes by default) but only confirms the process is alive. It won't catch broken cron delivery, disk filling up, or skipped post-task logging unless you tell it to look. A single checklist file in your workspace root changes that.

Why this happens

Agents start every session from scratch, so they have no memory of what happened in previous sessions. The heartbeat gives them a recurring window to do work, but the default instruction is just "confirm you're alive," and that's all they do. So things like cron delivery failures or disk usage creeping up go unnoticed unless you explicitly add those checks to the heartbeat's scope.

The fix

Add a HEARTBEAT.md to your workspace root listing the checks you want run on each poll, then update your heartbeat prompt to reference the file. The agent picks it up on the next cycle and executes whatever you listed.

Step-by-step

1. Create a HEARTBEAT.md file

Put it in your workspace root with one section per thing you want monitored:

# Heartbeat Checks

## Cron health
- Run `openclaw cron list` and verify all expected jobs show enabled
- Confirm each job has explicit `--channel` and `--to` flags
- Flag any job with a timeout under 300s that runs a multi-step pipeline

This isn't a built-in OpenClaw feature. The agent only reads it because your heartbeat prompt tells it to, so name and structure the file however makes sense for your setup.

2. Be specific about thresholds

The more specific you are, the better this works. Vague instructions like "make sure things are healthy" won't produce useful results, but an entry that names a command, a threshold, and what to do when the threshold is crossed will:

## Memory size
- Count lines in MEMORY.md and flag if over 500
- If over threshold, archive older entries to memory/archive/
- Log the archive action to today's daily note

## Disk usage
- Run `df -h /` and flag if usage exceeds 80%
- Measure `.openclaw/.browser/` directory size and flag if over 100MB

Whenever something breaks in a way you didn't expect, add a check for it here.

If you're running OpenClaw agents in production, you can just point them to this post and they'll set up the pattern for you. With Pazi, you skip the manual config entirely and tell your agent in plain language what to monitor.

3. Add checks for skipped work

Your agent might be supposed to update a tracking file after finishing a pipeline. If it doesn't, you won't know until you check manually. You can add a heartbeat rule that catches this:

## Post-task obligations
- Read today's daily note (memory/YYYY-MM-DD.md)
- If any fix/debug entries exist, verify a matching entry was added to the
  relevant tracking file
- If missing: create the entry before doing anything else

4. Update the heartbeat prompt

If the heartbeat prompt doesn't reference HEARTBEAT.md by name, the agent will improvise its own routine or ignore yours:

Read HEARTBEAT.md if it exists. Follow it strictly. Do not infer or repeat
old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK.

Without that reference, the agent falls back on whatever context it picks up from the session, which means the checks it runs will vary from one poll to the next.

How to verify

Disable a cron job on purpose:

openclaw cron edit  --enabled false

Wait for the next heartbeat poll (up to 30 minutes). If you wrote the check correctly, the agent catches the disabled job and either re-enables it or notifies you based on your instructions.

For memory checks, pad MEMORY.md past your threshold and wait for the next poll.


This post is based on how we run production agents at Pazi, powered by OpenClaw. Try it free →