Claude Code routines are scheduled cloud agents that run a prompt on a cron schedule without you in the loop. I use them for the boring, recurring work I used to do by hand. The one I lean on most runs every 6 hours, reads production exceptions from my Cloudflare Workers, diagnoses each one against the codebase, and then either opens a pull request with a fix, files a ClickUp task and tags me, or just posts an all-clear to Slack. This guide shows exactly how it works, the prompt behind it, and the rules that keep it from doing anything stupid.
The concept is simple, and so is the setup. Because Slack and ClickUp connect through Claude Connectors, wiring up a routine like this takes under five minutes: pick the connectors, paste the prompt, set the schedule. The value is entirely in the judgment you encode and the boundaries you set. That is what this post is about.
What are Claude Code routines?
What are Claude Code routines?
Claude Code routines are scheduled cloud agents that run a defined prompt on a cron schedule, autonomously, with access to the tools and MCP servers you grant them. They run in an isolated cloud environment instead of your terminal, report back when finished, and need no human prompting to start. Use a routine when a task is recurring, rule-based, and worth doing whether or not you remember to do it.
If you have used Claude Code in your terminal, a routine is the same agent with two differences: it runs on a schedule you set instead of when you type, and it runs in the cloud instead of on your machine. You give it a prompt, a cadence, and a set of tools. It wakes up in a fresh session with zero prior context, does the work, and tells you what it did.
People also call these "Claude Code scheduled tasks" or "Claude routines". Same thing.
Why I automate triage first
I pick automation targets with one question: what do I do on a recurring basis that follows rules I can write down?
Log triage was the obvious first answer. Before the routine, my morning looked like this: open the Cloudflare dashboard, scan for exception spikes, cross-reference anything weird against open issues, decide whether it was real or noise, and only then start actual work. Twenty to thirty minutes of context-switching before writing a line of code, every day, and I still missed things that happened overnight.
That task fits a routine because the decisions are mostly mechanical:
- Is this error new, or already tracked?
- Is it a real code bug, a config issue, or just a bot probing a route?
- Was it an unhandled crash, or an error we already caught?
- Can I describe a safe fix, or do I need a human to look closer?
Those are rules. Rules are exactly what an agent can run on a schedule.
The triage routine, end to end
Here is the whole shape before the details. This runs against one of my production apps, a Cloudflare Workers project:
Every 6 hours (fresh cloud session, no prior context): 1. Compute a 360-minute window from `date -u` 2. Run a saved Observability query against the Cloudflare API (production workers only; skip anything -staging / -preview / -dev) 3. Dedupe within the run, then classify each error 4. Read the codebase to diagnose 5a. Isolated, testable code bug -> open ONE PR against `canary` (TDD, never main) 5b. Anything ambiguous -> create a ClickUp task and tag me 6. Post a summary to Slack #claude-logs (every run, even when clean)Four moving parts: the query, the prompt, the schedule, and the output contract. The output contract is the part people skip, and it is the most important.
Step 1: A saved query against the Workers Observability API
I keep a saved query inside Cloudflare's Workers Observability and the routine calls it by id. Authentication is two environment variables ($CF_API_TOKEN and $CF_ACCOUNT_ID) and a plain curl. No wrangler, no installed tooling, because a fresh cloud session should stay minimal and predictable. The routine enumerates production workers and ignores anything ending in -staging, -preview, or -dev.
Set the env vars in the cloud environment, not your laptop
A routine runs in a fresh cloud session that has no access to your local shell. Add $CF_API_TOKEN, $CF_ACCOUNT_ID, and $CF_OBSERVABILITY_QUERY_ID to the environment configuration of the Claude Code cloud environment you set up for the routine, not your .zshrc. If they are missing, the routine has nothing to authenticate with, which is exactly why the prompt tells it to stop and post to Slack rather than guess.
The core call posts the saved query id and a time window to the telemetry endpoint, filtered to records that have an error, grouped by worker, error, and outcome:
curl -sS -X POST \ "https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/workers/observability/telemetry/query" \ -H "Authorization: Bearer $CF_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "queryId": "'"$CF_OBSERVABILITY_QUERY_ID"'", "timeframe": { "from": '"$FROM_MS"', "to": '"$TO_MS"' }, "parameters": { "filters": [ { "key": "$metadata.error", "operation": "exists", "type": "string" } ], "filterCombination": "and", "calculations": [{ "operator": "count", "alias": "Count" }], "groupBys": [ { "type": "string", "value": "$workers.scriptName" }, { "type": "string", "value": "$metadata.error" }, { "type": "string", "value": "$workers.outcome" } ], "orderBy": { "value": "Count" }, "limit": 100 } }'The key field is $workers.outcome. An exception is an unhandled crash; an ok with an error attached means the Worker caught and handled it. Both are worth a look, but a handled error is lower severity than a crash. For stack traces, the routine runs a second query per affected worker grouped by the error stack.
If the telemetry query shape gets rejected after a real attempt to fix it, the routine falls back to the GraphQL Analytics API (workersInvocationsAdaptive) and notes in Slack that full stack traces were unavailable. It never uses the Log Explorer API, which has no Workers dataset.
This pattern ports anywhere
Nothing here is Cloudflare or ClickUp specific. If you do not run on either, your platform of choice almost certainly has the equivalent. Swap the saved Observability query for a Sentry issues feed, a Logpush export in R2, a Supabase SQL query against your logs table, or a Vercel log drain. Swap ClickUp and Slack for Linear, Jira, GitHub Issues, Discord, or Teams, most of which have a Claude Connector already. The structure stays identical: a stable, saved query in, structured data out, the agent judges, the agent proposes.
Step 2: The routine prompt
The prompt is where the judgment lives. Here is a faithful, trimmed version of mine:
# RoleYou run every 6 hours, unattended, in a fresh cloud session with zero priorcontext. Surface PRODUCTION Cloudflare Worker exceptions from the last 6 hours,diagnose them against the codebase, and either propose a fix via PR or escalateto ClickUp. Read AGENTS.md at the repo root first; it governs architecture, TDD,and the multi-tenancy invariants you must follow for any code change.# Tools & auth- Query the Workers Observability telemetry API with curl, using $CF_API_TOKEN and $CF_ACCOUNT_ID. Do NOT use wrangler or install tooling.- If a token, account id, or query id is missing, or the API returns 401/403: STOP and post the failure to Slack #claude-logs. No fallback, no guessing.- Slack and ClickUp are reached via their MCP connectors.# Scope (hard constraints)- PRODUCTION ONLY. Ignore staging/preview/dev workers entirely.- Window: exceptions from the last 360 minutes, computed from `date -u` at start.- Enumerate every production worker from Cloudflare; do not assume a fixed list.- Never push to main. Never deploy. Never touch infrastructure or secrets.# Procedure1. Compute [from, to] in epoch ms (now-360min .. now).2. POST the saved telemetry query ($CF_OBSERVABILITY_QUERY_ID), filtered to errors, grouped by scriptName + error + outcome. Pull stack traces with a second query per affected worker.3. Dedupe within the run: group by (worker, error type, normalized message, top stack frame); collapse IDs, URLs, and timestamps first. The run is stateless.4. Classify each group: CODE BUG / CONFIG-ENV / TRANSIENT / UNCLEAR.# Act- Open ONE PR against `canary` only for an isolated, well-understood code bug with a safe, testable fix. Follow AGENTS.md TDD (failing test first); run typecheck, lint, and format before opening. Bias toward escalation when unsure.- Otherwise create a ClickUp task assigned to me, titled "[prod-triage] <worker> — <issue>", with the exception, a stack trace, occurrence count, root-cause analysis, and what needs verifying. Before creating, search for an open "[prod-triage]" task for the same worker and issue; if one exists, comment the new count instead of creating a duplicate.# ReportPost to Slack #claude-logs every run: PRs opened, items needing verification withClickUp links, and transient/no-action notes. If zero exceptions, say so.Notice how much of this is restraint. "Bias toward escalation when unsure." "Do not report anything already tracked." "If a query id is missing, STOP and post to Slack, no guessing." A routine that cannot stay quiet, or that improvises when it lacks data, becomes noise you will mute within a week.
Step 3: The schedule
Every 6 hours. In cron terms that is 0 */6 * * *, and the routine derives its 360-minute lookback window from date -u at the start of each run, so the schedule and the query window stay in sync. That cadence matches my traffic: enough that nothing festers overnight, rare enough that runs are cheap and I am not drowning in updates. On a higher-traffic app I would tighten it to hourly. On a side project, once a day.
The cadence is a dial you tune to your traffic, not a fixed default. Set it to how fast a problem can hurt you before you would otherwise notice.
Step 4: The output contract (the part that matters)
This is the difference between a useful routine and a liability:
- Confident, isolated, testable code bug: open one PR against
canarywith a failing-test-first fix, after running typecheck, lint, and format. I review and merge. - Anything ambiguous or risky: create a ClickUp task with the evidence and tag me. Bias here is heavy; the agent escalates rather than guesses.
- Clean run: post one line to Slack and stop.
I will say the important rule plainly: the routine cannot push to main and cannot deploy. Its most powerful action is opening a PR against canary, a branch that does not ship on its own, where a human gates the merge. Everything else is just writing things down for me to look at. That single boundary is what lets me point an autonomous agent at production logs and sleep fine.
The triage logic worth stealing
The mechanics are easy. These three rules are what make the output good instead of annoying.
Dedupe within a stateless run. Each run starts fresh with no memory of the last one, so the agent groups errors by worker, error type, normalized message, and top stack frame, collapsing high-cardinality noise like IDs and URLs before grouping. Without this, one ongoing incident shows up as fifty rows and you learn to ignore the report.
Treat a handled error and a crash differently. Cloudflare's outcome=exception is an unhandled crash; outcome=ok with an error means your code caught it. Both are worth seeing, but they are not the same severity, and folding that distinction into the prompt stops the agent from paging you over an error you already handle gracefully.
Make the default "escalate," not "act." The agent opens a PR only for an isolated bug with an obvious, testable fix. Everything else becomes a ClickUp task for a human. An agent that acts when uncertain is far more expensive than one that asks, so the prompt biases hard toward escalation.
What it can and cannot do
I think about routine permissions the way I think about a junior engineer's first week: lots of room to investigate and propose, no ability to break production.
When to give a routine write access:
- Opening PRs against a non-deploying branch (a human gates the merge)
- Creating issues or tasks
- Adding labels and comments
- Posting run summaries to a Slack channel
When to withhold it:
- Pushing to main or any branch that deploys
- Closing issues or resolving tasks (it lacks the full context)
- Messaging customers or pinging the whole team
- Changing infrastructure, billing, secrets, or auth config
If unsure: make the routine propose, not perform. A PR or a task is a proposal. A merge or a deploy is an action. Keep actions on the human side of the line, and make the routine fail loud (stop and post to Slack) the moment it lacks the access or data it needs.
What it has caught, and what I am still watching
Honest status: this routine is young, so I am reporting what it has done, not a year of triumphant metrics.
So far it has been most useful on the unglamorous stuff. It flagged a misconfiguration I would not have gone looking for, and it consistently surfaces bad bots, the probe-and-error pattern that never shows up in a quick dashboard glance but is obvious when something reads every production worker's exceptions on a schedule.
What I am watching for, because I have not hit it yet and I expect to:
- Severity inflation: agents tend to think everything is important. The outcome-aware classification and the escalate-by-default bias are my hedges, and I will tighten them if it over-reports.
- Log noise read as signal: a third-party outage can look like your bug. The dedupe and "classify transient, do not act" rules help, but this is the failure mode I trust least.
- Cost creep: every 6 hours is cheap; hourly across more workers is not free. Worth metering before you scale the cadence.
I would rather tell you the guardrails I built for problems I am anticipating than invent war stories. If you run this, watch those three.
Other routines worth automating
Triage is one example. The same pattern (stable saved input, encoded judgment, propose-don't-perform output) fits plenty of recurring work:
- Dependency triage: weekly, read the changelog and lockfile diff for outdated packages, open a PR for safe patch bumps, file a task for anything with breaking changes.
- Daily SEO brief: every morning, pull ranking and traffic changes, summarize what moved and why, flag pages that dropped.
- PR hygiene: a few times a day, find PRs that are stale, missing a description, or failing CI, and nudge with a comment.
- Docs drift: weekly, diff recent code changes against the docs and open tasks where they disagree.
Each one follows the same contract: it gathers, it judges against rules you wrote, and it proposes rather than acts.
Routine vs script vs interactive agent
A routine is not always the right tool. Quick framework:
| Tool | Best for | Avoid when |
|---|---|---|
| Cron script | Deterministic work with no judgment (backups, syncs) | The task needs reasoning about context |
| Claude Code routine | Recurring work that needs judgment and can propose changes | The task is one-off, or needs your input mid-task |
| Interactive agent | One-off or exploratory work where you steer | The work is recurring and you keep redoing it |
If a plain script can do it, use the script. Reach for a routine when the task needs to read a situation and decide, not just execute fixed steps.
Quick Recommendation
Claude Code routines are best for:
- Recurring work that follows rules you can write down (triage, audits, briefs)
- Tasks worth doing on a schedule whether or not you remember
- Anywhere a proposal (PR, task, Slack summary) is more useful than a raw alert
Skip routines if:
- The task is one-off or needs you steering it mid-flight
- A deterministic cron script already handles it
- You are not willing to define a strict output contract (an unbounded routine becomes noise)
My pick: start with one read-only routine that only files tasks and posts to Slack, run it for a week, and only grant PR access once you trust its judgment. Earn the write access incrementally, exactly like you would with a new hire.
Frequently Asked Questions
What are Claude Code routines?
How are Claude Code routines different from a cron job?
How do I pull Cloudflare logs into a routine?
Is it safe to give a scheduled agent access to production logs?
How often should a triage routine run?
What stops the routine from spamming me with false positives?
Next steps
If you want the foundations first, read our Claude Code best practices guide for how we structure AGENTS.md rules, MCP servers, and subagents, then see how to build a SaaS with Claude Code for the interactive version of this workflow. If you are choosing where to run the app this triages, we cover that in best hosting for Next.js. Routines are what you graduate to once those patterns are stable.
MakerKit ships with the AGENTS.md rules and MCP setup that make routines like this reliable out of the box, because an agent is only as good as the codebase it reasons about. That is the same principle behind every automation here: encode good structure once, then let the schedule do the rest.
Set one up and it handles the unglamorous part of running production while you sleep. And if you are feeling very brave, point it at a frontier model like Opus 4.8 with its 1M-token context, drop the canary guardrail, and let it push straight to production on its own. I would not, and everything above is an argument for why. But the routine will happily match whatever risk appetite you give it. Use at your own risk.