Build durable workflows with Cloudflare Workflows (GA April 2025). Features step.do, step.sleep, waitForEvent, Vitest testing, automatic retries, and state persistence for long-running tasks. Use when: creating workflows, implementing retries, or troubleshooting NonRetryableError, I/O context, serialization errors.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
skills listSkill Instructions
name: cloudflare-workflows description: | Build durable workflows with Cloudflare Workflows (GA April 2025). Features step.do, step.sleep, waitForEvent, Vitest testing, and runs for hours to days with automatic retries and state persistence.
Use when: creating long-running workflows, implementing retry logic, building event-driven processes, testing workflows with cloudflare:test, coordinating API calls, or troubleshooting NonRetryableError, I/O context errors, serialization failures.
Keywords: cloudflare workflows, workflows workers, durable execution, workflow step, WorkflowEntrypoint, step.do, step.sleep, workflow retries, NonRetryableError, workflow state, wrangler workflows, workflow events, long-running tasks, step.sleepUntil, step.waitForEvent, workflow bindings, vitest testing, cloudflare:test, introspectWorkflowInstance
Cloudflare Workflows
Status: Production Ready ✅ (GA since April 2025) Last Updated: 2025-11-25 Dependencies: cloudflare-worker-base (for Worker setup) Latest Versions: wrangler@4.50.0, @cloudflare/workers-types@4.20251121.0
Recent Updates (2025):
- April 2025: Workflows GA release - waitForEvent API, Vitest testing, CPU time metrics, 4,500 concurrent instances
- October 2025: Instance creation rate 10x faster (100/sec), concurrency increased to 10,000
- 2025 Limits: Max steps 1,024, state persistence 1MB/step (100MB-1GB per instance), event payloads 1MB, CPU time 5 min max
- Testing: cloudflare:test module with introspectWorkflowInstance, disableSleeps, mockStepResult, mockEvent modifiers
- Platform: Waiting instances don't count toward concurrency, retention 3-30 days, subrequests 50-1,000
Quick Start (5 Minutes)
# 1. Scaffold project
npm create cloudflare@latest my-workflow -- --template cloudflare/workflows-starter --git --deploy false
cd my-workflow
# 2. Configure wrangler.jsonc
{
"name": "my-workflow",
"main": "src/index.ts",
"compatibility_date": "2025-11-25",
"workflows": [{
"name": "my-workflow",
"binding": "MY_WORKFLOW",
"class_name": "MyWorkflow"
}]
}
# 3. Create workflow (src/index.ts)
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers';
export class MyWorkflow extends WorkflowEntrypoint<Env, Params> {
async run(event: WorkflowEvent<Params>, step: WorkflowStep) {
const result = await step.do('process', async () => { /* work */ });
await step.sleep('wait', '1 hour');
await step.do('continue', async () => { /* more work */ });
}
}
# 4. Deploy and test
npm run deploy
npx wrangler workflows instances list my-workflow
CRITICAL: Extends WorkflowEntrypoint, implements run() with step methods, bindings in wrangler.jsonc
Step Methods
step.do() - Execute Work
step.do<T>(name: string, config?: WorkflowStepConfig, callback: () => Promise<T>): Promise<T>
Parameters:
name- Step name (for observability)config(optional) - Retry configuration (retries, timeout, backoff)callback- Async function that does the work
Returns: Value from callback (must be serializable)
Example:
const result = await step.do('call API', { retries: { limit: 10, delay: '10s', backoff: 'exponential' }, timeout: '5 min' }, async () => {
return await fetch('https://api.example.com/data').then(r => r.json());
});
CRITICAL - Serialization:
- ✅ Allowed: string, number, boolean, Array, Object, null
- ❌ Forbidden: Function, Symbol, circular references, undefined
- Throws error if return value isn't JSON serializable
step.sleep() - Relative Sleep
step.sleep(name: string, duration: WorkflowDuration): Promise<void>
Parameters:
name- Step nameduration- Number (ms) or string:"second","minute","hour","day","week","month","year"(plural forms accepted)
Examples:
await step.sleep('wait 5 minutes', '5 minutes');
await step.sleep('wait 1 hour', '1 hour');
await step.sleep('wait 2 days', '2 days');
await step.sleep('wait 30 seconds', 30000); // milliseconds
Note: Resuming workflows take priority over new instances. Sleeps don't count toward step limits.
step.sleepUntil() - Sleep to Specific Date
step.sleepUntil(name: string, timestamp: Date | number): Promise<void>
Parameters:
name- Step nametimestamp- Date object or UNIX timestamp (milliseconds)
Examples:
await step.sleepUntil('wait for launch', new Date('2025-12-25T00:00:00Z'));
await step.sleepUntil('wait until time', Date.parse('24 Oct 2024 13:00:00 UTC'));
step.waitForEvent() - Wait for External Event (GA April 2025)
step.waitForEvent<T>(name: string, options: { type: string; timeout?: string | number }): Promise<T>
Parameters:
name- Step nameoptions.type- Event type to matchoptions.timeout(optional) - Max wait time (default: 24 hours, max: 30 days)
Returns: Event payload sent via instance.sendEvent()
Example:
export class PaymentWorkflow extends WorkflowEntrypoint<Env, Params> {
async run(event: WorkflowEvent<Params>, step: WorkflowStep) {
await step.do('create payment', async () => { /* Stripe API */ });
const webhookData = await step.waitForEvent<StripeWebhook>(
'wait for payment confirmation',
{ type: 'stripe-webhook', timeout: '1 hour' }
);
if (webhookData.status === 'succeeded') {
await step.do('fulfill order', async () => { /* fulfill */ });
}
}
}
// Worker sends event to workflow
export default {
async fetch(req: Request, env: Env): Promise<Response> {
if (req.url.includes('/webhook/stripe')) {
const instance = await env.PAYMENT_WORKFLOW.get(instanceId);
await instance.sendEvent({ type: 'stripe-webhook', payload: await req.json() });
return new Response('OK');
}
}
};
Timeout handling:
try {
const event = await step.waitForEvent('wait for user', { type: 'user-submitted', timeout: '10 minutes' });
} catch (error) {
await step.do('send reminder', async () => { /* reminder */ });
}
WorkflowStepConfig
interface WorkflowStepConfig {
retries?: {
limit: number; // Max attempts (Infinity allowed)
delay: string | number; // Delay between retries
backoff?: 'constant' | 'linear' | 'exponential';
};
timeout?: string | number; // Max time per attempt
}
Default: { retries: { limit: 5, delay: 10000, backoff: 'exponential' }, timeout: '10 minutes' }
Backoff Examples:
// Constant: 30s, 30s, 30s
{ retries: { limit: 3, delay: '30 seconds', backoff: 'constant' } }
// Linear: 1m, 2m, 3m, 4m, 5m
{ retries: { limit: 5, delay: '1 minute', backoff: 'linear' } }
// Exponential (recommended): 10s, 20s, 40s, 80s, 160s
{ retries: { limit: 10, delay: '10 seconds', backoff: 'exponential' }, timeout: '5 minutes' }
// Unlimited retries
{ retries: { limit: Infinity, delay: '1 minute', backoff: 'exponential' } }
// No retries
{ retries: { limit: 0 } }
Error Handling
NonRetryableError
Force workflow to fail immediately without retrying:
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from 'cloudflare:workers';
import { NonRetryableError } from 'cloudflare:workflows';
export class MyWorkflow extends WorkflowEntrypoint<Env, Params> {
async run(event: WorkflowEvent<Params>, step: WorkflowStep) {
await step.do('validate input', async () => {
if (!event.payload.userId) {
throw new NonRetryableError('userId is required');
}
// Validate user exists
const user = await this.env.DB.prepare(
'SELECT * FROM users WHERE id = ?'
).bind(event.payload.userId).first();
if (!user) {
// Terminal error - retrying won't help
throw new NonRetryableError('User not found');
}
return user;
});
}
}
When to use NonRetryableError:
- ✅ Authentication/authorization failures
- ✅ Invalid input that won't change
- ✅ Resource doesn't exist (404)
- ✅ Validation errors
- ❌ Network failures (should retry)
- ❌ Rate limits (should retry with backoff)
- ❌ Temporary service outages (should retry)
Catch Errors to Continue Workflow
Prevent workflow failure by catching optional step errors:
export class MyWorkflow extends WorkflowEntrypoint<Env, Params> {
async run(event: WorkflowEvent<Params>, step: WorkflowStep) {
await step.do('process payment', async () => { /* critical */ });
try {
await step.do('send email', async () => { /* optional */ });
} catch (error) {
await step.do('log failure', async () => {
await this.env.DB.prepare('INSERT INTO failed_emails VALUES (?, ?)').bind(event.payload.userId, error.message).run();
});
}
await step.do('update status', async () => { /* continues */ });
}
}
Graceful Degradation:
let result;
try {
result = await step.do('call primary API', async () => await callPrimaryAPI());
} catch {
result = await step.do('call backup API', async () => await callBackupAPI());
}
Triggering Workflows
Configure binding (wrangler.jsonc):
{
"workflows": [{
"name": "my-workflow",
"binding": "MY_WORKFLOW",
"class_name": "MyWorkflow",
"script_name": "workflow-worker" // If workflow in different Worker
}]
}
Trigger from Worker:
const instance = await env.MY_WORKFLOW.create({ params: { userId: '123' } });
return Response.json({ id: instance.id, status: await instance.status() });
Instance Management:
const instance = await env.MY_WORKFLOW.get(instanceId);
const status = await instance.status(); // { status: 'running'|'complete'|'errored'|'queued', error, output }
await instance.sendEvent({ type: 'user-action', payload: { action: 'approved' } });
await instance.pause();
await instance.resume();
await instance.terminate();
State Persistence
Workflows automatically persist state returned from step.do():
✅ Serializable:
- Primitives:
string,number,boolean,null - Arrays, Objects, Nested structures
❌ Non-Serializable:
- Functions, Symbols, circular references, undefined, class instances
Example:
// ✅ Good
const result = await step.do('fetch data', async () => ({
users: [{ id: 1, name: 'Alice' }],
timestamp: Date.now(),
metadata: null
}));
// ❌ Bad - function not serializable
const bad = await step.do('bad', async () => ({ data: [1, 2, 3], transform: (x) => x * 2 })); // Throws error!
Access State Across Steps:
const userData = await step.do('fetch user', async () => ({ id: 123, email: 'user@example.com' }));
const orderData = await step.do('create order', async () => ({ userId: userData.id, orderId: 'ORD-456' }));
await step.do('send email', async () => sendEmail({ to: userData.email, subject: `Order ${orderData.orderId}` }));
Observability
Built-in Metrics (Enhanced in 2025)
Workflows automatically track:
- Instance status: queued, running, complete, errored, paused, waiting
- Step execution: start/end times, duration, success/failure
- Retry history: attempts, errors, delays
- Sleep state: when workflow will wake up
- Output: return values from steps and run()
- CPU time (GA April 2025): Active processing time per instance for billing insights
View Metrics in Dashboard
Access via Cloudflare dashboard:
- Workers & Pages
- Select your workflow
- View instances and metrics
Metrics include:
- Total instances created
- Success/error rates
- Average execution time
- Step-level performance
- CPU time consumption (2025 feature)
Programmatic Access
const instance = await env.MY_WORKFLOW.get(instanceId);
const status = await instance.status();
console.log(status);
// {
// status: 'complete' | 'running' | 'errored' | 'queued' | 'waiting' | 'unknown',
// error: string | null,
// output: { userId: '123', status: 'processed' }
// }
CPU Time Configuration (2025):
// wrangler.jsonc
{ "limits": { "cpu_ms": 300000 } } // 5 minutes max (default: 30 seconds)
Limits (Updated 2025)
| Feature | Workers Free | Workers Paid |
|---|---|---|
| Max steps per workflow | 1,024 | 1,024 |
| Max state per step | 1 MiB | 1 MiB |
| Max state per instance | 100 MB | 1 GB |
| Max event payload size | 1 MiB | 1 MiB |
| Max sleep/sleepUntil duration | 365 days | 365 days |
| Max waitForEvent timeout | 365 days | 365 days |
| CPU time per step | 10 ms | 30 sec (default), 5 min (max) |
| Duration (wall clock) per step | Unlimited | Unlimited |
| Max workflow executions | 100,000/day | Unlimited |
| Concurrent instances | 25 | 10,000 (Oct 2025, up from 4,500) |
| Instance creation rate | 100/second | 100/second (Oct 2025, 10x faster) |
| Max queued instances | 100,000 | 1,000,000 |
| Max subrequests per instance | 50/request | 1,000/request |
| Retention (completed state) | 3 days | 30 days |
| Max Workflow name length | 64 chars | 64 chars |
| Max instance ID length | 100 chars | 100 chars |
CRITICAL Notes:
step.sleep()andstep.sleepUntil()do NOT count toward 1,024 step limit- Waiting instances (sleeping, retrying, or waiting for events) do NOT count toward concurrency limits
- Instance creation rate increased 10x (October 2025): 100 per 10 seconds → 100 per second
- Max concurrency increased (October 2025): 4,500 → 10,000 concurrent instances
- State persistence limits increased (2025): 128 KB → 1 MiB per step, 100 MB - 1 GB per instance
- Event payload size increased (2025): 128 KB → 1 MiB
- CPU time configurable via
wrangler.jsonc:{ "limits": { "cpu_ms": 300000 } }(5 min max)
Pricing
Requires Workers Paid plan ($5/month)
Workflow Executions:
- First 10,000,000 step executions/month: FREE
- After that: $0.30 per million step executions
What counts as a step execution:
- Each
step.do()call - Each retry of a step
step.sleep(),step.sleepUntil(),step.waitForEvent()do NOT count
Cost examples:
- Workflow with 5 steps, no retries: 5 step executions
- Workflow with 3 steps, 1 step retries 2 times: 5 step executions (3 + 2)
- 10M simple workflows/month (5 steps each): ((50M - 10M) / 1M) × $0.30 = $12/month
Troubleshooting
Issue: "Cannot perform I/O on behalf of a different request"
Cause: Trying to use I/O objects created in one request context from another request handler
Solution: Always perform I/O within step.do() callbacks
// ❌ Bad - I/O outside step
const response = await fetch('https://api.example.com/data');
const data = await response.json();
await step.do('use data', async () => {
// Using data from outside step's I/O context
return data; // This will fail!
});
// ✅ Good - I/O inside step
const data = await step.do('fetch data', async () => {
const response = await fetch('https://api.example.com/data');
return await response.json(); // ✅ Correct
});
Issue: NonRetryableError behaves differently in dev vs production
Known Issue: Throwing NonRetryableError with empty message in dev mode causes retries, but works correctly in production
Workaround: Always provide a message to NonRetryableError
// ❌ May retry in dev
throw new NonRetryableError();
// ✅ Works consistently
throw new NonRetryableError('User not found');
Source: workers-sdk#10113
Vitest Testing (GA April 2025)
Workflows support full testing integration via cloudflare:test module.
Setup
npm install -D vitest@latest @cloudflare/vitest-pool-workers@latest
vitest.config.ts:
import { defineWorkersConfig } from '@cloudflare/vitest-pool-workers/config';
export default defineWorkersConfig({ test: { poolOptions: { workers: { miniflare: { bindings: { MY_WORKFLOW: { scriptName: 'workflow' } } } } } } });
Introspection API
import { env, introspectWorkflowInstance } from 'cloudflare:test';
it('should complete workflow', async () => {
const instance = await introspectWorkflowInstance(env.MY_WORKFLOW, 'test-123');
try {
await instance.modify(async (m) => {
await m.disableSleeps(); // Skip all sleeps
await m.mockStepResult({ name: 'fetch data' }, { users: [{ id: 1 }] }); // Mock step result
await m.mockEvent({ type: 'approval', payload: { approved: true } }); // Send mock event
await m.mockStepError({ name: 'call API' }, new Error('Network timeout'), 1); // Force error once
});
await env.MY_WORKFLOW.create({ id: 'test-123' });
await expect(instance.waitForStatus('complete')).resolves.not.toThrow();
} finally {
await instance.dispose(); // Cleanup
}
});
Test Modifiers
disableSleeps(steps?)- Skip sleeps instantlymockStepResult(step, result)- Mock step.do() resultmockStepError(step, error, times?)- Force step.do() to throwmockEvent(event)- Send mock event to step.waitForEvent()forceStepTimeout(step, times?)- Force step.do() timeoutforceEventTimeout(step)- Force step.waitForEvent() timeout
Official Docs: https://developers.cloudflare.com/workers/testing/vitest-integration/
Related Documentation
- Cloudflare Workflows Docs: https://developers.cloudflare.com/workflows/
- Get Started Guide: https://developers.cloudflare.com/workflows/get-started/guide/
- Workers API: https://developers.cloudflare.com/workflows/build/workers-api/
- Vitest Testing: https://developers.cloudflare.com/workers/testing/vitest-integration/
- Sleeping and Retrying: https://developers.cloudflare.com/workflows/build/sleeping-and-retrying/
- Events and Parameters: https://developers.cloudflare.com/workflows/build/events-and-parameters/
- Limits: https://developers.cloudflare.com/workflows/reference/limits/
- Pricing: https://developers.cloudflare.com/workflows/platform/pricing/
- Changelog: https://developers.cloudflare.com/workflows/reference/changelog/
- MCP Tool: Use
mcp__cloudflare-docs__search_cloudflare_documentationfor latest docs
Last Updated: 2025-11-25 Version: 1.0.0 Maintainer: Jeremy Dawes | jeremy@jezweb.net
More by jezweb
View allSelf-hosted auth for TypeScript/Cloudflare Workers with social auth, 2FA, passkeys, organizations, RBAC, and 15+ plugins. Requires Drizzle ORM or Kysely for D1 (no direct adapter). Self-hosted alternative to Clerk/Auth.js. Use when: self-hosting auth on D1, building OAuth provider, multi-tenant SaaS, or troubleshooting D1 adapter errors, session caching, rate limits.
/review-skill - Skill Audit Command: Comprehensive skill documentation audit with automated checks and manual review phases.
Build rich text editors with Tiptap - headless editor framework with React and Tailwind v4. Covers SSR-safe setup, image uploads, prose styling, and collaborative editing. Use when creating blog editors, comment systems, or Notion-like apps, or troubleshooting SSR hydration errors, typography issues, or image upload problems.
Run LLMs and AI models on Cloudflare's GPU network with Workers AI. Includes Llama 4, Gemma 3, Mistral 3.1, Flux images, BGE embeddings, streaming, and AI Gateway. Handles 2025 breaking changes. Use when: implementing LLM inference, images, RAG, or troubleshooting AI_ERROR, rate limits, max_tokens, BGE pooling.
