Skip to content

Documentation

Everything you need to build with CesaFlow.

# Quick Start

Get your first AI-generated project running in under 2 minutes.

1

Create an account

curl -X POST https://api.cesaflow.ai/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email": "[email protected]", "password": "secret", "organization_name": "My Org"}'

# Response:
# { "api_key": "sk_...", "org_id": "..." }
2

Start a run

curl -X POST https://api.cesaflow.ai/api/v1/runs \
  -H "x-api-key: sk_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"objective": "Build a FastAPI todo app with JWT auth", "project_id": "my-todo"}'

# Response:
# { "run_id": "abc-123", "status": "pending" }
3

Watch it build

# Poll for status
curl https://api.cesaflow.ai/api/v1/runs/abc-123 \
  -H "x-api-key: sk_YOUR_KEY"

# Or connect WebSocket for live events:
# wss://api.cesaflow.ai/ws/runs?api_key=sk_YOUR_KEY

# Download when done:
curl https://api.cesaflow.ai/api/v1/runs/abc-123/download \
  -H "x-api-key: sk_YOUR_KEY" -o workspace.zip

# REST API Reference

Base URL: https://api.cesaflow.ai
Auth header: x-api-key: sk_YOUR_KEY

Endpoints
POST
/api/v1/auth/register

Create an account and get an API key

POST
/api/v1/auth/login

Login and get an API key

POST
/api/v1/runs

Start a new run (returns run_id immediately)

πŸ”‘ auth
GET
/api/v1/runs

List all runs for your org

πŸ”‘ auth
GET
/api/v1/runs/{run_id}

Get run status, nodes, and files

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/download

Download workspace as ZIP

πŸ”‘ auth
GET
/api/v1/benchmark/tasks

List all benchmark tasks

πŸ”‘ auth
POST
/api/v1/benchmark/run

Start a benchmark run

πŸ”‘ auth
GET
/api/v1/benchmark/{id}

Poll benchmark results

πŸ”‘ auth
GET
/api/v1/billing/usage

Get monthly usage and budget

πŸ”‘ auth
WS
/ws/runs?api_key=KEY

WebSocket: stream live run events

πŸ”‘ auth
POST
/api/v1/runs/{run_id}/cancel

Cancel a running run

πŸ”‘ auth
POST
/api/v1/runs/{run_id}/respond

Submit human-in-the-loop answer

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/tokens

Token usage and cost breakdown by agent

πŸ”‘ auth
POST
/api/v1/estimate

Estimate cost/time for a project (3 scenarios)

πŸ”‘ auth
POST
/api/v1/discussion/start

Start AI discussion session

πŸ”‘ auth
POST
/api/v1/discussion/turn

Continue discussion (AI asks questions)

πŸ”‘ auth
POST
/api/v1/projects

Create a new project

πŸ”‘ auth
GET
/api/v1/projects

List all projects for your org

πŸ”‘ auth
POST
/api/v1/cron/jobs

Create a scheduled cron job

πŸ”‘ auth
GET
/api/v1/cron/jobs

List all cron jobs

πŸ”‘ auth
DELETE
/api/v1/cron/jobs/{job_id}

Delete a cron job

πŸ”‘ auth
POST
/api/v1/mcp/servers

Add an MCP server

πŸ”‘ auth
GET
/api/v1/mcp/servers

List MCP servers

πŸ”‘ auth
GET
/api/v1/mcp/servers/{id}/tools

Get tools from MCP server

πŸ”‘ auth
POST
/api/v1/team/invite

Invite a team member by email

πŸ”‘ auth
GET
/api/v1/team/members

List team members with roles

πŸ”‘ auth
POST
/api/v1/inline/edit

Cmd+K: Edit selected code (IDE)

πŸ”‘ auth
POST
/api/v1/inline/chat

Cmd+L: Chat about file (IDE)

πŸ”‘ auth
POST
/api/v1/inline/complete

Tab completion (IDE)

πŸ”‘ auth
GET
/api/v1/marketplace/templates

Browse agent templates

πŸ”‘ auth
POST
/api/v1/marketplace/templates/{id}/install

Install a template

πŸ”‘ auth
POST
/api/v1/github/webhook

GitHub App webhook receiver

POST
/api/v1/policies

Create execution policy

πŸ”‘ auth
GET
/api/v1/models/free

List models with free tiers

πŸ”‘ auth
POST
/api/v1/models/compare

Compare models by cost/speed

πŸ”‘ auth
POST
/api/v1/review

Submit code for AI review (diff or PR URL)

πŸ”‘ auth
GET
/api/v1/review/{review_id}

Get review results

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/chain

Get all runs in a chain

πŸ”‘ auth
GET
/api/v1/runs/learning/stats

Learning engine stats for your org

πŸ”‘ auth
GET
/api/v1/runs/hierarchy/personas

List agent hierarchy personas and roles

GET
/api/v1/runs/performance/insights

Performance metrics and optimization suggestions

πŸ”‘ auth
GET
/api/v1/runs/money/templates

List revenue-generating project templates

πŸ”‘ auth
POST
/api/v1/runs/money/start

Start a Money Mode run from template

πŸ”‘ auth
GET
/api/v1/runs/clone/profile

Get your digital clone profile (learned preferences)

πŸ”‘ auth
GET
/api/v1/runs/clone/context

Get prompt-injectable clone context

πŸ”‘ auth
POST
/api/v1/runs/goal/start

Start goal-mode autonomous execution

πŸ”‘ auth
GET
/api/v1/runs/goal/{goal_id}

Get goal execution status

πŸ”‘ auth
GET
/api/v1/runs/auto-config

Get auto-optimized execution config

πŸ”‘ auth
POST
/api/v1/runs/clone/decide

Digital Clone autonomous decision

πŸ”‘ auth
GET
/api/v1/runs/clone/style

Inferred code style preferences

πŸ”‘ auth
GET
/api/v1/runs/benchmark/trend

Benchmark regression trend

πŸ”‘ auth
GET
/api/v1/runs/modules/available

List composable modules

πŸ”‘ auth
POST
/api/v1/runs/modules/resolve

Resolve module dependencies

πŸ”‘ auth
POST
/api/v1/marketplace/rate

Rate a marketplace template (1-5)

πŸ”‘ auth
GET
/api/v1/marketplace/ratings/{id}

Get template average rating

GET
/api/v1/runs/{run_id}/graph

Get execution graph (nodes, dependencies, progress)

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/logs

Get all shared-memory logs for a run

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/workspace

File listing and summary for a run

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/files/{path}

Get file content (preview, max 500KB)

πŸ”‘ auth
PATCH
/api/v1/runs/{run_id}/files/{path}

Save/edit file content (browser IDE)

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/ide-url

Get OpenVSCode Server URL for this run

πŸ”‘ auth
GET
/api/v1/runs/{run_id}/waiting

Check if run has pending human input

πŸ”‘ auth
GET
/api/v1/billing/plans

List all plan details (free/pro/enterprise)

GET
/api/v1/billing/status

Current subscription status

πŸ”‘ auth
POST
/api/v1/billing/checkout

Create Stripe Checkout Session

πŸ”‘ auth
POST
/api/v1/billing/portal

Get Stripe Customer Portal URL

πŸ”‘ auth
DELETE
/api/v1/team/members/{user_id}

Remove team member

πŸ”‘ auth
PATCH
/api/v1/team/members/{user_id}/role

Update member role

πŸ”‘ auth
GET
/api/v1/team/invite/{token}

Get invite metadata (public)

POST
/api/v1/team/invite/accept

Accept invite and create account

GET
/api/v1/projects/{id}

Get specific project

πŸ”‘ auth
PUT
/api/v1/projects/{id}

Update project

πŸ”‘ auth
DELETE
/api/v1/projects/{id}

Delete project

πŸ”‘ auth
POST
/api/v1/model-credentials

Store encrypted model API key (BYOM)

πŸ”‘ auth
GET
/api/v1/model-credentials

List stored credentials (keys hidden)

πŸ”‘ auth
DELETE
/api/v1/model-credentials/{id}

Remove stored credential

πŸ”‘ auth
POST
/api/v1/model-credentials/{id}/set-default

Set default credential

πŸ”‘ auth
GET
/api/v1/marketplace/categories

List template categories

πŸ”‘ auth
GET
/api/v1/marketplace/installed

List installed templates for org

πŸ”‘ auth
DELETE
/api/v1/marketplace/templates/{id}/install

Uninstall template

πŸ”‘ auth
GET
/api/v1/auth/me

Get current user and org info

πŸ”‘ auth
POST
/api/v1/auth/forgot-password

Send password reset email

POST
/api/v1/auth/reset-password

Reset password with token

DELETE
/api/v1/mcp/servers/{id}

Remove MCP server

πŸ”‘ auth
POST
/api/v1/tasks

Create and queue a task

πŸ”‘ auth
GET
/api/v1/tasks

List tasks (filter by project, status)

πŸ”‘ auth
GET
/api/v1/tasks/{id}

Get task details

πŸ”‘ auth
PUT
/api/v1/tasks/{id}

Update task

πŸ”‘ auth
POST
/api/v1/tasks/{id}/cancel

Cancel a task

πŸ”‘ auth
POST
/api/v1/capture/events

Capture event from IDE/CLI/Git hook

πŸ”‘ auth
GET
/api/v1/capture/events

List captured events

πŸ”‘ auth
POST
/api/v1/capture/events/{id}/convert

Convert event to task

πŸ”‘ auth
GET
/api/v1/admin/vault

List all API vault entries (admin)

πŸ”‘ auth
POST
/api/v1/admin/vault

Add API key to vault (admin)

πŸ”‘ auth
POST
/api/v1/admin/vault/{id}/test

Test vault entry connection (admin)

πŸ”‘ auth
GET
/api/v1/runs/models/catalog

Full model catalog (57+ models, 25+ providers)

GET
/api/v1/runs/models/catalog/{provider}

Models for specific provider

GET
/api/v1/runs/free-runs

Free runs remaining for this org

πŸ”‘ auth
GET
/health

System health check

GET
/ping

Simple ping/pong

Example: Start a Run with project_id
curl -X POST https://api.cesaflow.ai/api/v1/runs \
  -H "x-api-key: sk_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "Add WebSocket support to the chat feature",
    "project_id": "my-chat-app"
  }'

# If project_id exists:
# - Reuses /workspace/project_my-chat-app/ (files preserved)
# - Loads previous plan and file history
# - Planner knows what was already built
WebSocket Events
// Connect: wss://api.cesaflow.ai/ws/runs?api_key=sk_KEY

// Events streamed:
{ "event": "run_started",   "run_id": "...", "node_count": 4 }
{ "event": "token_chunk",   "agent": "backend", "chunk": "..." }
{ "event": "file_written",  "agent": "backend", "path": "main.py", "bytes": 1240 }
{ "event": "run_completed", "run_id": "...", "files": ["main.py", ...] }
{ "event": "human_input_required", "question": "Which database?" }

# CLI Reference

Install the CLI for terminal-first workflows.

Install
cd cli/
pip install -e .

# Or with pipx:
pipx install -e .
aios run β€” Start a task
aios run "Build a REST API for a blog"
aios run "Add JWT auth" --project my-blog
aios run "Build a FastAPI app" --tasks backend,qa   # select agents
aios run "..." --model gpt-4o                       # override model
aios benchmark β€” Run CesaFlow Benchmark Suite
aios benchmark              # run all 10 tasks
aios benchmark --list      # list available tasks
aios benchmark --tasks hello_world,calculator  # run specific tasks
aios logs β€” Stream live run events
aios logs <run-id>          # stream events for a run
aios logs <run-id> --follow  # keep streaming until done
Environment Variables
AIOS_API_KEY=sk_...      # your API key
AIOS_API_URL=https://api.cesaflow.ai  # API base URL (default)
OPENAI_API_KEY=sk_...   # for agents to call LLMs
GITHUB_TOKEN=ghp_...    # for git/PR tools (optional)
PLAYWRIGHT_ENABLED=1    # enable browser tool (optional)

# Agent Tools

All tools available to agents during a run. Agents choose which tools to call based on the task.

Files
write_file

Write content to a file in the workspace.

params: path: string, content: string

Files
read_file

Read the contents of a workspace file.

params: path: string

Files
list_files

List all files in the workspace.

Shell
run_command

Execute a shell command (pip install, pytest, npm, etc.).

params: command: string

Shell
read_test_output

Read the output of the last run_command call.

Web
web_search

Search the web via DuckDuckGo for documentation or packages.

params: query: string

Browser
browser_navigate

Navigate to a URL (Playwright or httpx fallback).

params: url: string

Browser
browser_get_content

Read text content of the current page.

Browser
browser_click

Click an element on the current page.

params: selector: string

Browser
browser_screenshot

Take a screenshot of the current page (base64 PNG).

Git
git_init

Initialize a git repo and make the first commit.

params: message?: string

Git
git_status

Run git status and show last 5 commits.

Git
github_create_pr

Open a pull request via GitHub API (requires GITHUB_TOKEN).

params: owner, repo, title, body, head, base

Interactive
ask_human

Pause execution and ask the user a question. Run waits until user responds.

params: question: string

Design
figma_get_file

Access Figma design files and extract component information.

params: file_key: string

Web
firecrawl_scrape

Scrape and extract web content with JavaScript rendering and Markdown conversion.

params: url: string

Files
pdf_to_text

Extract text content from a PDF file.

params: path: string

Deploy
generate_deploy

Generate deployment configs for Vercel, Railway, Fly.io, Docker, or Render.

params: platform: string, app_name?: string

Server
ssh_execute

Execute commands on remote servers via SSH.

params: host: string, command: string, username?: string

Communication
send_email

Send emails via SMTP for notifications and reports.

params: to: string, subject: string, body: string

Payment
stripe_create_product

Create a Stripe product with pricing for subscription or one-time billing.

params: name: string, price_cents?: int, recurring?: bool

Payment
stripe_create_checkout

Create a Stripe Checkout Session URL for accepting payments.

params: price_id: string, success_url?: string

Database
setup_database

Generate database config, Docker Compose, and initialization scripts for PostgreSQL, SQLite, or MongoDB.

params: db_type?: string, app_name?: string

Config
generate_env

Generate .env, .env.example, and .gitignore with proper project configuration.

params: app_name?: string, db_type?: string, include_stripe?: bool

Deploy
deploy_to_platform

Execute actual deployment to Vercel, Railway, Fly.io, or Docker.

params: platform: string, token?: string

Web
http_request

Make HTTP GET/POST/PUT/DELETE requests to external APIs. SSRF-protected.

params: url: string, method?: string, headers?: object, body?: string

Database
db_query

Execute read-only SQL queries against PostgreSQL. SELECT only, max 100 rows.

params: query: string, connection_string?: string

Web
download_file

Download files from URLs into the workspace. Max 10MB, SSRF-protected.

params: url: string, filename?: string

# Advanced Tools

Beyond the standard file/shell/browser tools, CesaFlow agents have access to real-world integration tools.

πŸ–₯️

SSH

Execute commands on remote servers. Useful for deployment, server management, and monitoring.

ssh_execute(host="server.com", command="docker ps")
πŸ“§

Email

Send emails via SMTP. Useful for notifications, reports, and automated communication.

send_email(to="[email protected]", subject="Deploy Complete", body="...")
πŸ’³

Stripe

Create products, prices, and checkout sessions. Agents can add payments to generated apps.

stripe_create_product(name="Pro Plan", price_cents=2900)
πŸ—„οΈ

Database

Generate database configs for PostgreSQL, SQLite, or MongoDB with Docker Compose and init scripts.

setup_database(db_type="postgresql", app_name="myapp")
πŸš€

Deploy

Generate platform-specific deployment configs (Vercel, Railway, Fly.io, Docker, Render).

generate_deploy(platform="vercel", app_name="my-saas")
βš™οΈ

Environment

Generate .env, .env.example, and .gitignore with proper configuration.

generate_env(app_name="myapp", include_stripe=true)

# Architecture

How CesaFlow runs your task from start to finish.

Agent Pipeline

POST /api/v1/runs
       β”‚
       β–Ό
  GraphScheduler
       β”‚
  Wave 1: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚ Planner β”‚  ← generates API Contract
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
  Wave 2: β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚ Backend β”‚   β”‚ Frontend β”‚  ← run in PARALLEL
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚               β”‚
  Wave 3:      β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β”
                  β”‚   QA    β”‚  ← waits for both
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Self-Debugging Loop

Agent runs β†’ validates (pytest/syntax check)
    β”‚
    β”œβ”€ PASS β†’ next wave
    β”‚
    └─ FAIL β†’ inject debug_info into prompt
                  {
                    error_type: "validation_failure",
                    failed_command: "pytest tests/",
                    full_output: "...",   // full pytest output
                    failed_tests: [...],  // parsed test names
                    attempt: 2
                  }
              β†’ agent retries (max 3 attempts)

Project Memory

project_id = "my-blog"

Workspace:  /workspace/project_my-blog/   ← shared, persistent
DB key:     project:my-blog:state         ← PostgreSQL project_memory

State stored:
  { project_id, total_runs, files[], runs[{objective, files, ts}] }

Next run with same project_id:
  β†’ Planner sees: "Continuation of project, existing files: [...]"
  β†’ DO NOT rewrite β€” build on top

# Benchmark

CesaFlow includes a built-in benchmark suite: 10 coding tasks evaluated automatically, no human grading. This is CesaFlow's internal evaluation suite β€” not the academic SWE-bench dataset. It measures end-to-end agent capability across file generation, code correctness, and test execution.

Task List
hello_worldtrivial
calculatoreasy
fastapi_healtheasy
todo_apimedium
data_processingmedium
auth_jwtmedium
websocket_chathard
async_scraperhard
react_componentmedium
docker_composemedium
Scoring
Each task: 10 points max
Check types:
  file_exists       β€” does the required file exist?
  file_contains     β€” does it contain the expected code?
  any_file_contains β€” does any file contain the pattern?

Score = (passed_checks / total_checks) * 10
Grade: S (90+) Β· A (80+) Β· B (70+) Β· C (60+) Β· D (50+) Β· F (<50)

# Execution Personas

Choose a persona to tailor agent behavior to your use case. Pass persona in your run request.

fullstack

Planner + Backend + Frontend + QA β€” complete web apps

"Build a blog with auth and admin panel"

api_only

Planner + Backend + QA β€” REST APIs and microservices

"Create a payment webhook handler"

data_pipeline

Planner + Backend + QA β€” ETL, pandas, SQL processing

"Build a CSV import pipeline with validation"

saas_starter

Full stack + auth, billing, multi-tenancy hints

"Launch a SaaS dashboard with Stripe"

microservice

Single-responsibility service with Docker + health checks

"Image resizing microservice with S3"

Usage
curl -X POST https://api.cesaflow.ai/api/v1/runs \
  -H "x-api-key: sk_YOUR_KEY" \
  -d '{"objective": "Build a payment API", "persona": "api_only"}'

# AI Discussion & Planning

Refine vague ideas through multi-turn conversation before executing. The AI asks focused questions to clarify requirements.

Flow
# 1. Start a session
POST /api/v1/discussion/start
{ "message": "I want to build something for managing tasks" }

# 2. AI asks clarifying questions
Response: { "reply": "Should it have user auth? REST or GraphQL?", "ready": false }

# 3. Continue the conversation
POST /api/v1/discussion/turn
{ "session_id": "...", "message": "REST with JWT, PostgreSQL, and team support" }

# 4. After 3-5 turns, AI declares ready
Response: { "ready": true, "refined_objective": "Build a task management REST API with JWT auth, PostgreSQL, team workspaces, and role-based access control" }

# 5. Execute the refined objective
POST /api/v1/runs
{ "objective": "<refined_objective>" }

# Cost Estimation

Get detailed cost, time, and quality estimates before running. Returns 3 execution scenarios.

POST /api/v1/estimate
{ "objective": "Build a blog with auth and comments" }

Response:
{
  "tasks": [
    { "name": "Setup project", "complexity": "low", "tokens_est": 2000 },
    { "name": "User auth system", "complexity": "medium", "tokens_est": 8000 },
    ...
  ],
  "plans": [
    { "name": "Premium Fast", "model": "gpt-4o", "cost_usd": 0.12, "quality": 9.2, "risk": "low" },
    { "name": "Balanced", "model": "gemini-1.5-flash", "cost_usd": 0.03, "quality": 7.8, "risk": "medium" },
    { "name": "Free", "model": "groq-llama", "cost_usd": 0.00, "quality": 6.5, "risk": "high" }
  ]
}

# MCP Integration

Connect external tools via Model Context Protocol. Agents can access GitHub, Linear, Slack, Notion, PostgreSQL, and any MCP-compatible server.

# Add an MCP server
POST /api/v1/mcp/servers
{
  "name": "GitHub",
  "url": "https://mcp-github.example.com",
  "api_key": "ghp_..."
}

# List available tools from the server
GET /api/v1/mcp/servers/{server_id}/tools
Response: [
  { "name": "create_issue", "description": "Create GitHub issue" },
  { "name": "search_code", "description": "Search code in repos" },
  ...
]

# Tools are automatically available to agents during runs

# Scheduled Runs (Cron)

Automate recurring tasks with standard cron expressions. Jobs persist across restarts.

# Create a daily job at 9 AM
POST /api/v1/cron/jobs
{
  "label": "Daily report",
  "objective": "Generate a summary report of yesterday's sales data",
  "cron_expr": "0 9 * * *",
  "project_id": "daily-reports"
}

# List all jobs
GET /api/v1/cron/jobs

# Delete a job
DELETE /api/v1/cron/jobs/{job_id}

# Cron format: minute hour day month day_of_week
# Examples: "0 9 * * *" (daily 9am), "0 */6 * * *" (every 6h), "0 9 * * 1" (Mon 9am)

# GitHub App Integration

Connect CesaFlow as a GitHub App for automated PR reviews, issue fixes, and CI debugging.

Capabilities

  • β€’ PR Auto-Review: Opens a PR β†’ CesaFlow reviews code, posts comments
  • β€’ Issue Fix Command: Comment /cesaflow fix on any issue β†’ agents analyze and submit a fix PR
  • β€’ CI Debugging: Workflow fails β†’ CesaFlow reads logs, identifies root cause, suggests fix
  • β€’ Issue Analysis: New issue opened β†’ CesaFlow analyzes and creates a run

# Template Marketplace

Pre-built agent templates with optimized system prompts for specific frameworks and use cases.

DjangoBackendDjango REST Framework with auth, admin, migrations
RailsBackendRuby on Rails with ActiveRecord, RSpec
Rust CLISystemsRust CLI app with clap, error handling
Data ScienceData & MLJupyter, pandas, scikit-learn pipelines
DevOpsDevOpsTerraform, Docker, CI/CD, monitoring
React NativeMobileCross-platform mobile with Expo
GolangBackendGo service with Chi router, GORM
Node.jsBackendExpress/Fastify with TypeScript
Java SpringBackendSpring Boot with JPA, Security
KubernetesDevOpsK8s manifests, Helm charts, operators
LLM ChatbotAIChatbot with LangChain, vector DB, RAG
Python CLISystemsClick CLI with rich output, testing

# Human-in-the-Loop

Agents can pause execution and ask you questions when they need clarification.

# During a run, an agent calls the ask_human tool:
# Agent: "Should I use PostgreSQL or SQLite for the database?"

# WebSocket event:
{ "event": "human_input_required", "run_id": "...", "question": "Which database?" }

# Check if a run is waiting for input:
GET /api/v1/runs/{run_id}/waiting

# Submit your answer:
POST /api/v1/runs/{run_id}/respond
{ "answer": "Use PostgreSQL with SQLAlchemy" }

# Agent receives answer and continues execution

# Inline AI (IDE Endpoints)

Fast single-turn endpoints for the CesaFlow IDE (browser-based, OpenVSCode Server). Target: <300ms response time.

Cmd+K β€” Edit Code
POST /api/v1/inline/edit
{ "code": "def hello():\n  print('hi')", "instruction": "Add type hints and docstring" }

Response: { "edited_code": "def hello() -> None:\n  \"\"\"Greet the user.\"\"\"\n  print('hi')", "summary": "Added return type and docstring" }
Cmd+L β€” Chat About Code
POST /api/v1/inline/chat
{ "file_context": "...", "message": "What does this function do?" }

Response: { "reply": "This function validates JWT tokens..." }
Tab β€” Completion
POST /api/v1/inline/complete
{ "prefix": "def calculate_tax(amount: float, ", "suffix": "):\n  return" }

Response: { "completion": "rate: float = 0.2" }

# Code Review API

AI-powered code review for pull requests and diffs. Submit code changes and get structured feedback with risk assessment, security analysis, and rollback plans.

Submit a Review
# Review a GitHub PR
POST /api/v1/review
{
  "github_url": "https://github.com/owner/repo/pull/123",
  "focus": "security",      // general | security | performance | style
  "language": "auto"
}

# Or review a raw diff
POST /api/v1/review
{
  "diff_content": "--- a/main.py\n+++ b/main.py\n@@ -1,3 +1,5 @@...",
  "focus": "general"
}

# Response: { "review_id": "abc123", "status": "processing" }

# Poll for results
GET /api/v1/review/abc123
# Returns: { "status": "completed", "result": "## Summary\n..." }

Review Output Includes

  • β€’ Summary: What the change does in 1-3 sentences
  • β€’ Risk Assessment: LOW / MEDIUM / HIGH with explanation
  • β€’ Issues Found: Critical, warning, and suggestion-level issues with fixes
  • β€’ Security Check: SQL injection, XSS, hardcoded secrets, input validation
  • β€’ Suggestions: Code quality, performance, and pattern improvements
  • β€’ Rollback Plan: What to do if the change causes production issues

# Multi-Run Chaining

Chain runs together for multi-step autonomous execution. When a run completes, it can automatically trigger the next objective.

Start a Chain
POST /api/v1/runs
{
  "objective": "Build a REST API for a blog",
  "project_id": "my-blog",
  "continuation": {
    "next_objective": "Add frontend with React for the blog API",
    "trigger_condition": "on_success",
    "max_depth": 3,
    "inherit_workspace": true,
    "then": {
      "next_objective": "Write E2E tests for the full-stack blog",
      "trigger_condition": "on_success"
    }
  }
}

# Flow:
# Run 1: Build API β†’ completes β†’ triggers Run 2
# Run 2: Build frontend β†’ completes β†’ triggers Run 3
# Run 3: Write tests β†’ completes β†’ chain done

# Track the chain:
GET /api/v1/runs/{run_id}/chain
# Returns all runs in order with status

Trigger Conditions

  • β€’ on_success: Continue only if the run completed successfully
  • β€’ on_failure: Continue only if the run failed (useful for fallback strategies)
  • β€’ always: Continue regardless of outcome
  • β€’ spec_pass: Continue only if spec validation score β‰₯ 70

# Learning Engine

CesaFlow learns from every run. Error patterns are tracked per organization, and lessons are automatically injected into agent prompts to avoid repeating the same mistakes.

How It Works

  • 1. Agent encounters a validation failure (tests fail, syntax error, etc.)
  • 2. Error pattern is recorded: error type, message, agent, context
  • 3. If the agent fixes the issue on retry, the fix is recorded too
  • 4. On future runs, relevant lessons are injected into the agent's prompt
  • 5. Agent avoids known patterns and applies known fixes proactively
Learning Stats API
GET /api/v1/runs/learning/stats

Response:
{
  "total_patterns": 23,
  "fixed_patterns": 18,
  "by_agent": {
    "backend": 12,
    "frontend": 8,
    "qa": 3
  },
  "top_errors": [
    { "type": "validation_failure", "count": 7, "agent": "backend" },
    { "type": "validation_failure", "count": 5, "agent": "frontend" }
  ]
}

# Agent Hierarchy

CesaFlow supports multi-level agent organizations. Choose a hierarchy persona to control how agents are structured.

Available Personas

  • β€’ standard: Planner β†’ Backend + Frontend β†’ QA (default)
  • β€’ enterprise: CEO β†’ CTO β†’ Backend + Frontend β†’ QA (full hierarchy with strategic planning)
  • β€’ with_devops: Planner β†’ Backend + Frontend β†’ QA β†’ DevOps (includes deployment setup)

7 Agent Roles

  • β€’ CEO (Level 0): Strategic planning, prioritization, resource allocation. Delegates to CTO and Growth.
  • β€’ CTO (Level 1): Architecture, tech stack selection, system design. Delegates to Dev agents.
  • β€’ Backend (Level 2): API development, database, server-side logic.
  • β€’ Frontend (Level 2): UI development, responsive design, accessibility.
  • β€’ QA (Level 2): Testing, code review, quality assurance.
  • β€’ DevOps (Level 2): CI/CD, Docker, infrastructure, deployment.
  • β€’ Growth (Level 1): SEO, analytics, marketing, user research.

# Performance Insights

CesaFlow tracks run metrics and provides optimization suggestions. The system identifies trends and recommends the best models for your workload.

GET /api/v1/runs/performance/insights

Response:
{
  "total_runs": 47,
  "pass_rate": 85.1,
  "avg_duration_s": 124.5,
  "avg_cost_usd": 0.0342,
  "trend": "improving",
  "best_model": "claude-3-5-sonnet",
  "model_usage": { ... },
  "persona_stats": { ... },
  "suggestions": [
    "Performance looks healthy. Keep going!",
    "Model 'claude-3-5-sonnet' has the best success rate"
  ]
}

# Money Mode

Revenue-generating project templates. Pick a business model and CesaFlow builds the entire application β€” auth, payments, dashboard, and deployment config included.

Available Templates
SaaS MVPSubscription ($29-499/mo)15-30 min
API-as-a-ServiceUsage-based ($0.001-0.01/req)10-20 min
E-Commerce StoreProduct sales20-40 min
MarketplaceCommission (10-20%)30-50 min
Lead GenerationLead sales ($5-50/lead)10-20 min
AI Wrapper ProductUsage credits ($10-100/mo)10-20 min
Booking SystemBooking fees ($5-100)15-25 min
Paid NewsletterSubscriptions ($5-20/mo)15-25 min
Start a Money Mode Run
POST /api/v1/runs/money/start
{
  "template_id": "saas_mvp",
  "customization": "Use React + Tailwind, target fitness industry"
}

# Response: { "run_id": "...", "template": "SaaS MVP", "revenue_model": "Subscription" }

# Deploy

CesaFlow generates deployment configurations automatically. Use the with_devops build mode or the generate_deploy tool to create platform-specific configs.

Supported Platforms

  • β€’ Vercel: Best for Next.js and React apps. Generates vercel.json + .vercelignore
  • β€’ Railway: Full-stack apps with databases. Generates railway.toml + Procfile
  • β€’ Fly.io: Global Docker deployments. Generates fly.toml + Dockerfile
  • β€’ Render: Simple web services. Generates render.yaml
  • β€’ Docker: Self-hosted. Generates Dockerfile + docker-compose.yml
Agent Tool Usage
# During a run, agents can call:
generate_deploy(platform="vercel", app_name="my-saas")

# This creates deployment files in the workspace:
# β†’ vercel.json
# β†’ .vercelignore

# Or for Docker:
generate_deploy(platform="docker", app_name="my-api")
# β†’ Dockerfile
# β†’ docker-compose.yml
# β†’ .dockerignore

# Digital Clone

CesaFlow builds a digital clone of your preferences over time. It learns your tech stack, coding patterns, and project preferences β€” then instructs agents to work the way you would.

What It Learns

  • β€’ Tech Stack: Which languages and frameworks you use most (Python, TypeScript, React, FastAPI, etc.)
  • β€’ Build Mode: Your preferred agent persona (fullstack, api_only, enterprise, etc.)
  • β€’ AI Model: Which model performs best for your workloads
  • β€’ Project Patterns: Common keywords, file types, and architecture choices
Clone Profile API
GET /api/v1/runs/clone/profile

Response:
{
  "status": "active",
  "runs_analyzed": 23,
  "preferred_stack": ["python", "typescript", "react"],
  "preferred_persona": "fullstack",
  "preferred_model": "claude-3-5-sonnet",
  "top_keywords": [
    { "word": "api", "count": 8 },
    { "word": "auth", "count": 5 }
  ],
  "profile_summary": "Primarily works with python, typescript, react. Prefers fullstack builds."
}

# Clone context is automatically injected into agent prompts:
GET /api/v1/runs/clone/context
# β†’ "This user prefers: python, typescript. Default: fullstack. Adapt output accordingly."

# Goal Mode

Set a high-level goal and CesaFlow autonomously decomposes it into tasks, chains them together, and executes until done. No "continue" needed.

Start a Goal
POST /api/v1/runs/goal/start
{
  "goal": "Build a complete SaaS product for project management with auth, teams, tasks, billing, and deploy to Vercel",
  "project_id": "my-saas"
}

# CesaFlow:
# 1. Decomposes into 5 tasks:
#    β†’ Build auth system with JWT
#    β†’ Create team + task management API
#    β†’ Build React frontend with dashboard
#    β†’ Add Stripe billing integration
#    β†’ Generate Vercel deploy config + tests
#
# 2. Chains them automatically
# 3. Each task validates before continuing
# 4. Self-corrects on failure
#
# Response: { "goal_id": "...", "run_id": "...", "tasks": [...], "total_steps": 5 }
Auto-Optimization
GET /api/v1/runs/auto-config

# Returns optimal config based on your run history:
{
  "auto_optimized": true,
  "recommended_model": "claude-3-5-sonnet",
  "recommended_strategy": "balanced",
  "recommended_provider": "anthropic"
}

# CesaFlow analyzes your pass rate, cost trends, and model performance
# to automatically suggest the best configuration.

Full Autonomy Stack

These systems work together for end-to-end autonomous execution:

  • 1. Goal Mode decomposes objective into sequential tasks
  • 2. Multi-Run Chaining auto-starts next task on completion
  • 3. Agent Hierarchy assigns roles (CEOβ†’CTOβ†’Devβ†’QA)
  • 4. Self-Debug Loop retries on failure (3 attempts)
  • 5. Learning Engine avoids past mistakes
  • 6. Digital Clone applies your preferences automatically
  • 7. Auto-Optimization selects best model from history
  • 8. Guardrails enforce cost/runtime/tool limits safely

# Webhook Continuation

Enable indefinite autonomous execution by connecting external systems via webhooks. When a run completes, CesaFlow notifies your webhook β€” and if it responds with a new objective, a follow-up run starts automatically.

# Start a run with webhook
POST /api/v1/runs
{
  "objective": "Build a REST API for blog",
  "webhook_url": "https://your-server.com/cesaflow-hook"
}

# When run completes, CesaFlow POSTs to your webhook:
{
  "event": "run_completed",
  "run_id": "abc-123",
  "status": "completed",
  "files_count": 12,
  "chain_id": "chain_abc"
}

# Your webhook can respond:
{ "continue": true, "next_objective": "Add frontend for the blog API" }
# β†’ CesaFlow automatically starts a new run!

# Or stop the chain:
{ "continue": false }

# Use cases:
# β€’ CI/CD pipelines (build β†’ test β†’ deploy stages)
# β€’ Monitoring (daily check β†’ fix if broken)
# β€’ Multi-system orchestration

# Composable Modules

Add pre-built capabilities to any project. Modules inject specific instructions into the agent prompt so it generates the right code for auth, payments, email, and more.

Available Modules
auth_jwtJWT authentication, registration, login, password reset
payments_stripeStripe checkout, subscriptions, webhooks, billing portal
email_smtpSMTP email sending with Jinja2 templates
file_uploadS3-compatible file upload with image processing
websocket_realtimeWebSocket real-time events, chat, notifications
admin_panelAdmin dashboard, user management, CRUD
search_full_textPostgreSQL full-text search with ranking
rate_limitingAPI rate limiting with per-user quotas
Dependency Resolution
POST /api/v1/runs/modules/resolve
{ "modules": ["admin_panel", "rate_limiting"] }

# Response:
{
  "resolved": ["admin_panel", "rate_limiting", "auth_jwt"],
  "added": [{ "module": "auth_jwt", "required_by": "admin_panel" }],
  "conflicts": [],
  "total": 3
}
# auth_jwt auto-added because admin_panel depends on it

# Policies & Guardrails

Create execution policies to control model selection, cost limits, runtime constraints, tool access, and approval workflows.

Create a Policy
POST /api/v1/policies
{
  "name": "secure-production",
  "model_strategy": "balanced",
  "max_cost_usd": 5.0,
  "max_runtime_minutes": 30,
  "preferred_providers": ["anthropic", "openai", "groq"],
  "fallback_enabled": true,
  "blocked_commands": ["rm -rf", "sudo", "shutdown", "curl | bash"],
  "allowed_tools": ["write_file", "read_file", "run_command", "list_files"],
  "require_approval": true
}
Use a Policy in a Run
POST /api/v1/runs
{
  "objective": "Build a payment API",
  "policy_id": "abc123"
}

# Enforcement:
# β€’ Cost tracked per node β€” run stops if limit exceeded
# β€’ Runtime checked per wave β€” remaining nodes skipped if timeout
# β€’ Blocked commands rejected at execution time
# β€’ Only allowed tools available to agents
# β€’ PR creation pauses for human approval

Guardrail Features

  • β€’ Cost Limit: Run stops if accumulated token cost exceeds max_cost_usd. A cost_limit_exceeded event is emitted.
  • β€’ Runtime Limit: Remaining nodes are skipped if execution exceeds max_runtime_minutes.
  • β€’ Blocked Commands: Shell commands matching blocked patterns are rejected before execution. Default blocks include rm -rf /, sudo, shutdown.
  • β€’ Tool Allowlist: When specified, agents can only use the listed tools. Empty = all tools available.
  • β€’ Approval Gate: When require_approval is true, the agent pauses and asks for human confirmation before creating pull requests.
  • β€’ Failover Chain: If the primary provider fails, CesaFlow automatically tries the next provider in preferred_providers list.

# API Vault & Key Management

CesaFlow separates admin and user API keys. Admin keys power chat/planning. User BYOM keys power full code generation.

How Keys Work

  • β€’ Admin Keys (API Vault): Used for chat, planner, discussion, code review, cost estimation. Admin adds keys via Nexus β†’ API Vault. Supports 25+ providers with auto-fallback.
  • β€’ User Keys (BYOM): Used for full code generation (backend, frontend, QA agents). Users add their own keys via Dashboard β†’ Model Settings. Admin keys are never used for user builds.
  • β€’ Fallback Chain: Active AI keys in the vault are tried in priority order. If provider 1 fails (quota, rate limit), provider 2 is tried automatically.
  • β€’ Separation: Admin and user keys are stored in different tables and never mixed. Admin costs are predictable (only lightweight operations).
Key Priority Flow
# For each AI call, CesaFlow checks in this order:

1. User BYOM key (model_credentials table)
   β†’ If found: use for ALL agents (planner + backend + frontend + qa)
   β†’ User pays their provider directly

2. Admin Vault key (api_vault table) β€” only for allowed purposes
   β†’ chat, planner, discussion, inline edit, review, estimate
   β†’ NOT used for: backend, frontend, qa, devops agents
   β†’ If no BYOM: backend/frontend/qa return "Add your own key" error

3. System config / env var (legacy fallback)
   β†’ Same restrictions as vault

# Fallback within vault:
# Priority 10 (Groq) fails β†’ try Priority 20 (Gemini) β†’ Priority 30 (OpenAI)

# Model Catalog

57+ AI models across 25+ providers. Each model includes rate limits, pricing, and context window information.

Browse the Catalog
GET /api/v1/runs/models/catalog

Response:
{
  "total_models": 38,
  "providers": [
    {
      "id": "groq",
      "name": "Groq",
      "model_count": 4,
      "models": [
        {
          "id": "llama-3.3-70b-versatile",
          "name": "Llama 3.3 70B",
          "context": 128000,
          "rate_limit": "30 req/min, 14400 tokens/min",
          "pricing": "Free",
          "best_for": "General, coding"
        },
        ...
      ]
    },
    ...
  ]
}

# Per-provider:
GET /api/v1/runs/models/catalog/openai
GET /api/v1/runs/models/catalog/anthropic

Free Runs

Every new user gets 3 free runs using admin-provided AI keys. After that, users must add their own BYOM key.

GET /api/v1/runs/free-runs
β†’ { "free_runs_remaining": 2, "free_runs_total": 3, "has_byom_key": false, "status": "free_runs_available" }

# Resilience & Recovery

CesaFlow is built for reliability with wave-level checkpointing, graceful shutdown, and automatic recovery.

Wave Checkpointing

After each wave completes, the execution state is persisted to PostgreSQL. If the server crashes mid-run, completed waves are not re-executed on restart.

Graceful Shutdown

On SIGTERM, the orchestrator saves current state within a 25-second drain timeout. The run is marked as "interrupted" and can be resumed.

Auto-Resume

On startup, the system scans for interrupted or orphaned runs and automatically resumes them, skipping already-completed nodes.

Multi-Provider Failover

If the primary AI provider fails, CesaFlow automatically tries the next provider in the fallback chain. A run only fails after all providers are exhausted.

# FAQ

Which AI models does CesaFlow use?

By default, CesaFlow uses Qwen Plus (DashScope) β€” no API key needed for new users. You can override this with any model via Dashboard β†’ Model Settings: Gemini, Claude, Groq, Mistral, DeepSeek, Ollama, and more are all supported.

Is GITHUB_TOKEN required?

No. Git tools (git_init, git_status) work without it. You only need GITHUB_TOKEN for github_create_pr β€” opening pull requests via the GitHub API.

Is the browser tool always active?

Browser navigation with Playwright requires PLAYWRIGHT_ENABLED=1 in your environment. Without it, browser_navigate falls back to plain HTTP via httpx β€” no JavaScript rendering.

How does project_id work?

When you pass project_id to a run, CesaFlow reuses the same workspace directory (/workspace/project_{id}/) so files persist across runs. The Planner also loads previous run history and tells agents not to overwrite existing code.

Can I bring my own model (BYOM)?

Yes. Go to Dashboard β†’ Model Settings and enter your API key for any supported provider. CesaFlow will use your credentials for all runs.

How do I run CesaFlow locally?

Clone the repo and run: docker compose up -d. No API key needed β€” CesaFlow ships with a built-in Qwen system key. For your own models, add keys via Model Settings. Backend: port 8001, Frontend: port 3000.