Documentation

Everything you need to build with CesaFlow.

# Quick Start

Get your first AI-generated project running in under 2 minutes.

Create an account

curl -X POST https://api.cesaflow.ai/api/v1/auth/register \
  -H "Content-Type: application/json" \
  -d '{"email": "[email protected]", "password": "secret", "organization_name": "My Org"}'

# Response:
# { "api_key": "sk_...", "org_id": "..." }

Start a run

curl -X POST https://api.cesaflow.ai/api/v1/runs \
  -H "x-api-key: sk_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"objective": "Build a FastAPI todo app with JWT auth", "project_id": "my-todo"}'

# Response:
# { "run_id": "abc-123", "status": "pending" }

Watch it build

# Poll for status
curl https://api.cesaflow.ai/api/v1/runs/abc-123 \
  -H "x-api-key: sk_YOUR_KEY"

# Or connect WebSocket for live events:
# wss://api.cesaflow.ai/ws/runs?api_key=sk_YOUR_KEY

# Download when done:
curl https://api.cesaflow.ai/api/v1/runs/abc-123/download \
  -H "x-api-key: sk_YOUR_KEY" -o workspace.zip

# REST API Reference

Base URL: https://api.cesaflow.ai
Auth header: x-api-key: sk_YOUR_KEY

Endpoints

POST

/api/v1/auth/register

Create an account and get an API key

POST

/api/v1/auth/login

POST

/api/v1/runs

Start a new run (returns run_id immediately)

🔑 auth

GET

/api/v1/runs

List all runs for your org

🔑 auth

GET

/api/v1/runs/{run_id}

Get run status, nodes, and files

🔑 auth

GET

/api/v1/runs/{run_id}/download

Download workspace as ZIP

🔑 auth

GET

/api/v1/benchmark/tasks

List all benchmark tasks

🔑 auth

POST

/api/v1/benchmark/run

Start a benchmark run

🔑 auth

GET

/api/v1/benchmark/{id}

Poll benchmark results

🔑 auth

GET

/api/v1/billing/usage

Get monthly usage and budget

🔑 auth

/ws/runs?api_key=KEY

WebSocket: stream live run events

🔑 auth

POST

/api/v1/runs/{run_id}/cancel

Cancel a running run

🔑 auth

POST

/api/v1/runs/{run_id}/respond

Submit human-in-the-loop answer

🔑 auth

GET

/api/v1/runs/{run_id}/tokens

Token usage and cost breakdown by agent

🔑 auth

POST

/api/v1/estimate

Estimate cost/time for a project (3 scenarios)

🔑 auth

POST

/api/v1/discussion/start

Start AI discussion session

🔑 auth

POST

/api/v1/discussion/turn

Continue discussion (AI asks questions)

🔑 auth

POST

/api/v1/projects

Create a new project

🔑 auth

GET

/api/v1/projects

List all projects for your org

🔑 auth

POST

/api/v1/cron/jobs

Create a scheduled cron job

🔑 auth

GET

/api/v1/cron/jobs

List all cron jobs

🔑 auth

DELETE

/api/v1/cron/jobs/{job_id}

Delete a cron job

🔑 auth

POST

/api/v1/mcp/servers

Add an MCP server

🔑 auth

GET

/api/v1/mcp/servers

List MCP servers

🔑 auth

GET

/api/v1/mcp/servers/{id}/tools

Get tools from MCP server

🔑 auth

POST

/api/v1/team/invite

Invite a team member by email

🔑 auth

GET

/api/v1/team/members

List team members with roles

🔑 auth

POST

/api/v1/inline/edit

Cmd+K: Edit selected code (IDE)

🔑 auth

POST

/api/v1/inline/chat

Cmd+L: Chat about file (IDE)

🔑 auth

POST

/api/v1/inline/complete

Tab completion (IDE)

🔑 auth

GET

/api/v1/marketplace/templates

Browse agent templates

🔑 auth

POST

/api/v1/marketplace/templates/{id}/install

Install a template

🔑 auth

POST

/api/v1/github/webhook

GitHub App webhook receiver

POST

/api/v1/policies

Create execution policy

🔑 auth

GET

/api/v1/models/free

List models with free tiers

🔑 auth

POST

/api/v1/models/compare

Compare models by cost/speed

🔑 auth

POST

/api/v1/review

Submit code for AI review (diff or PR URL)

🔑 auth

GET

/api/v1/review/{review_id}

Get review results

🔑 auth

GET

/api/v1/runs/{run_id}/chain

Get all runs in a chain

🔑 auth

GET

/api/v1/runs/learning/stats

Learning engine stats for your org

🔑 auth

GET

/api/v1/runs/hierarchy/personas

List agent hierarchy personas and roles

GET

/api/v1/runs/performance/insights

Performance metrics and optimization suggestions

🔑 auth

GET

/api/v1/runs/money/templates

List revenue-generating project templates

🔑 auth

POST

/api/v1/runs/money/start

Start a Money Mode run from template

🔑 auth

GET

/api/v1/runs/clone/profile

Get your digital clone profile (learned preferences)

🔑 auth

GET

/api/v1/runs/clone/context

Get prompt-injectable clone context

🔑 auth

POST

/api/v1/runs/goal/start

Start goal-mode autonomous execution

🔑 auth

GET

/api/v1/runs/goal/{goal_id}

Get goal execution status

🔑 auth

GET

/api/v1/runs/auto-config

Get auto-optimized execution config

🔑 auth

POST

/api/v1/runs/clone/decide

Digital Clone autonomous decision

🔑 auth

GET

/api/v1/runs/clone/style

Inferred code style preferences

🔑 auth

GET

/api/v1/runs/benchmark/trend

Benchmark regression trend

🔑 auth

GET

/api/v1/runs/modules/available

List composable modules

🔑 auth

POST

/api/v1/runs/modules/resolve

Resolve module dependencies

🔑 auth

POST

/api/v1/marketplace/rate

Rate a marketplace template (1-5)

🔑 auth

GET

/api/v1/marketplace/ratings/{id}

Get template average rating

GET

/api/v1/runs/{run_id}/graph

Get execution graph (nodes, dependencies, progress)

🔑 auth

GET

/api/v1/runs/{run_id}/logs

Get all shared-memory logs for a run

🔑 auth

GET

/api/v1/runs/{run_id}/workspace

File listing and summary for a run

🔑 auth

GET

/api/v1/runs/{run_id}/files/{path}

Get file content (preview, max 500KB)

🔑 auth

PATCH

/api/v1/runs/{run_id}/files/{path}

Save/edit file content (browser IDE)

🔑 auth

GET

/api/v1/runs/{run_id}/ide-url

Get OpenVSCode Server URL for this run

🔑 auth

GET

/api/v1/runs/{run_id}/waiting

Check if run has pending human input

🔑 auth

GET

/api/v1/billing/plans

List all plan details (free/pro/enterprise)

GET

/api/v1/billing/status

Current subscription status

🔑 auth

POST

/api/v1/billing/checkout

Create Stripe Checkout Session

🔑 auth

POST

/api/v1/billing/portal

Get Stripe Customer Portal URL

🔑 auth

DELETE

/api/v1/team/members/{user_id}

Remove team member

🔑 auth

PATCH

/api/v1/team/members/{user_id}/role

Update member role

🔑 auth

GET

/api/v1/team/invite/{token}

Get invite metadata (public)

POST

/api/v1/team/invite/accept

Accept invite and create account

GET

/api/v1/projects/{id}

Get specific project

🔑 auth

PUT

/api/v1/projects/{id}

Update project

🔑 auth

DELETE

/api/v1/projects/{id}

Delete project

🔑 auth

POST

/api/v1/model-credentials

Store encrypted model API key (BYOM)

🔑 auth

GET

/api/v1/model-credentials

List stored credentials (keys hidden)

🔑 auth

DELETE

/api/v1/model-credentials/{id}

Remove stored credential

🔑 auth

POST

/api/v1/model-credentials/{id}/set-default

Set default credential

🔑 auth

GET

/api/v1/marketplace/categories

List template categories

🔑 auth

GET

/api/v1/marketplace/installed

List installed templates for org

🔑 auth

DELETE

/api/v1/marketplace/templates/{id}/install

Uninstall template

🔑 auth

GET

/api/v1/auth/me

Get current user and org info

🔑 auth

POST

/api/v1/auth/forgot-password

Send password reset email

POST

/api/v1/auth/reset-password

Reset password with token

DELETE

/api/v1/mcp/servers/{id}

Remove MCP server

🔑 auth

POST

/api/v1/tasks

Create and queue a task

🔑 auth

GET

/api/v1/tasks

List tasks (filter by project, status)

🔑 auth

GET

/api/v1/tasks/{id}

Get task details

🔑 auth

PUT

/api/v1/tasks/{id}

Update task

🔑 auth

POST

/api/v1/tasks/{id}/cancel

Cancel a task

🔑 auth

POST

/api/v1/capture/events

Capture event from IDE/CLI/Git hook

🔑 auth

GET

/api/v1/capture/events

List captured events

🔑 auth

POST

/api/v1/capture/events/{id}/convert

Convert event to task

🔑 auth

GET

/api/v1/admin/vault

List all API vault entries (admin)

🔑 auth

POST

/api/v1/admin/vault

Add API key to vault (admin)

🔑 auth

POST

/api/v1/admin/vault/{id}/test

Test vault entry connection (admin)

🔑 auth

GET

/api/v1/runs/models/catalog

Full model catalog (57+ models, 25+ providers)

GET

/api/v1/runs/models/catalog/{provider}

Models for specific provider

GET

/api/v1/runs/free-runs

Free runs remaining for this org

🔑 auth

GET

/health

System health check

GET

/ping

Simple ping/pong

Example: Start a Run with project_id

curl -X POST https://api.cesaflow.ai/api/v1/runs \
  -H "x-api-key: sk_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "objective": "Add WebSocket support to the chat feature",
    "project_id": "my-chat-app"
  }'

# If project_id exists:
# - Reuses /workspace/project_my-chat-app/ (files preserved)
# - Loads previous plan and file history
# - Planner knows what was already built

WebSocket Events

// Connect: wss://api.cesaflow.ai/ws/runs?api_key=sk_KEY

// Events streamed:
{ "event": "run_started",   "run_id": "...", "node_count": 4 }
{ "event": "token_chunk",   "agent": "backend", "chunk": "..." }
{ "event": "file_written",  "agent": "backend", "path": "main.py", "bytes": 1240 }
{ "event": "run_completed", "run_id": "...", "files": ["main.py", ...] }
{ "event": "human_input_required", "question": "Which database?" }

# CLI Reference

Install the CLI for terminal-first workflows.

Install

cd cli/
pip install -e .

# Or with pipx:
pipx install -e .

aios run — Start a task

aios run "Build a REST API for a blog"
aios run "Add JWT auth" --project my-blog
aios run "Build a FastAPI app" --tasks backend,qa   # select agents
aios run "..." --model gpt-4o                       # override model

aios benchmark — Run CesaFlow Benchmark Suite

aios benchmark              # run all 10 tasks
aios benchmark --list      # list available tasks
aios benchmark --tasks hello_world,calculator  # run specific tasks

aios logs — Stream live run events

aios logs <run-id>          # stream events for a run
aios logs <run-id> --follow  # keep streaming until done

Environment Variables

AIOS_API_KEY=sk_...      # your API key
AIOS_API_URL=https://api.cesaflow.ai  # API base URL (default)
OPENAI_API_KEY=sk_...   # for agents to call LLMs
GITHUB_TOKEN=ghp_...    # for git/PR tools (optional)
PLAYWRIGHT_ENABLED=1    # enable browser tool (optional)

# Agent Tools

All tools available to agents during a run. Agents choose which tools to call based on the task.

Files

write_file

Write content to a file in the workspace.

params: path: string, content: string

Files

read_file

Read the contents of a workspace file.

params: path: string

Files

list_files

List all files in the workspace.

Shell

run_command

Execute a shell command (pip install, pytest, npm, etc.).

params: command: string

Shell

read_test_output

Read the output of the last run_command call.

Web

web_search

Search the web via DuckDuckGo for documentation or packages.

params: query: string

Browser

browser_navigate

Navigate to a URL (Playwright or httpx fallback).

params: url: string

Browser

browser_get_content

Read text content of the current page.

Browser

browser_click

Click an element on the current page.

params: selector: string

Browser

browser_screenshot

Take a screenshot of the current page (base64 PNG).

Git

git_init

Initialize a git repo and make the first commit.

params: message?: string

Git

git_status

Run git status and show last 5 commits.

Git

github_create_pr

Open a pull request via GitHub API (requires GITHUB_TOKEN).

params: owner, repo, title, body, head, base

Interactive

ask_human

Pause execution and ask the user a question. Run waits until user responds.

params: question: string

Design

figma_get_file

Access Figma design files and extract component information.

params: file_key: string

Web

firecrawl_scrape

Scrape and extract web content with JavaScript rendering and Markdown conversion.

params: url: string

Files

pdf_to_text

Extract text content from a PDF file.

params: path: string

Deploy

generate_deploy

Generate deployment configs for Vercel, Railway, Fly.io, Docker, or Render.

params: platform: string, app_name?: string

Server

ssh_execute

Execute commands on remote servers via SSH.

params: host: string, command: string, username?: string

Communication

send_email

Send emails via SMTP for notifications and reports.

params: to: string, subject: string, body: string

Payment

stripe_create_product

Create a Stripe product with pricing for subscription or one-time billing.

params: name: string, price_cents?: int, recurring?: bool

Payment

stripe_create_checkout

Create a Stripe Checkout Session URL for accepting payments.

params: price_id: string, success_url?: string

Database

setup_database

Generate database config, Docker Compose, and initialization scripts for PostgreSQL, SQLite, or MongoDB.

params: db_type?: string, app_name?: string

Config

generate_env

Generate .env, .env.example, and .gitignore with proper project configuration.

params: app_name?: string, db_type?: string, include_stripe?: bool

Deploy

deploy_to_platform

Execute actual deployment to Vercel, Railway, Fly.io, or Docker.

params: platform: string, token?: string

Web

http_request

Make HTTP GET/POST/PUT/DELETE requests to external APIs. SSRF-protected.

params: url: string, method?: string, headers?: object, body?: string

Database

db_query

Execute read-only SQL queries against PostgreSQL. SELECT only, max 100 rows.

params: query: string, connection_string?: string

Web

download_file

Download files from URLs into the workspace. Max 10MB, SSRF-protected.

params: url: string, filename?: string

# Advanced Tools

Beyond the standard file/shell/browser tools, CesaFlow agents have access to real-world integration tools.

🖥️

SSH

Execute commands on remote servers. Useful for deployment, server management, and monitoring.

ssh_execute(host="server.com", command="docker ps")

📧

Email

Send emails via SMTP. Useful for notifications, reports, and automated communication.

send_email(to="[email protected]", subject="Deploy Complete", body="...")

💳

Stripe

Create products, prices, and checkout sessions. Agents can add payments to generated apps.

stripe_create_product(name="Pro Plan", price_cents=2900)

🗄️

Database

Generate database configs for PostgreSQL, SQLite, or MongoDB with Docker Compose and init scripts.

setup_database(db_type="postgresql", app_name="myapp")

🚀

Deploy

Generate platform-specific deployment configs (Vercel, Railway, Fly.io, Docker, Render).

generate_deploy(platform="vercel", app_name="my-saas")

⚙️

Environment

Generate .env, .env.example, and .gitignore with proper configuration.

generate_env(app_name="myapp", include_stripe=true)

# Architecture

How CesaFlow runs your task from start to finish.

Agent Pipeline

POST /api/v1/runs
       │
       ▼
  GraphScheduler
       │
  Wave 1: ┌─────────┐
          │ Planner │  ← generates API Contract
          └─────────┘
               │
  Wave 2: ┌─────────┐   ┌──────────┐
          │ Backend │   │ Frontend │  ← run in PARALLEL
          └─────────┘   └──────────┘
               │               │
  Wave 3:      └───────┬───────┘
                  ┌────┴────┐
                  │   QA    │  ← waits for both
                  └─────────┘

Self-Debugging Loop

Agent runs → validates (pytest/syntax check)
    │
    ├─ PASS → next wave
    │
    └─ FAIL → inject debug_info into prompt
                  {
                    error_type: "validation_failure",
                    failed_command: "pytest tests/",
                    full_output: "...",   // full pytest output
                    failed_tests: [...],  // parsed test names
                    attempt: 2
                  }
              → agent retries (max 3 attempts)

Project Memory

project_id = "my-blog"

Workspace:  /workspace/project_my-blog/   ← shared, persistent
DB key:     project:my-blog:state         ← PostgreSQL project_memory

State stored:
  { project_id, total_runs, files[], runs[{objective, files, ts}] }

Next run with same project_id:
  → Planner sees: "Continuation of project, existing files: [...]"
  → DO NOT rewrite — build on top

# Benchmark

CesaFlow includes a built-in benchmark suite: 10 coding tasks evaluated automatically, no human grading. This is CesaFlow's internal evaluation suite — not the academic SWE-bench dataset. It measures end-to-end agent capability across file generation, code correctness, and test execution.

Task List

hello_worldtrivialWrite a Python hello world script

calculatoreasyImplement a calculator with unit tests

fastapi_healtheasyCreate a FastAPI app with /health endpoint

todo_apimediumFull CRUD todo API with SQLite

data_processingmediumCSV processing pipeline with pandas

auth_jwtmediumJWT authentication system

websocket_chathardReal-time WebSocket chat server

async_scraperhardAsync web scraper with rate limiting

react_componentmediumReact data table component

docker_composemediumMulti-service docker-compose setup

Scoring

Each task: 10 points max
Check types:
  file_exists       — does the required file exist?
  file_contains     — does it contain the expected code?
  any_file_contains — does any file contain the pattern?

Score = (passed_checks / total_checks) * 10
Grade: S (90+) · A (80+) · B (70+) · C (60+) · D (50+) · F (<50)

🏆 Run Benchmark Now

# Execution Personas

Choose a persona to tailor agent behavior to your use case. Pass persona in your run request.

fullstack

Planner + Backend + Frontend + QA — complete web apps

"Build a blog with auth and admin panel"

api_only

Planner + Backend + QA — REST APIs and microservices

"Create a payment webhook handler"

data_pipeline

Planner + Backend + QA — ETL, pandas, SQL processing

"Build a CSV import pipeline with validation"

saas_starter

Full stack + auth, billing, multi-tenancy hints

"Launch a SaaS dashboard with Stripe"

microservice

Single-responsibility service with Docker + health checks

"Image resizing microservice with S3"

Usage

curl -X POST https://api.cesaflow.ai/api/v1/runs \
  -H "x-api-key: sk_YOUR_KEY" \
  -d '{"objective": "Build a payment API", "persona": "api_only"}'

# AI Discussion & Planning

Refine vague ideas through multi-turn conversation before executing. The AI asks focused questions to clarify requirements.

Flow

# 1. Start a session
POST /api/v1/discussion/start
{ "message": "I want to build something for managing tasks" }

# 2. AI asks clarifying questions
Response: { "reply": "Should it have user auth? REST or GraphQL?", "ready": false }

# 3. Continue the conversation
POST /api/v1/discussion/turn
{ "session_id": "...", "message": "REST with JWT, PostgreSQL, and team support" }

# 4. After 3-5 turns, AI declares ready
Response: { "ready": true, "refined_objective": "Build a task management REST API with JWT auth, PostgreSQL, team workspaces, and role-based access control" }

# 5. Execute the refined objective
POST /api/v1/runs
{ "objective": "<refined_objective>" }

# Cost Estimation

Get detailed cost, time, and quality estimates before running. Returns 3 execution scenarios.

POST /api/v1/estimate
{ "objective": "Build a blog with auth and comments" }

Response:
{
  "tasks": [
    { "name": "Setup project", "complexity": "low", "tokens_est": 2000 },
    { "name": "User auth system", "complexity": "medium", "tokens_est": 8000 },
    ...
  ],
  "plans": [
    { "name": "Premium Fast", "model": "gpt-4o", "cost_usd": 0.12, "quality": 9.2, "risk": "low" },
    { "name": "Balanced", "model": "gemini-1.5-flash", "cost_usd": 0.03, "quality": 7.8, "risk": "medium" },
    { "name": "Free", "model": "groq-llama", "cost_usd": 0.00, "quality": 6.5, "risk": "high" }
  ]
}

# MCP Integration

Connect external tools via Model Context Protocol. Agents can access GitHub, Linear, Slack, Notion, PostgreSQL, and any MCP-compatible server.

# Add an MCP server
POST /api/v1/mcp/servers
{
  "name": "GitHub",
  "url": "https://mcp-github.example.com",
  "api_key": "ghp_..."
}

# List available tools from the server
GET /api/v1/mcp/servers/{server_id}/tools
Response: [
  { "name": "create_issue", "description": "Create GitHub issue" },
  { "name": "search_code", "description": "Search code in repos" },
  ...
]

# Tools are automatically available to agents during runs

# Scheduled Runs (Cron)

Automate recurring tasks with standard cron expressions. Jobs persist across restarts.

# Create a daily job at 9 AM
POST /api/v1/cron/jobs
{
  "label": "Daily report",
  "objective": "Generate a summary report of yesterday's sales data",
  "cron_expr": "0 9 * * *",
  "project_id": "daily-reports"
}

# List all jobs
GET /api/v1/cron/jobs

# Delete a job
DELETE /api/v1/cron/jobs/{job_id}

# Cron format: minute hour day month day_of_week
# Examples: "0 9 * * *" (daily 9am), "0 */6 * * *" (every 6h), "0 9 * * 1" (Mon 9am)

# GitHub App Integration

Connect CesaFlow as a GitHub App for automated PR reviews, issue fixes, and CI debugging.

Capabilities

• PR Auto-Review: Opens a PR → CesaFlow reviews code, posts comments
• Issue Fix Command: Comment /cesaflow fix on any issue → agents analyze and submit a fix PR
• CI Debugging: Workflow fails → CesaFlow reads logs, identifies root cause, suggests fix
• Issue Analysis: New issue opened → CesaFlow analyzes and creates a run

# Template Marketplace

Pre-built agent templates with optimized system prompts for specific frameworks and use cases.

DjangoBackendDjango REST Framework with auth, admin, migrations

RailsBackendRuby on Rails with ActiveRecord, RSpec

Rust CLISystemsRust CLI app with clap, error handling

Data ScienceData & MLJupyter, pandas, scikit-learn pipelines

DevOpsDevOpsTerraform, Docker, CI/CD, monitoring

React NativeMobileCross-platform mobile with Expo

GolangBackendGo service with Chi router, GORM

Node.jsBackendExpress/Fastify with TypeScript

Java SpringBackendSpring Boot with JPA, Security

KubernetesDevOpsK8s manifests, Helm charts, operators

LLM ChatbotAIChatbot with LangChain, vector DB, RAG

Python CLISystemsClick CLI with rich output, testing

# Human-in-the-Loop

Agents can pause execution and ask you questions when they need clarification.

# During a run, an agent calls the ask_human tool:
# Agent: "Should I use PostgreSQL or SQLite for the database?"

# WebSocket event:
{ "event": "human_input_required", "run_id": "...", "question": "Which database?" }

# Check if a run is waiting for input:
GET /api/v1/runs/{run_id}/waiting

# Submit your answer:
POST /api/v1/runs/{run_id}/respond
{ "answer": "Use PostgreSQL with SQLAlchemy" }

# Agent receives answer and continues execution

# Inline AI (IDE Endpoints)

Fast single-turn endpoints for the CesaFlow IDE (browser-based, OpenVSCode Server). Target: <300ms response time.

Cmd+K — Edit Code

POST /api/v1/inline/edit
{ "code": "def hello():\n  print('hi')", "instruction": "Add type hints and docstring" }

Response: { "edited_code": "def hello() -> None:\n  \"\"\"Greet the user.\"\"\"\n  print('hi')", "summary": "Added return type and docstring" }

Cmd+L — Chat About Code

POST /api/v1/inline/chat
{ "file_context": "...", "message": "What does this function do?" }

Response: { "reply": "This function validates JWT tokens..." }

Tab — Completion

POST /api/v1/inline/complete
{ "prefix": "def calculate_tax(amount: float, ", "suffix": "):\n  return" }

Response: { "completion": "rate: float = 0.2" }

# Code Review API

AI-powered code review for pull requests and diffs. Submit code changes and get structured feedback with risk assessment, security analysis, and rollback plans.

Submit a Review

# Review a GitHub PR
POST /api/v1/review
{
  "github_url": "https://github.com/owner/repo/pull/123",
  "focus": "security",      // general | security | performance | style
  "language": "auto"
}

# Or review a raw diff
POST /api/v1/review
{
  "diff_content": "--- a/main.py\n+++ b/main.py\n@@ -1,3 +1,5 @@...",
  "focus": "general"
}

# Response: { "review_id": "abc123", "status": "processing" }

# Poll for results
GET /api/v1/review/abc123
# Returns: { "status": "completed", "result": "## Summary\n..." }

Review Output Includes

• Summary: What the change does in 1-3 sentences
• Risk Assessment: LOW / MEDIUM / HIGH with explanation
• Issues Found: Critical, warning, and suggestion-level issues with fixes
• Security Check: SQL injection, XSS, hardcoded secrets, input validation
• Suggestions: Code quality, performance, and pattern improvements
• Rollback Plan: What to do if the change causes production issues

# Multi-Run Chaining

Chain runs together for multi-step autonomous execution. When a run completes, it can automatically trigger the next objective.

Start a Chain

POST /api/v1/runs
{
  "objective": "Build a REST API for a blog",
  "project_id": "my-blog",
  "continuation": {
    "next_objective": "Add frontend with React for the blog API",
    "trigger_condition": "on_success",
    "max_depth": 3,
    "inherit_workspace": true,
    "then": {
      "next_objective": "Write E2E tests for the full-stack blog",
      "trigger_condition": "on_success"
    }
  }
}

# Flow:
# Run 1: Build API → completes → triggers Run 2
# Run 2: Build frontend → completes → triggers Run 3
# Run 3: Write tests → completes → chain done

# Track the chain:
GET /api/v1/runs/{run_id}/chain
# Returns all runs in order with status

Trigger Conditions

• on_success: Continue only if the run completed successfully
• on_failure: Continue only if the run failed (useful for fallback strategies)
• always: Continue regardless of outcome
• spec_pass: Continue only if spec validation score ≥ 70

# Learning Engine

CesaFlow learns from every run. Error patterns are tracked per organization, and lessons are automatically injected into agent prompts to avoid repeating the same mistakes.

How It Works

1. Agent encounters a validation failure (tests fail, syntax error, etc.)
2. Error pattern is recorded: error type, message, agent, context
3. If the agent fixes the issue on retry, the fix is recorded too
4. On future runs, relevant lessons are injected into the agent's prompt
5. Agent avoids known patterns and applies known fixes proactively

Learning Stats API

GET /api/v1/runs/learning/stats

Response:
{
  "total_patterns": 23,
  "fixed_patterns": 18,
  "by_agent": {
    "backend": 12,
    "frontend": 8,
    "qa": 3
  },
  "top_errors": [
    { "type": "validation_failure", "count": 7, "agent": "backend" },
    { "type": "validation_failure", "count": 5, "agent": "frontend" }
  ]
}

# Agent Hierarchy

CesaFlow supports multi-level agent organizations. Choose a hierarchy persona to control how agents are structured.

Available Personas

• standard: Planner → Backend + Frontend → QA (default)
• enterprise: CEO → CTO → Backend + Frontend → QA (full hierarchy with strategic planning)
• with_devops: Planner → Backend + Frontend → QA → DevOps (includes deployment setup)

7 Agent Roles

• CEO (Level 0): Strategic planning, prioritization, resource allocation. Delegates to CTO and Growth.
• CTO (Level 1): Architecture, tech stack selection, system design. Delegates to Dev agents.
• Backend (Level 2): API development, database, server-side logic.
• Frontend (Level 2): UI development, responsive design, accessibility.
• QA (Level 2): Testing, code review, quality assurance.
• DevOps (Level 2): CI/CD, Docker, infrastructure, deployment.
• Growth (Level 1): SEO, analytics, marketing, user research.

# Performance Insights

CesaFlow tracks run metrics and provides optimization suggestions. The system identifies trends and recommends the best models for your workload.

GET /api/v1/runs/performance/insights

Response:
{
  "total_runs": 47,
  "pass_rate": 85.1,
  "avg_duration_s": 124.5,
  "avg_cost_usd": 0.0342,
  "trend": "improving",
  "best_model": "claude-3-5-sonnet",
  "model_usage": { ... },
  "persona_stats": { ... },
  "suggestions": [
    "Performance looks healthy. Keep going!",
    "Model 'claude-3-5-sonnet' has the best success rate"
  ]
}

# Money Mode

Revenue-generating project templates. Pick a business model and CesaFlow builds the entire application — auth, payments, dashboard, and deployment config included.

Available Templates

SaaS MVPSubscription ($29-499/mo)15-30 min

API-as-a-ServiceUsage-based ($0.001-0.01/req)10-20 min

E-Commerce StoreProduct sales20-40 min

MarketplaceCommission (10-20%)30-50 min

Lead GenerationLead sales ($5-50/lead)10-20 min

AI Wrapper ProductUsage credits ($10-100/mo)10-20 min

Booking SystemBooking fees ($5-100)15-25 min

Paid NewsletterSubscriptions ($5-20/mo)15-25 min

Start a Money Mode Run

POST /api/v1/runs/money/start
{
  "template_id": "saas_mvp",
  "customization": "Use React + Tailwind, target fitness industry"
}

# Response: { "run_id": "...", "template": "SaaS MVP", "revenue_model": "Subscription" }

# Deploy

CesaFlow generates deployment configurations automatically. Use the with_devops build mode or the generate_deploy tool to create platform-specific configs.

Supported Platforms

• Vercel: Best for Next.js and React apps. Generates vercel.json + .vercelignore
• Railway: Full-stack apps with databases. Generates railway.toml + Procfile
• Fly.io: Global Docker deployments. Generates fly.toml + Dockerfile
• Render: Simple web services. Generates render.yaml
• Docker: Self-hosted. Generates Dockerfile + docker-compose.yml

Agent Tool Usage

# During a run, agents can call:
generate_deploy(platform="vercel", app_name="my-saas")

# This creates deployment files in the workspace:
# → vercel.json
# → .vercelignore

# Or for Docker:
generate_deploy(platform="docker", app_name="my-api")
# → Dockerfile
# → docker-compose.yml
# → .dockerignore

# Digital Clone

CesaFlow builds a digital clone of your preferences over time. It learns your tech stack, coding patterns, and project preferences — then instructs agents to work the way you would.

What It Learns

• Tech Stack: Which languages and frameworks you use most (Python, TypeScript, React, FastAPI, etc.)
• Build Mode: Your preferred agent persona (fullstack, api_only, enterprise, etc.)
• AI Model: Which model performs best for your workloads
• Project Patterns: Common keywords, file types, and architecture choices

Clone Profile API

GET /api/v1/runs/clone/profile

Response:
{
  "status": "active",
  "runs_analyzed": 23,
  "preferred_stack": ["python", "typescript", "react"],
  "preferred_persona": "fullstack",
  "preferred_model": "claude-3-5-sonnet",
  "top_keywords": [
    { "word": "api", "count": 8 },
    { "word": "auth", "count": 5 }
  ],
  "profile_summary": "Primarily works with python, typescript, react. Prefers fullstack builds."
}

# Clone context is automatically injected into agent prompts:
GET /api/v1/runs/clone/context
# → "This user prefers: python, typescript. Default: fullstack. Adapt output accordingly."

# Goal Mode

Set a high-level goal and CesaFlow autonomously decomposes it into tasks, chains them together, and executes until done. No "continue" needed.

Start a Goal

POST /api/v1/runs/goal/start
{
  "goal": "Build a complete SaaS product for project management with auth, teams, tasks, billing, and deploy to Vercel",
  "project_id": "my-saas"
}

# CesaFlow:
# 1. Decomposes into 5 tasks:
#    → Build auth system with JWT
#    → Create team + task management API
#    → Build React frontend with dashboard
#    → Add Stripe billing integration
#    → Generate Vercel deploy config + tests
#
# 2. Chains them automatically
# 3. Each task validates before continuing
# 4. Self-corrects on failure
#
# Response: { "goal_id": "...", "run_id": "...", "tasks": [...], "total_steps": 5 }

Auto-Optimization

GET /api/v1/runs/auto-config

# Returns optimal config based on your run history:
{
  "auto_optimized": true,
  "recommended_model": "claude-3-5-sonnet",
  "recommended_strategy": "balanced",
  "recommended_provider": "anthropic"
}

# CesaFlow analyzes your pass rate, cost trends, and model performance
# to automatically suggest the best configuration.

Full Autonomy Stack

These systems work together for end-to-end autonomous execution:

1. Goal Mode decomposes objective into sequential tasks
2. Multi-Run Chaining auto-starts next task on completion
3. Agent Hierarchy assigns roles (CEO→CTO→Dev→QA)
4. Self-Debug Loop retries on failure (3 attempts)
5. Learning Engine avoids past mistakes
6. Digital Clone applies your preferences automatically
7. Auto-Optimization selects best model from history
8. Guardrails enforce cost/runtime/tool limits safely

# Webhook Continuation

Enable indefinite autonomous execution by connecting external systems via webhooks. When a run completes, CesaFlow notifies your webhook — and if it responds with a new objective, a follow-up run starts automatically.

# Start a run with webhook
POST /api/v1/runs
{
  "objective": "Build a REST API for blog",
  "webhook_url": "https://your-server.com/cesaflow-hook"
}

# When run completes, CesaFlow POSTs to your webhook:
{
  "event": "run_completed",
  "run_id": "abc-123",
  "status": "completed",
  "files_count": 12,
  "chain_id": "chain_abc"
}

# Your webhook can respond:
{ "continue": true, "next_objective": "Add frontend for the blog API" }
# → CesaFlow automatically starts a new run!

# Or stop the chain:
{ "continue": false }

# Use cases:
# • CI/CD pipelines (build → test → deploy stages)
# • Monitoring (daily check → fix if broken)
# • Multi-system orchestration

# Composable Modules

Add pre-built capabilities to any project. Modules inject specific instructions into the agent prompt so it generates the right code for auth, payments, email, and more.

Available Modules

auth_jwtJWT authentication, registration, login, password reset

payments_stripeStripe checkout, subscriptions, webhooks, billing portal

email_smtpSMTP email sending with Jinja2 templates

file_uploadS3-compatible file upload with image processing

websocket_realtimeWebSocket real-time events, chat, notifications

admin_panelAdmin dashboard, user management, CRUD

search_full_textPostgreSQL full-text search with ranking

rate_limitingAPI rate limiting with per-user quotas

Dependency Resolution

POST /api/v1/runs/modules/resolve
{ "modules": ["admin_panel", "rate_limiting"] }

# Response:
{
  "resolved": ["admin_panel", "rate_limiting", "auth_jwt"],
  "added": [{ "module": "auth_jwt", "required_by": "admin_panel" }],
  "conflicts": [],
  "total": 3
}
# auth_jwt auto-added because admin_panel depends on it

# Policies & Guardrails

Create execution policies to control model selection, cost limits, runtime constraints, tool access, and approval workflows.

Create a Policy

POST /api/v1/policies
{
  "name": "secure-production",
  "model_strategy": "balanced",
  "max_cost_usd": 5.0,
  "max_runtime_minutes": 30,
  "preferred_providers": ["anthropic", "openai", "groq"],
  "fallback_enabled": true,
  "blocked_commands": ["rm -rf", "sudo", "shutdown", "curl | bash"],
  "allowed_tools": ["write_file", "read_file", "run_command", "list_files"],
  "require_approval": true
}

Use a Policy in a Run

POST /api/v1/runs
{
  "objective": "Build a payment API",
  "policy_id": "abc123"
}

# Enforcement:
# • Cost tracked per node — run stops if limit exceeded
# • Runtime checked per wave — remaining nodes skipped if timeout
# • Blocked commands rejected at execution time
# • Only allowed tools available to agents
# • PR creation pauses for human approval

Guardrail Features

• Cost Limit: Run stops if accumulated token cost exceeds max_cost_usd. A cost_limit_exceeded event is emitted.
• Runtime Limit: Remaining nodes are skipped if execution exceeds max_runtime_minutes.
• Blocked Commands: Shell commands matching blocked patterns are rejected before execution. Default blocks include rm -rf /, sudo, shutdown.
• Tool Allowlist: When specified, agents can only use the listed tools. Empty = all tools available.
• Approval Gate: When require_approval is true, the agent pauses and asks for human confirmation before creating pull requests.
• Failover Chain: If the primary provider fails, CesaFlow automatically tries the next provider in preferred_providers list.

# API Vault & Key Management

CesaFlow separates admin and user API keys. Admin keys power chat/planning. User BYOM keys power full code generation.

How Keys Work

• Admin Keys (API Vault): Used for chat, planner, discussion, code review, cost estimation. Admin adds keys via Nexus → API Vault. Supports 25+ providers with auto-fallback.
• User Keys (BYOM): Used for full code generation (backend, frontend, QA agents). Users add their own keys via Dashboard → Model Settings. Admin keys are never used for user builds.
• Fallback Chain: Active AI keys in the vault are tried in priority order. If provider 1 fails (quota, rate limit), provider 2 is tried automatically.
• Separation: Admin and user keys are stored in different tables and never mixed. Admin costs are predictable (only lightweight operations).

Key Priority Flow

# For each AI call, CesaFlow checks in this order:

1. User BYOM key (model_credentials table)
   → If found: use for ALL agents (planner + backend + frontend + qa)
   → User pays their provider directly

2. Admin Vault key (api_vault table) — only for allowed purposes
   → chat, planner, discussion, inline edit, review, estimate
   → NOT used for: backend, frontend, qa, devops agents
   → If no BYOM: backend/frontend/qa return "Add your own key" error

3. System config / env var (legacy fallback)
   → Same restrictions as vault

# Fallback within vault:
# Priority 10 (Groq) fails → try Priority 20 (Gemini) → Priority 30 (OpenAI)

# Model Catalog

57+ AI models across 25+ providers. Each model includes rate limits, pricing, and context window information.

Browse the Catalog

GET /api/v1/runs/models/catalog

Response:
{
  "total_models": 38,
  "providers": [
    {
      "id": "groq",
      "name": "Groq",
      "model_count": 4,
      "models": [
        {
          "id": "llama-3.3-70b-versatile",
          "name": "Llama 3.3 70B",
          "context": 128000,
          "rate_limit": "30 req/min, 14400 tokens/min",
          "pricing": "Free",
          "best_for": "General, coding"
        },
        ...
      ]
    },
    ...
  ]
}

# Per-provider:
GET /api/v1/runs/models/catalog/openai
GET /api/v1/runs/models/catalog/anthropic

Free Runs

Every new user gets 3 free runs using admin-provided AI keys. After that, users must add their own BYOM key.

GET /api/v1/runs/free-runs
→ { "free_runs_remaining": 2, "free_runs_total": 3, "has_byom_key": false, "status": "free_runs_available" }

# Resilience & Recovery

CesaFlow is built for reliability with wave-level checkpointing, graceful shutdown, and automatic recovery.

Wave Checkpointing

After each wave completes, the execution state is persisted to PostgreSQL. If the server crashes mid-run, completed waves are not re-executed on restart.

Graceful Shutdown

On SIGTERM, the orchestrator saves current state within a 25-second drain timeout. The run is marked as "interrupted" and can be resumed.

Auto-Resume

On startup, the system scans for interrupted or orphaned runs and automatically resumes them, skipping already-completed nodes.

Multi-Provider Failover

If the primary AI provider fails, CesaFlow automatically tries the next provider in the fallback chain. A run only fails after all providers are exhausted.

# FAQ

Which AI models does CesaFlow use?

By default, CesaFlow uses Qwen Plus (DashScope) — no API key needed for new users. You can override this with any model via Dashboard → Model Settings: Gemini, Claude, Groq, Mistral, DeepSeek, Ollama, and more are all supported.

Is GITHUB_TOKEN required?

No. Git tools (git_init, git_status) work without it. You only need GITHUB_TOKEN for github_create_pr — opening pull requests via the GitHub API.

Is the browser tool always active?

Browser navigation with Playwright requires PLAYWRIGHT_ENABLED=1 in your environment. Without it, browser_navigate falls back to plain HTTP via httpx — no JavaScript rendering.

How does project_id work?

When you pass project_id to a run, CesaFlow reuses the same workspace directory (/workspace/project_{id}/) so files persist across runs. The Planner also loads previous run history and tells agents not to overwrite existing code.

Can I bring my own model (BYOM)?

Yes. Go to Dashboard → Model Settings and enter your API key for any supported provider. CesaFlow will use your credentials for all runs.

How do I run CesaFlow locally?

Clone the repo and run: docker compose up -d. No API key needed — CesaFlow ships with a built-in Qwen system key. For your own models, add keys via Model Settings. Backend: port 8001, Frontend: port 3000.