Convoship — Sketch your bot. Ship it in five minutes.

Use Playground to validate draft config before publish. Use Evals for repeatable regression checks; optionally gate publish with a workspace pass-rate threshold.

Playground

Route: /app/agents/{slug}/playground.
POST /v1/ai-agents/{slug}/playground — JSON response by default.
POST .../playground?stream=true — Server-Sent Events with event: token, trace, done.
Studio UI streams tokens into the bot bubble when streaming is enabled.
Each turn persists AiAgentConversation + trace; link to full trace under Conversations.
channel playground is always allowed even if Web is the only listed channel.
BUDGET guardrail blocks LLM when daily_spend_cap_cents exceeded.
RPM — same workspace rpm_cap_per_agent as public embed (0 = unlimited).

Trace panel

Event kind	Meaning
user	Inbound message
think	Model reasoning step
tool	Dispatch result — name, args, output, duration_ms
ask	Clarifying question to user
guardrail	AUTH / LIMIT / BUDGET / REDACT / ESCALATE hit

Evals tab

Create eval — name, user prompt, expected outcome text.
Run — POST .../evals/{id}/run executes playground + judge_eval (LLM JSON verdict or substring fallback).
Counters — pass_count, fail_count, last_status on the eval row.
Publish trend — bar chart from version eval_pass_rate (click a bar for per-version runs).

LLM judge

judge_eval returns { passed, rationale }. Uses workspace BYOK when configured; otherwise platform keys. Without keys, substring match on expected text.

Conversations

/app/agents/{slug}/convos lists playground and embed traffic. Detail view shows chronological trace with tool timing — same schema as public production conversations.

After evals pass in Playground, set eval_pass_threshold under Workspace Settings and publish from Deploy — eval runs are snapshotted on the version for audit.