Skip to content
Stagent
jie-worldstatelabs/prep-test-suitepublic

Interviews scope, drafts a test plan, implements the suite, and dry-runs it — looping until every test passes cleanly, with app bugs logged separately.

1
by jie-worldstatelabsupdated Apr 25, 20263 stages4 runs

Run in Claude Code

/stagent:start --flow=cloud://jie-worldstatelabs/prep-test-suite <task_description>

Paste in Claude Code and replace <task_description>

Template blueprint

State machine

Loading state machine…

Click any stage above to view its instructions below.

Stageplan

plan.md

inline· interruptible · transitions: approved → writing

Stage: plan

Runtime config (canonical): workflow.jsonstages.plan

Purpose: interview the user about test scope and category, then produce a structured test plan that downstream stages will implement and execute. Output artifact: write to the absolute path provided in your I/O context Valid results this stage writes: pending (plan drafted, awaiting user approval), approved (user has explicitly confirmed)

<HARD-GATE> Do NOT transition out of this stage until the user explicitly confirms the plan. Write `result: approved` only after they have said so. </HARD-GATE>

This is an interruptible stage — the stop hook allows natural pauses for Q&A.

Note: by the time you read this file, state.md already exists with status: plan and the current epoch set. Read state.md for the epoch you must stamp into the artifact's frontmatter.

Step 1 — Write a pending artifact immediately

Before asking any questions, write the output artifact at the absolute path shown in your I/O context with this body:

markdown
---
epoch: <epoch from state.md>
result: pending
---
# Test Suite Plan: <Topic>

(Draft in progress — interviewing user.)

This signals to the stop hook that the stage is in flight so the loop pauses naturally for your Q&A.

Step 2 — Explore the project

Understand the application under test before drafting questions:

  • Read package.json, pyproject.toml, pubspec.yaml, go.mod, Cargo.toml, etc. — whichever is present — to identify the language, framework, and existing test tooling.
  • Note any existing tests (their location, framework, conventions) so the new suite fits the project rather than fighting it.
  • For new projects with nothing yet: just note the absence and design a fresh layout.

Step 3 — Ask clarifying questions

Inline Q&A — the stop hook allows natural pauses.

  • One question per message. Prefer multiple choice (A/B/C) where possible.
  • Typically 3–6 questions; stop when you have enough.
  • Cover at minimum:
    1. Scope — which features / modules / endpoints / screens are in scope for this suite?
    2. Test category — quick smoke test, unit, integration, end-to-end, browser-use, performance, accessibility, etc. (one or several).
    3. Framework preference — pytest / jest / vitest / playwright / cypress / xcuitest / go test / etc., or "you pick based on the stack."
    4. Runner / execution command — how the user wants the suite invoked (npm test, pytest, npx playwright test, custom script).
    5. Pass criteria — coverage target, must-pass paths, performance budgets, etc.
    6. Anything off-limits — code that should NOT be touched, network/services that must be mocked.

Step 4 — Draft the plan body

Replace the artifact body (keep the frontmatter at result: pending) with this structure:

markdown
---
epoch: <epoch>
result: pending
---
# Test Suite Plan: <Topic>

## Application Under Test
<one-paragraph summary: what the app is, language/framework, where its code lives>

## Scope
<bullet list of features / modules / endpoints / screens covered by this suite, plus an explicit "out of scope" list>

## Test Categories
<which categories the user picked — e.g. unit + integration, or e2e only — with a one-line rationale per category>

## Framework & Tooling
- Framework: <e.g. pytest, jest, playwright>
- Runner command: <exact shell command — MUST target `<project-dir>/test/`, e.g. `pytest test/`, `npx playwright test --config test/playwright.config.ts`, `npx jest test/`>
- Reporter / output format: <if relevant>
- Mocks / fixtures / test doubles: <approach for external dependencies>

## File Layout
<directory tree the writer stage will create — every test file path, plus any helper / fixture / config files. **Hard constraint**: every path MUST be rooted under `<project-dir>/test/`. The writer stage will reject and silently relocate any entry outside `test/`. Place test-framework configuration (e.g. `playwright.config.ts`, `pytest.ini`, `jest.config.js`) inside `test/` as well.>

## Test Cases
For each in-scope item, list the concrete test cases the writer must implement.

### <Module / Feature 1>
- [ ] <test case 1 — input / setup → expected outcome>
- [ ] <test case 2 — ...>

### <Module / Feature 2>
- [ ] ...

## Pass Criteria
- [ ] All listed test cases implemented
- [ ] Suite runs cleanly via the runner command — zero test-side errors
- [ ] <coverage target if given>
- [ ] <any other user-supplied gate>

## Bug Classification Policy (for `dry_run`)
- **test-side bug** — fault is in the test code (wrong assertion, stale fixture, bad selector, missing setup). Loops back to `writing` for repair.
- **app-side bug** — fault is in the application under test. Logged in the dry-run report's "App bugs" ledger, but does NOT block the suite from passing this workflow.

## Constraints
<anything off-limits, mocking rules, environment requirements>

Step 5 — Get user approval

Tell the user:

"Plan saved to the session's plan-report.md. Please review and confirm to start writing the suite, or request changes."

If the user requests changes, edit the plan body but keep result: pending.

Step 6 — Finalize

When the user explicitly approves, edit the artifact: change result: pendingresult: approved. That is the only action needed here. The main loop reads the result: and advances the state machine — do NOT call update-status.sh from this stage file.

workflow.json· raw config

workflow.json

drives the state machine above