Skip to content
Stagent
jie-worldstatelabs/root-cause-analysispublic

Simplified RCA flow — repro, diagnose with cited evidence, fix, then independently re-validate. Same discipline as before, one fewer stage.

1
by jie-worldstatelabsupdated Apr 27, 20264 stages0 runs

Run in Claude Code

/stagent:start --flow=cloud://jie-worldstatelabs/root-cause-analysis <task_description>

Paste in Claude Code and replace <task_description>

Template blueprint

State machine

Loading state machine…

Click any stage above to view its instructions below.

Stagereproduce

reproduce.md

inline· interruptible · transitions: done → diagnose

Stage: reproduce

Runtime config (canonical): workflow.jsonstages.reproduce

Purpose: interview the bug reporter, capture symptoms, and produce a minimal reproduction plus an expected vs actual table that downstream stages can rely on. Output artifact: write to the absolute path provided in your prompt Valid results this stage writes: pending (gathering info / repro in progress), done (repro and expected-vs-actual table are locked in)

This is an interruptible inline stage — the stop hook allows natural pauses for Q&A.

Note: by the time you read this file, state.md already exists with status: reproduce and the current epoch.

Inputs

  • Required: (none)
  • Optional: previous epoch's diagnose report — present only on loop-back when the prior diagnosis was inconclusive. Read it first: its "Why Inconclusive" section names the exact missing information that prevented confirmation. Use that to focus this round's interview questions instead of starting from scratch.

Step 1 — Stamp pending immediately

Read the current epoch from state.md. Before doing any work, write the output artifact at the path shown in your I/O context with:

markdown
---
epoch: <epoch>
result: pending
---
# Reproduce Report (in progress)

This signals the stop hook that the stage is mid-work and is safe to pause for user input.

Step 2 — Gather the bug report

Ask the user for the report. Pull on the loose threads — one question per message, multiple-choice when possible, typically 3–6 questions:

  • What did they do (user action, command, request)?
  • What did they expect to happen?
  • What actually happened (exact error message, stack trace, screenshot, log line)?
  • When did it start? Last known good state? Recent deploys / config changes?
  • Environment: OS, version, browser, runtime, branch / commit hash?
  • Frequency: every time, sometimes, only with specific inputs?

If the optional diagnose report is present, prioritise the gaps it called out — ask the user the specific question(s) that would have let the previous diagnose round confirm a hypothesis.

Stop asking when you have enough to attempt a repro.

Step 3 — Build the minimal repro

A "minimal repro" is the smallest set of steps or smallest script that triggers the bug deterministically. Keep cutting until removing one more thing makes the bug disappear.

Forms the repro can take, in order of preference:

  1. A shell command or script the next stage can run unattended (e.g. python repro.py, curl …, a failing test).
  2. A failing test case added to the project's test suite.
  3. A step-by-step manual sequence (only when 1 and 2 are impossible — e.g. requires a physical device or third-party UI).

Write the repro into the body of the report under ## Repro. If you created a script or test file in the project, list its path.

Step 4 — Lock the expected-vs-actual table

markdown
| # | Step / Input          | Expected                | Actual                   |
|---|-----------------------|-------------------------|--------------------------|
| 1 | <action>              | <observable result>     | <observed result>        |

Be specific — "an error" is not actual behavior, "TypeError: cannot read 'id' of undefined at user.js:42" is.

Step 5 — Confirm with the user

Show the user the repro and the expected-vs-actual table. Iterate until they confirm it matches the bug they were reporting. Do NOT advance until they explicitly confirm.

Step 6 — Finalize

Once the user confirms, overwrite the output artifact:

markdown
---
epoch: <epoch>
result: done
---
# Reproduce Report

## Bug Summary
<one-line description>

## Environment
<OS, runtime/version, branch + commit hash, any relevant config>

## Repro
<exact commands, script path, or step-by-step sequence>

## Expected vs Actual
| # | Step / Input | Expected | Actual |
|---|---|---|---|
| 1 | ... | ... | ... |

## Notes
<frequency, recent changes, anything else the next stage should know>

## Loop-back Notes
(Include this section ONLY if the optional prior diagnose report was provided.)

- Diagnose gap that motivated this round: <quoted from the prior diagnose's "Why Inconclusive">
- New information gathered to close it: <one or two sentences>

That is the only action needed here. The main loop reads the artifact's result: and calls update-status.sh to advance the state machine — do NOT call it yourself.

Rules

  • Do NOT propose fixes — your job is to capture, not to solve.
  • Do NOT skip user confirmation. A repro the user has not approved is worthless to downstream stages.
  • Do NOT mark the stage done until the expected-vs-actual table has at least one concrete row.
workflow.json· raw config

workflow.json

drives the state machine above