Skip to content
Stagent
jie-worldstatelabs/strict-tddpublic

Strict 3-stage TDD: write a failing test (proven failing in stderr), make the minimum change to pass it, then refactor with the whole suite green.

1
by jie-worldstatelabsupdated Apr 27, 20263 stages4 runs

Run in Claude Code

/stagent:start --flow=cloud://jie-worldstatelabs/strict-tdd <task_description>

Paste in Claude Code and replace <task_description>

Template blueprint

State machine

Loading state machine…

Click any stage above to view its instructions below.

Stagewrite_failing_test

write_failing_test.md

subagent · transitions: failing → make_it_pass, passing → write_failing_test

Stage: write_failing_test

Runtime config (canonical): workflow.jsonstages.write_failing_test

Purpose: write a single new failing test for the next behavior, run only that test, and capture the verbatim FAIL line from stderr. The workflow refuses to advance to make_it_pass without a captured FAIL line — that is the gate. Output artifact: write to the absolute path provided in your prompt Valid results this stage writes: failing, passing

This file is the canonical protocol for the write_failing_test stage. The main agent launches workflow-subagent with this file as the stage instructions; the subagent reads this file first before doing anything.

You are a senior TDD practitioner running the RED step. Your job is to add one new failing test that pins down the next small behavior, prove it fails by running only that test, and paste the verbatim FAIL line into your report. No implementation code — tests only.

Inputs

Read every input path from your prompt — do NOT construct or hardcode paths.

  • Required: test_command run file — the project test runner. Read its contents.
    • If the file's content begins with # REQUIRED: (the auto-detect placeholder), STOP. Write the report with result: passing and explain in the body that no test command is configured. (The plugin's escalated machinery is the main loop's concern; do not call update-status.sh yourself.)

Protocol

  1. Read the test command from the run-file input. Strip trailing whitespace. Call this <cmd>.

  2. Pick the next behavior. Look at the project's existing tests and source. Choose the smallest unimplemented behavior worth pinning. One behavior, one assertion focus. Do not bundle multiple cases into one test.

  3. Write exactly one new failing test. Place it in the project's conventional test location next to similar tests. Follow the project's existing test conventions (naming, fixtures, framework). Do NOT modify implementation code in this stage.

  4. Run only that one test so the FAIL output is unambiguous:

    bash
    cd <project-directory> && <cmd> <relative-path-to-the-new-test> 2>&1

    Use a 3-minute timeout (timeout: 180000). Capture the full stderr/stdout.

  5. Find the verbatim FAIL line. The exact phrasing depends on the framework — examples:

    • pytest: a line containing FAILED tests/...::test_name
    • jest: a line containing or FAIL <path>
    • cargo test: test <name> ... FAILED
    • go test: --- FAIL: TestName Copy the line exactly as printed — same whitespace, same path, same name. Do not paraphrase.
  6. Decide the result:

    • If you captured a FAIL line for the new test → result: failing. The workflow advances to make_it_pass.
    • If the test surprisingly passed (e.g., the behavior was already implemented), or no FAIL line is present → result: passing. The workflow loops back to this stage so you can pick a different behavior. Do NOT mark failing without a captured FAIL line — that is the gate.

Output artifact

Write to the absolute path in your prompt:

markdown
---
epoch: <epoch from your prompt>
result: failing | passing
---
# Write Failing Test Report

## Test Command
<the `<cmd>` you read from the run file>

## New Test
- File: <relative path to the new test file>
- Test name: <function/method name>
- Behavior under test: <one sentence>

## Single-Test Run
<the exact command line you executed> ```

Verbatim FAIL Line

<paste the FAIL line exactly as printed — required when result is `failing`>

(If result: passing, replace this section with a short note explaining why no FAIL line was produced — e.g. "test passed unexpectedly; behavior already implemented; need to pick a different one next round".)

Run Output (tail)

<last ~40 lines of stdout/stderr for context>

Notes for next stage

<any pointers make_it_pass needs — e.g. "expected return shape is X", "fixture Y is in conftest.py">

## Rules - One new test per round. Do NOT add multiple tests. - Do NOT modify implementation code in this stage. - Do NOT emit `result: failing` without a verbatim FAIL line in the artifact — that is the hard gate. - If the project has no test runner (`# REQUIRED:` placeholder in `test_command`), write the report and emit `result: passing` with the reason — the main loop will hit the `max_epoch` cap rather than churn forever. - Do NOT call `update-status.sh` — the main loop reads your `result:` and transitions on its own.
workflow.json· raw config

workflow.json

drives the state machine above