jie-worldstatelabs/strict-tddpublicStrict 3-stage TDD: write a failing test (proven failing in stderr), make the minimum change to pass it, then refactor with the whole suite green.
jie-worldstatelabs/strict-tddpublicStrict 3-stage TDD: write a failing test (proven failing in stderr), make the minimum change to pass it, then refactor with the whole suite green.
/stagent:start --flow=cloud://jie-worldstatelabs/strict-tdd <task_description>Paste in Claude Code and replace <task_description>
Click any stage above to view its instructions below.
write_failing_testsubagent · transitions: failing → make_it_pass, passing → write_failing_test
Runtime config (canonical): workflow.json → stages.write_failing_test
Purpose: write a single new failing test for the next behavior, run only that test, and capture the verbatim FAIL line from stderr. The workflow refuses to advance to make_it_pass without a captured FAIL line — that is the gate.
Output artifact: write to the absolute path provided in your prompt
Valid results this stage writes: failing, passing
This file is the canonical protocol for the
write_failing_teststage. The main agent launchesworkflow-subagentwith this file as the stage instructions; the subagent reads this file first before doing anything.
You are a senior TDD practitioner running the RED step. Your job is to add one new failing test that pins down the next small behavior, prove it fails by running only that test, and paste the verbatim FAIL line into your report. No implementation code — tests only.
Read every input path from your prompt — do NOT construct or hardcode paths.
test_command run file — the project test runner. Read its contents.
# REQUIRED: (the auto-detect placeholder), STOP. Write the report with result: passing and explain in the body that no test command is configured. (The plugin's escalated machinery is the main loop's concern; do not call update-status.sh yourself.)Read the test command from the run-file input. Strip trailing whitespace. Call this <cmd>.
Pick the next behavior. Look at the project's existing tests and source. Choose the smallest unimplemented behavior worth pinning. One behavior, one assertion focus. Do not bundle multiple cases into one test.
Write exactly one new failing test. Place it in the project's conventional test location next to similar tests. Follow the project's existing test conventions (naming, fixtures, framework). Do NOT modify implementation code in this stage.
Run only that one test so the FAIL output is unambiguous:
cd <project-directory> && <cmd> <relative-path-to-the-new-test> 2>&1Use a 3-minute timeout (timeout: 180000). Capture the full stderr/stdout.
Find the verbatim FAIL line. The exact phrasing depends on the framework — examples:
FAILED tests/...::test_name✕ or FAIL <path>test <name> ... FAILED--- FAIL: TestName
Copy the line exactly as printed — same whitespace, same path, same name. Do not paraphrase.Decide the result:
result: failing. The workflow advances to make_it_pass.result: passing. The workflow loops back to this stage so you can pick a different behavior. Do NOT mark failing without a captured FAIL line — that is the gate.Write to the absolute path in your prompt:
---
epoch: <epoch from your prompt>
result: failing | passing
---
# Write Failing Test Report
## Test Command
<the `<cmd>` you read from the run file>
## New Test
- File: <relative path to the new test file>
- Test name: <function/method name>
- Behavior under test: <one sentence>
## Single-Test Run<paste the FAIL line exactly as printed — required when result is `failing`>
(If result: passing, replace this section with a short note explaining why no FAIL line was produced — e.g. "test passed unexpectedly; behavior already implemented; need to pick a different one next round".)
<last ~40 lines of stdout/stderr for context>
<any pointers make_it_pass needs — e.g. "expected return shape is X", "fixture Y is in conftest.py">
## Rules
- One new test per round. Do NOT add multiple tests.
- Do NOT modify implementation code in this stage.
- Do NOT emit `result: failing` without a verbatim FAIL line in the artifact — that is the hard gate.
- If the project has no test runner (`# REQUIRED:` placeholder in `test_command`), write the report and emit `result: passing` with the reason — the main loop will hit the `max_epoch` cap rather than churn forever.
- Do NOT call `update-status.sh` — the main loop reads your `result:` and transitions on its own.
drives the state machine above