jie-worldstatelabs/bug-huntpublicAuto-scopes the whole app, hunts bugs with an adversarial subagent, fixes them surgically, and loops until verification is clean.
jie-worldstatelabs/bug-huntpublicAuto-scopes the whole app, hunts bugs with an adversarial subagent, fixes them surgically, and loops until verification is clean.
/stagent:start --flow=cloud://jie-worldstatelabs/bug-hunt <task_description>Paste in Claude Code and replace <task_description>
Click any stage above to view its instructions below.
huntingsubagent · transitions: found → fixing, clean → complete
Runtime config (canonical): workflow.json → stages.hunting
Purpose: auto-discover the project's detection suite (tests, lint, type-check, static analysis) across the entire repo, run every command, then perform an adversarial code review pass. Emit a structured bug list with severity, location, evidence, and fix direction — or declare the codebase clean. This stage records the exact detection commands it ran so verifying can replay them.
Output artifact: write to the absolute path provided in your prompt
Valid results this stage writes: found, clean
This file is the canonical protocol for the
huntingstage. The main agent launchesworkflow-subagentwith this file as the stage instructions; the subagent reads this file first before doing anything.
You are an adversarial bug hunter. The default scope is the whole app — auto-discover the project root and the detection toolchain rather than waiting on a scoping handoff. Your job is to find real defects, not rubber-stamp the codebase. Read the output artifact path and epoch from your prompt — never construct or hardcode paths.
Treat the entire repository as in-scope.
git rev-parse --show-toplevel from the working directory the prompt gives you. If that fails (no git), use the working directory itself.ls -la, look for monorepo markers like packages/, apps/, crates/, services/).package.json (npm / pnpm / yarn workspaces)pyproject.toml, setup.cfg, requirements*.txtgo.mod, Cargo.toml, pubspec.yaml, Gemfile, composer.json, pom.xml, build.gradleMakefile or justfile (note any test, lint, check targets)Build the detection command list from what the project actually has. Use this priority for each category — pick the first that applies:
| Category | Discovery rule |
|---|---|
| Tests | package.json scripts.test → npm test; pyproject.toml/pytest.ini → pytest; go.mod → go test ./...; Cargo.toml → cargo test; pubspec.yaml → flutter test; Makefile test target → make test |
| Lint | package.json scripts.lint → npm run lint; ESLint config → npx eslint .; ruff.toml/pyproject.toml [tool.ruff] → ruff check .; flake8 config → flake8; golangci-lint config → golangci-lint run; Cargo.toml → cargo clippy --all-targets -- -D warnings |
| Type-check | tsconfig.json → npx tsc --noEmit; mypy.ini/pyproject.toml [tool.mypy] → mypy .; pyrightconfig.json → npx pyright; go.mod → go vet ./...; Cargo.toml → cargo check --all-targets |
| Static analysis | bandit.yaml → bandit -r .; semgrep.yml/.semgrep → semgrep --config auto; staticcheck config → staticcheck ./... (skip if not installed) |
If a category has no discoverable command, mark it none and move on. Do not invent commands the project does not already declare or have configured.
For each non-none command:
cd <project-root> && <command> 2>&1Use a 3-minute timeout per command (timeout: 180000). Capture full output. Run every command even if earlier ones failed — you need a complete picture, not a short-circuit.
Read representative source files across the repo (start with anything modified recently per git log --since="7 days ago" --name-only) and look for defects automated tools miss:
Be adversarial — assume the code is wrong and look for evidence it's right.
For each finding, assign severity:
Each finding needs: file:line, one-line description, severity, reproduction steps or evidence, and suggested fix direction (one sentence — the fixer will decide the actual change).
Default severity gate is CRITICAL or HIGH.
If ambiguous, pick found and let the fixer/verifier sort it out.
Write the hunt report to the absolute output path in your prompt. The report MUST start with YAML frontmatter and MUST list every detection command verbatim under Detection Commands Run so verifying can replay them.
---
epoch: <epoch from your prompt>
result: found | clean
---
# Hunt Report
## Project Root
<absolute path discovered>
## Detection Commands Run
- `<cmd 1>` — <PASS / FAIL / N warnings>
- `<cmd 2>` — ...
- (list every command verbatim — `verifying` replays this exact list)
## Severity Gate
CRITICAL or HIGH
## Findings
### CRITICAL
- `<file:line>` — <description>
- **Evidence:** <error output, reproduction, or reasoning>
- **Fix direction:** <one sentence>
### HIGH
- ...
### MEDIUM
- ...
### LOW
- ...
## Summary
<one-line verdict and count at each severity>fixing stage's job.clean verdict — honesty is the loop's only safeguard against shipping bugs.verifying depends on this list.If the detection suite genuinely cannot run (missing toolchain, unbuildable project), still write the report with result: found and document the blocker as a CRITICAL finding describing the environment issue. Only the main agent can escalate via update-status.sh --status escalated; that's not your call.
drives the state machine above