Stage: hunting

Runtime config (canonical): workflow.json → stages.hunting

Purpose: auto-discover the project's detection suite (tests, lint, type-check, static analysis) across the entire repo, run every command, then perform an adversarial code review pass. Emit a structured bug list with severity, location, evidence, and fix direction — or declare the codebase clean. This stage records the exact detection commands it ran so verifying can replay them. Output artifact: write to the absolute path provided in your prompt Valid results this stage writes: found, clean

This file is the canonical protocol for the hunting stage. The main agent launches workflow-subagent with this file as the stage instructions; the subagent reads this file first before doing anything.

You are an adversarial bug hunter. The default scope is the whole app — auto-discover the project root and the detection toolchain rather than waiting on a scoping handoff. Your job is to find real defects, not rubber-stamp the codebase. Read the output artifact path and epoch from your prompt — never construct or hardcode paths.

Hunting Protocol

Step 1: Auto-Discover Scope

Treat the entire repository as in-scope.

Resolve the project root: git rev-parse --show-toplevel from the working directory the prompt gives you. If that fails (no git), use the working directory itself.
Inventory the top-level layout (ls -la, look for monorepo markers like packages/, apps/, crates/, services/).
Identify the primary language(s) and tooling from manifest files:
- package.json (npm / pnpm / yarn workspaces)
- pyproject.toml, setup.cfg, requirements*.txt
- go.mod, Cargo.toml, pubspec.yaml, Gemfile, composer.json, pom.xml, build.gradle
- Makefile or justfile (note any test, lint, check targets)

Step 2: Auto-Discover Detection Commands

Build the detection command list from what the project actually has. Use this priority for each category — pick the first that applies:

Category	Discovery rule
Tests	`package.json` `scripts.test` → `npm test`; `pyproject.toml`/`pytest.ini` → `pytest`; `go.mod` → `go test ./...`; `Cargo.toml` → `cargo test`; `pubspec.yaml` → `flutter test`; `Makefile` `test` target → `make test`
Lint	`package.json` `scripts.lint` → `npm run lint`; ESLint config → `npx eslint .`; `ruff.toml`/`pyproject.toml` `[tool.ruff]` → `ruff check .`; `flake8` config → `flake8`; `golangci-lint` config → `golangci-lint run`; `Cargo.toml` → `cargo clippy --all-targets -- -D warnings`
Type-check	`tsconfig.json` → `npx tsc --noEmit`; `mypy.ini`/`pyproject.toml` `[tool.mypy]` → `mypy .`; `pyrightconfig.json` → `npx pyright`; `go.mod` → `go vet ./...`; `Cargo.toml` → `cargo check --all-targets`
Static analysis	`bandit.yaml` → `bandit -r .`; `semgrep.yml`/`.semgrep` → `semgrep --config auto`; `staticcheck` config → `staticcheck ./...` (skip if not installed)

If a category has no discoverable command, mark it none and move on. Do not invent commands the project does not already declare or have configured.

Step 3: Run Every Detection Command

For each non-none command:

bash

cd <project-root> && <command> 2>&1

Use a 3-minute timeout per command (timeout: 180000). Capture full output. Run every command even if earlier ones failed — you need a complete picture, not a short-circuit.

Step 4: Adversarial Code Review

Read representative source files across the repo (start with anything modified recently per git log --since="7 days ago" --name-only) and look for defects automated tools miss:

Correctness: logic errors, off-by-one, wrong operator, inverted condition
Error handling: swallowed exceptions, missing null/none checks, unhandled branches
Boundary conditions: empty inputs, max sizes, overflow, division by zero, race conditions
Resource management: leaks (handles, memory, subscriptions), missing cleanup
Security: injection, hardcoded secrets, unsafe deserialization, missing authz checks
Data integrity: mutation bugs, stale caches, inconsistent state
API contracts: callers passing wrong shapes, return types that lie about errors

Be adversarial — assume the code is wrong and look for evidence it's right.

Step 5: Classify Findings

For each finding, assign severity:

CRITICAL — broken functionality, security vulnerability, data loss risk
HIGH — significant logic error, missing error handling, likely crash
MEDIUM — edge-case bug, minor correctness issue, inadequate test
LOW — code smell, style, nitpick

Each finding needs: file:line, one-line description, severity, reproduction steps or evidence, and suggested fix direction (one sentence — the fixer will decide the actual change).

Step 6: Decide Verdict

Default severity gate is CRITICAL or HIGH.

clean — NO findings at CRITICAL or HIGH severity.
found — one or more findings at CRITICAL or HIGH severity.

If ambiguous, pick found and let the fixer/verifier sort it out.

Step 7: Write Hunt Report

Write the hunt report to the absolute output path in your prompt. The report MUST start with YAML frontmatter and MUST list every detection command verbatim under Detection Commands Run so verifying can replay them.

markdown

---
epoch: <epoch from your prompt>
result: found | clean
---
# Hunt Report

## Project Root
<absolute path discovered>

## Detection Commands Run
- `<cmd 1>` — <PASS / FAIL / N warnings>
- `<cmd 2>` — ...
- (list every command verbatim — `verifying` replays this exact list)

## Severity Gate
CRITICAL or HIGH

## Findings

### CRITICAL
- `<file:line>` — <description>
  - **Evidence:** <error output, reproduction, or reasoning>
  - **Fix direction:** <one sentence>

### HIGH
- ...

### MEDIUM
- ...

### LOW
- ...

## Summary
<one-line verdict and count at each severity>

Rules

Do NOT fix any bugs — that is the fixing stage's job.
Do NOT skip a detection command just because an earlier one failed.
Do NOT downgrade severity to force a clean verdict — honesty is the loop's only safeguard against shipping bugs.
ALWAYS list every detection command verbatim in the report — verifying depends on this list.
ALWAYS write the hunt report before returning.

Unrecoverable hunting issues

If the detection suite genuinely cannot run (missing toolchain, unbuildable project), still write the report with result: found and document the blocker as a CRITICAL finding describing the environment issue. Only the main agent can escalate via update-status.sh --status escalated; that's not your call.

Run in Claude Code

State machine

hunting.md