Project Autopilot — WebbCraft

The Problem with Task-Level AI

Most AI workflows are still task-level: write this function, fix this bug, draft this PR comment. Useful, but small.

Projects are different. They are multi-hour or multi-day, cross-language, full of ambiguity, and packed with tradeoffs. You cannot run that safely with a single prompt and a generic coding bot.

This is not a side interest. It is the next operational unlock, and for people like me it is becoming a requirement. If you want to stay competitive and stay employed, you need systems that can own projects, not just isolated tasks. That means role specialization, explicit gates, adversarial review loops, and decision state that can be audited later.

The Project Flow

The entry point is simple: /project [TICKET]. From there, specialist agents run the lifecycle.

●

Layer 1: Research + Architecture run in one agent to define AC and technical direction

●

Layer 2: Specialist + Vanguard pairs move through planning, implementation, and PR review phases

●

Layer 3: QA runs validation and drives a bugfix loop until release criteria are met

●

Final gate: Research/Architecture agent verifies AC compliance and design integrity

Human input is required only when ambiguity cannot be resolved from available context, or when agent consensus stalls and cannot converge safely.

The 3-Layer Model

Flattening the workflow to three layers improved reliability and reduced orchestration overhead. It keeps accountability clear while preserving specialization where it matters.

Layer 1 combines research and architecture into one role, eliminating handoff drift between requirements and design. Layer 2 keeps specialist and vanguard pairs, but uses the same pairs across planning, programming, and PR review phases. Layer 3 is QA with a direct bugfix feedback loop.

Same checks and balances, fewer moving parts.

One Pair, Multiple Phases

Each Specialist is paired with a Vanguard. Same technical depth, different objective. The difference now is continuity: the same pair stays active through planning, programming, and PR review.

Specialists drive implementation. Vanguards challenge assumptions and protect codebase integrity. Keeping the same pair across phases preserves context and cuts re-explaining overhead.

If they deadlock, they escalate to Layer 1 (Research + Architecture) or request targeted human input.

State Files: The Most Important Layer

Every agent writes state JSON throughout the flow: stage, rationale, unresolved questions, confidence, and handoff data.

This gives us an auditable trail of why decisions were made, not just what code changed. That becomes fuel for post-project review, process tuning, and future automation.

Without state, you get output. With state, you get a system that can improve itself.

What This System Is (and Isn't)

This is long-horizon autonomous directed development: project ownership by AI teams with policy-based controls and minimal, high-leverage human intervention.

It is not fire-and-forget autonomy. High-ambiguity and high-impact decisions still need a human gate.

The point is not to remove humans from software. The point is to remove humans from repetitive project mechanics so they can stay focused on judgment, direction, and outcomes.

We are not automating tasks. We are operationalizing project delivery.

Closing

This is early, and it is still evolving. But the direction is clear: coordinated agent teams can now carry significant project weight with surprisingly little human input.

The teams that win will not be the ones with the biggest model. They will be the ones with the clearest workflows, strictest gates, and best feedback loops.