Flow: Moving Work from Idea to PR Through Structured Handoffs
AI coding tools are genuinely good at the first 80% of a task. The last 20% is where they get unreliable — they either silently guess wrong, or they dump the ambiguity back on you without framing it in a way that’s actionable.
I built Flow to solve that. It’s a skill system for Claude Code that moves work from idea to shipped PR through a series of structured stages, with a document at each boundary that both humans and AI agents can read and edit.
The repo is at github.com/jyliang/flow.
The Core Insight: Documents Are the Interface
The key idea is that between every stage of development, there should be a document. Not a chat message, not a CLI flag — a file on disk that both you and the AI can see, read, and edit.
That document serves two purposes:
- You read it to understand what happened, and edit it to redirect if needed
- The next AI agent reads it to continue work from exactly where things stand
This creates a clean separation. The AI does a stage of work and produces a document. You review the document and decide whether to advance. If you want to change direction, you edit the document. The AI picks up your edits in the next stage.
The documents are the API between human and AI, and between AI and AI.
The Pipeline
Idea
↓ [explore]
Spec ← is this what we're building?
↓ [plan]
Plan ← is this how we're building it?
↓ [implement]
Changes ← normal code review
↓ [review]
Findings ← what needs my judgment?
↓ [ship]
PR
Five stages. Each stage is a skill that reads the previous document and produces the next one. The entry point is a single /flow command — it detects where you are in the pipeline and advances work to the next stage.
Stage detection is straightforward:
- No
agent/spec.md→ run explore, produce a spec - Spec exists, no plan → run plan, produce an implementation plan
- Plan exists with incomplete steps → run implement, make the changes
- Plan complete or unreviewed changes on branch → run review, produce findings
- Findings exist with unresolved items → run ship, open a PR
If you give explicit intent — “review this PR”, “ship it” — Flow skips detection and goes directly there.
Document Depth Scales With Complexity
One thing I worked hard to get right: the ceremony is proportional to the complexity of the task.
A one-line bug fix produces a three-line spec and skips straight to implementation. The document still exists — it’s still the interface — but it’s minimal. There’s no six-section impact analysis for renaming a variable.
A complex feature gets the full treatment: a detailed spec with edge cases and impact analysis, a multi-step plan with dependencies, multiple review rounds. The structure is always there. The depth is proportional.
This matters because the overhead of the system has to be lower than the value it creates. If every change required a PhD thesis to ship, people would stop using it immediately. The system has to be fast for the simple cases and thorough for the complex ones.
Revisions Are Communication
Work isn’t linear. During implementation, you discover the spec was wrong. During review, you realize the plan missed a step.
Flow handles this explicitly through revisions. When work deviates from an earlier document, the system updates that document and appends a revision entry:
## Revisions
- **implement → spec** 2026-04-16: Changed auth from JWT to session cookies
**Why**: Existing middleware only supports sessions. Rewriting is out of scope.
**Impact**: Plan steps 3-5 updated. No JWT dependency needed.
This is not a bug in the process — it’s a feature. The revision trail answers questions that come up constantly in software teams: “Why does the code differ from the spec?” “When did we change approach?” “Who decided this and why?” The decisions get captured instead of lost in a chat thread.
The Review Stage Self-Verifies
The review skill is the one I spent the most time on, because it’s where the last-20% problem is most acute.
After implementation, the review skill reads the diff and produces a findings document. Each finding is categorized: blocking issues that need to be fixed, warnings worth noting, and things that look fine.
But the key behavior is that the review stage self-verifies every finding before surfacing it to you. If the agent thinks there’s a bug in line 42, it checks line 42 before putting it in the findings. It looks for counter-evidence. It confirms the issue is actually present.
This dramatically reduces false positives. A review that cries wolf on every trivial thing trains you to ignore it. A review that’s accurate trains you to take it seriously.
Only genuinely ambiguous findings — things that require human judgment about product direction or architectural tradeoffs — make it to the top of the list. The mechanical stuff gets fixed automatically before it reaches you.
The teach Skill
One part of Flow that I think is underappreciated: the teach skill.
teach lets you codify patterns you discover during development into new skills. If you find yourself repeatedly explaining the same thing to the AI — “always use server actions for mutations in this codebase,” “our error handling convention is X” — you can run /teach and it captures that into a skill file.
Skills are just markdown files. They’re readable, editable, and versionable. The system evolves with your codebase instead of staying static.
This is important because the value of any workflow system compounds over time. The more you use it, the more it knows about your specific context. teach is the mechanism for that accumulation.
How to Install It
Flow ships as a Claude Code plugin:
/plugin marketplace add jyliang/flow
/plugin install flow
Or via npx skills, which works across Claude Code, Cursor, Codex, Copilot, and others:
# Install globally for Claude Code
npx skills add jyliang/flow -g -a claude-code
# Pick individual skills
npx skills add jyliang/flow --skill flow --skill review -g -a claude-code
Why I Built This
The thing that motivated Flow was a frustration I kept running into: AI coding assistants are great at generating code but bad at knowing when to stop and ask. They’ll silently fill in an assumption that turns out to be wrong, and you won’t find out until you’re deep into reviewing something that went in the wrong direction.
The solution isn’t smarter AI — it’s better structure. If you build clear handoff points into the workflow, with human review baked in at each one, you catch the wrong turns early. You also get a cleaner record of what happened and why, which matters for teams and for your future self.
Flow is the structure I wanted. I’ve been using it on real projects, including some of the features in Jotcache, and it’s changed how I work with AI coding tools. Less cleanup after the fact, more intentional collaboration throughout.
The code is open source. Issues and PRs welcome.