But why??
Most AI coding sessions still look like this: prompt, wait, get distracted, come back, review, fix, repeat. The limiting factor isn't model capability — it's human sequencing overhead, and the fact that lessons from prior sessions evaporate.
What if you batch-planned a whole stretch of work upfront — with full project context and experiments carried forward from your last plan — then ran the prompts in parallel while you did something else? And then reviewed and shipped with evidence, not vibes?
That's Conducty. Systems-level AI agent orchestration for Claude Code, backed by an Obsidian vault as the context engine. Every plan, design, project context, improvement, failure pattern, and metrics row is a wikilinked note. The next plan reads the graph and gets sharper.
Shape → Plan → Trace → Execute → Verify → Improve → Code Review → Ship.
The cycle
- —Non-trivial goals go through appetite-driven design first — set boundaries, define no-go zones, identify rabbit holes before writing a single prompt.
- —Load the project's context sub-graph from the vault and pull failure patterns, metrics, and improvement experiments from prior plans.
- —Generate all prompts upfront with acceptance criteria, verification steps, calibrated review levels, and tracer markers. Every prompt is checked for smells before it ships.
- —The first prompt in each group runs alone as a tracer bullet — validating plan assumptions end-to-end.
- —If the tracer fails, it's the plan that needs revision, not just the code. The remaining prompts don't blindly execute.
- —Remaining prompts run as Claude Code Task subagents in parallel, each with precisely curated context and no-go zones.
- —Git worktrees isolate parallel prompts targeting the same repo. Time budgets act as circuit breakers.
- —Review rigor scales with risk: low-risk prompts get verify-only, high-risk gets full two-stage review.
- —Health metrics — first-attempt pass rate, retry count, blocked count — computed after each group.
- —Hill chart positions updated for every goal. Systemic failures (2+ related) flagged as plan-level issues, not individual code bugs.
- —Fixes generated at the right leverage point: plan, prompt, or code. Three failures on the same prompt triggers a circuit breaker.
- —Evidence-based audit of every change — verdicts, failure patterns, velocity metrics, all written into the vault.
- —Improvement kata: target vs. actual, obstacles, and specific experiments for the next plan.
- —History that doesn't change behavior is just a log. The next plan reads this graph and gets sharper.
- —Whole-branch holistic review across five lenses: spec alignment, correctness, security, architecture & coupling, tests & maintainability.
- —Findings triaged Critical / Important / Minor with file:line references. Critical+ findings flow back to Failure Patterns so the next plan can prevent them.
- —Goes beyond the in-cycle reviewers — sees the cumulative diff as a single artifact, not prompt-by-prompt.
- —Six-gate battery: code-review verdict, lint, typecheck, full test suite, secrets scan, dependency-vulnerability check.
- —Single green / yellow / red verdict written to the vault. Mechanical, not subjective — failures cite verbatim command output.
- —Ship never auto-merges. Verdict is advisory; you own the merge.
Claude Code native
Context engine
Conducty's state isn't a flat log. It's an Obsidian vault where every artifact is a wikilinked note. A plan links the designs it consumed, the project context it loaded, the improvement experiments it's testing, and the prior plan whose work carries forward. The next plan navigates that graph instead of re-grepping a write-only history.
Quality principles
Ten engineering-grounded principles enforced across every session through always-on rules. No manual checklist.
Engineering roots
A learning system that compounds. Every plan's failures become the next plan's better prompts. The vault remembers so you don't have to.
