01 The pivot
The catalyst wasn’t a spreadsheet of SaaS subscriptions. It was a bottle of olive oil.
Angelica’s Organic EVOO is small-batch, limited-quantity — the kind of product where every detail of the customer experience is part of the product. The harvest story belongs on the product page. The provenance belongs in the confirmation email. The voice that answers a contact form has to sound like the people who picked the olives. Templated commerce can’t do that. Not really.
So I made the decision that, on paper, was insane for a non-engineer: I’d build the whole thing. Storefront, admin, agent swarm, webhook fan-out, shipping integration, the Discord bot that manages the marketing team. Not to save money on a Shopify subscription — that framing misses the point. I built it because the ceiling on what a team of one could orchestrate had moved, and templated commerce was now sitting below that ceiling.
The coinage: context-parity
Here’s the term I want to own: context-parity. Context-parity is the property that your AI collaborator can hold, in a single reasoning pass, the same mental model of the system that you hold. If you can’t achieve it, the AI hallucinates integration points — proposing function calls to modules that don’t exist, assuming env vars are wired when they aren’t, “fixing” a bug in the admin app by editing a file in the website app that has the same name but different semantics.
Context-parity is the thing that breaks first when a solo operator scales via AI. Most teams have the opposite problem — too many engineers, not enough shared context — so their tooling (microservices, separate repos, platform teams) reflects that. For a team of one using AI, that shape is poison. Every repo boundary is a context boundary. Every context boundary is a place the AI has to guess.
So I made the structural bet that defines everything downstream: one monorepo, npm workspaces, customer site and admin dashboard as siblings — sharing one node_modules resolution graph and the same Supabase generated types.
evoo-monorepo/ ├── apps/ │ ├── website/ → angelicasevoo.com (storefront) │ │ ├── src/ │ │ │ ├── app/ Next.js 16 App Router │ │ │ ├── features/orders/ webhook + fulfillment │ │ │ └── libs/supabase/ generated types ──┐ │ │ └── supabase/ │ │ │ ├── functions/ 9 edge functions │ │ │ └── migrations/ pg_cron schedules │ │ │ │ │ └── admin/ ─────shared ──────┘ │ └── src/ (reads same types, same auth boundary) │ ├── Angelica EVOO/ │ └── project-architecture.canvas ← Obsidian · living map ├── docs/architecture/AI-AGENTS.md └── package.json ← workspaces + overrides
The toolchain, and a lesson about buying
I bought a year of Cursor upfront. Paid in full. Convinced I’d found the thing. Then Claude Code shipped, I tried it once, and I never opened Cursor again.
The lesson isn’t “Claude Code beats Cursor.” The lesson is: in a market moving this fast, the cost of staying flexible is almost always lower than the cost of a locked-in bet. Monthly over annual. Try the new thing the week it ships. Treat your toolchain as a rotation, not a commitment. I’d rather pay 20% more per month and swap tools three times a year than lock in a “deal” that stops being the deal the moment something better ships.
So the toolchain today is two things: Claude Code is where I pair on the system — one seat, one context, one conversation at the repo root. And Obsidian, specifically one canvas file, is where the architecture lives. The canvas isn’t a Notion wiki that decays from write-once-read-never. It’s load-bearing — the next Claude Code session reads it at session start. If the canvas is wrong, the next session builds the wrong thing. That incentive is what keeps it current.
@types/react caret rangeNext.js 16 tightened layout typing. Both apps started failing typecheck on files I hadn’t touched in weeks. Root cause: ^19.0.4 had silently resolved to 19.2.x, which shipped breaking changes in a minor. Fix was not a type cast — it was pinning the exact version in the root package.json and letting workspaces propagate the pin.
Caret ranges are a trust contract with the upstream maintainer that they won’t ship breaking changes in a minor. That trust is sometimes misplaced — and when the AI is doing the diagnosis, it chases the error into your code before it suspects the lockfile.
next lint is goneNext.js 16 removed the next lintcommand. Broke CI, pre-commit hooks, and every doc that said “run npm run lintbefore committing.” Both apps had to migrate to ESLint 9 flat config independently.
Framework conveniences are leased, not owned. When the lease expires, you pay the moving cost. Owning a boring default lint config means the framework upgrade doesn’t break the build pipeline.
current_setting()Scheduled jobs fired. Nothing happened. No log rows. The auth header was being built from current_setting('supabase.service_role_key'), which resolves in the interactive SQL editor but not inside a pg_cron background worker. Two weeks of silent failure. Fix: move the secret to Supabase Vault and read decrypted_secret — session-agnostic.
A thing that works in the interactive console is not necessarily a thing that works in a scheduled background worker. That’s the kind of question AI collaboration accelerates once you know to ask it. The hard part is knowing the question exists.
02 The factory floor
Intelligence agents write reports. The Brain reads the reports and writes a strategy. Content agents read the strategy and write content. Everything else is plumbing.
The question I kept getting was: why not n8n, Zapier, Make? Three reasons, each rooted in context-parity. An n8n workflow is a graph in a GUI — Claude can’t read it without crossing a context boundary. A marketing-automation SaaS is exactly the upfront commitment the Cursor subscription taught me to avoid. And every third-party connector is a new auth surface. In Supabase, the edge functions already run inside the auth boundary and the secrets live in Vault.
So the factory floor lives in Postgres. Three tables hold the state between tiers — intelligence_reports, content_strategy, agent_runs. The architectural move is that the tiers communicate through the database, not through function calls. An intelligence agent doesn’t hand its report to the Brain. It writes a row. The Brain reads rows. If the Brain is down on Sunday, the reports sit in the table until next week. The database is the message bus — with transactions, constraints, and RLS.
Graceful degradation, by design
Each content agent has a fallback. If content_strategy is empty — because the Brain failed, because Sunday silently broke, because I cleared the row — the agent reverts to a “freestyle” mode that reads from a marketing_context table and ships anyway. The output is worse, but the business keeps shipping content.
Graceful degradation isn’t a bolt-on here. Intelligence agents degrade by returning partial reports. The Brain degrades by synthesizing fewer reports. Content agents degrade to freestyle. The business is still operating at every failure level — just with less intelligence layered on.
And the content_strategyrow is overridable. Some weeks the Brain gets it wrong. Some weeks I have knowledge the Brain can’t — a supplier conversation, a customer email, a harvest note. Being able to UPDATEthe strategy directly, as a human, without touching any agent’s code, is the feature that makes me trust the system. If I couldn’t override it, I wouldn’t have shipped it.
03 The router
Nine agents. Four model tiers. One ~270-line file that decides who runs where — and makes the rest of the code not care.
If you use one of the AI content SaaS tools, you’re paying for a model choice they made on your behalf. That choice is locked in because the whole product is built around one provider’s interface, prompt conventions, billing. That’s fine until the day you realize social posts need a different model than blog posts — structured JSON vs warm long-form prose — and you have to leave the platform to change it.
Multi-model routing decouples the “what to write” from the “what to write it with.” Agents are defined by their job. The router decides which model serves which job. When a better model ships, the agents don’t change — one line of config does.
The interface that makes it work
The whole file hinges on one move: making Claude-via-OpenRouter quack like Gemini. Every agent calls model.generateContent(prompt) and reads result.response.text(). So the OpenRouter adapter exposes exactly that shape:
class OpenRouterModel {
async generateContent(prompt: string):
Promise<{ response: { text: () => string } }> {
const res = await fetch(
"https://openrouter.ai/api/v1/chat/completions",
{
method: "POST",
headers: { Authorization: `Bearer ${this.apiKey}`, ... },
body: JSON.stringify({
model: this.modelId,
messages: [{ role: "user", content: prompt }],
max_tokens: this.maxTokens,
}),
}
);
if (res.ok) {
const data = await res.json();
const content = data.choices?.[0]?.message?.content ?? "";
return { response: { text: () => content } };
}
// 400 — don't fall back, won't self-resolve
if (res.status === 400) throw new Error(`OpenRouter 400: ${...}`);
// 401/403/429/5xx — fall back to Gemini
return this.geminiFallback(prompt);
}
}generateContent and calls it.GEMINI_MODEL_OVERRIDEWhat this replaces
A typical AI content SaaS charges $50–$300/mo per seat, for one workflow. Stack three or four (blog + social + email + intelligence) and you’re at $400–$1,200/mo. My all-in AI spend across the swarm — Gemini, OpenRouter credits for Claude, Pexels for images — runs well under $50/mo. The arbitrage is three things: Flash Lite for the quality gate (pennies per call), Haiku for short-form and Sonnet for long-form (~5× cheaper where it matters), and the free Gemini tier for intelligence agents.
And — circling back to the “pay monthly, rotate fast” discipline — the router makes switching providers almost free. If Claude Sonnet gets worse, I change "anthropic/claude-sonnet-4-6" to "openai/gpt-4.1-mini" in one line and the agent starts running on a different provider. No architectural rewrite. One commit.
First draft: anynon-200 fell back to Gemini. 400, 429, 500 — all the same. The blog-agent was sending prompts that exceeded OpenRouter’s max context for the configured model. OpenRouter returned 400. The router silently fell back to Gemini. Blog posts shipped — with the wrong voice — for a week. I noticed only because a reader commented that “the last few posts feel different.”
A fallback that masks the original error is worse than no fallback. Split “temporary provider failure” from “permanent request failure.” Log loudly on fallback. Never swallow the diagnosis.
04 The operator console
I didn’t build a dashboard. I built a Discord bot. Because when you can talk to the swarm from your phone, the swarm stops feeling autonomous and starts feeling like a team you manage.
The obvious thing — the thing every tutorial nudges you toward — is an admin dashboard. A React app with a sidebar, cards, buttons to trigger runs. I already have half of that at apps/admin. I deliberately didn’t extend it.
A dashboard is a lie about the interface I want. When I interact with the swarm, it’s almost never at my laptop. It’s at a coffee shop, on my phone, between tastings, handing out samples at a farmers’ market. A dashboard implies a workstation. A chat interface implies a pocket. The pocket is where the work happens.
The swarm doesn’t have a long list of nouns to browse — it has verbs: run the blog agent now, show me this week’s strategy, skip Friday’s social post. Verbs want a command line, not a sidebar. Discord is a command line with good ergonomics and free push notifications.
/run-blog/show-strategyWhy the VPS
Vercel can’t run a long-lived Discord gateway connection — bots subscribe, they don’t poll. Serverless that spins up on request doesn’t fit. A Hostinger VPS is $4/month, runs Docker, stays up. The cheapest way to own a long-running process. The operational cost is that I have to think about Linux hosting, which I hadn’t done in years. That’s where the scars live.
First deploy’s container was openclaw-abc1-.... Second deploy’s was openclaw-xyz9-.... I’d written the slug into every script. Every script broke on redeploy. Fix: an auto-discovery block at the top of every script that reads the current slug from docker ps.
Any identifier assigned by a platform is a trust contract that they won’t change it. Vercel project IDs are stable. Supabase refs are stable. Hostinger container slugs are explicitly not. Assume volatile until proven otherwise.
I tried to auto-discover GITHUB_PATby grepping the VPS’s .env during deploy. Works in steady state — docker psreturns the container, the env file is mounted, grep works. During a fresh deploy, the old container has exited and the new one hasn’t started. SLUG goes empty, grep returns empty, GITHUB_PAT becomes an empty string, git-pull fails with a confusing 401.
Bootstrapping has a chicken-and-egg problem that steady-state doesn’t. Any script that assumes steady-state invariants will pass testing and fail in production. Distinguish bootstrap path from operation path.
The commands I use daily work. The commands I imagined and never needed — bulk replay, historical diffing, scheduled-run editing from Discord — don’t exist. Earlier in the build, I shipped hooks for features that didn’t exist yet “because I’ll probably want them.” Three were never used. One had a bug that took a day to find because the hook only fired in a code path I’d never exercised.
Don’t add features, refactor, or introduce abstractions beyond what the task requires. The opposite of the instinct most engineers have — and the instinct AI pushes toward, because training data is full of over-engineered code. Staying at 70% survives only if both sides of the pairing agree to it.
05 What it adds up to
AI didn’t remove the engineering. It raised the ceiling on what one operator could orchestrate.
I learned Postgres sessions the hard way because I had to. I learned npm workspaces overrides the hard way because I had to. I learned that pg_cron background workers don’t inherit GUC state, and that a fallback masking a 400 is a lie, and that Hostinger randomizes container slugs, and that caret ranges on @types/react can silently eat your Next.js build. More Postgres, more React, more Linux, more TypeScript in six months than the prior decade — because AI collaboration made the learning loop tight enough that the scars were worth paying for.
The four pieces are the shape that falls out when you optimize for context-parity. One monorepo, so the AI reads the whole system in one pass. A database-as-message-bus, so the tiers communicate in a substrate the AI can diff. A router that swaps providers in one line, so the frontier can move without breaking the agents. An operator console in Discord, so I can manage the swarm from anywhere.
None of these are SaaS products. All of them are cheaper and more flexible than the SaaS I would have bought. That’s not the point. The point is that the SaaS couldn’t have been the product I wanted — because the product I wanted required composing seven services, three models, two deployment targets, and one chat interface. No vendor sells that composition.
What AI gave me wasn’t a shortcut. It gave me access to the composition. I could hold the whole system in my head because the AI was holding it with me. That’s context-parity. That’s the ceiling raising.