Part II: How Do I Work With This Thing?

Agents as Teammates

It was 2pm on a Tuesday and I was deep in a Kai debugging session when Claude's quota wall hit. Mid-sentence. The conversation where I'd spent twenty minutes building context about the sensor pipeline, the event bus, the timing issue that only surfaced when two automations fired within the same second — gone. Not deleted, just frozen. Come back at 5pm.

I sat there for about ten seconds feeling sorry for myself. Then I opened Gemini.

Gemini didn't know anything about Kai. It didn't know the codebase, the architecture, the bug. But it didn't need to. I'd been putting off prototyping a dashboard layout for the sensor data — the kind of visual, component-level work that Gemini handles better than Claude anyway. So I pivoted. Ninety minutes of productive UI work later, Claude came back online, and I picked up the debugging session where it left off. Different agents, different shifts.

That's the day I stopped thinking of AI tools as interchangeable and started thinking of them as a roster. Cursor burns through its monthly allocation in three weeks if you're on a productive streak. So Codex takes the admin shifts — file organization, boilerplate, the chores that need doing but don't need your best model. That frees up premium quota for work that actually requires it. You're balancing cost, capability, availability, and specialization across a team that each have different pricing, different limits, and different strengths. The person who uses one model for everything is like a manager who puts senior engineers on data entry.

API access is the escape hatch, and it's also the trap. The subscription models — twenty dollars a month for Claude, twenty for ChatGPT, twenty for Gemini — are artificially cheap. They're loss leaders designed to get you dependent. The moment you need more than the quota allows, you're on API pricing, and API pricing reflects the actual cost of running these models. A conversation that cost you nothing on the subscription might cost two dollars on the API. Do that fifty times a day and you've got a real line item. The subscription is the buffet. The API is à la carte. Knowing when you're about to exceed the buffet and should slow down — or when the à la carte is worth it because the task is high-value — is a budgeting skill that nobody teaches.

The specialization is real and it shifts. Right now, one model builds the best user interfaces — it understands layout, spacing, component architecture in a way the others don't. Another model is the strongest at reasoning through complex system design. A third generates images that the others can't touch. A fourth is fast and cheap and good enough for the tasks that don't need the best. These rankings will change. They change every few months. The model that's untouchable in March gets leapfrogged in June. The durable skill isn't memorizing which model is best at what. It's developing the instinct to evaluate quickly: what kind of problem is this, which class of tool handles it well, and which tool in that class still has quota left today?

There's a meta-skill here that most people miss. The act of switching between tools — being forced to by quotas, by cost, by capability gaps — teaches you more about what each tool does well than any benchmark could. You learn that Claude thinks in systems but gets verbose. Gemini is fast and visual but shallow on architecture. Codex is reliable for repetitive tasks but doesn't improvise. ChatGPT is the generalist — good at most things, best at few. You learn this by working with them, by hitting their walls, by being forced into the next one and noticing what changes.

Hold the roster loosely. The names will change. The pricing will change. The strengths will shift. But the pattern — multiple agents, different shifts, different skills, different costs, optimized as a team — is the shape that stays.


← The Tests Are for You