Bring Your Own Fleet — why your existing GPUs are the right execution layer
· IsoKron team · 3 min read
Your Qwen-Coder-class compute is closer to your code, your data, and your money than any hosted fleet. Why IsoKron is built around BYO-fleet from day one — and where we're going with hardware-native execution agents.
- byo-fleet
- infrastructure
- ai-codegen
The hosted-fleet pitch, and what's wrong with it
Most current AI development platforms run their own agent fleet. You send them a request. Their fleet does the work in their data center on their hardware against your code (which they've copied into their environment to run against). They send you the result.
The pitch sounds reasonable. The customer doesn't think about infrastructure. The platform manages capacity. Everything just works.
In practice, three things bite:
- The platform's fleet sees your code. Sometimes they retain copies for training. Sometimes they don't. The privacy boundary lives in their terms of service, not in your network.
- The platform's hardware is expensive. GPU rentals at hyperscaler markup are 2–5× the cost of running the same model on workstation-class hardware you already own.
- You can't extend the fleet. When you need a specialized worker — one that knows your internal stack, or has access to your local databases, or runs a model you've fine-tuned — there's no surface to add it. The platform is closed.
What BYO Fleet actually means
Bring Your Own Fleet inverts this. The platform compiles. Your fleet executes. The fleet runs wherever you want — on a Mac Mini in your home office, on an old workstation under a desk, on a NUC in a closet, on rented GPU capacity, on whatever you've got that can run a 7B–14B class coder model.
Your fleet connects to IsoKron via MCP (Model Context Protocol). The platform's compile pipeline produces a knowledge graph + tickets. Your fleet claims tickets, runs them locally against your code, posts results back. Your code never leaves your hardware unless you explicitly export it.
Three things fall out of this:
- Your data is your data. The platform stores the compiled graph (declarations, decisions, tickets, conventions). The platform does not see your source code unless you choose to surface a piece of it explicitly.
- Your hardware is your hardware. That Mac Mini you bought to run local models? Now it's earning. The RTX 5090 in your gaming machine? Same.
- Your fleet is your fleet. Custom workers, fine-tuned models, internal tools — all yours, all running with your credentials, all accessible from your workstation.
Why this scales for indie and small-team operators
The hosted-fleet model is built for sales-led enterprise. Big budget, low complexity, predictable workloads. It doesn't work for the indie operator running three side projects from a 2-bedroom apartment with a Mac Mini + a Pi cluster.
BYO Fleet works for the indie operator. The fleet has been measured against current open-weight coder models on consumer-grade hardware. We do not need frontier compute for execution — only for compilation. A reasonable Qwen-Coder-class model on a 5090 sustains thousands of tokens per second. That's fast enough to ship features.
The economics:
- Compile cost (per project): pennies-to-low-dollars on the frontier provider you BYOK to
- Execution cost: your existing hardware, plus electricity
- Storage: a few dollars on Cloudflare R2 for archived audit data
A solo operator running three projects sustained for a year on this shape pays under $200/month total. Versus $1,000+/month on a hosted-fleet platform doing the same work.
A glimpse of where this is going — hardware-native execution
The fleet-on-existing-hardware story is the v1 form. We're building toward a v1.5+ extension we call HNAO — hardware-native autonomous operators — where the fleet has direct physical access to the systems it's working on. Not just running a model. Running a system that can see what's on the screen, operate the device, and report back.
It's deliberately not in v1. It's the natural next step once BYO Fleet has proven out, and once the customer has expressed the want. We'll have more to say on this as it materializes. For now, the relevant point is that IsoKron's MCP surface is the integration point. Whatever hardware-shaped execution layer comes next will plug into the same place your existing fleet does today.
Bottom line
BYO Fleet is not the trendy answer in 2026. The trendy answer is hosted, managed, opaque, expensive. We disagree. Your existing compute is closer to your code, your data, and your money than any third-party fleet will ever be. The platform that respects that is the platform you can build a business on.
Related: BYOK economics, Why structured graphs beat markdown for AI codegen.