2025-10-08 providersarchitectureengineering

Provider-portable prompts: what the abstraction actually buys you

A provider abstraction is not about hedging your bets. It is about keeping the prompt as the unit of work — and making the SDK the implementation detail it should always have been.

The first time you read a pitch for a provider-agnostic prompt library, the marketing line is always some variation of “swap providers in one line.” That line is true, and it is also misleading, because the value of provider portability is not the day you migrate. The value is every day before that day, when nobody is migrating, and the prompt still gets to be the unit of work in your codebase.

This is a quieter argument than “switch from OpenAI to Anthropic in one flag,” but it is the one that actually matters. Let’s walk through it.

The lock-in is not where you think

People imagine vendor lock-in as the OpenAI key in their .env file. It is not. The key is the cheapest part of switching. The real lock-in is what your code thinks a prompt is.

If you wrote the original integration against the OpenAI SDK, your prompt is shaped like a messages array — a list of objects with role and content fields. If you wrote it against Anthropic, your prompt is a system string plus a messages array with subtly different rules about alternating roles. If you wrote it against Groq, the shape mostly matches OpenAI but the model strings are different and a few parameters behave differently under load.

None of that matters until the day it does. The day it does — when GPU prices spike, when a new model lands on Groq that you want for one route only, when an outage drags on for six hours — you discover that “the prompt” in your codebase is not a thing you can lift and move. It is a method call against a specific SDK with a specific message shape and a specific set of parameters. Migrating that is not a config change. It is a refactor.

The provider abstraction is the wall between “what the prompt is” and “which SDK is calling it.” Once that wall is in place, the day of migration becomes a day of editing a flag. The reason it becomes a day of editing a flag is that you spent every previous day forcing the wall to stay vertical.

What the abstraction actually does

A useful provider abstraction is not a least-common-denominator wrapper. It is not “every provider gets a complete(text) method and we hope for the best.” It is a translation layer that takes a single, structured representation of a prompt — the AST — and lowers it into the call shape that each SDK expects, faithfully.

That means:

The system role becomes a system message for OpenAI, a system parameter for Anthropic, and the appropriate slot for Groq.
The Chain-of-Thought block becomes part of the user-visible prompt for plain providers, and part of a reasoning channel for providers that expose one.
The Harmony Protocol channels are surfaced as structured fields when the provider supports them, and folded into the prompt body when it does not.
The maxTokens constraint becomes max_tokens for some providers and max_completion_tokens for others, without you having to remember which is which.
The temperature, top-p, and stop tokens all land in their right slots.

The translation layer is allowed to be opinionated. It is allowed to say “this technique is supported on this provider and falls back to that representation on others.” What it must not do is pretend the providers are identical. They are not, and your prompt is more robust when the abstraction acknowledges the differences instead of hiding them.

What you gain on day zero

The migration argument is the headline, but the day-zero gains are the reason this pattern earns its keep.

Prompts become unit-testable. When a prompt is an AST plus a deterministic lowering step, you can snapshot-test the lowered form. You can write a test that says “given these params, this prompt produces this messages array for this provider.” That test runs in milliseconds without touching the network. The day someone tweaks the wording, the snapshot diff makes the change visible. The day someone tweaks the lowering rule, every snapshot for every provider catches it.

Evals stop being provider-specific. An eval harness that wants to compare three prompts across two providers is, at its core, a nested loop. With raw SDK code, the inner loop is six branches and a lot of “wait, does Anthropic want the system message inside the user message in this case?” With a provider abstraction, the inner loop calls executePrompt(p, params, { provider }) and trusts the abstraction to do the right thing. The eval results become comparable in a way that they otherwise are not.

Per-route provider choice stops being scary. Some routes want Groq because they care about latency more than ceiling capability. Some routes want Claude because the reasoning is more careful. Some routes want OpenAI because of structured output mode. Without a provider abstraction, every route that picks a different provider becomes a small special case in your codebase. With an abstraction, the route just picks a provider for that prompt and moves on.

Outage hedging becomes a runtime decision. It is not the most common case, but when it matters, it matters a lot. A prompt that can run against three SDKs is a prompt you can retry against a different provider when one of them is degraded. That is not interesting on a normal Tuesday, and it is the entire ballgame on the worst Tuesday of your year.

What you do not get

Provider portability does not give you provider parity. The models behave differently. A Chain-of-Thought block produces a different shape of response on Claude than on GPT-class models. A maxTokens of 800 might run hot on one and cold on another. A system message phrased in one way might be respected on Anthropic and ignored on Groq.

The abstraction does not, and should not, paper over those differences. It standardises the wire format so you can write the prompt once. The quality of the output still depends on the model. The reasonable response to this is to evaluate your prompts on every provider you intend to ship to, version the prompt against the providers it has been evaluated on, and treat the provider as a parameter the eval harness owns. This is something a structured prompt makes easy and a soup of f-strings makes nearly impossible.

It also does not give you cost portability. The math is still the math. A long-context Anthropic call is not the same cost as a short Groq call, and no abstraction is going to fix that. The abstraction’s job is to make the cost-aware decision a thing you can actually make.

Where the line is

A good provider abstraction supports a small, opinionated set of providers — three is plenty — and adds new ones grudgingly. The line lives at “official SDK, stable API, real use cases.” Anything past that line is community work and should live in a community-maintained adapter, not the core.

The same goes for techniques. Chain-of-Thought, Few-Shot, Zero-Shot, Tree-of-Thoughts, ReAct, Self-Consistency. There are more, but the named six cover most of what production prompts actually do. Past that, the technique is custom, and the prompt should embed it as a sequence of body blocks rather than asking the framework to grow.

The abstraction earns its keep by being small. The more it tries to do, the more lock-in you have invented for yourself, in trade for the lock-in you were trying to avoid. Promptel’s bet is that a thin, faithful translation layer beats a thick one, and that the right level of abstraction is the prompt-as-AST, not the prompt-as-agent.

The headline is “swap providers in one line.” The story is “your prompt is now the unit of work, and the SDK is the detail it should have been from the start.” That is the actual sale.

More notes Docs ↗