You're Talking to ChatGPT, But Not Always the Same One
When you ask ChatGPT a question, an invisible dispatcher decides which model answers. And that changes everything about what you get back.

Who are you even talking to?
You open ChatGPT, type your question, get an answer. Simple. Except what happens between your message and the response is a bit like a massive call center. You think you're calling a specific person, but an invisible operator redirects you to someone else based on the time, the load, and what they think you deserve.
That someone isn't always the same. And nobody tells you.
It's a disorienting reality. For months, millions of users thought they were talking to "one" intelligence. They were actually talking to a dispatch system that chose for them.
The dispatcher under the hood
Since August 2025, OpenAI has deployed what they call a "router". The idea: when you send a message to ChatGPT, an algorithm analyzes the complexity of your request in milliseconds. Based on its verdict, it sends you to one of the available models.
Concretely, there are three tiers. GPT-5 Instant, fast but lightweight: the colleague who shoots back a quick answer between meetings. GPT-5 Thinking, more powerful, takes time to reason through things. And GPT-5 Pro, the heavy-duty version reserved for complex tasks.
Under the hood, there's even more granularity. GPT-5.3 Instant, GPT-5.4 Thinking, GPT-5.4 mini... The router juggles between these versions based on your question, your subscription, and current quotas. A restaurant where the waiter decides whether you deserve the head chef or the line cook, without telling you.
The backlash
When OpenAI launched the router in August 2025, it didn't go well. On Reddit and X, the feedback was brutal. Users paying for Plus complained about short, sloppy responses that looked like what they got on the free tier.
"I'm paying for Pro but it feels like I'm getting the free model." That's a real quote from a Reddit thread that made the rounds. And it wasn't an isolated case. Another user summed up the problem: "Two people, same prompt, different results."
The dominant feeling was deception. Not technical anger, disappointment. They'd sold these users a Ferrari and were delivering a Clio some days.
Result: in December 2025, OpenAI backtracked for Free and Go users (the entry-level subscription). The router was disabled for those plans. For Plus and Pro, it stayed, but with an option to turn it off in settings. In March 2026, the interface was simplified with three clear modes: Instant, Thinking, and Pro, plus an "auto-switch" toggle in Configure.
It's better. But the fact that it existed in invisible mode for months raises a real question.
What about everyone else?
This isn't unique to OpenAI. Google does the same thing with Gemini. The "Auto" mode routes requests between Flash (fast) and Pro (powerful). The difference: Google owns it openly. Their Gemini CLI offers a documented "Adaptive" mode, open-source, viewable on GitHub. Vertex AI even offers an experimental "Model Optimizer". Everything's on the table.
Two restaurants both using pre-made ingredients. One writes it on the menu, the other lets you believe everything's made from scratch. The dish might be the same, but the trust relationship is very different.
Anthropic's Claude takes yet another approach. On the web interface, there's no automatic routing. Pro subscribers choose for themselves between Opus, Sonnet, and Haiku. The user decides which model runs, not an algorithm. Claude Code, their developer tool, offers an "opusplan" mode that does routing (Opus plans, Sonnet executes), but it's optional and documented.
Quick summary:
- ChatGPT: automatic routing, transparent since March 2026, but opaque history
- Gemini: automatic routing, documented and open from day one
- Claude: no routing, explicit user choice
Why it exists: the impossible equation
Running ChatGPT costs a fortune. According to industry estimates (Mirantis, March 2026), OpenAI spends about $700,000 per day on inference, over $250 million per year. Just to answer users' questions.
And not all models cost the same. A GPT-5 Thinking consumes about ten times the resources of an Instant. Like each "serious" question burning ten tickets instead of one. With hundreds of millions of users, that adds up fast.
Dynamic routing cuts these costs by 40% to 85% depending on the case (Requesty figures). Today, inference represents 80% of AI budgets for companies, versus only 20% for model training (NVIDIA, 2026). In other words, running AI day-to-day costs four times more than building it.
So no, routing isn't malicious. It's economic survival. The problem isn't that it exists. It's that nobody said so.
What this means for you
In practice, this explains something you've probably experienced: some days, ChatGPT gives you brilliant answers. Other days, on the same question, you get something flat, generic. "Why is my answer garbage today?" It's a question that comes up constantly on forums.
The answer: it wasn't the same model. And when usage limits are hit, routing becomes even more aggressive. Quiet redirection to smaller models, no notification.
A streaming service that lowered the quality of your movie halfway through without telling you. Technically, you're still watching a movie. But the experience isn't the same.
Taking back control: practical guide
Good news: you can do something about it. Here's how to take back the wheel, whatever tool you're using.
On ChatGPT: go to Configure (the gear icon top right), and turn off "auto-switch". Manually choose Instant, Thinking, or Pro based on your task. Instant for quick factual questions, Thinking for analysis and writing, Pro for complex code or long reasoning chains.
On Gemini: turn off Auto mode and manually choose between Flash and Pro. It's in the chat settings. At least with Google, the option is clear.
On Claude: nothing to do. The choice is already explicit.
A concrete exercise: this week, ask the same complex question to your preferred AI three times throughout the day. Note the quality of each response. If you see significant variation, it's probably the router at work. Then rerun the test by forcing the model manually. Compare.
That's the kind of small test that teaches you more about your tool than hours of reading.
The real question
This isn't black and white. Routing is a technical solution to a real problem: running these models costs a fortune, and without optimization, prices would explode or the service would collapse. The logic holds.
But transparency isn't negotiable. When you pay for a service, you have the right to know what you're getting. And when the quality of what you receive depends on an invisible algorithmic choice, that's a trust problem, not a technology problem.
Google got it from the start. OpenAI is learning the hard way. Anthropic chose not to play this game. And now that you know, you can choose with open eyes.
Next time you ask your AI a question and the answer disappoints you, before thinking "AI is trash", ask yourself: was it actually the right model that answered?
If this article helped you, share it with someone who uses ChatGPT every day without knowing what's happening behind the scenes. This is the kind of thing we should all know.



