You're on Claude Code and you've heard about Amp. You want to know if it's worth the switch, not another feature table.

Short version up top.

TL;DR

Amp is Sourcegraph's coding agent. It routes across models automatically (GPT-5.5 for deep reasoning, Opus 4.8 for smart work, GPT-5.5 again for fast small tasks), and it bills credits with zero markup on the underlying API. Claude Code is Anthropic's agent. It's Anthropic-only, and it bills as a subscription with rate limits.

Stay on Claude Code if you want one predictable monthly bill and you live in the Claude ecosystem. The $20 Pro or $100 Max plan caps your spend and the model is consistent.

Look hard at Amp if the thing you've always wanted is automatic model switching: a cheaper, faster model on the small stuff and the expensive one saved for real reasoning. You also get parallel agent runs you can watch out of the box. You pay for exactly what you burn, and on heavy Opus or GPT-5.5 work that adds up fast.

Most people reading this don't need to switch. They need to know what Amp does that Claude Code doesn't, and where both fall down. So that's the rest of this, including a real run where both agents choked on the same thing.

FeatureAmpClaude Code
MakerSourcegraph (spun out Dec 2025)Anthropic
ModelsMulti-provider (GPT-5.5, Opus 4.8, Gemini, more)Anthropic only (Sonnet 4.6, Opus)
Model routingAutomatic, per taskYou pick, stays fixed
BillingPay-as-you-go credits, zero markupSubscription with rate limits
Entry price$5 minimum credit (no subscription)Pro $20/mo (also Max $100 / $200)
Remote codebase searchNative (Librarian subagent)Via MCP servers (not native)
Parallel agent visibilityOut of the boxNeeds TUI plugins
EditorsVS Code, Cursor, Windsurf, JetBrains, Zed, NeovimTerminal, VS Code, JetBrains, desktop
SWE-bench (agent)None publishedIn Terminal-Bench; model-level proxies only

What Amp actually is

Amp is the successor to Cody, Sourcegraph's old assistant. It launched May 2025. Cody Free and Pro were shut down in July 2025.

In December 2025 Amp spun out as its own company, led by Quinn Slack, the Sourcegraph co-founder. So it's no longer a side product. It's the main thing.

It runs in the terminal and in your editor: VS Code, Cursor, Windsurf, JetBrains, Zed, Neovim. You can run it headless with amp -x.

The pitch is simple. Amp uses the best model for each job instead of locking you to one. That's the whole identity.

I ran both on the same task

I gave both the same real job: a Python agent built with FastMCP, wired into an agent-SDK web chat app. Not a toy. Something from my actual backlog.

Amp got it built. Start to finish took me two days, which for a multi-agent setup with an MCP server and a web frontend is fast. I was impressed.

On cost, I'll be straight with you: I'm not going to hand you a per-task dollar figure, because I didn't capture a clean before-and-after on my credit balance and I'm not going to invent one. What I can tell you is how it felt. Amp felt expensive. Claude Code and OpenAI both subsidise their own models inside their own products, and that subsidy is doing a lot of quiet work to make those tools feel cheap. Amp passes the real inference cost straight through. So you're seeing the unsubsidised number, and the unsubsidised number stings. That's not a knock on Amp's pricing. It's honest pass-through. But it's the thing nobody warns you about when they say "zero markup," because zero markup also means zero subsidy.

The model switching is the part I'd actually switch for. It's the optimisation I've wanted from Claude Code since day one. You're not paying top-tier rates to rename a variable. Amp drops to a faster, cheaper model for the small stuff and saves the heavy reasoning for when it matters. A lot of engineers have quietly wished for exactly this, and Claude Code still doesn't do it out of the box.

The thread model is nicer to live in. Every interaction is saved as a thread you can reopen and share. Claude Code keeps history too, but it's buried. Amp puts it in front of you.

Parallel agents are where the CLI earns its keep. When Amp spins up subagents you can watch each one's progress. In Claude Code you're bolting on TUI plugins to get the same view. Out of the box, Amp just shows you.

What broke: ruff

The failure that taught me the most wasn't really about Amp.

Amp choked on ruff. It couldn't get the Python linting right, threw bad errors, and I couldn't get it to a clean state no matter how I prompted. Frustrating.

So I tried the same thing in Claude Code. It couldn't find or fix the ruff warnings either.

Then I opened it in Cursor and it handled ruff fine.

Cursor has an unfair advantage on this specific class of problem, and the advantage is the IDE. Cursor sees the linter output inline, live, the moment it fires. A terminal agent has to run ruff, read the output, and reason about it from text. An IDE agent is sitting inside the surface where the squiggly red line already exists.

So if your work is lint-heavy Python and that tight feedback loop matters, neither CLI agent is going to feel as good as the editor. That's not an Amp problem or a Claude Code problem. It's a terminal-agent problem. Worth knowing before you pick a tool for that kind of work.

How each one handles context and models

This is the real architectural split, not a spec sheet.

Amp routes across providers. You don't manage any of this. Amp picks the model per mode, and the subagents run their own models underneath.

Mode / subagentModelJob
DeepGPT-5.5Hard reasoning, extended thinking
SmartOpus 4.8State-of-the-art day-to-day work
RushGPT-5.5Fast, low-token, small tasks
LibrarianSonnet 4.6Remote codebase search
OracleGPT-5.5Second-opinion reasoning
SearchGemini 3 FlashFast code retrieval
ReviewGemini 3.1 ProBug and code review

Amp's Librarian is the differentiator most people miss. It's a subagent that searches remote codebases without leaving your session: all public GitHub plus your connected private repos. It rides Sourcegraph's code-intel infrastructure, which is the one thing Sourcegraph has always been good at. Ask it to go read how a framework you depend on actually works, and it will.

Claude Code is local-first. It walks your filesystem, reads files, greps on demand. No pre-built index. It reads CLAUDE.md for guidance and supports subagents with their own context windows. You can bolt on remote or semantic search through MCP servers, but it's not native the way Amp's Librarian is.

Anthropic's own engineering blog is honest about the core limit: Claude Code's ability to help in a big codebase is bounded by its ability to find the right context. Too much and performance drops. Too little and it's blind. That's the problem the Librarian is built to solve.

The models tell the rest of the story. Amp is multi-provider by design. Claude Code is Anthropic-only: Sonnet 4.6 by default, Opus on Max plans and via API, no OpenAI or Google models. If model choice matters to you, that's a one-sided difference.

What each one costs

This is the sharpest contrast, and it moves, so check both pages before you commit.

AmpClaude Code
ModelCredits, pay-as-you-goSubscription, soft caps
MarkupZero over provider API costBundled, subsidised inference
Minimum$5 credit purchase$20/mo Pro
Higher tiersBuy more credits as neededMax 5x $100, Max 20x $200
Free tier$10/day credits, now clawed back for someNone (no Claude Code on free)
OverflowJust keep spending creditsOpt-in extra usage at API rates, capped
Enterprise+50%, one-time $1,000 to unlockCustom, Premium seats $100+/seat

Amp: credits, zero markup

Amp is pay-as-you-go credits. No subscription, no commitment, $5 minimum purchase. For individuals and non-enterprise workspaces there's zero markup on the providers' API pricing.

The example from their own page: a thread that incurs $2 of Anthropic usage and $0.50 of OpenAI usage deducts $2.50 from your balance. You're paying raw inference cost.

There was a free tier: $10/day in credits, replenished hourly, roughly $300/month, no payment, originally ad-supported and later ad-free. Worth knowing it got clawed back. On May 23, 2026, Amp started pausing or reducing the free daily allowance for less-active users and for people on older client versions. So "just try Amp free" is weaker advice now than it was in January. You might get the grant. You might not.

Enterprise is 50% more than individual and team pricing, plus a one-time $1,000 purchase to unlock it.

Claude Code: subscription with caps

Claude Code is a subscription. Pro is $20/month, Max 5x is $100, Max 20x is $200. Usage runs against a 5-hour rolling window plus weekly caps, shared across Claude Code, Claude.ai chat, and Cowork. Hit the ceiling and you're throttled, not charged.

Two things worth knowing. Anthropic doubled the 5-hour limits on May 6, 2026, and removed the peak-hours reduction for Pro and Max. And every paid plan now has an opt-in extra-usage toggle: when you hit your included limit you can choose to continue at API rates with a spend cap you set. So the old "Amp meters, Claude Code throttles" line isn't clean anymore. Both can meter if you let them. The difference is the default. Amp's default is the meter. Claude Code's default is the cap.

If you'd rather pay per token directly, Claude Code runs on the Anthropic API: Opus 4.8 at $5/$25 per million in/out, Sonnet 4.6 at $3/$15, Haiku 4.5 at $1/$5.

For reference, Anthropic's own docs put measured Claude Code spend at roughly $13 per developer per active day, $150 to $250 per developer per month, with most users under $30/day. That's real spend, not an estimate.

The honest cost read

Amp shows you a number that ticks as you work. Some people find that clarifying. Others find it stressful. The HN threads on Amp are full of both: engineers who love seeing exact per-task cost, and engineers who spent $5, then $10, then $20 in single sittings and got nervous watching the meter.

Claude Code hides the meter behind a flat fee. You trade visibility for predictability.

Neither is cheaper in the abstract. Amp is cheaper if your usage is light and bursty. Claude Code is cheaper if you're a heavy daily user, because the flat rate amortises and cache reads are bundled. The crossover depends entirely on how much you actually run it.

One thing the credit model exposes that the subscription hides: the real, unsubsidised cost of inference. Claude Code and OpenAI both subsidise their own models inside their own products. Amp passes the raw provider cost straight through. So Amp can feel more expensive even when it isn't marking anything up, because you're seeing the number the subscription is quietly absorbing for you. Zero markup also means zero subsidy.

Where each one breaks

The real failure modes for both tools, because a comparison that only lists wins is marketing.

Claude Code burned limits too fast in early 2026. Users flooded GitHub and Reddit in March reporting the 5-hour limits exhausting faster than expected. One measured suite (WOZCODE, May 2) hit 161 turns versus 52 on the same 11 prompts depending on config, with people hitting the cap before lunch. Even after the May 6 doubling, heavy agentic tool use drains faster than chat does. The "before lunch" framing is aggregated from user reports; the turn counts are the measured part.

Claude Code had a real quality regression. Anthropic published its own postmortem on April 23, tracing it to three separate changes that hit Claude Code, the Agent SDK, and Cowork. Resolved by April 20 in v2.1.116. Credit to them for owning it publicly, but it happened.

Amp's failure modes are barely documented. That could mean it has fewer problems, or it could mean a smaller user base files fewer reports. I lean toward the second read: Amp is younger and pricier, so it has fewer users generating bug threads. Not enough public signal to call it either way with confidence.

Amp's cost is the failure mode everyone names. Go read any Amp discussion. The single most common complaint isn't quality, it's the meter. "Great, but pricey" shows up over and over. The model is working as designed. But if a ticking balance makes you code more cautiously, that caution is a real cost too.

The benchmark question

There is no head-to-head benchmark of Amp versus Claude Code. Nobody has run both on the same eval with the same harness. Amp has no published SWE-bench score at all, vendor or independent.

Terminal-Bench (January 2026) tested Claude Code, Codex CLI, and Gemini CLI on agent-level terminal tasks. Amp wasn't included. So Claude Code has an agent benchmark and Amp has none.

You'll see model-level SWE-bench numbers floating around, but read them carefully: those are model scores, not agent scores. Opus 4.8 lands around 88.6%, Sonnet 4.6 around 77.2%, Gemini 3.1 Pro around 80.6%. Amp's smart mode runs Opus 4.8 and Claude Code defaults to the same model on Max plans, so on comparable runs they're drawing from the same model strength. But the Librarian and tool-use overhead are unmeasured, so a leaderboard score doesn't translate directly to agent behaviour. Treat any "Amp is X% better" claim as unfounded. The data doesn't exist yet.

That absence is exactly why a real run matters. When there's no benchmark, the honest comparison is one engineer running both on the same job and reporting what happened, including what broke. Which is the scene above.

The verdict

Stay on Claude Code if you want a predictable bill and you're happy in the Anthropic ecosystem. The flat subscription is easier to reason about, the model is consistent, and the rate limits just got more generous.

Switch to Amp, or run it alongside, if automatic model routing is the feature you've been missing, or you want parallel agent runs you can actually watch without bolting on plugins. The Librarian is a genuine edge if you do a lot of cross-repo or framework-internals research.

The catch with Amp is the meter. You pay for exactly what you burn, and on heavy Opus or GPT-5.5 work that climbs fast. The free tier that used to soften the trial got clawed back in May, so going in you should expect to pay.

For most engineers already shipping on Claude Code, don't switch. Keep Claude Code as the daily driver and reach for Amp when you want multi-model routing or remote-codebase search on a specific task. Run the same job through both once and see what each one does with your actual code. And if your work is lint-heavy Python, keep an editor in the loop, because both CLI agents went blind on ruff in a way Cursor didn't.

Frequently asked questions

What is Amp and who makes it?
Amp is an AI coding agent from Sourcegraph, the successor to Cody. It launched in May 2025 and spun out as an independent company in December 2025, led by Sourcegraph co-founder Quinn Slack. It runs in the terminal and in editors including VS Code, Cursor, JetBrains, Zed, and Neovim.
How is Amp different from Claude Code?
Two main differences. Amp routes across models automatically (GPT-5.5, Opus 4.8, and others depending on the task), while Claude Code is Anthropic-only. And Amp bills credits at pass-through API cost with no subscription, while Claude Code is a flat monthly subscription with rate limits. Amp also has the Librarian, a subagent that searches remote codebases natively.
Is Amp more expensive than Claude Code?
It depends on usage. Amp is pay-as-you-go credits with zero markup, so light or bursty users often pay less. Claude Code is a flat subscription ($20 Pro, $100 or $200 Max), so heavy daily users usually pay less because the flat rate amortises. There's no per-task dollar cost for Amp published anywhere, which is why a real side-by-side run is the only honest way to compare.
Does Amp have a free tier?
It did: $10/day in credits, roughly $300/month, no payment required. But in May 2026 Amp started pausing or reducing that free allowance for less-active users and people on older client versions. So you may or may not get it now. Claude Code has no free tier at all; it needs at least a $20 Pro subscription.
Should I switch from Claude Code to Amp?
Most Claude Code users don't need to switch. Consider Amp if you specifically want automatic model switching, native remote-codebase search, or clearer parallel-agent runs. The strongest move for many people is running Amp alongside Claude Code rather than replacing it, using each for what it's best at.
Is there a benchmark comparing Amp and Claude Code?
No. No one has run both on the same eval with the same harness, and Amp has no published SWE-bench score. Claude Code appears in agent benchmarks like Terminal-Bench; Amp does not. Any SWE-bench numbers you see are model-level scores, not agent scores, so they don't tell you which tool wins.

Start by doing this

5 mins: Install Amp (curl -fsSL https://ampcode.com/install.sh | bash), sign in, and run one small real task from your backlog. Watch the credit balance move. That's the unsubsidised cost of inference, and seeing it once tells you more than any pricing page.

15 mins: Run the same task through Claude Code. Note which finished cleaner, which felt faster, and how much of your 5-hour window it ate.

30 mins: If your stack is Python, throw a deliberately lint-broken file at both CLI agents and watch how they handle ruff. Then open it in your editor. That gap is the thing this whole comparison turns on.