Perspectives · Context Engineering · AI Engineering

The 8 Levels of Context Maturity in AI-Native Engineering

BW
Brandon Waselnuk · May 8, 2026 · 14 min read
Software development is shifting from an activity centered on writing code to an activity grounded in orchestrating agents that write code. — Anthropic, 2026 Agentic Coding Trends Report1Anthropic. 2026 Agentic Coding Trends Report.

Most engineering teams are caught between two realities: AI now shows up in roughly 60% of their work, but according to Anthropic's research1Anthropic. 2026 Agentic Coding Trends Report., only about 20% can be fully delegated.

The gap between those two numbers is what this guide is about. It builds on Bassim Eledath's framework for agentic engineering4Bassim Eledath. Levels of Agentic Engineering. and outlines the eight stages organizations move through as they adopt AI-native development.

This guide will help you move from "you are the context provider" to "curated context" to a real context layer.

The maturity curve
Eight levels, three zones, one trajectory
Levels 1-2 You are the context. Engineers carry the system in their heads.
Levels 3-4 Curated context. Rules files work, until they go stale.
Levels 5-8 Context layer. Live retrieval becomes load-bearing infrastructure.
Bar height represents the autonomy each level unlocks and the infrastructure investment it demands. Click any level to jump to it.
Recommended addition Want a personalized read? Take the AI Adoption Assessment. It maps your team to a level in five minutes and pulls the relevant sections of this guide for you.
Reading guide

Every level follows the same four-section structure. We call it GATE:

G
Ground truth
What teams at this stage actually look like, including the signals that you're here.
A
AI native
How teams that have figured this level out operate.
T
Tactics
Concrete steps to advance, with examples.
E
Exit criteria
The blockers you need to clear before advancing to the next stage.

The level map at a glance

Eight levels mapped to three context-maturity zones.

Level
01
You are the context

AI Code Completion (Tab Complete)

The floor. Autocomplete saves keystrokes, but the team's planning and review loops happen elsewhere.

Ground truth

Engineers use Copilot-style autocomplete inside their IDE. AI fills in code snippets, completes function bodies, and saves keystrokes.

You're at Level 1 if:

  • One or more autocomplete tools are approved and rolled out.
  • Senior engineers like it. Junior engineers find it inconsistent or distracting.
  • AI hasn't changed how the team plans, reviews, or debugs work.
  • Adoption is uneven. Some engineers leave the suggestions on, others turn them off.

The team gets a mild productivity bump on routine code. The mental model is "AI is a better autocomplete," not "AI is a teammate." Output quality depends on whether the engineer already knows the shape of the answer before they start typing.

The size of the bump is real but modest. DORA research correlates a 25% increase in AI adoption inside an organization with a 7.5% gain in documentation quality, 3.4% in code quality, and 3.1% in code review speed2DX. The AI Strategy Playbook.. These aren't transformational numbers. They're the floor.

AI native

AI is a baseline floor. Every engineer has it on, but the team's planning, generation, and review loops happen elsewhere.

Teams that have moved past Level 1 use autocomplete as the residual layer underneath their actual workflow. Their planning, generation, and review loops happen elsewhere. They don't measure success in keystrokes saved. They measure it in tasks delegated.

There's a quiet pattern in the productivity research: engineers using AI report a net decrease in time spent per task category but a much larger net increase in output volume1Anthropic. 2026 Agentic Coding Trends Report.. AI doesn't speed up individual tasks as much as it makes more tasks viable. The shift in mindset that gets a team out of Level 1 is recognizing it as a floor, not a ceiling.

Tactics

Pair the IDE with a Q&A layer. Autocomplete doesn't help when the question is "why does this service do X?" Give engineers a way to ask questions about your code, PRs, and docs from a surface they already use.

Add an agent IDE alongside, not instead. Most teams stuck at Level 1 jumped from "no AI" to "Copilot" and stopped. Letting senior engineers experiment with chat-first agent IDEs on real tasks creates the pull for Level 2.

Write down your stack. A short README that names approved tools, shows how to get access, and points at where to ask for help beats six months of organic confusion.

Don't measure adoption by license seats. Measure by what work actually got done with AI in the loop. Seats are a vanity metric. incident.io found that nearly half of teams calculate metrics but take no action on them3incident.io. Industry Benchmark Report, 100,000+ incidents..

Exit criteria

Tap each item as you clear it. You're ready for Level 2 when all four are checked.
  • Engineers can ask a tool or agent questions about your codebase and get grounded, cited answers, not just autocomplete.
  • At least one engineer has shipped a non-trivial change drafted with an agent.
  • Senior engineers can name one task they delegate to AI rather than write themselves.
  • You have a written rollout doc that names approved tools and how to get help, with a named DRI.
Zone 1 → Zone 1
Level
02
You are the context

Agent IDE

Chat is the primary surface. Output quality varies wildly with how much context the engineer feeds in.

Ground truth

The team has moved into agent-first IDEs where chat is the primary surface, not autocomplete. Engineers do multi-file edits, ask the agent to refactor, generate tests, and explain code.

You're at Level 2 if:

  • The chosen agent IDE is approved and most engineers use it daily.
  • Engineers manually paste files, snippets, docs into chat to get useful output.
  • Output quality varies wildly with how much context the engineer feeds in.
  • "You are the context." Your brain is running compute for the agent.

The 10x sessions are when an engineer brings deep context. The off-the-rails sessions are when the agent guesses. Power users feel productive. Newer engineers struggle to reproduce those results because they don't yet know what context to bring.

AI native

Engineers stop pasting context from memory. They start writing it down because doing it from scratch every session is unsustainable.
Before Unblocked, I was manually compiling documentation into a local folder just so Claude Code could reference it. Now it pulls everything directly. I'm getting 90% accuracy on complex data structure questions that would have taken hours to figure out.
Austin Rojan·Onboarding Specialist at Subsplash

AI-native teams at Level 2 are honest about the bottleneck. The agent isn't dumb. It's blind. Every interaction is gated by how much the engineer is aware of and remembers to paste in. They start writing down what they're pasting because doing it from memory every time is unsustainable. That instinct — the "this should be captured somewhere" moment — is what propels a team into Level 3.

Tactics

Catalog what engineers paste. Spend a week noting which files, conventions, and explanations get re-pasted across sessions. That's your future curated-context inventory.

Run a context-dump session. Have your top three power users walk through their workflow. What do they always include? What do they wish the agent had? That's your Level 3 roadmap.

Generate a starter rules file. Most agent IDEs have a command for this. Cursor has /create-rule; Claude Code has /init to write CLAUDE.md.

Stop asking "is the model good enough?" Start asking "is the agent's input good enough?" Most quality gaps at Level 2 are context gaps, not capability gaps.

Exit criteria

Tap each item as you clear it. You're ready for Level 3 when all four are checked.
  • You have at least one rules file checked in and maintained.
  • Engineers know where to find shared prompts and conventions.
  • The team can name three specific patterns they want agents to follow.
  • Onboarding a new engineer to your AI workflows takes hours, not weeks.
Zone 1 → Zone 2 · Crossing into curated context
Level
03
Curated context

Context Engineering

Every token in the prompt fights for its place. Rules files exist, are maintained, and have visibly improved output.

Ground truth

The team is intentional about context. CLAUDE.md, .cursorrules, system prompts, curated docs folders. Every token in the prompt fights for its place.

You're at Level 3 if:

  • Rules files exist and are maintained, not abandoned.
  • The team has shared prompt patterns and conventions for using them.
  • Engineers can describe what context the agent has versus what it doesn't.
  • Output quality has improved since curating, and the team can point to why.

The shift has happened. You've moved from "you are the context" to "curated context." Output is more consistent. Engineers spend less time pasting. Code review still catches plenty, but fewer issues are basic convention misses.

AI native

An agent ships a small bug fix end-to-end without an engineer driving every step.
My setup tells the agent: before you implement anything, go check Unblocked. It has everything — our repos, Notion, Slack, coding standards — and it surfaces things I wouldn't have thought to look for.
Justin McCraw·Software Engineer at The Information

At Level 3, the leading-edge version is about recognizing that curated context will hit a ceiling. Rules files capture what you already know to write down. They can't capture the Slack thread from yesterday, the PR that changed a convention, or the design doc. AI-native teams at this level start running multiple agents in parallel to amortize their context investment, while quietly looking for what comes after curation.

Tactics

Treat your rules files like code. Version them, review them, refactor them when conventions change. Stale rules generate confidently wrong output, which is worse than no rules at all.

Map your knowledge sources. Which questions do agents need answered that aren't in any rules file? PR discussions? Slack? Incident postmortems? Design docs? That's the gap a context layer closes.

Make your rules queryable. As rules grow across multiple files, agents either read the wrong sections or load all of them and burn tokens. The Repo Rules Agent9Unblocked. Repo Rules Agent (open-source). scans your existing rules files and exposes them as a structured index agents can query by task, language, or file scope.

Start delegating multi-step tasks. If your rules are working, an agent should be able to ship a small bug fix end-to-end. If it can't, the rules aren't the problem. The context is.

Exit criteria

Tap each item as you clear it. You're ready for Level 4 when all four are checked.
  • An engineer can show one task where they set a goal and let an agent do multi-step work without driving every step.
  • When agents fail, the team blames missing context first and model capability second.
  • The team has a "lessons learned" channel or doc where agent and prompt patterns get codified.
  • You can describe what your rules files cover and, just as importantly, what they don't.
Zone 2 · The curated context trap begins
Level
04
Curated context (under strain)

Compounding Engineering

Plan, delegate, assess, codify. The codify loop reveals just how much knowledge isn't written down anywhere.
Author's note Kieran Klaassen and the team at Every introduced "compound engineering" to describe self-improving development systems where each unit of work makes the next easier. We use "compounding" because it emphasizes the loop.

Ground truth

The team runs a deliberate "plan, delegate, assess, codify" feedback loop. Agent failures aren't one-off corrections; they trigger updates to rules, prompts, or shared docs so the same failure doesn't happen twice.

You're at Level 4 if:

  • The team captures lessons from agent sessions and embeds them back into rules.
  • When an agent goes off the rails, the first instinct is "what context was missing."
  • Power users run multiple agents in parallel, but parallelism is a how, not a what.
  • The team can name three rules updates triggered by specific past failures.

This is where teams start to feel the strain of curated context. The codify loop reveals just how much knowledge isn't written down anywhere. You've maintained CLAUDE.md, written prompts, standardized your rules. But every new agent, every new task, every new repo needs the same context fed in again. Copy-paste is still a job, even with rules in place.

AI native

Every agent failure is one rule, prompt, or doc update away from never happening again.
When I plugged Unblocked into our context-gathering and PR review steps, it brought in the Slack conversations where real architectural decisions get made. We went from three rounds of PR review to one before the code was production-ready.
Pablo Vallejo·Engineering Manager at Clio

AI-native teams at Level 4 see the codify loop running and notice it's becoming a maintenance treadmill. Every fix exposes more gaps. They start asking how to give every agent the same understanding of the system without manually piping it in. Curated context stops being enough at this stage. The teams that pull ahead recognize they need a context layer, not a bigger rules file.

They also use the codify loop in reverse. Because the discipline is in place and the cost of cleanup has dropped close to zero, Level 4 teams aggressively delegate refactors, renames, and minor cleanups to agents. Code smells that used to be deferred for "when there's time" become free fixes. Simon Willison's argument lands here7Simon Willison. Better Code, agentic engineering patterns.: agents shouldn't be tolerated as a quality tradeoff. They should be deployed to raise quality.

About 27% of AI-assisted work consists of tasks that wouldn't have been attempted otherwise1Anthropic. 2026 Agentic Coding Trends Report.. That shift from "save time" to "do work that wasn't possible before" is also why the productivity gains at this level can be deceptively large. One Anthropic customer reported their development system "doubled execution speed — not by eliminating human involvement, but by shifting developers toward higher-value work."

Most teams stall at this level. The fix is a different architecture for how context reaches the agent. That takes serious work and distributed systems thinking.

Tactics

Run the codify loop deliberately. After every session that goes wrong, ask: which rule, prompt, or doc would have prevented this? Update it before starting the next session.

Apply agents at the bottleneck, not the keyboard. Eli Goldratt put it bluntly: "An hour saved on something that isn't the bottleneck is worthless." For most engineering orgs, writing code was never the bottleneck. PR review latency, incident triage, documentation drift, and onboarding probably are.

Invert your time-on-task ratio. Every's compound-engineering approach: 80% in planning and review, 20% in execution. If your engineers are still spending most of their time writing code line-by-line, you're operating at Level 3, not Level 4.

Try Every's compound-engineering plugin.6Every. compound-engineering-plugin (open-source). It drops the plan-delegate-assess-codify loop into Claude Code, Cursor, Codex, and Copilot as ready-made skills.

Time-track context overhead. How much engineer time goes to re-feeding context across agents and sessions? If it's more than a few hours a week per power user, you've hit the ceiling.

Audit your failure modes. When agents miss conventions, which conventions? Where were they documented? Most failures trace to information that exists somewhere but isn't reaching the agent.

Pilot an MCP-based context layer. Connecting agents to a system that synthesizes code, PRs, Slack, and docs lets you stop pasting and start querying. Start with one team and one workflow.

Exit criteria

Tap each item as you clear it. You're ready for Level 5 when all four are checked.
  • The team can show 5+ rules updates traced to specific past failures.
  • At least one workflow has agents pulling context from a shared source rather than pasted-in prompts.
  • Engineers use a shared registry of skills and prompts rather than each maintaining their own.
  • At least one MCP server is connected and used regularly, not just installed.
Zone 2 → Zone 3 · The context layer emerges
Level
05
Context layer emerging

MCP + Skills

Agents have access to external systems. Access is not understanding — and that's where the new failure mode lives.

Ground truth

Agents have access to external systems through MCP and custom skills: databases, APIs, CI pipelines, design systems, browser automation.

You're at Level 5 if:

  • One or more MCP servers are connected and used regularly.
  • The team maintains a shared skills and prompt registry.
  • Agents take actions in external systems, not just suggest code.
  • Some workflows are delegated end-to-end. Engineers kick off and review.

The team is fast. The team is also seeing a new failure mode. Agents have access to lots of information through MCP, but they don't always pick the right source, reconcile conflicts, or recognize when a Slack thread overrides a doc.

AI native

Agents have decision-grade context, not just access to MCP servers.
Individual MCPs are great when you already know what you're looking for. Unblocked is what you use when you need the full picture.
Zachary Goldberg·Engineering Manager at Lilt

Access to information is not the same as understanding. AI-native teams at Level 5 stop adding MCP servers and start asking which one is authoritative, what happens when two disagree, and when the agent should keep looking instead of stopping at the first plausible answer. The context layer becomes essential here.

This is also the level where a known bias from radiology, "satisfaction of search" failures become visible. Agents stop at the first answer that looks plausible. If your agent is consistently confidently wrong, that's almost always why: it called the first MCP, found the first thing, and started burning tokens on bad information.

Tactics

Inventory every MCP connection and what it provides. If two of them can answer the same question differently, decide which is authoritative or invest in something that reconciles between them.

Watch for satisfaction of search failures. When agents are wrong, did they find one source and stop? Or did they reconcile across sources and still pick the wrong answer? The first is a context architecture problem.

Add a who-knows-what graph agents can traverse. Source prioritization often depends on who's asking. The OSS Engineering Social Graph Builder8Unblocked. Engineering Social Graph Builder (OSS). creates that graph from git history and code ownership.

Consolidate skills. When multiple engineers have written nearly the same skill, that's a registry problem. Move skills into a shared place with naming conventions and version control.

Measure correction loops. If agents at Level 5 still need three or more rounds of correction on average, the context layer isn't doing its job. Don't blame the model.

Consider an MCP gateway. Organizations are using this pattern to distribute approved MCPs and skills, like Stripe's toolshed or Cloudflare's MCP server portals.

Exit criteria

Tap each item as you clear it. You're ready for Level 6 when all four are checked.
  • Agents have decision-grade context (synthesized, reconciled, ranked), not just access to MCP servers.
  • Agents operate inside a feedback loop with automated checks (lint, type, test, security).
  • At least one workflow runs end-to-end without an engineer in the chair.
  • You have a shared way to capture and reuse skills across the org.
Zone 3 · The context layer is now load-bearing
Level
06
Context layer essential

Harness Engineering

The harness is the product, not the agent. Constraints over instructions. Backpressure over hand-holding.
Author's note OpenAI introduced "harness engineering" to describe the runtime, feedback loops, and guardrails an agent operates inside. We use it here as the level where the harness becomes the team's primary infrastructure investment.

Ground truth

Agents operate inside complete environments: observability, testing, validation, security boundaries.

You're at Level 6 if:

  • Agents have automated feedback loops (lint, type-check, test, security scan) they self-correct against.
  • Security boundaries for what agents can and can't do are enforced, not requested.
  • The team optimizes for throughput over perfection.
  • The harness includes context. Agents pull needed context outside the code without prompting.

The team isn't running agents anymore. The team is running an environment that runs agents.

AI native

The harness self-corrects. When CI fails, agents see it and fix it without engineer intervention.

At Level 6, the AI-native team treats the harness as the most important infrastructure they own. The agent is interchangeable. The harness — feedback loops, context layer, security guardrails — is what makes the work reliable. Constraints over instructions. Backpressure over hand-holding. Throughput over perfection.

The context layer is now critical. In fact, it's looking a lot like a context engine. Agents need access to team conventions, past decisions, and cross-repo knowledge to operate without an engineer filling gaps in real time.

Tactics

Make the feedback loop visible to the agent. If your CI fails, the agent should see the failure and self-correct. If your security scan flags an issue, the agent should know without an engineer translating.

Codify constraints, not instructions. "Don't write to this file" beats "follow these conventions." Agents respond to enforced boundaries better than to documented preferences.

Treat the context layer as critical infrastructure. Cross-repo knowledge, team conventions, past decisions — these need to be queryable by every agent, every time, with dynamic context.

Watch your throughput numbers. How many tasks does the harness complete end-to-end without engineer intervention? If that number isn't climbing, something in the loop is broken.

Exit criteria

Tap each item as you clear it. You're ready for Level 7 when all four are checked.
  • At least one workflow runs autonomously without human approval gates.
  • Agents self-correct based on automated feedback without engineer intervention.
  • The context layer serves multiple agents and roles.
  • You can describe what the harness does in fewer sentences than what your agents do.
Zone 3 · Lights-out engineering
Level
07
Context layer load-bearing

Background Agents

Agents churn away at 3am while nobody's watching. It only works because the context layer doesn't sleep either.

Ground truth

Agents execute autonomously without human approval gates. Engineers set goals; agents generate plans and run them asynchronously.

You're at Level 7 if:

  • One or more agents run in the background. Ralph loops, scheduled runs, event-triggered workflows.
  • An orchestrator pattern is in place: one agent dispatches workers across isolated contexts.
  • The team uses different models for different roles.
  • Engineers are on the loop, not in it: they prompt, walk away, and review the output.

There's something unsettling about agents churning away at 3am while nobody's watching. That's the day-to-day at Level 7, and it only works because the context layer is load-bearing. By this stage it's really a context engine: it has its own compute, reasoning, memory, and access. Agents wherever they run pull from the same understanding. If it goes down, work stops.

Background agents in production
1,000+
Stripe · PRs per week from Minions agents
1,500+
Spotify · Merged PRs from background coding agent
~30%
Ramp · Share of merged PRs from Inspect agent across main repos

The work that becomes routine at Level 7 is the kind that used to be economically untenable. Weekend-long refactors. Fifty-repo CVE patches. Dependency upgrades that took a quarter to plan and a sprint to execute. None of that requires more model intelligence. It requires patience, parallelism, and context that doesn't fade.

The case for Level 7 lands hardest if you've ever been on call. The hardest part of incident response is rarely the fix. It's getting eyes on the problem at 3am. The average gap between an alert firing and the first human message is 4-5 minutes during the day, and overnight incidents take nearly twice as long3incident.io. 100,000+ incidents analyzed.. Background agents don't sleep. They're already triaging the alert when the human gets to their laptop.

AI native

An agent ships a non-trivial change overnight while nobody is watching, and the morning review confirms it shipped correctly.
When I'm on call and we hit a data discrepancy, that investigation used to take a full day. Now the agent calls Unblocked to pattern match against past Slack conversations and Confluence pages — and I have an answer in 30 minutes.
Nazmus Sakib·Software Engineer at Workday

AI-native teams at Level 7 think about agents the way they used to think about microservices. Each one has a job, an SLA, a budget, and an owner. They invest in evals and validators, because they can't review every PR by eye. They invest in observability, because async agents fail silently if you let them. They invest in context, because background agents have no human around to babysit.

The shift from Level 6 is subtle but important. At Level 6, the harness supports the human. At Level 7, the harness replaces the human in most loops. That's only safe if the context layer is reliably delivering decision-grade understanding to every agent.

Tactics

Build evaluation harnesses. Take real review comments from your senior engineers. See if agents with your current context would have caught the issues. Measure the output, not the intermediate retrieval step.

Run model diversity by role. Hard reasoning gets the biggest model. Repetitive tasks get smaller, faster ones. Don't pay top dollar for tasks that don't need it.

Treat context layer SLAs like database SLAs. Latency, availability, consistency. If your agents are waiting on context, they're not working.

Set throughput budgets. Background agents can run forever. Define when they stop. "Until the PRD is complete," "until three attempts have failed," "until $X in inference is spent."

Exit criteria

Tap each item as you clear it. You're ready for Level 8 when all four are checked.
  • Multiple agents work on the same problem and coordinate without funneling through a single orchestrator.
  • Agents claim tasks, share findings, and resolve dependencies with minimal human routing.
  • The context layer serves as a shared source that all agents read from and contribute to.
  • Your eval coverage is broad enough that you'd notice quality regressions before users do.
Zone 3 · The frontier
Level
08
Mature context layer

Agent Teams

The emerging frontier. Most teams should focus on getting Level 7 right before chasing Level 8.

Ground truth

Multiple agents coordinate directly. They claim tasks, share findings, flag dependencies, and resolve conflicts without a single orchestrator routing every message.

You're at Level 8 if:

  • Agents communicate with each other, not just with a central coordinator.
  • The system can decompose a task across roles and reassemble the result.
  • Failures get recovered by the agent team, not by an engineer reading logs.
  • Most teams aren't here yet, and shouldn't pretend to be.

Level 8 is the emerging frontier. Most teams should focus on getting Level 7 right before chasing Level 8. For most teams, Level 7 is the destination, not a stop on the way to Level 8. Background agents with a solid context layer give you most of the leverage. Agent-to-agent coordination adds complexity that very few teams are positioned to manage.

AI native

Agents claim work, hand off, and resolve conflicts without a human routing the messages.

At Level 8, the AI-native team has accepted a hard truth: when agents coordinate directly, the context engine is the product. Whatever the agents see, they must all see the same way. Whatever conflicts exist must be resolved before agents act on them. Whatever the team has decided in the past must be available to every agent making a related decision now. The context engine isn't infrastructure here. It's the operating system.

Tactics

Don't skip Level 7. Level 8 without solid background agent infrastructure is a demo, not a workflow.

Define agent-to-agent protocols. How do agents claim work? How do they hand off? How do they resolve conflicts? Without this, "agent teams" is just "agents in the same room."

Invest in shared memory. Agents that don't share context produce conflicting work. Agents that do share context can coordinate. The boundary is your context engine and handoff method.

Plan for the failure modes. Two agents working on overlapping code. An agent holding a stale view of a decision. A skill registry that drifts between agents. These aren't edge cases — they're the day-to-day at Level 8.

Exit criteria

There's no Level 9 yet. The criterion is staying credible — keeping the harness, the evals, and the context engine ahead of the agents you're running.
  • The context engine reflects new decisions within minutes, not days.
  • Eval coverage grows as agent capability grows.
  • Agent teams have an owner. Someone responsible for their performance, like an SRE for a service.
  • You're publishing what works. The state of the art at Level 8 is being defined right now.

Where most teams really are

If this guide feels mostly aspirational, that's because it is. The honest distribution today:

The real distribution, mid-2026

Estimated based on what teams report doing vs. what they claim. The ones at Level 7 don't talk about it; the ones who claim to be are usually one outage away from being wrong.

2%
38%
29%
18%
8%
3%
1%
<1%
L1
L2
L3
L4
L5
L6
L7
L8
Most teams Level 2 or 3. Adopted an agent IDE, shoveling context manually or starting to write rules files.
Meaningful minority Level 4 or 5. Running a codify loop, connecting MCP servers. Feeling the limits of curated context.
Small number Level 6. Built harnesses with feedback loops. Honest about what they need to invest in.
Almost no one Level 7 or 8. The ones who claim to be are usually one outage away from being wrong.

The teams that get this right pull ahead by an order of magnitude that isn't always visible from the outside. One Anthropic customer reported 13,000 custom AI solutions shipped across the org alongside engineering code shipping 30% faster, with over 500,000 hours saved1Anthropic. 2026 Agentic Coding Trends Report.. That kind of compounding doesn't come from any single tool. It comes from getting the levels right in sequence.