Building AI Code Review with context as a first-class system

Brandon Waselnuk

January 21, 2026

Our customers kept telling us the same thing: code review was eating their week. As AI coding assistants became ubiquitous, engineers were spending more time reviewing, not less. Agent-generated code still needed human eyes, and the review bottleneck was becoming the critical path. As we dug in, it became clear that most AI code review tools were optimizing for speed and surface-level correctness, not understanding. That gap was exactly where reviews were breaking down.

We'd already built a context engine that could answer questions by synthesizing information across an organization's knowledge base - Slack conversations, documentation, pull request history, and source code. The question became: could we apply that same contextual intelligence to code review?

By December 2025, after weeks of testing with beta customers, we had our answer.

Unblocked reviews code the way a 20-year veteran engineer would, with full awareness of architectural decisions, team conventions, and the conversations that shaped the codebase.

Since launch, we've seen customers replace multiple code review tools with ours, consistently reporting higher signal-to-noise ratios and comments that actually get resolved.

The context problem in AI code review

Most code review tools start and end with the diff. They might expand to the file, or even the repository. But that's not how the best reviewers think.

When a senior engineer reviews code, they're drawing on years of institutional knowledge. They remember that Slack thread from six months ago about why the team chose this particular database pattern. They know that David on the platform team has strong opinions about error handling. They've internalized dozens of unwritten conventions that never made it into any style guide or lint.

We realized that the real problem was context.

One of our founding engineers, put it: "You can think of a context engine as an omniscient mind that can see all the context at once. You're taking that talented engineer who's like a new employee and transforming them into an expert engineer that's been at your organization for 20 years."

Internally, we call this "decision-grade context": the ability to surface exactly the historical, architectural, and social information that would change how a human evaluates a piece of code.

Building on the context engine

The breakthrough came from recognizing that we'd already solved the hard problem. Our context engine had spent years learning how to retrieve, rank, and synthesize information across an organization's knowledge graph.

Code review was a natural extension.

Consider a common scenario: a PR that adds retry logic to an API client. The diff looks reasonable. But what the reviewer can't see is the Slack thread from four months ago where the platform team explained why they removed retry logic from this exact client - it was masking cascading failures during incidents.

Or the Linear issue where someone already attempted this fix and reverted it.

Or the RFC in Confluence that established the team's position on retry boundaries.

That's the context gap that burns teams. The code is correct in isolation but wrong in context, and the reviewer either doesn't know enough history to catch it or spends thirty minutes hunting through Slack and docs to confirm a hunch.

Unblocked closes that gap automatically. When it reviews a PR, it synthesizes information from across your organization's knowledge graph including tickets, conversations, documentation, previous reviews and surfaces only what's relevant. Not a wall of links, but the specific context that would change how you evaluate the code.

This last piece changed how we thought about the whole product. We built what we call "memories" - the distilled best practices extracted from your organization's actual PR history.

When a reviewer leaves the same comment for the nth time about not using useEffect for a particular pattern, that becomes organizational knowledge that Unblocked can surface automatically.

<blockquote class="small">"Unblocked uses real application context to provide accurate and relevant feedback in code reviews, making it a strong starting point for other reviewers to initiate discussions. 
Compared with other AI tools, Unblocked is more concise, provides the right amount of feedback, and effectively highlights best practices." 
Hector (Ivan) Cabrera — Senior Software Engineer, Drata</blockquote>

Learning from human biases

Early in development, Richie, who led the code review effort, noticed something unexpected. The model exhibited many of the same biases that plague human reviewers.

"Surprisingly, this is something I actually found happened in our LLM-based review," he explained. "I think it's because the models are trained on human reviews, human text, and so it's subject to some of the same biases."

One example: satisfaction of search. Human reviewers often find one bug, feel accomplished, and unconsciously reduce their vigilance for the rest of the review. The model did the same thing.

We also encountered what Richie calls "null finding anxiety" - when the model couldn't find anything wrong, it would manufacture increasingly obscure issues rather than return empty-handed. The first time we saw this, the model agonized turn after turn over a single-line change until it found something, anything, to report. It wasn’t a useful report.

The solution came from an unexpected place: reliability engineering.

Aviation uses checklists and copilot verification not because pilots are incompetent, but because systematic processes catch what intuition misses. We applied the same thinking by creating structured review passes, adversarial validation, and explicit bias awareness in our prompts.

How our PR Failure Agent informed Code Review

Before building code review, we released a tool we called the PR Failure Agent. It analyzes CI failures and connects them to the specific code changes that caused them. Users react to comments with GitHub emojis, giving us signal on what was working.

Then something strange happened. People started downvoting comments that appeared entirely correct.

When we investigated, we learned two things that shaped our approach:

First, developers don't appreciate obvious feedback. Finding that a lint check failed or that changing a function to return 5 broke a test expecting 6 - these technically correct observations added no value. We were wasting their time.

Second, and more surprising: developers were deleting our comments. We assumed this meant they hated them. When we reached out, we got the opposite response: "I love Unblocked. These comments were great." The deletion was about appearances. They wanted to clean up before a senior engineer reviewed their PR.

This taught us that code review is really about moving the conversation forward. Every comment has a social cost, and that cost compounds in noisy environments where PRs already have multiple bots competing for attention.

What makes a review actually helpful

Based on everything we learned, we kept coming back to a few ideas. We'd rather miss a minor style issue than annoy a developer with noise. Our validation pipeline discards over 50% of issues found in the first pass. Every comment that survives should tell the developer exactly what's wrong and how to fix it.

The highest-value comments aren't about the code itself. They're about context the author couldn't have known. "Did you consider this conversation from three months ago about deprecating this approach?" That's the kind of insight that only comes from organizational memory. And even when we're wrong, the comment should be grounded in real citations and previous discussions. It should open a productive dialogue, not shut one down.

We also eliminated social biases entirely. An intern's code gets the same review as a principal engineer's. We won't fixate on a colour change while ignoring architectural issues. And because the bot doesn't have feelings, it can surface hard feedback that humans might sugarcoat.

One customer had this feedback that really drove this home for us:

<blockquote class="small">"I've been ignoring AI code reviews because they were more noise than signal, but Unblocked code reviews have high signal, and contain context that only someone who has a full view of the codebase and relevant git history could provide. 
I added a TODO comment, not knowing how to address the bug but planning to come to it later. Unblocked said "No silly, that's because when you converted your Objective C code to Swift, you missed one line of code." 
Lemuel Dulfo — Senior Software Developer (Mobile Team) Clio</blockquote>

Conflict resolution at scale

One of the hardest problems in context-aware review is conflicting information. What happens when Bob says one thing in Slack and Susan says another? When documentation describes an architecture that the code doesn't follow?

Our approach draws on multiple signals. We track who reviews whose code, who contributes to which areas, and who the organization treats as authoritative on specific topics. When there's conflict, we weight toward the domain experts. More recent information generally wins, but not always as some architectural documents from years ago describe still-relevant constraints. We learned to distinguish between "old and stale" versus "old and foundational." And at the end of the day, what's in main is what's real. If documentation says one thing and the code says another, we know which one actually runs in production.

This isn't fully solved, but the combination of signals gets us surprisingly far.

What's next

We're still figuring out a lot. The memories feature currently focuses on PR history, but we're expanding it to capture patterns from incident responses, architectural discussions, and onboarding documentation. Some of this works well already. Some of it is harder than we expected.

We're also exploring deeper integration with the development lifecycle, catching issues not at PR time but while code is being written, and surfacing relevant context directly in the IDE before the first commit.

The goal is simple: take the expert engineer who's been at your company for decades and make that knowledge accessible to everyone at any moment in the SDLC. Code review felt like the right problem to solve next.

---

Unblocked Code Review is available now. Read our docs or reach out to see it in action on your codebase.

‍