AI Won't Replace Code Reviews, But It Can Fix Them

Code reviews often fail not because the code is wrong, but because no one knows why it was written that way. This post explores how AI-generated comments can add missing intent to pull requests—making both human and AI reviews smarter, faster, and more effective.

Most code reviews don’t fail because the code is wrong: they fail because no one knows why the code was written that way. This week, I saw something new: Copilot left a brilliant review comment, but only because Claude had already documented the code’s intent.

AI helping AI, and suddenly, reviews were better than ever.

The Problem: Missing Context in Reviews

Every engineer has felt it: you open a pull request, skim through hundreds of lines, and… you’re lost. The logic might be fine, but the intent is invisible. Why is this loop written that way? Why not just call an existing helper? Is this defensive code, or a bug?

Humans are bad at documenting intent. We know we should add comments, but most of us don’t. The result is that reviewers (human or AI) end up blind. They comment on syntax, formatting, or trivial edge cases—while the deeper design decisions go unchallenged.

The Experiment: Claude Writes, Copilot Reviews

This week, I had Claude generate some code for a feature. Claude, unlike me on a Friday afternoon, is very good at adding inline comments. Not just “this is a loop,’ but actual reasoning: why it was doing something in a specific way.

At the same time, we’ve enabled GitHub Copilot’s automatic PR reviews. It scans every pull request and leaves suggestions, much like a teammate who never sleeps.

And here’s where it got interesting: Copilot flagged an issue in my PR. But reading the comment carefully, I realized it wasn’t actually Copilot being smart—it was Copilot using Claude’s comment as context to make a smart suggestion.

The Surprise: AI Helping AI

Here’s what happened:

🤯

That review wouldn’t have happened if Claude hadn’t left the initial comment. Copilot needed that “this was for flakiness” explanation to challenge whether .first() was the correct fix or just a workaround. The AI review became smarter, not because Copilot understood the code in isolation, but because it had context.

This is the missing ingredient in most reviews. Without intent, reviewers can only guess. With intent, even if it’s provided by another AI, the quality of the feedback skyrockets.

The Takeaway: Let AI Do the Documenting

So here’s the opinionated stance:

Humans won’t document intent.
AI can, and should.
Commit the context, not just the code.
Reviews—human and AI—get instantly smarter.

Hear me here: if humans won’t document their code (and let’s be honest, most won’t), then we should let AI do it. And not just as a ghostwriting tool that we copy and paste locally. I mean, AI-generated comments should be committed with the code.

Why?

They make AI reviewers smarter (as we just saw).
They also make human reviewers smarter.
They create a lightweight paper trail of reasoning for future maintainers.

This isn’t about replacing reviewers. It’s about giving them the context they need to do the job we’re asking them to do.

But What About Hallucinations?

Of course, there are risks: comments can rot, AI might hallucinate intent, etc. But here’s the thing: the trade-off still improves reviews.

In fact, even when AI does hallucinate, at least it gets your attention. 😅 A slightly-wrong comment is often better than no comment at all: it forces the reviewer to pause, double-check, and clarify intent.

A hallucinated comment might be annoying, but it still sparks the exact conversation a good review should have (I know, I know, it brings discomfort just thinking about that).

Lessons Learned

AI is surprisingly slick at capturing intent (the very thing humans often overlook).
Reviews, whether by human or machine, are only as good as the context provided.
We should normalize pushing AI-written comments along with our code.

The more I think about it, the less this feels like a hack and the more it feels like the future of code review. Just as CI systems run automated tests on every PR, we should run an AI that auto-documents every PR. That context makes every other tool—from Copilot reviews to human eyes—much more effective.

Final Reflection

If reviewers can’t see your intent, their feedback will always fall short. AI won’t replace reviews, but it can finally give them the missing context.

Consider this: today, no one ships code without automated tests. Tomorrow, nobody should ship code without AI-written intent annotations. Context will be as essential as tests, because without it, every review is flying blind.

AI Won't Replace Code Reviews, But It Can Fix Them

The Problem: Missing Context in Reviews

The Experiment: Claude Writes, Copilot Reviews

The Surprise: AI Helping AI

The Takeaway: Let AI Do the Documenting

But What About Hallucinations?

Lessons Learned

Final Reflection

Need more control over when code merges?

Recommended posts

The Comfortable Room

Claude Didn’t Kill Craftsmanship

Should We Still Write Docs If AI Can Read the Code?