Skip to content
Muhammet Şafak
tr
Journal 6 min read

Asking a Tool to Review My Own Diff — Why I Built CommitBrief

One afternoon I ran headfirst into the limits of self-review, so I wrote a tool that audits my own diff right from the terminal: CommitBrief. A note on why I built it, how it works, and what problem it actually solves.


Last month, one afternoon, I was reading a PR I was about to open for the third time, top to bottom. The diff was small — 80 lines added, 14 removed, four files. Even on the third read, nothing caught my eye. Two hours after opening the PR, a teammate left a comment: “This error wrap is duplicated — %w is already there, but you also appended a text prefix, so the log currently reads ‘auth failed: auth failed:’ twice.” A bug sitting right where my eyes had landed, in a diff I had read three times.

The real power of this incident was how small it was. It wasn’t a careless read — it was three careful reads. The person who wrote the code couldn’t read the code, because they had already read it, already written it, already knew it.

I Am My Own Code’s Worst Reader

There is something peculiar about self-review: when you read code you wrote, your mind sees the version you thought you wrote, not the version you actually wrote. They call it anchor bias — the memory of “I wrote this line for this reason” obscures what the line actually says. That’s why a developer reviewing their own diff alone will consistently catch fewer bugs than a fresh pair of eyes, no matter how experienced they are.

The practical solutions were always the same:

  • Wait. Give it a day, look at it tomorrow with fresh eyes. I usually can’t wait; the deploy is today.
  • Ask a teammate. Best option, but they’re not always available — and interrupting someone’s day for a minor typo doesn’t feel right.
  • Ask the AI assistant in the IDE. Copilot review, Cursor review, JetBrains AI — they all work, but they share a common problem: they think at the file level, not the change level. They don’t know my project’s rules. Behavior varies from IDE to IDE.
  • Copy-paste into ChatGPT. Works, but throwing the diff to the clipboard every time, retyping the project context, and never caching any of it — the friction is high.

As I’ve written before, the place where these tools generate value is no longer “writing code” — it’s applying reasoning to the code you’ve already written. What I needed wasn’t “let AI write it” but “let AI be a second set of eyes” — and none of the existing tools did that without leaving the terminal, at the diff level, with my own rules.

So I wrote CommitBrief.

What It Does in Five Seconds

The moment I type commitbrief, here’s what happens: it takes my staged diff, runs it through a three-layer filter (built-in noise → .commitbriefignore → semantic rules in COMMITBRIEF.md), sends it to the LLM provider of my choice, processes the response against a fixed JSON schema, and prints colored cards to the terminal. Each finding includes: severity (critical/high/medium/low/info), file, line, title, description, and an optional code snippet.

$ commitbrief --staged
commitbrief v1.0.0 · provider: anthropic/claude-sonnet-4-6 · cache: miss
analyzing 3 files · 42 added · 11 removed

┌─ HIGH ─ internal/api/handler.go:201 ──────────────────────────┐
│ Wrapped error duplicated in message                           │
│ "%w" is already present; the prefix repeats the wrapped       │
│ error, producing "auth failed: auth failed: …" in the log.   │
└───────────────────────────────────────────────────────────────┘

✓ Done in 4.2s · 1 finding · 8,421 tokens · Cost: $0.0319

If I run the same diff again, it doesn’t take four seconds — it takes 13 microseconds — the local cache kicks in and the footer shows Saved: $0.0319. As long as the diff hasn’t changed, it’s free.

A quick design summary:

  • Provider-agnostic. Anthropic, OpenAI, Gemini, Ollama (local). On top of that, it can use your existing Claude Code / Gemini CLI subscription via claude-cli and gemini-cli — no extra API key needed.
  • Local. Both the diff and the review output stay on your machine. The only outbound request goes to your chosen provider.
  • Project-specific rules. COMMITBRIEF.md lives in the project root and is sent as the system prompt. “No magic numbers in this project”, “don’t wrap errors outside of context”, “don’t import the logger directly” — you write the rules.
  • Pre-send safety. If the diff contains credential-shaped substrings, it warns you before sending to the provider. --allow-secrets is required to bypass.
  • CI gate. commitbrief --fail-on=critical exits with code 1 if there are any critical findings — drop it in as a pre-push hook or a CI step.

Single-file Go binary; install via brew install CommitBrief/tap/commitbrief, scoop install commitbrief, or go install github.com/CommitBrief/commitbrief/cmd/commitbrief@latest.

The Core Insight: The Right Unit Is “Change,” Not “File”

In the early weeks I was thinking at the file level: “let me have AI read this file.” That turned out to be the wrong abstraction. The real unit of a review isn’t a file — it’s a change. A reviewer hadn’t looked at that file for three years; what they’ll look at is the 12 lines in the diff. You need to show the AI the same thing.

This small shift in perspective solves half the problem: token budget drops, noise drops, severity assignments become meaningful — because you’re no longer asking “is this function well-written” but “does this change introduce a problem”.

The other half is flow: having the tool in the same place as you — the terminal — turns the act of “let me ask a tool” into a one-second reflex. Opening a browser tab, copy-pasting, then switching back — all of that is context-switch overhead. In a developer’s day, that cost grows not by how many seconds it takes but by how many times it happens.

Limitations

CommitBrief doesn’t question architectural decisions — it won’t ask “should you build this feature at all.” It doesn’t know intent — it can’t say “are you building the right thing.” It can produce false positives; in my workflow these tend to land at the “info” level and are easy to ignore, but they’re there.

It’s important to position this tool correctly: not a replacement for lint, but a layer on top of lint + tests. “A second set of eyes on top of my own, before a human reviewer ever sees my PR” — that’s where it’s useful, and that’s where it stops.

Try It

How much it has sped up my process in concrete numbers, which finding patterns have repeated over three months, how far I trust it and where I stay cautious — I’ll cover all of that in a follow-up post. For now, the gain is clear to me: if producing is getting cheaper, auditing what you produce shouldn’t stay expensive.

Share:

Comments

Sign in with your GitHub account to join the discussion. Comments are stored in GitHub Discussions.

Related Posts

Search the site

Start typing to search posts, projects and pages.

Esc to close Powered by Pagefind