Teach an AI to Clean Up Horrible Code

Teach an AI to Clean Up Horrible Code

There are few things that make my palms sweat like getting a notification for a code review on a piece of legacy code I just had to touch.

You know the one. It’s that ancient, cursed file every developer avoids like the plague. In my case, it was a function with the charming name:
handle_user_permissions_and_report_generation_v2.

Yes, really.

This function was a masterclass in what I like to call “archaeological coding.” Layers upon layers of contributions from developers long gone, with commented-out blocks like fossils, and if-else branches that spiraled off into oblivion. Adding one new permission felt like brain surgery—with a butter knife.

During a particularly brutal review (where my changes worked but still drew remarks like, “Can we make this… easier to read?”), I hit a wall. Frustrated and tired, I asked myself:

What if I could get a robot to untangle this for me?

That’s when I fell down the rabbit hole of AI for code—specifically, a model called CodeBERT.

Meet CodeBERT

Let me be clear: I’m not an AI researcher. I’m just a developer who likes Python and hates headaches. So this wasn’t a rigorous research project. It was more like a weekend fueled by caffeine and desperation.

The big idea behind CodeBERT? It’s not just a text model. It doesn’t just understand words—it understands code. Trained on massive repositories from GitHub, CodeBERT recognizes code structure and the relationship between comments and logic.

Think of it as a multilingual translator—not from Spanish to English, but from “Clunky, Confusing Code” to “Readable, Elegant Code.”

The Plan: Train Good Taste

My goal? To teach this AI to have good taste in code. And the way to do that was to feed it examples. A lot of them.

Using Python scripts, I crawled GitHub for commits labeled “Refactor,” “Simplify,” or “Clean up.” These commits were golden—they were before-and-after snapshots of code improvements.

  • Before: Messy, overcomplicated logic.
  • After: Clean, thoughtful design.

Thousands of these became training material. Essentially, I was creating a digital mentor—a robot trained on the wisdom of open-source developers making their code better.

Facing the Monster

After a weekend of collecting data, fine-tuning, and making my laptop sound like a rocket about to launch, I had my model. Now, it was time to return to the beast:

handle_user_permissions_and_report_generation_v2

I copied the nastiest part of the function and fed it to my AI.

Did I expect magic? No.
Did I get magic? Also no.

But I did get something valuable: a suggestion—and a really good one.

Instead of a maze of nested if statements, it suggested using a dictionary to map user roles to permissions. It was a pattern I hadn’t even considered, but as soon as I saw it, it clicked.

Not a Replacement—A Partner

This wasn’t some automated fix-all. I still had to implement and test everything. But this AI wasn’t replacing me—it was pair programming with me. A partner that had read millions of refactors and could spot a smarter path when I was too deep in the weeds.

That weekend project changed how I think about AI.

It’s not coming for our jobs. It’s coming for the worst parts of our jobs—the tedious, repetitive, spaghetti-code-deciphering parts. It’s a tool to help us level up, to recognize our own bad habits, and spend more time building than unraveling.

Read more about tech blogs . To know more about and to work with industry experts visit internboot.com .

Code Review: Round 2

The review for my AI-assisted refactor? Way smoother.

And honestly, it was one of the few times I walked away from a legacy code edit feeling like I actually made it better, instead of just duct-taping it to work another day.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *