Built a Watchdog with Prometheus and Node.js

The Slack Ping That Sparked It All

You know the one.

“Hey, is the app running slow for anyone else?”

Cue the detective scramble.
Frontend? Looks fine.
Backend? Maybe.
Database? Cue dramatic music.

These incidents weren’t just occasional—they were constant. And we were tired of guessing. Was the DB genuinely choking? Or was it a rogue feature triggering 10,000 tiny queries per second?

We needed visibility. We needed proof. And most importantly—we needed a solution that didn’t involve signing a check for yet another enterprise-grade monitoring suite.

The Idea: A Custom Informant That Always Tells the Truth

We didn’t need fancy. We needed functional.

The goal was simple:

Build a lightweight process that checks in on our database every few seconds, gathers performance vitals, and reports them in a way we could graph, understand, and act on.

So we turned to our trusty companion: Node.js.

We built a small, self-contained Node.js service—our internal “watchdog.” It didn’t touch production data. It just ran diagnostic queries like:

  • Total DB connections
  • Active vs. idle connections
  • Average query execution time
  • Longest running queries

Simple checks. But incredibly powerful when viewed over time.

Prometheus: Turning Stats into a Time Machine

Getting the vitals was half the battle. We needed to store them, track them, and query them across time.

Prometheus was perfect. It doesn’t just store metrics—it archives history. Every 15 seconds, it knocks on our Node.js endpoint, grabs the current state, and tucks it away with a timestamp.

To make Node.js speak Prometheus’ language, we used a library like prom-client. It let us expose metrics in this simple format:

nginx
# HELP db_active_connections Number of active DB connections
# TYPE db_active_connections gauge
db_active_connections 42

From there, it was just:

  1. Collect →
  2. Expose →
  3. Scrape →
  4. Store.

Now we had a full archive of our database’s health. And for the first time—we had visibility.

Grafana: Where the Real Magic Happened

Prometheus is the brain. Grafana is the face.

We hooked up our metrics and created a dashboard that would make any ops engineer proud:

  • A gauge showing real-time query latency
  • A line chart for idle connections over time
  • A histogram of slow query durations
  • Alerts triggered when connections exceeded thresholds

The first win came quickly.
We saw a slow, steady increase in idle connections. Turns out, one of our new features wasn’t closing DB connections. We had a leak. Grafana spotted it days before the problem could crash our system.

Later, a latency spike in one graph aligned perfectly with a code deployment. A single poorly written query had slipped through. We knew exactly when it started, why, and how to roll it back—in minutes, not hours.

The Payoff: From Chaos to Clarity

Before this system, performance issues looked like:

“It’s slow… maybe… for some users?”

After?

“At 2:11 PM, query latency spiked to 580ms. Slow query logs confirm a new join-heavy SELECT deployed at 2:08 PM. Reverting now.”

The difference?
Data over guesswork.

We didn’t need a bloated observability platform. Just:

  • A few smart queries
  • A lean Node.js collector
  • Prometheus to store the data
  • Grafana to visualize the story

Together, they gave us confidence and clarity.
No more firefighting. No more finger-pointing. Just insight.

Read more about tech blogs . To know more about and to work with industry experts visit internboot.com .

The Takeaway

Sometimes, all you need is a bit of glue and a few open-source tools to build something that just works. We didn’t reinvent monitoring. We made it ours. Lightweight, simple, and perfectly tuned to the real problems we faced.

And honestly? That tiny watchdog we built out of Node and Prometheus is one of the most powerful tools in our stack.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *