Skip to main content

👀 Monitoring – Keep an Eye Before Things Explode

·522 words·3 mins
Mohamed Alabbas
Author
Mohamed Alabbas
wd3bbas = Sudanese + Lazy**Smart

Why Monitoring Matters
#

Think of monitoring as your app’s personal bodyguard. 🕶️
Without it, you like going on a long road trip from Khartoum to Port Sudan without checking your car’s water, oil, or tires. You’re driving and hoping nothing breaks down in the middle of the way

Monitoring keeps your app in shape in terms of performance, security, and scalability. And in today’s world, competition is ruthless. If your app lags, the customer taps “uninstall” faster than you can say logfile.

It’s like checking your car’s dashboard before a road trip. Ignore it, and you’ll end up on the roadside with no fuel or an overheated engine. 🚗🔥

Depending on your stack, you’ll need to build your monitoring. For example, if you’re running a Java app, you’ll want to keep a close eye on memory usage, JVM behavior, garbage collection, and possible leaks. Otherwise, you’ll wake up to an app that can’t process requests.

CPU, Memory, Disk, and IO are all connected. A single overloaded disk can bottleneck the entire system. Having long-term consumption data helps you predict the future—and resize before things break.


Manual Monitoring Example
#

Here’s a quick DIY monitoring hack. Let’s say you’ve got 3 servers:

  • 192.168.0.10
  • 192.168.0.11
  • 192.168.0.12

We’ll create a script to SSH into each server and grab resource stats.

Step 1 – Create inventory file
#

cat > env
192.168.0.10
192.168.0.11
192.168.0.12

Step 2 – Run monitoring script
#

for i in $(cat env); do ssh user@"$i" '
CPU=$(LC_ALL=C top -bn1 | awk -F"[, ]+" "/Cpu/ {printf("%.1f", 100 - \$8)}")
MEM=$(free | awk "/Mem:/ {printf("%.1f%%", $3/$2*100)}")
ROOT=$(df -h / | awk "NR==2 {print \$5}")
echo "DateTime: $(date) | Host: $(hostname) | CPU: ${CPU}% | MEM: ${MEM} | Disk(/): ${ROOT}"
'; done

Example Output
#

DateTime: Mon Aug 18 14:05:20 UTC 2025 | Host: GitLab   | CPU: 2.4% | MEM: 21.9% | Disk(/): 38%
DateTime: Mon Aug 18 14:05:21 UTC 2025 | Host: Personal | CPU: 0.0% | MEM: 3.3%  | Disk(/): 6%
DateTime: Mon Aug 18 14:05:21 UTC 2025 | Host: Docker   | CPU: 0.0% | MEM: 23.1% | Disk(/): 78%

Build some charts
#

Running this on a cron job gives you daily trends. For example, you might notice CPU spikes every morning at 8–9 AM due business rush hour. 📈


Beyond Scripts – Free Tools to Try
#

Scripts are cool, but dashboards are cooler. Here are a couple of open-source monitoring tools worth exploring:

  • Beszel – Lightweight, agent-based monitoring (with container stats!). Simple to deploy on Docker.
  • checkmk – More advanced, enterprise-grade monitoring. Perfect for larger environments with many servers, apps, and services.

Why?
#

Real-time Alerts
Daily reports are fine, but what if CPU spikes at 2 AM? These tools send instant alerts (email or SMS) so you can act before it becomes an outage.

Pretty Dashboards
Text logs are useful, but dashboards are better. Beszel and checkmk turn raw numbers into clear charts and graphs — like a car dashboard for your servers.

Historical Data
Scripts show you now. These tools keep weeks or months of data, helping you spot trends, plan upgrades, and catch resource-hungry apps early.