DevOps Lab: Run Your Own Monitoring Server
By Sudheer S
There are many tools and software programs that can be used for monitoring and performance analysis on Linux systems. Some popular options include:
- top - This is a command-line utility that shows real-time information about the processes running on a Linux system, such as their CPU and memory usage.
- htop - This is a more advanced version of top that provides a more user-friendly interface and additional features, such as the ability to sort processes by different metrics and to kill processes.
- sar - This is a command-line utility that collects and displays performance metrics for a Linux system over time. It can be used to analyze CPU, memory, I/O, and network usage, as well as other metrics.
- iostat - This is a command-line utility that shows real-time information about I/O performance on a Linux system. It can be used to monitor the performance of disks and other storage devices.
- vmstat - This is a command-line utility that shows real-time information about various system resources, such as memory, CPU, and I/O. It can be used to monitor the overall health of a Linux system.
- netstat - This is a command-line utility that shows information about network connections on a Linux system. It can be used to monitor the status of network connections and to diagnose networking issues.
There are also many modern monitoring tools and software programs available for Linux, such as Prometheus and Zabbix. These tools typically offer more advanced features and capabilities than the built-in Linux utilities, such as the ability to collect and store metrics over time, and to generate alerts when certain conditions are met.
Prometheus, AlertManager, Grafana, and NodeExporter are all open source software programs that can be used together to monitor and analyze the performance of a Linux system. Prometheus is a monitoring and alerting tool that collects metrics from various sources and stores them in a time-series database. AlertManager is a tool that can be used in conjunction with Prometheus to send alerts based on predefined rules and conditions. Grafana is a visualization tool that can be used to create dashboards and graphs based on the metrics stored in Prometheus. NodeExporter is a Prometheus exporter that can be installed on a Linux system to collect system metrics, such as CPU, memory, and network usage. Together, these tools provide a powerful and flexible solution for monitoring and analyzing the performance of a Linux system.
Steps
- Prepare a virtual machine guest on your laptop/desktop. Install a Linux distribution on the guest.
- Create two other virtual machines which you will monitor.
- On the monitoring server, install Prometheus. Configure Prometheus to scrape metrics from the other two servers.
- Install the Prometheus Node Exporter on the virtual machines. Expose the host metrics such as CPU usage, memory usage, disk usage, systemd units, etc.
- Install and configure Grafana on the monitoring server. Create a dashboard using the metrics collected.
- Install and configure AlertManager on the monitoring server. Send alert notifications over email when the CPU usage metric breaches a threshold.
- Use tools to artificially increase memory usage, CPU and disk usage on the target servers.
- Optional: Expose Prometheus, Grafana and AlertManager web interfaces behind a reverse proxy such as Nginx. On the reverse proxy, use a domain name with TLS certificate.
IAC It
Like all our DevOps labs, use IAC tools such as Ansible to install and configure Prometheus, Node Exporter, AlertManager, Grafana.
Tech Chorus References
Resources
- Prometheus
- Node Exporter
- Grafana
- A tool to help conduct stress tests stress-ng