Keeping an eye on your IT infrastructure is incredibly important in keeping your mission-critical applications up and running. To facilitate, there are programs that can be used to monitor servers remotely and can act as a pane of glass to view your full server environment. Not only will they view productivity, they will also be able to act as an alert for when something goes down. There are many programs that can be used, but I’ll be focusing on 3 popular solutions that have fully-featured free offerings: Zabbix, Nagios, and Cacti. All of these solutions function by communicating with servers via SNMP.
Nagios offers a full monitoring system backed up with reporting and alerts features that can send notifications to IT admins around the clock in the event that an outage is noticed. It also includes an interesting feature that tracks usage so that IT admins can utilize data to prepare for upcoming expansions to obtain the right number of resources to perform the swap, as well as being able to schedule machines for maintenance. Since Nagios is an open-source project, there are many other developers that build plug-ins for the solution, so anything that you might need in addition to what Nagios comes with initially upon install is available from the community.
Cacti is primarily focused on network monitoring, though it does offer hardware monitoring functionality as well. Cacti is capable of drawing in data through custom scripts made by users and converting the information into graphs so that administrators can more easily understand the data that is rushing in towards them. Cacti offers significant control from the command line level, though it does feel a bit rougher around the edges compared to its counterparts.
Zabbix, the monitoring tool used by my organization, is focused on monitoring via an agent installed on servers that are deployed in addition to the port monitoring utilized by Cacti and Nagios, which don’t utilize agents. Since Zabbix uses an agent it is able to dive deeper into the functionality of monitored machines, such as CPU performance or RAM usage that the other monitoring tools can’t see since they don’t use agents. The solution can also function with agents if you are monitoring appliances that can’t have them installed (ex. Power systems), so the solution is elastic as well as powerful. Of course, you might have a scenario where agents aren't an option, so Zabbix would be limited in what it could do.
While we use Zabbix, most organizations should really have one of these monitoring solutions in place (or any others that might exist, such as Zenoss) to maintain environments that house mission-critical applications. Having one of these solutions in place is better than nothing at all, so if you feel more comfortable with Cacti than the others, for example, use Cacti. The reason for this is because these monitoring tools can let you know when a server goes down or suffers some kind of glitch with its functionality as opposed to waiting to hear about it from a client. The ability to be proactive in solving IT problems is paramount in being able to deliver quality services and applications through the internet today, and these monitoring tools, combined with highly available and redundant infrastructure, is key to making that happen.