The topic of system monitoring is very broad and usually encompasses many different areas. A complete monitoring system is a rather complex system and often is made up of multiple components working together. We are not going to develop a complete, self sufficient system here, but we'll look into two important areas of a typical monitoring system: information gathering and representation. In this chapter we'll implement a system that queries devices using an SNMP protocol and then stores the data using the RRDTool library, which is also used to generate the graphs for visual data representation. All this is tied together into simple web pages using the Jinja2 templating library. We'll look at each of these components in more detail as we go along through the chapter.
Before we start designing our application we need to come up with some requirements for our system. First of all we need to understand the functionality we expect our system to provide. This will help us to create an effective (and we hope easy-to-implement) system design. In this chapter we are going to create a system that monitors network-attached devices, such as network switches and routers, using the SNMP protocol. So the first requirement is that the system needs to be able to query any device using SNMP.
The information gathered from the devices needs to be stored for future reference and analysis. Let's make some assumptions about the use of this information. First, we don't need to store it indefinitely. (I'll talk more about permanent information storage in Chapters 9-11). This means that the information is stored only for a predefined period of time, and once it becomes obsolete it will be erased. This defines our second requirement: the information needs to be deleted after it's "expired."
Second, the information needs to be stored so that graphs can be produced. We are not going to use it for anything else, and therefore the data store should be optimized for the data representation tasks.
Finally, we need to generate the graphs and represent this information on easily accessible web pages. The information needs to be structured by the device names only. For example, if we are monitoring several devices for CPU and network interface utilization, this information needs to be presented on a single page. We don't need to present this information on multiple time scales; by default the graphs should show the performance indicators for the last 24 hours.
Now that we have some ideas about the functionality of our system, let's create a simple design, which we'll use as a guide in the development phase. The basic approach is that each of the requirements we specified earlier should be covered by one or more design decisions.
The first requirement is that we need to monitor the network-attached devices, and we need to do so using the SNMP protocol. This means that we have to use appropriate Python library that deals with the SNMP objects. The SNMP module is not included in the default Python installation, so we'll have to use one of the external modules. I recommend using the PySNMP library (available at http: //pysnmp. sourceforge.net/), which is readily available on most of the popular Linux distributions.
The perfect candidate for the data store engine is RRDTool (available at http://oss. oetiker .ch/ rrdtool/index.en. html). The Round Robin Database means that the database is structured in such a way that each "table" has a limited length, and once the limit is reached, the oldest entries are dropped. In fact they are not dropped; the new ones are simply written into their position.
The RRDTool library provides two distinct functionalities: the database service and the graphgeneration toolkit. There is no native support for RRD databases in Python, but there is an external library available that provides an interface to the RRDTool library.
Finally, to generate the web page we will use the Jinja2 templating library (available at http: // jinja.pocoo.org/2/), which lets us create sophisticated templates and decouple the design and development tasks.
We are going to use a simple Windows INI-style configuration file to store the information about the devices we will be monitoring. This information will include details such as the device address, SNMP object reference, and access control details.
The application will be split into two parts: the first part is the information-gathering tool that queries all configured devices and stores the data in the RRDTool database, and the second part is the report generator, which generates the web site structure along with all required images. Both components will be instantiated from the standard UNIX scheduler application—cron. These two scripts will be named snmp-manager.py and snmp-pages.py respectively.
Was this article helpful?