Zabbix Network Monitoring(Second Edition)
上QQ阅读APP看书,第一时间看更新

Using Net-SNMP

If you installed Zabbix from the distribution packages, SNMP support should be already included. If you compiled Zabbix from source, it should still have SNMP support, as we included that in the configure flags. All that's left to do is set up SNMP monitoring configuration. Before we do that, we'll need a device that has an SNMP agent installed. This is where you can choose between various options—you can use any networked device that you have access to, such as a manageable switch, network printer, or a UPS with an SNMP interface. As SNMP agents usually listen on port 161, you will need the ability to connect to such a device on this port over User Datagram Protocol (UDP). Although TCP is also supported, UDP is much more widely used.

If you don't have access to such a device, you could also start up an SNMP daemon on a computer. For example, you could easily use Another host as a test bed for SNMP querying. Many distributions ship with the SNMP daemon from the Net-SNMP package, and often it is enough to simply start the snmpd service. If that's not the case for your chosen distribution, you'll either have to find one of those networked devices with an SNMP agent already available or configure snmpd manually.

For testing, it may be enough to have a line like the following in /etc/snmp/snmpd.conf:

rocommunity public

This allows full read access to anybody who uses the public community string.

Tip

Do not use such a configuration in production.

Whichever way you choose, you will have to find out what data the device actually provides and how to get it. This is where Net-SNMP comes in, providing many useful tools to work with SNMP-enabled devices. We will use several of these tools to discover information that is required to configure SNMP items in Zabbix.

Let's start by verifying whether our SNMP device is reachable and responds to our queries.

While SNMPv3 has been the current version of SNMP since 2004, it is still not as widespread as SNMPv1 and SNMPv2. There are a whole lot of old devices in use that only support older protocol versions, and many vendors do not hurry with SNMPv3 implementations.

To complicate things further, SNMPv2 also isn't widely used. Instead, a variation of it, the community-based SNMPv2, or SNMPv2c, is used. While devices can support both v1 and v2c, some only support one of these. Both use so-called community authentication, where user authentication is performed based on a single community string. Therefore, to query a device, you would have to know which protocol version it supports and the community string to use. It's not as hard as it sounds. By default, many devices use a common string for access, public, as does the Net-SNMP daemon. Unless you explicitly change this string, you can just assume that's what is needed to query any host.

Tip

In some distributions, the Net-SNMP daemon and tools can be split out in separate packages. In such cases, install the tool package as well.

If you have installed and started Net-SNMP daemon on Another host, you can perform a simple query to verify SNMP connectivity:

$ snmpstatus -v 2c -c public <IP address>

If the daemon has been started correctly and network connectivity is fine, you should get some output, depending on the system you have:

[UDP: [<IP address>]:161->[0.0.0.0]:51887]=>[Linux another 3.11.10-29-default #1 SMP Thu Mar 5 16:24:00 UTC 2015 (338c513) x86_64] Up: 10:10:46.20 
Interfaces: 3, Recv/Trans packets: 300/281 | IP: 286/245

We can see here that it worked, and by default, communication was done over UDP to port 161. We can see the target system's operating system, hostname, kernel version, when was it compiled and what hardware architecture it was compiled for, and the current uptime. There's also some network statistics information tacked on.

If you are trying to query a network device, it might have restrictions on who is allowed to use the SNMP agent. Some devices allow free access to SNMP data, while some restrict it by default and every connecting host has to be allowed explicitly. If a device does not respond, check its configuration—you might have to add the IP address of the querying machine to the SNMP permission list.

Looking at the snmpstatus command itself, we passed two parameters to it: the SNMP version (2c in this case) and community (which is, as discussed before, public).

If you have other SNMP-enabled hosts, you can try the same command on them. Let's look at various devices:

$ snmpstatus -v 2c -c public <IP address>
[UDP: [<IP address>]:161]=>[IBM Infoprint 1532 version NS.NP.N118 kernel 2.6.6 All-N-1] Up: 5 days, 0:29:53.22
Interfaces: 0, Recv/Trans packets: 63/63 | IP: 1080193/103316

As we can see, this has to be an IBM printer. And hey, it seems to be using a Linux kernel.

While many systems will respond to version 2c queries, sometimes you might see the following:

$ snmpstatus -v 2c -c public <IP address>
Timeout: No Response from <IP address>

This could of course mean network problems, but sometimes SNMP agents ignore requests coming in with a protocol version they do not support or an incorrect community string. If the community string is incorrect, you would have to find out what it has been set to; this is usually easily available in the device or SNMP daemon configuration (for example, Net-SNMP usually has it set in the /etc/snmp/snmp.conf configuration file). If you believe a device might not support a particular protocol version, you can try another command:

$ snmpstatus -v 1 -c public <IP address>
[UDP: [<IP address>]:161]=>[HP ETHERNET MULTI-ENVIRONMENT,SN:CNBW71B06G,FN:JK227AB,SVCID:00000,PID:HP LaserJet P2015 Series] Up: 3:33:44.22
Interfaces: 2, Recv/Trans packets: 135108/70066 | IP: 78239/70054

So this HP LaserJet printer did not support SNMPv2c, only v1. Still, when queried using SNMPv1, it divulged information such as the serial number and series name.

Let's look at another SNMPv1-only device:

$ snmpstatus -v 1 -c public <IP address>
[UDP: [<IP address>]:161]=>[APC Web/SNMP Management Card (MB:v3.6.8 PF:v2.6.4 PN:apc_hw02_aos_264.bin AF1:v2.6.1 AN1:apc_hw02_sumx_261.bin MN:AP9617 HR:A10 SN: ZA0542025896 MD:10/17/2005) (Embedded PowerNet SNMP Agent SW v2.2 compatible)] Up: 157 days, 20:42:55.19
Interfaces: 1, Recv/Trans packets: 2770626/2972781 | IP: 2300062/2388450

This seems to be an APC UPS, and it's providing a lot of information stuffed in this output, including serial number and even firmware versions. It also has considerably longer uptime than the previous systems: over 157 days.

But surely, there must be more information obtainable through SNMP, and also this looks a bit messy. Let's try another command from the Net-SNMP arsenal, snmpwalk. This command tries to return all the values available from a particular SNMP agent, so the output could be very large—we'd better restrict it to a few lines at first:

$ snmpwalk -v 2c -c public 10.1.1.100 | head -n 6
SNMPv2-MIB::sysDescr.0 = STRING: Linux zab 2.6.16.60-0.21-default #1 Tue May 6 12:41:02 UTC 2008 i686
SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (8411956) 23:21:59.56
SNMPv2-MIB::sysContact.0 = STRING: Sysadmin (root@localhost)
SNMPv2-MIB::sysName.0 = STRING: zab
SNMPv2-MIB::sysLocation.0 = STRING: Server Room

Tip

This syntax did not specify OID, and snmpwalk defaulted to SNMPv2-SMI::mib-2. Some devices will have useful information in other parts of the tree. To query the full tree, specify a single dot as the OID value, like this:

snmpwalk -v 2c -c public 10.1.1.100

As we can see, this command outputs various values, with a name or identifier displayed on the left and the value itself on the right. Indeed, the identifier is called the object identifier or OID, and it is a unique string, identifying a single value.

Calling everything on the left-hand side an OID is a simplification. It actually consists of an MIB, OID, and UID, as shown here:

Nevertheless, it is commonly referred to as just the OID, and we will use the same shorthand in this book. Exceptions will be cases when we will actually refer to the MIB or UID part.

Looking at the output, we can also identify some of the data we saw in the output of snmpstatus—SNMPv2-MIB::sysDescr.0 and DISMAN-EVENT-MIB::sysUpTimeInstance. Two other values, SNMPv2-MIB::sysContact.0 and SNMPv2-MIB::sysLocation.0, haven't been changed from the defaults, and thus aren't too useful right now. While we are at it, let's compare this output to the one from the APC UPS:

$ snmpwalk -v 1 -c <IP address> | head -n 6
SNMPv2-MIB::sysDescr.0 = STRING: APC Web/SNMP Management Card (MB:v3.6.8 PF:v2.6.4 PN:apc_hw02_aos_264.bin AF1:v2.6.1 AN1:apc_hw02_sumx_261.bin MN:AP9617 HR:A10 SN: ZA0542025896 MD:10/17/2005) (Embedded PowerNet SNMP Agent SW v2.2 compatible)
SNMPv2-MIB::sysObjectID.0 = OID: PowerNet-MIB::smartUPS450
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (1364829916) 157 days, 23:11:39.16
SNMPv2-MIB::sysContact.0 = STRING: Unknown
SNMPv2-MIB::sysName.0 = STRING: Unknown
SNMPv2-MIB::sysLocation.0 = STRING: Unknown

The output is quite similar, containing the same OIDs, and the system contact and location values aren't set as well. But to monitor some things, we have to retrieve a single value per item, and we can verify that it works with another command, snmpget:

$ snmpget -v 2c -c public 10.1.1.100 DISMAN-EVENT-MIB::sysUpTimeInstance
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (8913849) 1 day, 0:45:38.49

We can add any valid OID, such as DISMAN-EVENT-MIB::sysUpTimeInstance in the previous example, after the host to get whatever value it holds. The OID itself currently consists of two parts, separated by two colons. As discussed earlier, the first part is the name of a Management Information Base or MIB. MIBs are collections of item descriptions, mapping numeric forms to textual ones. The second part is the OID itself. There is no UID in this case. We can look at the full identifier by adding a -Of flag to modify the output:

$ snmpget -v 2c -c public -Of 10.1.1.100 DISMAN-EVENT-MIB::sysUpTimeInstance
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (8972788) 1 day, 0:55:27.88

Tip

To translate from the numeric to the textual form, an MIB is needed. In some cases, the standard MIBs are enough, but many devices have useful information in vendor-specific extensions. Some vendors provide quality MIBs for their equipment, some are less helpful. Contact your vendor to obtain any required MIBs. We will discuss basic MIB management later in this chapter.

That's a considerably long name, showing the tree-like structure. It starts with a no-name root object and goes further, with all the values attached at some location to this tree. Well, we mentioned numeric form, and we can make snmpget output numeric names as well with the -On flag:

$ snmpget -v 2c -c public -On 10.1.1.100 DISMAN-EVENT-MIB::sysUpTimeInstance
.1.3.6.1.2.1.1.3.0 = Timeticks: (9048942) 1 day, 1:08:09.42

So, each OID can be referred to in one of three notations: short, long, or numeric. In this case, DISMAN-EVENT-MIB::sysUpTimeInstance, .iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance, and .1.3.6.1.2.1.1.3.0 all refer to the same value.

Tip

Take a look at the snmpcmd man page for other supported output-formatting options.

But how does this fit into Zabbix SNMP items? Well, to create an SNMP item in Zabbix, you have to enter an OID. How do you know what OID to use? Often, you might have the following choices:

  • Just know it
  • Ask somebody
  • Find it out yourself

More often than not, the first two options don't work, so finding it out yourself will be the only way. As we have learned, Net-SNMP tools are fairly good at supporting such a discovery process.

Using SNMPv3 with Net-SNMP

The latest version of SNMP, version 3, is still not that common yet, and it is somewhat more complex than the previous versions. Device implementations can also vary in quality, so it might be useful to test your configuration of Zabbix against a known solution: Net-SNMP daemon. Let's add an SNMPv3 user to it and get a value. Make sure Net-SNMP is installed and that snmpd starts up successfully.

To configure SNMPv3, first stop snmpd, and then, as root, run this:

# net-snmp-create-v3-user -ro zabbix

This utility will prompt for a password. Enter a password of at least eight characters—although shorter passwords will be accepted here, it will fail the default length requirement later. Start snmpd again, and test the retrieval of values using version 3:

$ snmpget -u zabbix -A zabbixzabbix -v 3 -l authNoPriv localhost SNMPv2-MIB::sysDescr.0

This should return data successfully, as follows:

SNMPv2-MIB::sysDescr.0 = STRING: Linux another 3.11.10-29-default #1 SMP Thu Mar 5 16:24:00 UTC 2015 (338c513) x86_64

We don't need to configure versions 1 or 2c separately, so now we have a general SNMP agent, providing all common versions for testing or exploring.

The engine ID

There is a very common misconfiguration done when attempting to use SNMPv3. According to RFC 3414 (https://tools.ietf.org/html/rfc3414), each device must have a unique identifier. Each SNMP engine maintains a value, snmpEngineID, which uniquely identifies the SNMP engine.

Sometimes, users tend to set this ID to the same value for several devices. As a result, Zabbix is unable to successfully monitor those devices. To make things worse, each device responds nicely to commands such as snmpget or snmpwalk. These commands only talk to a single device at a time; thus, they do not care about snmpEngineID much.

In Zabbix, this could manifest as one device working properly but stopping when another one is added to monitoring.

If there are mysterious problems with SNMPv3 device monitoring with Zabbix that do not manifest when using command line tools, snmpEngineID should be checked very carefully.

Authentication, encryption, and context

With SNMPv3, several additional features are available. Most notably, one may choose strong authentication and encryption of communication. For authentication, Zabbix currently supports the following methods:

  • Message-Digest algorithm 5 (MD5)
  • Secure Hash Algorithm (SHA)

For encryption, Zabbix supports these:

  • Data Encryption Standard (DES)
  • Advanced Encryption Standard (AES)

While it seems that one might always want to use the strongest possible encryption, keep in mind that this can be quite resource intensive. Querying a lot of values over SNMP can overload the target device quite easily. To have reasonable security, you may choose the authNoPriv option in the Security level dropdown. This will use encryption for the authentication process but not for data transfer.

Another SNMPv3 feature is context. In some cases, one SNMP endpoint is responsible for providing information about multiple devices— for example, about multiple UPS devices. A single OID will get a different value, depending on the context specified. Zabbix allows you to specify the context for each individual SNMPv3 item.