Josh Stevens, Solarwinds’ “Head Geek”, did an excellent video overview of the different network protocols that network admins must be familiar with to fully utilize the different network management tools out there.
We’ve also transcribed this video below for the benefit of visitors without access to video in their environment:
Understanding network management protocols is an important part of your job as a network engineer and also within your role of doing network management. Network management protocols are also a key part of the certification program for Solar Winds. This training video will provide you an overview of how network management applications leverage network management protocols. We’ll also describe the different types of network management protocols that are available and explain how the most common ones work.
Let’s go ahead and go to my desk now and take a look at some network management protocols, how they work and how network management applications leverage them. As we discuss network management protocols, we should think about three main things.
First of all, let’s talk about how Orion leverages network management protocols and which types of protocols are leveraged to four different types of monitoring. Let’s also discuss the most common network management protocols and how they work.
As you know, Orion does both fault monitoring and performance monitoring. For fault management, most network management systems, including Orion, use ICMP or ping to detect whether a device is up or down. It’s a simple as that NMS sending out a ping request and waiting for a response. If a response isn’t returned, the NMS assumes the device is down.
Now, on Orion it works slightly differently. If the response isn’t returned, Orion places the node into a node warning state, which means that Orion is going to fast poll the device to verify that it’s really down. Orion by default will monitor the device in a fast polling mode for 120 seconds before it notifies you that the device is down. This is configurable under “Advanced Settings” in the Orion system manager.
Monitoring for faults of sub‑elements like interfaces and volumes is done differently, as it’s done via SNMP. Now, because it’s done via SNMP, it’s more reliable than device status. For example, if the NMS is pinging a router to see if it’s up or down and a response doesn’t come back, the NMS really doesn’t know that the device is down. It only knows that the response didn’t come back. The device could be down, there could be a routing problem, an intermediary device could be down, or something could have blocked the packet on its way to or from the device.
With SNMP, if you get a message that an interface is down, it’s verifiable. It means you asked the device for status via SNMP and the device told you specifically that the interface was down, so it’s 100% accurate. Now, in the event that the device is down and you’re querying it for a sub‑interface or a sub‑element status, you’ll get into an unknown state. So that’s a different type of signifier than if the interface itself was down.
In terms of performance management metrics, ICMP or ping is used for availability calculations and latency or response time. SNMP is used for almost all of the other statistics, including CPU, memory, buffers, interface traffic and errors and many, many more statistics. In some cases, for Windows systems, the NMS might leverage WMI when trying to check for performance counter type values.
Now, let’s go ahead and take a look at Orion and see which of the data is collected by each protocol. What I have here is the Orion main home page and I’m going to click on the “Top Ten” view. The top ten lists are one of my favorite pages to go look and see different information that’s available and has been collected by Orion.
As we scroll down the list, the first thing we notice are top ten nodes by response time and packet loss on the left. That data is collected via ICMP or ping. Top ten interfaces by percent utilization and by traffic on the right are collected via SNMP, the Simple Network Management Protocol. Pretty much all of the other data on the screen ‑‑ errors and discards, the CPU load, memory ‑‑ are all collected via SNMP.
Let’s go ahead and drill down on one of these routers and view the node details page. In this example, we see response time and packet loss on the top left ‑‑ again, ICMP‑based ‑‑ as well as the fact that the device is green, meaning it’s up, which know is, it verifies we’re receiving ping packets back. The rest of the data on the screen in terms of CPU load, memory utilization, buffer hits and misses, interface status and interface traffic, errors and statistics, are all collected via Simple Network Management Protocol, SNMP.
Let’s now take a look at Wireshark, my favorite protocol analyzer, and look at some packets. This is an ICMP or ping request sent from a network management system to a host to verify its status. One of the things you’ll want to notice is that this ping packet has data within the data portion or the payload portion of the packet.
Orion, by default, does include data within the packet, although it’s not random data like this, it’s actually text and readable data. Now, it’s important to know this because some devices in between your NMS and the device you’re trying to poll may block ICMP packets with certain criteria in their data fields. Some firewalls will block packets with a zero‑size data field; some firewalls will block packets with large data in the field and I’ve even seen devices that would have trouble passing ICMP packets that were even‑ or odd‑sized. So, you’ll want to pay attention to that.
The next packet in the series is the response, or the reply, and the NMS will calculate the difference between the time that the request was sent and the reply was received and that is your roundtrip response time or latency. For SNMP, there are a few things that are very important to notice. The first thing to notice is that SNMP is a UDP‑based protocol. That means it’s connectionless and therefore it’s less reliable than a connection‑based protocol.
It also by default runs on UDP port 161. Now, this is really important to know because as you’re needing to change firewall rules and access lists, you’ll want to allow UDP port 161 through there for polling. Some devices leverage a non‑default port for SNMP. You can usually configure that on that device. You can also configure that in Orion for each individual node.
When you look within the data portion of the packet, both in the request and in the reply, you’ll see the NIVs and variables that are used, and the OIDs, to get the data. We’ll discuss that in more detail in the video entitled “Understanding NIVs, OIDs, and Performance Counters.”
This concludes the video on network management protocols. Remember, the video is not intended to teach you everything you need to know to pass a certification exam. You’ll also want to actually use some of the protocols, experiment with network management applications and tools, watch the other videos on the solarwinds.com website and read the recommended reading material on the education section of the solarwinds.com website as well.
Thanks a lot, and good luck.
To view more of SolarWinds’ video training features check out excellent “Tech Talks” video library here.