In Chapter 2, “Data for Network Automation,” you learned about data types and data models, as well as common methods to gather data from your network infrastructure. After you have stored your network data, what do you do with it? Are you currently storing any of your network’s data (logs, metrics, configurations)? If so, which data? What for?
You typically store data in order to find insights in it and to comply with regulatory requirements. An insight, in this sense, can be any useful information or action.
This chapter helps you understand how to use the collected data from your network and derive value from it. In the context of enterprise networks, this chapter covers the following topics:
Data preparation techniques
Data visualization techniques
Network insights
At the end of this chapter are some real case studies that are meant to inspire you to implement automation solutions in your own network.
Data Preparation
Data comes in various formats, such as XML, JSON, and flow logs (refer to Chapter 2). It would be nice if we could gather bread from a field, but we must gather wheat and process it into bread. Similarly, after we gather data from a variety of devices, we need to prepare it. When you have a heterogeneous network, even if you are gathering the same type of data from different places, it may come in different formats (for example, NetFlow on a Cisco device and IPFIX on an HPE device). Data preparation involves tailoring gathered data to your needs.
There are a number of data preparation methods. Data preparation can involve simple actions such as normalizing the date format to a common one as well as more complex actions such as aggregating different data points. The following sections discuss some popular data preparation methods that you should be familiar with.
Parsing
Most of the times when you gather data, it does not come exactly as you need it. It may be in different units, it may be too verbose, or you might want to split it in order to store different components separately in a database. In such cases, you can use parsing techniques.
There are many ways you can parse data, including the following:
Type formatting: You might want to change the type, such as changing seconds to minutes.
Splitting into parts: You might want to divide a bigger piece of information into smaller pieces, such as changing a sentence into words.
Tokenizing: You might want to transform a data field into something less sensitive, such as when you store payment information.
You can parse data before storage or when consuming it from storage. There is really no preferred way, and the best method depends on the architecture and storage capabilities. For example, if you store raw data, it is possible that afterward, you might parse it in different ways for different uses. If you store raw data, you have many options for how to work with that data later; however, it will occupy more storage space, and those different uses may never occur. If you choose to parse data and then store it, you limit what you store, which saves space. However, you might discard or filter a field that you may later need for a new use case.
Regular expressions (regex) play a big part in parsing. Regex, which are used to find patterns in text, exist in most automation tools and programming languages (for example, Python, Java). It is important to note that, regardless of the tool or programming language used, the regex are the same. Regex can match specific characters, wildcards, and sequences. They are not predefined; you write your own to suit your need, as long as you use the appropriate regex syntax. Table 3-1 describes the regex special characters.
Table 3-1 Regex Special Characters
Character |
Description |
---|---|
\d |
Matches characters that contain digits from 0 to 9. |
\D |
Matches characters that do not contain digits from 0 to 9. |
\w |
Matches any word containing characters from a to z, A to Z, 0 to 9, or the underscore character. |
\W |
Matches any non-word characters (not containing characters from a to z, A to Z, 0 to 9, or the underscore character). |
\s |
Matches any whitespace character (spaces, tabs, newlines, carriage returns). |
\S |
Matches any non-whitespace character. |
Table 3-2 describes a number of regex meta characters; although this is not a complete list, these are the most commonly used meta characters.
Table 3-2 Regex Meta Characters
Characters |
Description |
---|---|
[] |
A set of characters |
. |
Any character |
^ |
Starts with |
$ |
Ends with |
+ |
One or more occurrences |
* |
Zero or more occurrences |
{} |
Exact number of occurrences |
| |
OR |
Example 3-1 shows how to use regex to find the IP address of an interface.
Example 3-1 Using Regex to Identify an IPv4 Address
Example 3-1 shows a simplex regex that matches on three blocks of characters that range from 0 to 999 and have a trailing dot, followed by a final block of characters ranging from 0 to 999. The result in this configuration is two entries that correspond to the IP address and the mask of the Loopback0 interface.
IP addresses are octets, with each unit consisting of 8 bits ranging from 0 to 255. As an exercise, improve the regex in Example 3-1 to match only the specific 0 to 255 range in each IP octet. To try it out, find one of the many websites that let you insert text and a regex and then show you the resulting match.
You can also use a number of tools, such as Python and Ansible, to parse data. Example 3-2 shows how to use Ansible for parsing. Firstly, it lists all the network interfaces available in a Mac Book laptop, using the code in the file all_interfaces.yml. Next, it uses the regex ^en to display only interfaces prefixed with en. This is the code in en_only.yml.
Example 3-2 Using Ansible to Parse Device Facts
What if you need to apply modifications to interface names, such as replacing en with Ethernet? In such a case, you can apply mapping functions or regex, as shown with Ansible in Example 3-3.
Example 3-3 Using Ansible to Parse and Alter Interface Names
Another technique that can be considered parsing is enhancing data with other fields. Although this is typically done before storage and not before usage, consider that sometimes the data you gather might not have all the information you need to derive insights. For example, flow data might have SRC IP, DST IP, SRC port, and DST port information but no date. If you store that data as is, you might be able to get insights from it on the events but not when they happened. Something you could consider doing in this scenario is appending or prepending the current date to each flow and then storing the flow data.
As in the previous example, there are many use cases where adding extra data fields can be helpful—for example during a maintenance window having sensor data that includes the sensor’s location. Adding extra data fields is a commonly used technique when you know you will need something more than just the available exported data.
Example 3-4 enhances the previous Ansible code (refer to Example 3-3) by listing the available interfaces along with the time the data was collected.
Example 3-4 Using Ansible to List Interfaces and Record Specific Times
As part of parsing, you might choose to ignore some data. That is, you might simply drop it instead of storing or using it. Why would you do this? Well, you might know that some of the events that are taking place taint your data. For example, say that during a maintenance window you must physically replace a switch. If you have two redundant switches in your architecture, while you are replacing one of them, all your traffic is going through the other one. The data collected will reflect this, but it is not a normal scenario, and you know why it is happening. In such scenarios, ignoring data points can be useful, especially to prevent outliers on later analysis.
So far, we have mostly looked at examples of using Ansible to parse data. However, as mentioned earlier, you can use a variety of tools for parsing.
Something to keep in mind is that the difficulty of parsing data is tightly coupled with its format. Regex are typically used for text parsing, and text is the most challenging type of data to parse. Chapter 2 mentions that XQuery and XPath can help you navigate XML documents. This should give you the idea that different techniques can be used with different types of data. Chapter 2’s message regarding replacing the obsolete CLI access with NETCONF, RESTCONF, and APIs will become clearer when you need to parse gathered data. Examples 3-5 and 3-6 show how you can parse the same information gathered in different formats from the same device.
Example 3-5 Using Ansible with RESTCONF to Retrieve an Interface Description
In Example 3-5, you can see that when using a RESTCONF module, you receive the interface information in JSON format. Using Ansible, you can navigate through the JSON syntax by using the square bracket syntax. It is quite simple to access the interface description, and if you needed to access some other field, such as the IP address field, you would need to make only a minimal change:
debug: var=cat9k_rest_config['response']['ietf-interfaces:interface'] ['ietf-ip:ipv4']
Example 3-6 achieves the same outcome by using a CLI module and regex.
Example 3-6 Using Ansible with SSH and Regex to Retrieve an Interface Description
Example 3-6 uses the following regex:
"description [\\w+ *]*"
This is a simple regex example, but things start to get complicated when you need to parse several values or complex values. Modifications to the expected values might require building new regex, which can be troublesome.
By now you should be seeing the value of the structured information you saw in Chapter 2.
Aggregation
Data can be aggregated—that is, used in a summarized format—from multiple sources or from a single source. There are multiple reasons you might need to aggregate data, such as when you do not have enough computing power or networking bandwidth to use all the data points or when single data points without the bigger context can lead to incorrect insights.
Let’s look at a networking example focused on the CPU utilization percentage in a router. If you are polling the device for this percentage every second, it is possible that for some reason (such as a traffic burst punted to CPU), it could be at 100%, but then, in the next second, it drops to around 20% and stays there. In this case, if you have an automated system to act on the monitored metric, you will execute a preventive measure that is not needed. Figures 3-1 and 3-2 show exactly this, where a defined threshold for 80% CPU utilization would trigger if you were measuring each data point separately but wouldn’t if you aggregated the data and used the average of the three data points.
Figure 3-1 % CPU Utilization Graph per Time Measure T
Figure 3-2 % CPU Utilization Graph per Aggregated Time Measure T
In the monitoring use case, it is typical to monitor using aggregated results at time intervals (for example, 15 or 30 seconds). If the aggregated result is over a defined CPU utiliza103tion threshold, it is a more accurate metric to act on. Tools like Kibana support aggregation natively.
As another example of aggregating data from multiple sources to achieve better insights, consider the following scenario: You have two interfaces connecting the same two devices, and you are monitoring all interfaces for bandwidth utilization. Your monitoring tool has a defined threshold for bandwidth utilization percentage and automatically provisions a new interface if the threshold is reached. For some reason, most of your traffic is taking one of the interfaces, which triggers your monitoring tool’s threshold. However, you still have the other interface bandwidth available. A more accurate aggregated metric would be the combined bandwidth available for the path (an aggregate of the data on both interfaces).
Finally, in some cases, you can aggregate logs with the addition of a quantifier instead of repeated a number of times—although this is often out of your control because many tools either do not support this feature or apply it automatically. This type of aggregation can occur either in the device producing the logs or on the log collector. It can be seen as a compression technique as well (see Example 3-7). This type of aggregation is something to have in mind when analyzing data rather than something that you configure.